keyongtech


  keyongtech > java > 11/2008

 #1  
11-14-08, 06:58 PM
Mark Sizzler
We have problems with the Java Garbage Collector.
It is very slow when we hold large tables in memory and perform many inserts and updates.

As far as I remember there are 3rd party improved garbagr collector products.
Does someone have a recommendation?

Are there any best practice hints on how to improve the Java built-in GC?

Mark
 #2  
11-14-08, 07:27 PM
John B. Matthews
In article <491dca53$0$32682$9b4e6d93>,
marksiz (Mark Sizzler) wrote:

> We have problems with the Java Garbage Collector. It is very slow
> when we hold large tables in memory and perform many inserts and
> updates.


This makes me wonder if you are unintentionally retaining objects in a
collection that implements the Map interface.

[...]
> Are there any best practice hints on how to improve the Java built-in GC?


Avoid unintentional object retention and consider WeakHashMap:

<http://java.sun.com/javase/6/docs/api/java/util/WeakHashMap.html>
<http://www-128.ibm.com/developerworks/java/library/j-jtp11225/>
<http://www.ibm.com/developerworks/java/library/j-perf08273.html>
 #3  
11-14-08, 08:04 PM
Lew
Mark Sizzler wrote:
>> We have problems with the Java Garbage Collector. It is very slow
>> when we hold large tables in memory and perform many inserts and
>> updates.



John B. Matthews wrote:
> This makes me wonder if you are unintentionally retaining objects in a
> collection that implements the Map interface.


Other possibilities are references from long-lived objects to short-
lived ones, and intentional retention of objects in the mistaken
belief that it will reduce GC overhead.

Mark Sizzler wrote:
>> Are there any best practice hints on how to improve the Java built-in GC?


John B. Matthews wrote:
> Avoid unintentional object retention and consider WeakHashMap:


Also avoid intentional retention of references.

If the algorithm requires that an object live a long time, let it
live. But an antipattern in Java is to keep an object and reuse it
for different values over and over, e.g.,

public class GcAntiPattern
{
public static void main( String [] args )
{
Foo foo = new Foo();
for ( int ix = 0; ix < 10000; ++ix )
{
fillWithValues( foo );
doSomethingWithFoo( foo );
}
}
}

Usually in such cases it is better to allocate the 'Foo' inside the
loop so that it can be GCed by a minor collection instead of a major
one.

> <http://java.sun.com/javase/6/docs/api/java/util/WeakHashMap.html>
> <http://www-128.ibm.com/developerworks/java/library/j-jtp11225/>
> <http://www.ibm.com/developerworks/java/library/j-perf08273.html>


Also:
<http://www.ibm.com/developerworks/java/library/j-jtp09275.html>
<http://www.ibm.com/developerworks/java/library/j-jtp01274.html>
 #4  
11-14-08, 08:27 PM
rossum
On 14 Nov 2008 18:58:27 GMT, marksiz (Mark Sizzler) wrote:

>We have problems with the Java Garbage Collector.
>It is very slow when we hold large tables in memory and perform many inserts and updates.
>
>As far as I remember there are 3rd party improved garbagr collector products.
>Does someone have a recommendation?
>
>Are there any best practice hints on how to improve the Java built-in GC?

Try to allocate enough space for your tables right at the start so
insertions/updates do not cause the table to expand, and hence have to
be moved.

Free up things as soon as possible.

Avoid finalizers if at all possible.

Allocate more memory to the JVM.

rossum
 #5  
11-14-08, 09:00 PM
Daniel Pitts
Mark Sizzler wrote:
> We have problems with the Java Garbage Collector.
> It is very slow when we hold large tables in memory and perform many inserts and updates.
>
> As far as I remember there are 3rd party improved garbagr collector products.
> Does someone have a recommendation?
>
> Are there any best practice hints on how to improve the Java built-in GC?
>
> Mark

Don't hold large tables in memory :-)
Are you sure its the garbage collector? Have you tried tuning the
performance parameters:
<http://java.sun.com/javase/technologies/hotspot/gc/gc_tuning_6.html>
 #6  
11-15-08, 12:26 AM
Lew
Lew wrote:

John B. Matthews wrote:
>>> <http://java.sun.com/javase/6/docs/api/java/util/WeakHashMap.html>
>>> <http://www-128.ibm.com/developerworks/java/library/j-jtp11225/>
>>> <http://www.ibm.com/developerworks/java/library/j-perf08273.html>


Lew wrote:
>> Also:
>> <http://www.ibm.com/developerworks/java/library/j-jtp09275.html>
>> <http://www.ibm.com/developerworks/java/library/j-jtp01274.html>


Avinash Ramana wrote:
> In the case of that anti-pattern...
>
> Can you elaborate a little more on why the anti-pattern is bad? Why
> is reusing an object bad? Is it worse than creating an object in a
> loop like that?


The referenced links explain it better than I probably can, but the key is
that Java uses a generational garbage collector. Young generation collections
are very fast, and object creation is blazingly fast. Tenured generation
collections take much longer. Also, the JVM can often optimize away object
creation altogether with temporary short-lived objects, depending on their
shape and usage. Basically, as one of the cited articles point out, the
programmer usually cannot do better than the compiler in Java.

This is only a rule of thumb. It could be that creation of umpty-gazillion
objects inside the loop would trigger so many young-generation GC cycles that
it would help to create one outside the loop. Or, Hotspot might figure that
out for you. It's hard to tell.

That is the crux. We as programmers really don't know. It's better to scope
a variable for its natural life - if an object is only used inside the loop,
declare it inside the loop. This will prevent bugs that would be much, much
worse than a putative, unprovable slowdown due to GC.
 #7  
11-15-08, 04:03 PM
Tom Anderson
On Fri, 14 Nov 2008, Mark Sizzler wrote:

> We have problems with the Java Garbage Collector. It is very slow when
> we hold large tables in memory and perform many inserts and updates.
>
> As far as I remember there are 3rd party improved garbagr collector
> products. Does someone have a recommendation?
>
> Are there any best practice hints on how to improve the Java built-in
> GC?


I'm not aware of any way to plug a third-party GC into Sun's JVM. You
could switch to using another JVM - IBM make a good one, but i'm not aware
of any more which are anywhere near as good. They're mostly research VMs,
or fairly basic open-source ones. AFAIK, anyway.

However, there are a lot of flags you can use to tune the way Sun's GC
works. This guide discusses the most important stuff:

http://java.sun.com/docs/hotspot/gc5.0/gc_tuning_5.html

Googling for things like 'java garbage collection tuning' will find you
more.

tom
 #8  
11-15-08, 04:14 PM
Arne Vajhøj
Tom Anderson wrote:
> I'm not aware of any way to plug a third-party GC into Sun's JVM. You
> could switch to using another JVM - IBM make a good one, but i'm not
> aware of any more which are anywhere near as good. They're mostly
> research VMs, or fairly basic open-source ones. AFAIK, anyway.


BEA and Oracle have their own JVM's too (Oracle owns BEA now, but
I don't think they have merged product lines yet).

BEA JRockIt has a very good reputation.

Arne
 #9  
11-17-08, 09:47 AM
J. Davidson
Lew wrote:
> Other possibilities are references from long-lived objects to short-
> lived ones, and intentional retention of objects in the mistaken
> belief that it will reduce GC overhead.


The garbage collector used to be very different.

Old Java garbage collectors like mark-sweep had to do work in proportion
to the number of dead objects. So to make code run fast you reused
objects and avoided discarding too many.

The relatively new generational garbage collector has to do work in
proportion to the number of surviving objects instead. So to make code
run fast you now should discard objects rather than retain them.

GC optimization has basically been turned 180 degrees and stood up on
its head by this. The advice to get the fastest GC performance now is
diametrically opposite what was best with the older GC.

If Mark is working with a Java project with very many tree-rings in it,
it's quite likely the thing was coded exactly as horribly as possible
from the stand-point of GC optimization, because of GC-optimization.

Unfortunately, making it run fast with the newer GC means a lot of work
or even a total rewrite of big chunks of it in that case. On the plus
side, the results will be well worth it, making everything far faster
than the old code used with the older GC. And it's very unlikely that
the next big revolution in GC will flip everything over again, so the
new code will be future-proofed in one respect that the old code
(obviously) wasn't.

Another good reason is that treating most objects as disposable can
greatly simplify a lot of the logic, getting rid of pools and other
scaffolding and allowing some of them to be made immutable, which lets
you get rid of setters and possibly a lot of other code. Code that had
to be maintained, and might contain bugs, as well as was bloating up the
size of the running image and the jar files, bloating up the
documentation, and cluttering up the IDE's method listing and code view.

You can also afford to make classes that encapsulate a few primitives
where you might have once just used the primitives directly to cut down
on object creation. To use a worn-out old example, you can get rid of
all those xs and ys worked on in tandem and use Point2D or Complex or
whatever fits the situation without all those used-up Point2Ds or
Complexes gumming up the garbage collector the way they would have, five
or ten years ago. The code might get more readable, simpler, and less
error-prone.

Or you might just end up wishing for operator overloading to be in Java
7. :)

- jenny
 #10  
11-17-08, 01:46 PM
Tom Anderson
On Mon, 17 Nov 2008, J. Davidson wrote:

> Lew wrote:
>> Other possibilities are references from long-lived objects to short-
>> lived ones, and intentional retention of objects in the mistaken
>> belief that it will reduce GC overhead.

>
> The garbage collector used to be very different.
>
> Old Java garbage collectors like mark-sweep had to do work in proportion to
> the number of dead objects.


I don't think that was ever true. Mark-and-sweep and stop-and-copy both do
work proportional to the number of live objects - dead objects are never
reached during traversal of the object graph, and never touched.

I think only C-style memory managers, which put each deleted block on a
free list, do work proportional to dead objects.

Rather, what it was that the pre-generational collectors would do work
proportional to the *total* number of live objects on every collection,
whereas generational collectors do work proportional to the number of live
objects *in the nursery* (roughly). That meant that if you had a
significant amount of long-lived objects (which most apps do), you wanted
to avoid frequent collections, because each one would walk all your
objects, and thus ...

> So to make code run fast you reused objects and avoided discarding too
> many.


Bingo.

> GC optimization has basically been turned 180 degrees and stood up on
> its head by this. The advice to get the fastest GC performance now is
> diametrically opposite what was best with the older GC.


Also bingo. This is why i shudder when i read things like this:

http://lab.polygonal.de/2008/06/18/using-object-pools/

Right now, Flash has a pretty basic collector, and object pooling is
apparently a very big win. At some point in the next few years, it'll get
a more sophisticated collector (along with a JIT and all the other goodies
needed to keep up with javascript), and then there's an excellent chance
that all this stuff will end up being a big smoking hole in everyone's
feet.

With any luck, there will be a small number of pool libraries in use, and
they'll all be written so that they can be replaced with no-op non-pooling
versions when the time comes. Fingers crossed.

tom
 #11  
11-21-08, 07:36 AM
J. Davidson
Tom Anderson wrote:
> I think only C-style memory managers, which put each deleted block on a
> free list, do work proportional to dead objects.


I recall reading about garbage collectors that did so as well. I
definitely recall old Java versions performing better if you reused
objects rather than discarded them -- exactly opposite to current Java.
About the latter you seem to be in agreement.
 #12  
11-21-08, 12:20 PM
Tom Anderson
On Fri, 21 Nov 2008, J. Davidson wrote:

> Tom Anderson wrote:
>> I think only C-style memory managers, which put each deleted block on a
>> free list, do work proportional to dead objects.

>
> I recall reading about garbage collectors that did so as well. I
> definitely recall old Java versions performing better if you reused
> objects rather than discarded them -- exactly opposite to current Java.
> About the latter you seem to be in agreement.


Yes, absolutely. I think this was due to their slow implementation of
allocation, rather than their doing per-dead-object work. I think.

tom
 #13  
11-21-08, 03:25 PM
Roedy Green
On 14 Nov 2008 18:58:27 GMT, marksiz (Mark Sizzler) wrote,
quoted or indirectly quoted someone who said :

>We have problems with the Java Garbage Collector.
>It is very slow when we hold large tables in memory and perform many inserts and updates.


Others will tackle your problem directly. Here are some things to
check before you invest big bucks in new GC package.

Have you instrumented to be sure the problem is GC? not the
operations on the tables themselves?

Does your table technique keep allocating new objects frequently? It
should be doing something like ArrayList does, using a buffer bigger
than needed and only growing it when it overflows.

Have you done a study of the objects to make sure there is no
packratting? No GC is going to work well if you accidentally hold on
to objects you don't really need. see
http://mindprod.com/jgloss/packratting.html

Finally there is the ballerina in phone booth problem. What is your
ratio of live object space to heap space? No GC will work well when
that ratio gets too large.
 #14  
11-21-08, 04:51 PM
Tom Anderson
On Fri, 21 Nov 2008, Roedy Green wrote:

> On 14 Nov 2008 18:58:27 GMT, marksiz (Mark Sizzler) wrote,
> quoted or indirectly quoted someone who said :
>
>> We have problems with the Java Garbage Collector.

>
> Finally there is the ballerina in phone booth problem.


I hadn't heard it called that before. That's a great name!

tom
Similar Threads
C++ Garbage Collector on VMS?

Hi all, I've been trying to find out if the Boehm GC works on VMS, and have so far not found anything, except a mention of a porting attempt from 1995 (this group,...

about garbage collector

How do garbage collecter knows thats it the time to remove object from memory if it not been referred anymore .Means what algorithms it uses.

garbage collector

Hello! I write in perl a litle server that start a function on many threads. The problem is that my aplication crash on a big amount of connection, but the threads are...

garbage collector

Hi, I'm looking for a good garbage collector for C++. However, I have some constraints: In my particular application, memory fragmentation could become a real problem. ...

Garbage collector

Hi I wonder... does installing .NET framework may effect the performance of native process running on the same computer Does the GC runs in the kernel or in the application...


All times are GMT. The time now is 02:43 PM. | Privacy Policy