The Garbage Collection Handbook: The Art of Automatic Memory Management (Chapman & Hall/CRC Applied Algorithms and Data Structures series)

Author: Richard Jones, Antony Hosking, Eliot Moss
5.0
All Stack Overflow 8
This Year Stack Overflow 2

Comments

by anonymous   2018-03-19

I've wrote the Qish garbage collector (not really maintained any more, but feel free to ask). It is a free copying generational GC for C (with some coding styles restrictions).

The GCC MELT [meta-]plugin (free, GPLv3 licensed), providing a high level language, MELT, to extend the GCC compiler, also has a copying generational GC above the existing Ggc garbage collector of GCC. Look into gcc/melt-runtime.c

With generational copying GC, generating the application's code in C is quite useful. See my DSL2011 paper on MELT

Feel free to ask me more, I love talking about my GC-s.

Of course, reading the Garbage Collection Handbook: The Art of Automatic Memory Management (Jones, Hosking, Moss) [ISBN-13: 978-1420082791] is a must


(added in 2017)

Look also into Ravenbrook's Memory Pool System which can be used for generational GC.

Look also into the runtime of Ocaml, which has a good (single-threaded) generational GC.


PS. Debugging a generational copying GC is painful.

by anonymous   2018-03-19

If you look for a vendor-independent resource revealing and thoroughly describing all the various GC algorithms ever researched/designed, I recommend:

  • The Garbage Collection Handbook - Explains theory and implementation of the main GC research that was there since the first GC algorithm ever designed. References also related research articles where you can find all the nasty details. I really like that book, I think that THIS IS THE BIBLE of all the GC-related research.
by anonymous   2017-08-20

Garbage collection is a pretty complicated topic, and while you could learn all the details about this, I think what’s happening in your case is pretty simple.

Sun’s Garbage Collection Tuning guide, under the “Explicit Garbage Collection” heading, warns:

applications can interact with garbage collection … by invoking full garbage collections explicitly … This can force a major collection to be done when it may not be necessary … One of the most commonly encountered uses of explicit garbage collection occurs with RMI … RMI forces full collections periodically

That guide says that the default time between garbage collections is one minute, but the sun.rmi Properties reference, under sun.rmi.dgc.server.gcInterval says:

The default value is 3600000 milliseconds (one hour).

If you’re seeing major collections every hour in one application but not another, it’s probably because the application is using RMI, possibly only internally, and you haven’t added -XX:+DisableExplicitGC to the startup flags.

Disable explicit GC, or test this hypothesis by setting -Dsun.rmi.dgc.server.gcInterval=7200000 and observing if GCs happen every two hours instead.

by anonymous   2017-08-20

I believe it is the memory allocated by the shared library which you are dlclose-ing which is staying, and you have no simple way to remove it (because you don't know which other parts of your process -e.g. which others dlopen-ed libraries is using it). If you want to understand more, read a good book on Garbage Collection, or at least the wikipage. Being a memory useful to the process is a global property of the entire process, not of particular libraries.

However, some libraries have conventions regarding memory usage, and might offer you the facility of cleaning up memory and resources. Others libraries don't release resources. Some libraries give you the ability to give as parameters the allocation routines they are calling.

You might consider using the Boehm conservative garbage collector or chase your leaks with an utility like valgrind.

Good luck, since your problem has no general solution. Perhaps telling us more about the actual libraries you are dlopen-ing might help.

And of course, there is the work-around of restarting from time to time your process.

by anonymous   2017-08-20

Out of various resources I have compiled a sanity checklist that I use to analyze GC behavior and performance of my applications. These guidelines are general and apply to any vendor-specific JVM but contain also HotspotVM-specific information for illustration.

  1. Disable Explicit GC. Explicit GC is a bad coding practice, it never helps. Use -XX:+DisableExplicitGC.

  2. Enable Full GC logging. Lightweight yet powerful.

    • Compute Live Data Set, Allocation Rate, and Promotion Rate. This will tell you if you need a bigger Heap or if your eg. Young Gen is too small, or if your Survivor spaces are overflowing, etc.
    • Compute total GC time, it should be <5% of total running time.
    • Use -XX:+PrintTenuringDistribution -XX:+UnlockDiagnosticVMOptions -XX:+LogVMOutput -XX:LogFile=jvm.log -XX:+HeapDumpOnOutOfMemoryError -Xloggc:gc.log -XX:+PrintGCTimeStamps -XX:+PrintGCDetails -showversion
  3. Consider additional means of collecting information about your GC. Logging is fine but there are sometimes available lightweight command-line tools that will give you even more insight. Eg. jstat for Hotspot which will show you occupation/capacity of Eden, Survivor and Old Gen.

  4. Collect Class Histograms These are lightweigh and will show you the content of the heap. You can take snapshots whenever you notice some strange GC activity, or you can take them before/after Full GC:

    • Content of the OldGen space: You can find out which objects reside in the OldGen. You need to print histograms before and after Full GC. And since a YoungGen collection is executed before the Full GC, these Histograms will show you the content of the Old generation. Use -XX:+PrintClassHistogramBeforeFullGC -XX:+PrintClassHistogramAfterFullGC.
    • Detecting prematurely promoted objects: To determine if any instances are promoted early, you need to study the Histograms to see which classes are expected to reside in the OldGen and which classes should be seen only in the YoungGen. This cannot be done automatically, you need to reason about the purpose of each class and its instance to determine if the object is temporary or not.
  5. Consider different GC Algorithm. The VMs usually come with several different GC implementations that are providing various tradeoffs : throughput, footprint, pause-less/short-pauses, real-time, etc. Consider the options you have and pick the one that suites your needs.

  6. Beware of finalize(). Check that GC keeps up with classes using finalize(). The execution of this method may be quite costly and this can impact GC and application throughput.

  7. Heap Dumps. This is the first step that is heavyweight and will impact the running application. Collect the Heap Dump to further study the heap content or to confirm a hypothesis observed in step 4.

Resources used:

Books:

Talks/Articles:

Mailing Lists: