Friday, December 27, 2013

JRockit: Thread Local Area Size and Large Objects

Large objects can be a problem for an application. If your Java application allocates a lot of objects, especially large objects, it helps to play around with various Thread Local Area (TLA) settings.[1] For a case study, read [11].

If you try to find the default settings of TLA by using:
  • -XX (an alias for -Xprintflags)
it prints out the following information in JRockit R28.2.5:
        TlaWasteLimit = 0 (default)
                (Alias: -XXlargeObjectLimit)
                - Internal. Use -XXtlaSize:wasteLimit instead.
        TlaMinSize = 0 (default)
                (Alias: -XXminBlockSize)
                - Internal. Use -XXtlaSize:min instead.
        TlaPreferredSize = 0 (default)
                - Internal. Use -XXtlaSize:preferred instead.

Hmm. That is not very helpful. In this article, we will discuss:
  • Thread local area size and large objects
  • How to find out the default TLA settings?

Thread Local Area Size and Large Objects[1-4]


The thread local area (TLA) is a chunk of free space reserved on the heap or the nursery and given to a thread for its exclusive use. A thread can allocate small objects in its own TLA without synchronizing with other threads. When the TLA gets full the thread simply requests a new TLA. The objects allocated in a TLA are accessible to all Java threads and are not considered “thread local” in any way after they have been allocated.

Increasing the TLA size is beneficial for multi-threaded applications where each thread allocates a lot of objects. Increasing the TLA size is also beneficial when the average size of the allocated objects is large, as this allows larger objects to be allocated in the TLAs. Increasing the TLA size too much may however cause more fragmentation and more frequent garbage collections. Before any JRockit performance tuning, you need to assess the sizes of the objects allocated by your application. One way to access live data size is to view object allocation statistics. There are multiple ways of achieving that:
  • You can create a JRockit Flight Recorder recording and view object allocation statistics in the JRockit Flight Recorder[7]
  • You can use the following JVM option:
    • -Xverbose:memdbg Xverbose:gc -Xverbosedecorations=level,module,timestamp,millis,pid

Large Object: Pre-R28 vs. Post-R28[1]


In JRockit versions earlier than R28, large objects were allocated immediately on the heap and never in a TLA. A flag called –XXlargeObjectLimit was provided to tell JRockit the minimum number of bytes an object should be of in order to be treated as "large". The default was 2 KB.

JRockit post R28 uses a waste limit for TLA space instead. This constrains the amount of TLA space that can be thrown away for each TLA when large objects are allocated and is a more flexible solution.

The R28 allocation algorithm now works like this—JRockit tries to allocate every object regardless of its size in the current TLA. If it doesn't fit and the waste limit is less than the space left in the TLA, the object
goes directly on the heap. Otherwise, JRockit will "waste" the rest of this TLA and try to allocate the object in a new TLA or directly on the heap, depending on the size of the object.

The TLA sizes are set using the following option:
  • -XXtlaSize:min=size,preferred=size,wasteLimit=size[8]
    • min
      • Sets the minimum TLA size. 
    • preferred
      • Sets a preferred TLA size.  This means that TLAs will be the preferred size whenever possible, but can be as small as the min size.
    • wasteLimit
      • Sets the waste limit for TLAs. This is the maximum amount of free memory that a TLA is allowed to have when a thread requires a new TLA.
The following relation is true for the TLA size parameters:
  • -XXtlaSize:wasteLimit <= -XXtlaSize:min <= -XXtlaSize:preferred
Read [9] for the tuning advice.

What are the default TLA settings?


To find out the default TLA settings, you can use the following JVM option:[12]
  • -Xverbose:memdbg
For example, here is the output:

[DEBUG][memory ][19709] Minimum TLA size is 2048 bytes.
[DEBUG][memory ][19709] Preferred TLA size is 16384 bytes.
[DEBUG][memory ][1388121278654][19709] TLA waste limit is 2048 bytes.


from the following JVM options:
  • -Xms2g -Xmx2g -Xverbose:memdbg -Xgc:pausetime -Xverbose:gc -Xverbosedecorations=level,module,timestamp,millis,pid

References

  1. Oracle JRockit- The Definitive Guide
  2. Tuning Java Virtual Machines (JVMs)
  3. On Nursery Sizing (Migrated)
    • The goal of nursery sizing to get as much memory as possible freed by young collections rather than old collections.
    • The rule of thumb is that a nursery size of approximately half of the free memory on the heap is nearly optimal.
    • If your application is sensitive to latencies you may want to decrease the nursery size to shorten the young collection pause times.
    • If you're using the garbage collection mode optimizing for short pauses (-Xgcprio:pausetime) or the static generational concurrent garbage collector (-Xgc:gencon) you will most likely want to tune the nursery size manually.
  4. First Steps for Tuning the Oracle JRockit JVM
  5. Oracle® JRockit Diagnostics and Troubleshooting Guide (Release R28)
  6. Oracle® JRockit Performance Tuning Guide (Release R28)
  7. Oracle® JRockit Flight Recorder Run Time Guide (Release R28)
  8. -XXtlaSize Parameters
  9. Optimizing Memory Allocation Performance (Section 4.4 of this pdf)
  10. Oracle® JRockit Command-Line Reference (Release R28)
  11. JRockit: A Case Study of Thread Local Area (TLA) Tuning (Xml and More)
  12. JRockit: Analyzing GC With JRockit Verbose Output (-Xverbose:memdbg)  (Xml and More)
  13. Where did all of these ConstPoolWrapper objects come from?!

No comments: