ProcedureDetecting Hanging / Looping VMs

Procedure

Problems may arise involving hangs or looping processes. A hang can arise for many reasons such as a deadlock in the application code, API/library code or even a bug in the VM. Sometimes an apparent hang turns out not to be a hang but is due to the VM process is consuming all available CPU cycles – most likely caused by a bug that causes one or more threads to go into an infinite loop.

The following example demonstrates a very simple program which causes an endless loop in the VM:

Syntax Syntax

  1. public class EndlessLoop {
    
        public static void loopForever() {
            boolean loop = true;
            int i = 0;
            
            System.out.println("Entering endless loop");
            
            while (loop) {
                i++;
            }
            System.out.println("Finished endless loop");
        }
       
        public static void main(String[] args) {
            loopForever();
        }
    }
    
End of the code.

If a Java program is not behaving well (for example, it is not responding, or does not show any progress), the Management Console is a good starting point to diagnose the error.

In addition, you can get information about the system and the running VMs by connecting to the NetWeaver Application Server Java with the browser, and then choosing System Information.

An initial step when diagnosing a hang is to find out if the VM process is idle or consuming all available CPU cycles. This information can be seen in the AS Java Process Table tab of the Management Console (see below). If the process appears to be busy and is consuming all available CPU cycles then it is likely that the issue is a looping thread rather than a deadlock.

AS Java Process Table in MMC

This graphic is explained in the accompanying text.

If a VM process appears to be looping then the first step is to try and get a thread dump. If a thread dump can be obtained, it often will be clear which thread is looping. If the looping thread can be identified then the trace stack in the thread dump should indicate in which application or component the thread is looping (and maybe why).

Thread Analysis

If there is a problem, it is often interesting to know, where in the code the VM is currently running or waiting. A thread dump will show a stack trace for each Java thread in the VM, enriched with additional information about the time a specific thread has been assigned to a CPU, the number of bytes a specific thread has been allocated during its lifetime and the user, the thread is working for. A typical section of a thread dump looks like:

Example Example

"main" cpu=13296.88 ms allocated memory=40.52 MB (42493080 B)

       user="Hans Moleman" prio=6 tid=0x044f3fb8 nid=0x71c runnable

        at EndlessLoop.loopForever()V(EndlessLoop.java:24)

        at EndlessLoop.main([Ljava/lang/String;)V(EndlessLoop.java:32)

End of the example.

As shown in the snapshot, the 'main' thread has consumed 13296.88 milliseconds running on one of the available CPUs. During its lifetime it has allocated 40.52 megabytes of heap memory and is running for the user “Hans Moleman”.

Creating a Thread Dump

A thread dump can be generated in several ways:

  • The Management Console (MMC or Java MC) can trigger thread dumps. In the AS Java Process table, right-click on the server process and choose Dump stack trace. The thread dump of all threads of the server process is then written to the developer trace.

  • On Windows platforms, the key stroke 'Control-Break' performed in the command shell of the started VM generates a thread dump. On Unix platforms, a "Quit" signal sent to a specific VM triggers a thread dump.

Once the thread dump is triggered, it can be found on the console or in a redirected file (like the dev_server_<n> file in case of the AS Java). The extended thread provides the CPU time per thread. As we know that the VM is looping, we just have to look for a thread with a high CPU time. The graphic below shows that the main thread has consumed a lot of CPU time.

Thread Dump

This graphic is explained in the accompanying text.

As the callstack looks suspicious, we decide to have a closer look. With the debugging on demand feature, we can switch the VM to debug mode, attach a debugger to it and inspect the situation.

Debugging the Hanging Program

This graphic is explained in the accompanying text.

The graphic shows the attached debugger. Within the debugger, it is easy to see what is going wrong. The error can be fixed (set variable loop to false) and the program continues running.

More Information

Detecting Memory Leaks