public class Main { public static void main(String[] args) { new Thread(new Idle(), "Idle").start(); new Thread(new Busy(), "Busy").start(); } } class Idle implements Runnable { @Override public void run() { try { TimeUnit.HOURS.sleep(1); } catch (InterruptedException e) { } } } class Busy implements Runnable { @Override public void run() { while(true) { "Foo".matches("F.*"); } } }

Idle

Busy

Busy

PID

java

$ top -n1 | grep -m1 java

top

java

22614 tomek 20 0 1360m 734m 31m S 6 24.3 7:36.59 java

top

grep

cut

java

$ top -n1 | grep -m1 java | perl -pe 's/\e\[?.*?[\@-~] ?//g' | cut -f1 -d' '

cut -f1 -d' '

22614

top -H

-H

$ top -n1 -H | grep -m1 java $ top -n1 -H | grep -m1 java | perl -pe 's/\e\[?.*?[\@-~] ?//g' | cut -f1 -d' '

25938 tomek 20 0 1360m 748m 31m S 2 24.8 0:15.15 java 25938

"Busy" prio=10 tid=0x7f3bf800 nid=0x6552 runnable [0x7f25c000] java.lang.Thread.State: RUNNABLE at java.util.regex.Pattern$Node.study(Pattern.java:3010)

nid=0x645a

top -H

top

jstack

$ printf "%x" 25938 6552

#!/bin/bash PID=$(top -n1 | grep -m1 java | perl -pe 's/\e\[?.*?[\@-~] ?//g' | cut -f1 -d' ') NID=$(printf "%x" $(top -n1 -H | grep -m1 java | perl -pe 's/\e\[?.*?[\@-~] ?//g' | cut -f1 -d' ')) jstack $PID | grep -A500 $NID | grep -m1 "^$" -B 500

PID

java

NID

grep

nid

$ ./profile.sh "Busy" prio=10 tid=0x7f3bf800 nid=0x6552 runnable [0x7f25c000] java.lang.Thread.State: RUNNABLE at java.util.regex.Pattern$Node.study(Pattern.java:3010) at java.util.regex.Pattern$Curly.study(Pattern.java:3854) at java.util.regex.Pattern$CharProperty.study(Pattern.java:3355) at java.util.regex.Pattern$Start.<init>(Pattern.java:3044) at java.util.regex.Pattern.compile(Pattern.java:1480) at java.util.regex.Pattern.<init>(Pattern.java:1133) at java.util.regex.Pattern.compile(Pattern.java:823) at java.util.regex.Pattern.matches(Pattern.java:928) at java.lang.String.matches(String.java:2090) at com.blogspot.nurkiewicz.Busy.run(Main.java:27) at java.lang.Thread.run(Thread.java:662)

watch

Busy

Multiple threads

new Thread(new Idle(), "Idle").start(); new Thread(new Busy(), "Busy-1").start(); new Thread(new Busy(), "Busy-2").start();

Busy-1

Busy-2

Pattern

Idle

What do you do when your Java application consumes 100% of the CPU? Turns out you can easily find the problematic thread(s) using built-in UNIX and JDK tools. No profilers or agents required.For the purpose of testing we'll use this simple program:As you can see, it starts two threads.is not consuming any CPU (remember, sleeping threads consume memory, but not CPU) whileeats the whole core as regular expression parsing and executing is a surprisingly complex process. Let's run this program and forget about it. How can we quickly find out thatis the problematic piece of our software? First of all we use top to find out the process id () of theprocess consuming most of the CPU. This is quite straightforward:This will display the first line ofoutput containing "" sentence:The first column is the PID, let's extract it. Unfortunately it turned out thatuses ANSI escape codes for colors - invisible characters that are breaking tools likeand. Luckily I found a perl script to remove these characters and was finally able to extract the PID ofprocess exhausting my CPU:Theinvocation simply takes the first value out of space-separated columns:Now when we now the problematic JVM PID, we can useto find problematic Linux threads. Theoption prints a list of allas opposed to, the PID column now represents the internal Linux thread ID:The output is surprisingly similar, but the first value is now the thread ID:So we have a process ID of our busy JVM and Linux thread ID (most likely from that process) consuming our CPU. Here comes the best part: if you look at jstack output (available in JDK), each thread has some mysterious ID printed next to its name:That's right, theparameter is the same as thread ID printed by. Of course to not make it too simple,uses decimal notation whileprints in hex. Again there is a simple solution, printf "%x" Let's wrap all we have now into a script and combine the results:holds thePID andholds the thread ID, most likely from that JVM. The last line simply dumps the JVM stack trace of the given PID and filters out (using) the thread which has matching. Guess what, it works:Running the script multiple times (or with, see below) will capturethread in different places, but almost always inside regular expression parsing - which is our problematic piece!In case your application has multiple CPU-hungry threads, you can usecommand to run the script every second and get semi real-time stack dumps, most likely from different threads. Testing with the following program:you'll see stack traces either ofor ofthreads (in different places insideclass), but never