February 23, 2011

In this article we'll discuss Java bytecode, how to use objects and call methods, as well as how to pass parameters to methods and return values. You can click the links below to jump to either section.

Looking Back at Java Bytecode

After publishing Java Bytecode Fundamentals I received some valuable feedback from the community. The topic turned out to be quite popular and it seems that Java developers just miss the old days of programming in assembly language and fitting the program into 64k of memory. I thought it would be a good idea to revise the post a little bit and touch on some of aspects that were left uncovered the first time.

The Java Bytecode Fundamentals post covered the very fundamental aspects related to Java bytecode - how to obtain the listings, how to read the mnemonics, the constant table, frame structure, local variable table, etc. After compiling Java source code with javac you obtain Java executables - the *.class files. Looking at the contents of those files you can actually recognize some of the opcodes and even full frames, like on the screenshot below.

What Is javap?

javap is the standard disassembler from JDK tools which can be used to view the mnemonical representation of the compiled Java class. You cannot do much with the results that javap provides, but you can read and understand it fairly well.

DISCLAIMER: if you haven't read the first article, it would probably be a good idea to take a look at that one first, and then return to the text below as at some point I assume that you've read the my previous post.

Now let's take a dive into more specific aspects of Java bytecode: using classes, calling methods, and how the stack is involved in the whole process of passing the parameters to the methods.

Using Objects & Calling Methods

Creating class instances, calling methods, obtaining field values - all these operations reserve a dedicated opcode in the Java bytecode instruction set. Our aim is to reveal the code constructs that produce the desired bytecode instructions. Let's create a simple example: a class Scheduler calls a method on another class JobImpl via its interface Job . JobImpl then implements some logic to produce the result.

//Scheduler.java public class Scheduler { Job job = new JobImpl(); public void main() { String result = (String) job.execute(); print(result); } private static void print(String message) { System.out.println(message); } }

//Job.java public interface Job { Object execute(); }

// JobImpl.java import java.util.Random; public class JobImpl implements Job { public Object execute(){ Integer value = createRandomValue(); return incValue(value); } private Integer createRandomValue(){ return new Random().nextInt(42); } private Integer incValue(Integer value){ return value + 1; } }

To see the bytecode listings for Scheduler and JobImpl we'll use javap as follows:

javap -c -private Scheduler

javap -c -private JobImpl

The -private option is required as the source code includes some private methods for which bytecode listings will not be printed otherwise. We're not using -verbose for brevity.

public class Scheduler extends java.lang.Object{ Job job; public Scheduler(); Code: 0: aload_0 1: invokespecial #1; //Method java/lang/Object."<init>":()V 4: aload_0 5: new #2; //class JobImpl 8: dup 9: invokespecial #3; //Method JobImpl."<init>":()V 12: putfield #4; //Field job:LJob; 15: return public void main(); Code: 0: aload_0 1: getfield #4; //Field job:LJob; 4: invokeinterface #5, 1; //InterfaceMethod Job.execute:()Ljava/lang/Object; 9: checkcast #6; //class java/lang/String 12: astore_1 13: aload_1 14: invokestatic #7; //Method print:(Ljava/lang/String;)V 17: return private static void print(java.lang.String); Code: 0: getstatic #8; //Field java/lang/System.out:Ljava/io/PrintStream; 3: aload_0 4: invokevirtual #9; //Method java/io/PrintStream.println:(Ljava/lang/String;)V 7: return }

Let's start our review from Scheduler 's constructor that was generated by the compiler. Some may have expected the generated constructor to be empty, but there's the Job field to be initialized. The first few lines for the constructor are the same as if we had an empty class without any fields:

0: aload_0 1: invokespecial #1; //Method java/lang/Object."<init>":()V

Next, we can see part of the initializer included in the constructor. First, the reference to the Scheduler instance is loaded with aload_0 again as it was previously removed during the invokespecial call.

On the next line, we can now see the use of the new instruction that creates a new object of type identified by the class reference in the constant pool. Indeed, the constant #2 refers to JobImpl . The newly created object reference will actually be pushed onto the stack. We can see that the instructions followed are invokespecial and putfield that will both pop the stack up. Therefore we'll need to save the reference to JobImpl twice. This is done using dup instruction.



5: new #2; // create the instance of JobImpl 8: dup // duplicate the instance on the stack 9: invokespecial #3; // call "<init>" and pop the stack 12: putfield #4; // stores the object reference in Job field, and pop the stack

Opcode 0xBB: new

As you may have noticed the new opcode is only used to "create a reference" of the type, but in order to initialize the object it is still required to call <init> on that object reference. In fact, the four-instruction-sequence ( new/dup/invokespecial/astore ) is a common pattern, when an object is new'ed and stored into a local variable. You can read bytecode faster if you remember this rule :)

invokeinterface (0xB9) and invokestatic (0xB8)

If we proceed to reading the bytecode listing for the Scheduler class, in the main() method we can see a couple of instructions that are related to method calls - invokeinterface and invokestatic . The Scheduler 's field job was declared using the Job interface, i.e. all the calls will actually be interface calls. Therefore we can see the execute() method being called using the invokeinterface instruction rather than invokevirtual , explained later.



0: aload_0 1: getfield #4; //Field job:LJob; 4: invokeinterface #5, 1; //InterfaceMethod Job.execute:()Ljava/lang/Object;

invokestatic is used to call the class methods, i.e. if the target method is declared with the static keyword. The method is identified by a reference in the constant pool and therefore there's no need to load the target object reference to the stack - it only requires the parameters to be passed in.



13: aload_1 14: invokestatic #7; //Method print:(Ljava/lang/String;)V

invokevirtual (0xB6)

Further on, in the Scheduler class bytecode listing we find the print(..) method that includes one more instruction related to method invocation on an object reference - invokevirtual . If you see the invokevirtual opcode, you can be pretty sure that the method is being called directly on the class instance, without using an interface, and the method access is not private . In our example, we can see that invokevirtual is called on an instance of the java.io.PrintStream class:



4: invokevirtual #9; //Method java/io/PrintStream.println:(Ljava/lang/String;)V

invokespecial (0xB7)

In the example above you probably spotted the invokespecial instruction in use. The instruction is used to invoke instance method on object reference, and here's is a good place for a question - what is the difference between invokespecial and invokevirtual ?

The answer can be easily found if one reads the Java VM Spec carefully:

The difference between the invokespecial and the invokevirtual instructions is that invokevirtual invokes a method based on the class of the object. The invokespecial instruction is used to invoke instance initialization methods as well as private methods and methods of a superclass of the current class.

In other words, invokespecial is used to call methods without concern for dynamic binding, in order to invoke the particular class' version of a method.

Passing the Parameters to Methods, Returning Values

For the last topic on this post, let's grasp how the parameters are passed to methods and how the result is returned. For this example we will use the JobImpl class introduced earlier in the article. Using javap -c -private JobImpl , the bytecode listing is printed as follows:



public class JobImpl extends java.lang.Object implements Job{ public JobImpl(); Code: 0: aload_0 1: invokespecial #1; //Method java/lang/Object."<init>":()V 4: return public java.lang.Object execute(); Code: 0: aload_0 1: invokespecial #2; //Method createRandomValue:()Ljava/lang/Integer; 4: astore_1 5: aload_0 6: aload_1 7: invokespecial #3; //Method incValue:(Ljava/lang/Integer;)Ljava/lang/Integer; 10: areturn private java.lang.Integer createRandomValue(); Code: 0: new #4; //class java/util/Random 3: dup 4: invokespecial #5; //Method java/util/Random."<init>":()V 7: bipush 42 9: invokevirtual #6; //Method java/util/Random.nextInt:(I)I 12: invokestatic #7; //Method java/lang/Integer.valueOf:(I)Ljava/lang/Integer; 15: areturn private java.lang.Integer incValue(java.lang.Integer); Code: 0: aload_1 1: invokevirtual #8; //Method java/lang/Integer.intValue:()I 4: iconst_1 5: iadd 6: invokestatic #7; //Method java/lang/Integer.valueOf:(I)Ljava/lang/Integer; 9: areturn }

The incValue(..) method is the one we're looking for - it takes a parameter in and returns a value. The incValue(..) method is called from execute() using the invokespecial instruction, obviously, because it is declared private . So how is the parameter being passed to incValue ?

Let's read the execute() method code:

0: aload_0 // load the reference to this 1: invokespecial #2; // call createRandomValue() 4: astore_1 // store the result to local variable #1 5: aload_0 6: aload_1 // load the value of local variable #1 to the stack 7: invokespecial #3; //Method incValue:(Ljava/lang/Integer;)Ljava/lang/Integer; 10: areturn

Before calling invokespecial , the program loads the value of the local variable number 1 to the stack. This is how the parameter is passed. invokespecial pops the stack as many times as the number of parameters it is about to consume, according to the method signature. That said, aload is the type of instruction that prepares the parameters for a method call.

To return a value the program calls, the areturn instruction means that it returns an object reference from a method. The instruction is prefixed with the 'a' character to indicate that we deal with an object reference here. If we tried to return a value of int type, the instruction would have been ireturn . There are also lreturn , dreturn , freturn used for long , double and float accordingly. The mechanics of the instruction is as follows: the result is popped from the operand stack of the current frame and pushed onto the operand stack of the frame of the invoker. The interpreter returns control to the invoker afterwards.

Final Thoughts

In the two sections above we had a chance to observe how objects are created at the bytecode level, what the opcodes for method invocation are, how parameters are passed to the method invocations and how the return value is passed back. The important part to understand is how the stack is involved into the operations, along with the local variable table taking part in the game.

Additional Resources

There are some good articles on the topic that could be found on the internets. I'd like to give credit to Peter Haggar'sarticle at developerWorks. That is a superb piece, though a little outdated by now. Also, there's Ted Neward's The Working Developers Guide to Java Bytecode that explains all the stuff above and even more. With this all said, even if such articles exist and you can find good coverage of the topic in the documentation of bytecode crunching utilities like ObjectWeb ASM, still it is worth it to refer to The Java Virtual Machine Specification in order to get the information "from the source".

Finally, check out our blog post about mastering Java byte code here.

Want to learn about the latest advancements in Java? This webinar looks at what to expect for Java as a language in 2020 and beyond!

Looking for more Java resources? Our resources page has everything from recorded webinars to whitepages an cheatsheets.

See Available Resources