Pattern Centric Blog

New Control Structures for Java

by Howard Lovatt

October 15, 2008



Summary

Java has many traditional control structures, like if and for. There are also proposals, BGGA, JCA, and ARM, to add more. This post examines what new structures might be of interest and suggests that more traditional control structures would add little and instead parallel-functional structures would be more useful.


This post is primarily about the semantics of control structures and inner classes/closures, but also discusses some syntax aspects. Traditional control structures are constructs like if and for that are found in many languages, for example Java's are almost identical to C's. Traditional control structures are characterised by:

Not having a return value; they are statements rather than expressions. The code blocks to be executed do not have arguments; instead new or existing variables are used (e.g. in " for ( int i = ... ) " i is a new variable) Sequential in nature; for evaluates each loop in turn.

The only non traditional control structure that Java has is ?: , which does return a value. Proposals to extend Java like BGGA and FCM + JCA suggest adding the ability to write more traditional control structures. In fact this is a major driver for the BGGA proposal which ends up limiting the power of a BGGA closure itself so that it can be used in a traditional control structure and also means that two types of closure are required, a restricted closure and an unrestricted closure. There may be a case for some extra traditional control structures, e.g. a for each loop with an index or Automatic Resource Management (ARM), but I say that traditional structures are already well covered.

Other newer languages emphasise parallel control structures and control structures that return a value, e.g. Fortress. Newer languages emphasise parallel-functional programming because this is easier on multi-core machines that are now commonplace. What I would like is the ability to write these new control structures. The inner class construct provides the ideal basis for writing parallel-functional structures. Consider a forEach loop that has an index, executes each loop in parallel, and returns the results as a map associating index with returned value.

public static abstract class ForEach<O, K, I> { protected static final Exception CONTINUE = new Exception( "forEach Continue" ); static { CONTINUE.setStackTrace( new StackTraceElement[ 0 ] ); } // Delete irrelevant stack trace public abstract O block( K index, I in ) throws Exception; Callable<O> aLoop( final K index, final I in ) { return new Callable<O>() { public O call() throws Exception { return block( index, in ); } }; } } public static <O, K, I> LinkedHashMap<K, O> forEach( final ExecutorService pool, final Map<K, I> in, final ForEach<O, K, I> block ) { final int size = in.size(); final Map<K, Future<O>> futures = new LinkedHashMap<K, Future<O>>( size ); for ( final Map.Entry<K, I> entry : in.entrySet() ) { // Execute in parallel final K index = entry.getKey(); final Callable<O> oneLoop = block.aLoop( index, entry.getValue() ); final Future<O> future = pool.submit( oneLoop ); futures.put( index, future ); } final LinkedHashMap<K, O> results = new LinkedHashMap<K, O>( size ); for ( final Map.Entry<K, Future<O>> entry : futures.entrySet() ) { // Collect results final K index = entry.getKey(); try { results.put( index, entry.getValue().get() ); } catch ( ExecutionException e ) { final Throwable cause = e.getCause(); if ( cause != ForEach.CONTINUE ) { // Ignore CONTINUE otherwise abort for ( final Future>O> f : futures.values() ) { f.cancel( true ); } throw new IllegalStateException( "Exception thrown when evaluating forEach block index " + index + " of value " + in.get( index ), cause ); } } catch ( CancellationException e ) { // Ignore cancelled result } catch ( InterruptedException e ) { Thread.currentThread().interrupt(); // Restore the interrupted status } } return results; }

Which is executed like this:

final int numProc = 2 * Runtime.getRuntime().availableProcessors(); final ExecutorService pool = Executors.newFixedThreadPool( numProc ); final String text = "*Hello*"; final int size = text.length(); final Map<Integer, Character> input = new LinkedHashMap<Integer, Character>( size ); for ( int i = 0; i < size; i++ ) { input.put( i, text.charAt(i) ); } final ForEach<Character, Integer, Character> trim = new ForEach<Character, Integer, Character>() { public Character block( final Integer index, final Character in ) throws Exception { if ( index <= 0 || index >= size - 1 ) { throw CONTINUE; } return in; } }; final Map<Integer, Character> output = forEach( pool, input, trim ); out.println( output ); pool.shutdownNow();

and gives the expected result:

{1=H, 2=e, 3=l, 4=l, 5=o}

The important thing demonstrated is that inner classes have the ideal semantics for parallel-functional structures, but not the ideal syntax (see below). The semantics of the BGGA style closures are not ideal for this type of control structure because you need to use a restricted closure (not an unrestricted) and you have to be careful not to inadvertently box this closure inappropriately (e.g. into a function type). The scope of variables etc. is not ideal with a BGGA style closure either. This is not to say that something similar cannot be written in BGGA, just to say that it will be worse than what we can currently write. This begs the question of why add a BGGA style closure when the existing inner classes are better. The BGGA and JCA proposals also have further syntax support for calling control structures, but unfortunately this can only be used with traditional control structures.

As mentioned above, there are some syntax improvements that can be made in terms of calling the new control structure, forEach . The declaration of the forEach method itself could also be shorter; but since the method is called more often than it is written, this is less important. The inner class trim , that is the block of code executed in parallel by forEach , is the main area where syntax improvements can be made. All the inner class/closure proposals, C3S (this is my suggestion!), CICE, BGGA, and FCM, all provide shorter syntax for inner classes/closures, e.g. using C3S syntax the call to forEach and the inner class trim could be written in one line as:

final Map<Integer, Character> output = forEach( pool, input, method( index, in ) { if ( index <= 0 || index >= size - 1 ) { throw CONTINUE; } return in; } );

The primary syntax support provided by C3S is the keyword method that makes an instance of an anonymous class and overrides the only abstract method in its super class and also infers the super-class type, method to be overridden, argument types, return type, and exception types. This short syntax for defining an inner-class instance helps a great deal with the readability of the code and I think some form of short syntax for inner classes would be a useful addition to Java.

C3S also provides further, secondary, syntax support and the line can be written as:

final output = forEach pool, input, method( index, in ) { if ( index <= 0 || index >= size - 1 ) { throw CONTINUE } in };

This secondary syntax support provided by C3S is:

Infers the type of output Brackets, (), are not needed for method calls provided that it is not ambiguous. The semicolon before a close brace, }, is not needed. return is optional at the end of a method.

I think this extra secondary support for inner classes and method calling is worth having, but the gain is not as great as that provided by short syntax for inner-class instances.

This post has demonstrated that the ideal semantics for new control structures are those of the inner class and that the BGGA and JCA proposals do not give us a useful new construct because they are biased both in terms of their closure semantics and syntactic support to traditional control structures. Java has a good selection of tradition control structures and I contend that what is really needed is the ability to write parallel-functional control structures; further I suggest that inner classes with shorter syntax are ideal for this purpose, perhaps with further short syntax for calling these structures. What do you think?

Talk Back!

Have an opinion? Readers have already posted 32 comments about this weblog entry. Why not add yours?

RSS Feed

If you'd like to be notified whenever Howard Lovatt adds a new entry to his weblog, subscribe to his RSS feed.

About the Blogger

Dr. Howard Lovatt is a senior scientist with CSIRO, an Australian government owned research organization, and is the creator of the Pattern Enforcing Compiler (PEC) for Java. PEC is an extended Java compiler that allows Software Design Patterns to be declared and hence checked by the compiler. PEC forms the basis of Howard's 2nd PhD, his first concerned the design of Switched Reluctance Motors.

This weblog entry is Copyright © 2008 Howard Lovatt. All rights reserved.