Wednesday, September 16, 2009

Programming sockets using temporary ports

In an earlier posting, I showed where there was code in Apache Harmony that had some unsafe parameter checking logic, and I gave a pattern for how to do it right.

Another "anti-pattern" that I see recurring in the Harmony test cases is around client-server socket programming. The typical scenario is that the tester wants to exercise some socket code, so they spin up a new Thread to act as the server, and run the tests on the main thread. The tester either picks a port that they assume will be free, or use some horrible code that tries to find a free port:


/*
* Returns a different port number every 6 seconds or so. The port number
* should be about += 100 at each 6 second interval
*/
private static int somewhatRandomPort() {
Calendar c = Calendar.getInstance(TimeZone.getTimeZone("UTC"));
int minutes = c.get(Calendar.MINUTE);
int seconds = c.get(Calendar.SECOND);

return 6000 + (1000 * minutes) + ((seconds / 6) * 100);
}

Guessing port numbers like this is simply awful and hopeless. Once the port has been established, the poorly written test case opens a server socket to accept connections in the server Thread, while to get the timing right the client goes into a sleep for a while to give the server time to start up. Apart from the fact that the code is going to fail intermittently, it means the test case always takes a minimum length of time to execute as it spends time sleeping.

There is no need for guessing ports or separate threads. In Java, just like other programming languages that expose the underlying platform's TCP/IP stack behaviour, binding to port zero instructs the stack to allocate an ephemeral port.

Furthermore, server sockets have a listen backlog queue capable of remembering 50 outstanding connect requests by default. Here is a example of simple client-server code using an ephemeral port and a single thread to exchange a simple message over TCP/IP sockets.

// Set-up
ServerSocket server = new ServerSocket(0);

Socket client = new Socket();
client.connect(server.getLocalSocketAddress());

Socket worker = server.accept();

// Do some stuff
client.getOutputStream().write("Hello world!".getBytes("UTF-8"));
byte[] buffer = new byte[1024];
int length = worker.getInputStream().read(buffer);

// Tidy-up
client.close();
worker.close();
server.close();

System.out.println(new String(buffer, 0, length, "UTF-8"));

Hopefully the code is simple enough to understand without further comment. I'll just point out that the server socket is created with an argument of "0" to mean the listening socket should be opened on any network adapter, with a stack allocated port number, and supporting up to fifty pending connections. Then the client connects to the actual interface and port that was used using getLocalSocketAddress().

The same simple example can also be written using the NIO APIs, which requires passing null to the bind() method, like this:

// Set-up
ServerSocketChannel server = ServerSocketChannel.open();
server.socket().bind(null);

SocketChannel client = SocketChannel.open();
client.connect(server.socket().getLocalSocketAddress());

SocketChannel worker = server.accept();

// Do some stuff
client.write(ByteBuffer.wrap("Hello world!".getBytes("UTF-8")));

ByteBuffer readBuffer = ByteBuffer.allocate(1024);
worker.read(readBuffer);
readBuffer.flip();

// Tidy-up
worker.close();
client.close();
server.close();

System.out.println(Charset.forName("UTF-8").decode(readBuffer));

Of course, the simple examples still ignore a few return codes and exceptions that should be considered to make the code safer, but the purpose here is to show the structure for tests and applications that need to use sockets to exchange information.

Monday, September 07, 2009

Shoddy reporting from El Reg :-(

I stop by The Register website quite frequently to keep up on the gossip, and quite enjoy their laid back reporting style, but the September 4th article titled "Oracle should relax Sun's Java Community control grip" was a very poor piece of reporting.

It hit my radar with the reference to Apache Harmony, where Clarke writes:

For all its evangelism - and its initial decision to open source Java - Sun has refused to open the TCKs, infuriating and frustrating the open-source community.

This has led to accusations that Sun is hindering - not helping - open-source Java projects such as Harmony from the Apache Software Foundation (ASF), backed strongly by IBM.

While Apache's has been able to build an implementation of Java Standard Edition under Project Harmony thanks to the opening of Java, Harmony cannot be certified because the TCKs contain proprietary code the open-source code cannot touch. Harmony, therefore, remains stuck in a limbo of having been built but being uncertified.
Unfortunately this just serves to illustrate that Clarke doesn't understand the situation, or history, around open source Java SE.

Apache have never asked, or expected, Sun to open source their TCKs. While that would be a fine outcome in itself, there is no reason why Sun cannot maintain their tests as proprietary code and make them available under a variety of license terms as they do with other Java specification test suites. Apache have asked for license terms that allow the code we have written to be released under an open source license, that is to say Apache will not entertain restrictions placed on our code that passes the TCK.

http://www.apache.org/jcp/sunopenletter.html

Furthermore, Apache Harmony and GNU Classpath existed well before OpenJDK, so it can hardly be said that OpenJDK was a prerequisite to alternative implementations being created!

The story attempts to cover a great deal of ground, the possible effects of the acquisition, the Apache dispute, JCP reform, and so on -- each of these is a major story in itself and this shoddy attempt does each a disservice.

Improving parameter checking in Apache Harmony

I'm going through some Apache Harmony code at the moment, and spotting a few places where there is a common pattern of code that needs fixing.

Consider this method:


public synchronized int read(byte[] target, int offset, int length) throws IOException {
if (length + offset > target.length || length < 0 || offset < 0) {
throw new ArrayIndexOutOfBoundsException();
}
if (0 == length) {
return 0;
}
ByteBuffer buffer = ByteBuffer.wrap(target, offset, length);
return channel.read(buffer);
}

Can you see the problem?
Hint: the parameter checking in the if() statement is wrong.

The original author rightly wants to ensure that the offset and length represent plausible values for the data to be put into the buffer; that is, the method should not attempt to place data outside the bounds of the 'target' byte array.

The problem is that in Java, like a number of other languages, (int) values overflow without any exception being thrown. In the code above we are checking for illegal values by computing (length + offset) and throwing an exception if it is greater than the buffer's length. This ensures we don't "write off the end" of the target buffer. However, if the sum overflows an int representation then it may indeed pass this safety check, and permit a value for offset or length that is too large.

As a practical illustration, consider that the 'offset' actual parameter value is Integer.MAX_VALUE = (int)2147483647, and the 'length' value is, say 42. In this case the sum of these positive values is a negative value '(int)(2147483647 + 42) = (int)-2147483607' so the less-than test passes and the method is broken!

The fix is to avoid the sum, and convert it into a subtraction, e.g.
if (length < 0 || offset < 0 || length > target.length - offset) {
throw new ArrayIndexOutOfBoundsException();
}
Now we can be assured that there will be no int overflow. The target.length cannot be less than zero, so the subtraction can only produce values in the range (0 - Integer.MAX_VALUE = Integer.MIN_VALUE + 1) to (Integer.MAX_VALUE - 0 = Integer.MAX_VALUE).

I know this is not news for Java developers. Josh Bloch and others have been highlighting the problem for a number of years, yet I've found about four places in the code so far that exhibit this potential problem. They'll all be fixed in time for the next milestone build.