Thursday, July 26, 2007

Here be (multiple) dragons

Are you thinking about the impact of multi-core processors on the code you write? Are you just thinking about it or actually writing code that will exploit multiple cores?

"The computing industry changed course in 2005 when Intel followed the lead of IBM’s Power 4 and Sun Microsystems’ Niagara processor in announcing that its high performance microprocessors would henceforth rely on multiple processors or cores." [The Landscape of Parallel Computing Research: A View from Berkeley]
Programming for multiple processors is hard. I find myself having to make an explicit mental switch from a 'functional view' of what the program is trying to accomplish, to a 'multi-threaded view' of how the flow of control can help / hinder the program effectiveness in the light of multiple threads running on multiple cores. Then mentally switch back and forth until both worlds are reconciled. I'm sure smarter people can do this in a single pass, but I can't.

There are multiple dragons to consider too, that live in the regular world of parallel algorithms (deadlock, stalling, synchronization, ...), in the nuances of the system's memory model (visibility guarantees, NUMA, data races, ...), and in the language implementation choices (particular compiler's instruction reordering, effects of Java final, pthreads version, ...).

In Java there is a defined memory model (JSR133) and set of concurrency utilities that significantly help with writing correct, portable programs. Intel have released a set of libraries and utilities for C++ called 'Thread Building Blocks' that should improve the correctness and portability of those programs too.

The Intel suite (which appears to be descended from the OpenMP code) runs on Windows/P4 or Xeon, Linux/P4 or Xeon or Itanium2, and MacOSX/Core Duo and has been released under a modified-GPLv2 license, with the option to commercially license at a measly 300 dollars a platform.

That commercial license option means this is another 'parallel open source project'. If you want to contribute to Thread Building Blocks then you must assign copyright to Intel, and grant them a broad patent license. In return they license your code back to you to use as you choose. The Governance model is 'Intel decide' too. I have to say that I'm no fan of this style of project, but back to the technology...

Abstracting the 'multithreaded view' into a set of consumable APIs and pragmas helps the application writer focus on the functional correctness, and allows the compiler writers to optimize for known patterns without the disruption of language changes.

It will be interesting to see if these building blocks are adopted by the other major CPU/compiler vendors, and whether they are extended into languages beyond C++. Will they do it under the Intel terms for this project?

The technology is sound. Anything that helps us voyagers to navigate the uncharted waters and avoid the dragons will be most welcome.

No comments: