|
Oct 23
2008
|
|
|
Just got back from Chicago, where over the past 2 days a small group of scientists, academia and industry discussed various aspects of cloud computing and related topics.
One of the topics was about comparing extreme large scale analytical problems and the systems leverage to solve them. In order to compare classes of super computers, Alex Szalay (John Hopkins University) explained a simple yet interesting figure: The AMDAHL number (Amdahl's Law Bell, Gray and Szalay 2006)
Alex explained the Amdahl number (BW) as One bit of IO/sec per instruction/sec.
Why is this figure interesting? To compare the analytical capabilities of various extreme large clusters, it is important to categorize them into different groups based on their processing capabilities. Typical commercial applications of large scale analytics require large amounts of IO per available CPU while various scientific applications require less IO per available CPU.
For a Blue Gene the BW=0.013, the JHU cluster BW = 0.664. So off I went and calculated the Amdahl number of our main processing clusters. While different in size, they are configured proportionally identical. The biggest difference to the other systems: our clusters are built and optimized for high levels of random access while most other systems IO capability is measured in single stream serial IO - obviously a major differentiation.
Anyway the result is a BW=0.8 the closest to 1 of any system listed by Alex. Again, I have calculated that metric with our mixed & highly random sustained IO throughput. When you have hundreds of concurrent workloads hitting the same platform simultaneously, everything becomes pretty much random - in terms of IO patterns.
Finally I compared this figure to a typical Map-Reduce cluster based on 8 cores per node and 4 internal disks as the main IO source. The BW for such a configuration is around 0.005 and indicates an extreme high ratio of CPU vs IO. For random access patterns, that BW slips to 0.001.
What is the importance of the Amdahl number you might ask? The first and very obvious application is to segment analytical applications into various groups with various scientific applications on one side of the spectrum and business oriented analytics on the other end.
Think of it that way: Does your workload perform tasks like: Scan massive amounts of data for rare patterns, perform large sorts, join large data sets beyond the memory footprint of your system?
If so, you really need the highest achievable Amdahl number or you will not be able to leverage the CPU cycles available. Furthermore you will consume considerably more energy in performing the same task, as power consumption of individual nodes is fairly constant, whether you run the cores to 5% or 99% busy.
This gets us onto another topic of energy consumed for a particular task. Together with traditional $$$ based TCO figures it is becoming a more and more interesting metric to measure the efficiency of a particular platform performing various tasks for a set of data processing requirements.
More about that at another time.

