HPC on the (relative) cheap using public cloud providers

For the past several years, I’ve been working on leveraging high-performance computing techniques for high-throughput data intensive processing on desktop computers for stuff like image and video processing. Its been fun tracking what the multi-processing end of HPC has been doing, where the top 100 super-computer list has been very competitive and very active. Countries, IHVs and universities vie for who can generate more teraflops; spending millions and millions of dollars on the cooling plants alone for their dedicated data centers. These super computers exist to solve the BIG PROBLEMS of computing, and aren’t really useful beyond that.

At the same time, I’ve been following the public computing clouds like Amazon’s EC2, Google’s App Engine and Rack Space’s Public Cloud. These have been interesting for providing compute on the other end of the spectrum, occasional compute tasks, or higher average workloads with the occasional spike capability (like web apps). The public clouds are made up of thousands of servers and certainly rival or best the super computers in numbers of cores and raw compute power, but they exist for a different purpose.

This article in The Register really got me excited. Especially when I read this:

Stowe tells El Reg that during December last year, Cycle Computing set up increasingly large clusters on behalf of customers to start testing the limits. First, it did a 2,000-core cluster in early December, and then a 4,096-core cluster in late December. The 10,000-core cluster that Cycle Computing set up and ran for eight hours on behalf of Genentech would have ranked at 114 on the Top 500 computing list from last November (the most current ranking), so it was not exactly a toy even if the cluster was ephemeral.

The cost of running this world-class super computer?

Genentech loaded up its code and ran the job for eight hours at a total cost of $8,480, including EC2 compute and S3 storage capacity charges from Amazon and the fee for using the Cycle Computing tools as a service.

Real world HPC is now coming into price points where it is accessible to even small companies or research groups. This seems like a ripe opportunity for companies who can apply HPC-techniques to solve real problems for others, and for tools vendors who can make using these ephemeral clouds easier for companies who want to take advantage of them without having to build up high-end expertise in-house.