In an article for Nature, Cory Doctorow, co-editor of Boing Boing, describes some of the world’s most colossal data centres. These include facilities for gene sequencing, particle physics, internet archiving, and so forth. The article includes some vivid descriptions of the massive scale at which data is being interacted with, as well as some of the technologies associated. Describing the ‘PetaBoxes’ that contain copies of much of the web, he explains:
[H]oused in these machines are hundreds of copies of the web — every splenetic message-board thrash; every dry e-government document; every scientific paper; every pornographic ramble; every libel; every copyright infringement; every chunk of source code (for sufficiently large values of ‘every’, of course).
They have the elegant, explosive compactness of plutonium.
Far from being static repositories, many of these places have been designed for a near-constant process of upgrading. They maintain spare capacity into which 1 terabyte drives can be installed when the 500 gigabyte drives become dated (and then 2 terabyte drives, and then 4 terabyte drives). The ones with the greatest capacity use huge arrays of magnetic tapes, archived and accessed by robotic arms. The data centre at CERN (where the Large Hadron Collider will soon begin collecting data) includes two robots, each of which manages five petabytes of data. That’s five million gigabytes: equivalent to more than 585,000 double-sided DVDs.
One of the most interesting issues described is heat and the mechanisms through which it is addressed. The section describing how emergency shutdowns need to occur in the event of a cooling failure definitely comes across powerfully. Describing a facility in the Netherlands, it says:
The site manager Aryan Piets estimates that if it broke down and the emergency system didn’t come on, the temperature in the centre would hit 42 °C in ten minutes. No one could cleanly bring down all those machines in that time, and the dirtier the shutdown, the longer the subsequent start-up, with its rebuilding of databases and replacement of crashed components. Blow the shutdown and stuff starts to melt — or burn.
The main system being discussed is actually surprisingly climate friendly, since it uses cool lake water and pumps rather than air conditioning equipment to keep the drives and servers at an acceptable temperature. Hopefully, it is something that other firms with massive server farm needs are paying attention to. The article mentions Google several times.
For the geeky and the curious, the whole article deserves a read.
New “petascale” computer models depicting detailed climate dynamics, and building the foundation for the next generation of complex climate models, are in the offing. Researchers at the University of Miami Rosenstiel School of Marine and Atmospheric Science (RSMAS), the National Center for Atmospheric Research (NCAR) in Boulder, Colo., the Center for Ocean-Land-Atmospheric Studies (COLA) in Calverton, Md., and the University of California at Berkeley are using a $1.4 million award from the National Science Foundation (NSF) to generate the new models.
The development of powerful supercomputers capable of analyzing decades of data in the blink of an eye marks a technological milestone, say the scientists, capable of bringing comprehensive changes to science, medicine, engineering, and businesses worldwide.
The speed of supercomputing is measured in how many calculations can be performed in a given second.
Petascale computers can make 1,000,000,000,000,000 calculations per second, a staggeringly high rate even when compared to supercomputers.