The world’s most extensive data centres

In an article for Nature, Cory Doctorow, co-editor of Boing Boing, describes some of the world’s most colossal data centres. These include facilities for gene sequencing, particle physics, internet archiving, and so forth. The article includes some vivid descriptions of the massive scale at which data is being interacted with, as well as some of the technologies associated. Describing the ‘PetaBoxes’ that contain copies of much of the web, he explains:

[H]oused in these machines are hundreds of copies of the web — every splenetic message-board thrash; every dry e-government document; every scientific paper; every pornographic ramble; every libel; every copyright infringement; every chunk of source code (for sufficiently large values of ‘every’, of course).

They have the elegant, explosive compactness of plutonium.

Far from being static repositories, many of these places have been designed for a near-constant process of upgrading. They maintain spare capacity into which 1 terabyte drives can be installed when the 500 gigabyte drives become dated (and then 2 terabyte drives, and then 4 terabyte drives). The ones with the greatest capacity use huge arrays of magnetic tapes, archived and accessed by robotic arms. The data centre at CERN (where the Large Hadron Collider will soon begin collecting data) includes two robots, each of which manages five petabytes of data. That’s five million gigabytes: equivalent to more than 585,000 double-sided DVDs.

One of the most interesting issues described is heat and the mechanisms through which it is addressed. The section describing how emergency shutdowns need to occur in the event of a cooling failure definitely comes across powerfully. Describing a facility in the Netherlands, it says:

The site manager Aryan Piets estimates that if it broke down and the emergency system didn’t come on, the temperature in the centre would hit 42 °C in ten minutes. No one could cleanly bring down all those machines in that time, and the dirtier the shutdown, the longer the subsequent start-up, with its rebuilding of databases and replacement of crashed components. Blow the shutdown and stuff starts to melt — or burn.

The main system being discussed is actually surprisingly climate friendly, since it uses cool lake water and pumps rather than air conditioning equipment to keep the drives and servers at an acceptable temperature. Hopefully, it is something that other firms with massive server farm needs are paying attention to. The article mentions Google several times.

For the geeky and the curious, the whole article deserves a read.

Replacing the keyboard on a G4 iBook

Back in April, I managed to spill coffee on the keyboard of my 14″ iBook, disabling a number of keys. Now, I have managed to return it to functionality for less cost than anticipated.

The authorized Apple repair places in Ottawa wanted $45 just to diagnose the problem – specifically, to determine if the failure lay in the keyboard itself or the logic board it connects to. Replacing the keyboard would then cost extra for parts and labour. Replacing the logic board would be quite a significant expense, largely because the machine would have to be seriously taken apart.

Instead of taking it into a shop, I bought a replacement keyboard on eBay for about $30. Had it been a logic board issue, I would have diagnosed it myself for a lesser cost, which could have been further reduced by re-selling the replacement keyboard. As it happens, the new keyboard works fine. The process of installing it is pretty straightforward:

  1. Shut down computer. Remove power cord and battery.
  2. Lift plastic tabs at the top of the keyboard so it can swing upward towards you. Lay the partially removed keyboard flat across the area with the touchpad.
  3. Ground yourself by touching something metal, to prevent static shock to the components.
  4. If present, remove the AirPort card by gently pulling it towards the screen. Gently remove the plug connecting it to the motherboard.
  5. Use a tiny screwdriver to remove the four tiny screws holding down the aluminum plate under the space where an AirPort card goes.
  6. Lift off that plate.
  7. Pull the keyboard connector out of the motherboard. In my experience, it takes a moderate amount of force to make it disconnect.
  8. Position the new keyboard where the old one was, lying keys-down on the trackpad area.
  9. Plug the new keyboard into the logic board, as before.
  10. Replace the aluminum plate. Replace the four screws.
  11. If present, replace the AirPort card by plugging the connector into it, then clicking it back where it was previously.
  12. Place the keyboard back in its normal position, allowing the tabs to click it into place.

I am always suspicious that stuff I buy on eBay is counterfeit. This keyboard certainly looks identical to the old one. I am less sure about the sounds and feeling of the keys, but that may just be because I had grown used to how an old keyboard feels, followed by the feeling of Apple’s nice new aluminum external keyboards.

The replacement keyboard is definitely squeakier than would be ideal (particularly in terms of the spacebar). Hopefully, it will mellow with use.

Paper backups of digital files

One thing well illustrated by history is that the records that endure are the ones that got chiseled into stone or, failing that, at least put on paper. Given the issues of long-term reliability relating to hard drives, flash memory, and writable optical media, someone wishing to preserve information for the distant future might be well advised to make a paper copy of the parts that are most critical.

PaperBack is a mechanism for facilitating exactly that. It includes software to convert about half a megabyte of any kind of data into a pattern that can be printed onto paper. For some kinds of highly compressible information, it can manage three megabytes per page – as much as two old 3.5″ diskettes. It also includes code for scanning the data back into a digital form. While I doubt anybody will be doing this for multi-gigabyte video files, it may be a worthwhile thing for some kinds of information. Anyone building the modern equivalent of an ancient Greek tomb might be especially well advised to consider the software. Hopefully, future generations will prove as capable at deciphering JPEG images as those in the recent past did at deciphering Linear B.

A compiled version of the software is available for Windows. Mac and Linux users will need to compile the code for themselves.

Google’s web browser

Google is in the process of rolling out a web browser, called Chrome. The defining characteristics are mostly on the back end, in terms of how it deals with processes and memory addressing. That being said, the foundation is being laid for what ought to be an unusually stable and secure browser.

The whole thing is explained in this comic book. The beta version is available for Windows, but we Mac users need to keep waiting for a while yet.

P.S. Another piece of software I am excited about is Spore. I have been a big appreciator of SimCity, SimAnt, and the like. The opportunity to evolve intelligent organisms on my shiny new computer is one I anticipate eagerly.

A supercomputer on every desk

One product of globalization and technological advance is the amplification of the ‘pygmy and giant’ phenomenon. On measures like wealth or fame, the world is probably more unequal than ever before. There are faces that would probably be recognized by a significant majority of those alive on Earth – probably a situation that has only existed for a few decades at most.

At the same time, technology is sometimes a great equalizer. For instance, the world wide web lets virtually anyone with literacy and moderate wealth speak to a worldwide audience. The range of capabilities is also narrowing in other areas. For example, Wal-Mart supposedly has about 583 terabytes of sales and inventory data stored at its headquarters. That sounds impressive until I remember the 1 terabyte drive sitting on my desk. It cost about three days worth of after-tax pay and serves the major purpose of protecting my data from the failure of the disk in my main computer. At a moderate personal expense, I have 0.17% of Wal-Mart’s storage capacity.

The amount of computing power you can get per dollar (or per watt of electricity), continues to increase dramatically. For the price of a sports car, you can build yourself a supercomputer. It is interesting to speculate upon what the democratization of computing power will lead to. Will it just mean increasingly realistic games and ever-more-bloated word processors, or will some genuinely game-changing applications emerge? The fact that someone can host a webpage like this for under $40 a year suggests the potential importance of this confluence in technology, economics, and innovation.

Link rot

Anyone who has been running a website for a few years (and paying attention) will be familiar with the reality of link rot. Sites get redesigned or removed from the web and, in so doing, links you have made to them in the past cease to be functional or lead to the right content.

Unfortunately, there isn’t a huge amount that can be done about this. For the people doing the linking, there is only so much effort that can be devoted to making sure old links are still current. It is feasible for a few critical links (blogroll items, links in key posts), but not in the case of hundreds or even thousands of old entries. If the content had been moved, there is at least the theoretical possibility of combatting link rot through updating. If the content is simply gone, there is really very little that can be done.

Those being linked can probably do the most in response. When they move from one type of site organization (or one site location) to another, they can provide tools to help those brought in through old links. The gold standard is to automatically redirect people to the correct pages in new locations. At the very least, sites should provide a mechanism for lost visitors to search for the content they wanted.

Steganography challenge

In the past, I have posted a few cipher challenges for the cryptographically inclined. Here is a new one:

The above is an example of steganography rather than cryptography, though the two can be easily combined. Indeed, the same approach used above could be applied in a far more subtle and effective fashion. To save people some trouble, I can tell you that the hidden message is in the actual text shown, not hidden somewhere in the data file.

Here is a hint, weakly enciphered using ROT13: Guvf sbez bs frperg jevgvat jnf vairagrq ol Senapvf Onpba.

Sleep and slime moulds

Since I spent the last fourteen hours sleeping, I don’t have much of interest to convey right now.

As a consolation, here is a time lapse video of slime moulds and fungus growing. I have always found slime moulds rather fascinating. They start of as single-celled, bacteria-eating organisms resembling amoebas. If two with matching mating types encounter one another, they can form a zygote. That, in turn, becomes a macroscopic organism with many nuclei, but no membranes between cells – an “interconnected network of protoplasmic strands.” Once this has eaten everything nearby, fruiting bodies form that disperse spores. These hatch into single-celled bacteria-eating eukaryotes once again.

One of the more odd and charming sections from the Wikipedia entry on slime moulds is this:

In 2006, researchers at the University of Southampton and the University of Kobe reported that they had built a six-legged robot whose movement was remotely controlled by a Physarum slime mold. The mold directed the robot into a dark corner most similar to its natural habitat.

It is disconcerting to consider that an entity consisting of an amalgamation of amoebas can apparently display something akin to preferences when put in control of a robot (though I think the ‘control’ just consisted of watching how the slime mould moved and copying it). This article has a picture of the robot.

In any case, I am hoping that my period of hibernation will reset my brain. During the last few days, it has sunken into something akin to – but nonetheless more profound than – the normal August lull which permeates Ottawa.

The future of plate tectonics

The PALEOMAP project has created some interesting projections of how the continents will be arranged in the distant future. Fifty million years out, Africa will have pushed into Europe, eliminating the Mediterranean. In 100 million years, all the continents will be drawing together. In 250 million years, only two landmasses will be left: a combination of Australia and Antarctica near the south pole and North and South America massed with Eusasia and Africa around a central sea.

The projections may prove entirely incorrect, but it is nonetheless remarkable to see the world thus transformed. It is a reminder of just how variable the world is, over long time horizons.