An article by Robert Cringely from a couple of weeks ago where, besides describing an amusing anecdote about a 'pet psychic' he consulted, he talks about many disconnected things that are nevertheless interesting. Example:

First there is Google, which runs four enormous data centers around the world containing in excess of 10,000 servers. It is the largest Linux cluster of all, and is constructed entirely of generic beige box PCs interconnected by 10/100 Ethernet. These are not racks and racks of state-of-the-art blade servers, just el cheapo PCs. So the magic must be in the software.

Now here is the part that sticks in my mind: the fault tolerant nature of the cluster is such that if a machine fails, the other machines simply take over its functions. As a result, whenever a server fails at Google, THEY DO NOTHING. They don't replace the broken machine. They don't remove the broken machine. They don't even turn it off. In an army of drones, it isn't worth the cost of labor to locate and replace the bad machines. Hundreds, maybe thousands of machines lie dead, uncounted among the 10,000 plus.

We have reached the point where we are totally dependent on computers, yet the marginal cost of a computer -- at least for Google -- is nothing. This may be an historical first.


Posted by diego on April 21 2003 at 8:00 PM

