Sunday, December 13, 2009

Commodity Hardware

According to last week's Economist, the United States Air Force are buying 2,200 units of Sony PlayStation 3. The idea is to get a cheap hardware (subsidized by SONY as their actual revenue comes from selling the games), install Linux and have a super-powerful cluster at 10% of the normal cost. I've had a similar idea last year - coming out the economic boom left me with number of spare notebooks and number of customer needing a server but with advancing recession with less and less money for a new one. Out of necessity I set up temporary servers using my spare laptops. While I did expect some problems - like failing hard drives, performance issues and so on... surprisingly enough 1 year later not one of those laptops/servers failed. Not only that - I even found the performance on a par with my other regular servers.

That got me thinking...according to the Moore's law the computing power is doubling fast enough to make last month's average laptop faster than last year's high end server. Over the years I've worked with number of servers and number of brands - from Dell, through HP, Compaq to NEC. Strangely enough, no matter how expensive the server (or any computer for that matter) - the only sure thing is that it will break. E.g. I use MacBook Pro 17" that cost me over $5,000. Yet, everything (with the exception of keyboard) has already failed and I spent on repairs more than the initial amount. Just the display with the motherboard was more than $4,000. The worst thing about it was that every time something happened the notebook spent at least a month (up to 2 months) in the workshop. My old Dell notebook broke down just as much and it used to be very frustrating to wait until the next business day for technician to come to my office and fix it. Having to actually go to the workshop, spend 1 hour there + 1 hour on the way and have your bloody expensive notebook taken for a month... Anyway, the point is that even my high end servers fail. I once had 4 out of 5 hot-swap SCSI drives fail at the same time. Another time just 3 of them did. At the end I found it much more important to prepare for the hardware failure than trying to prevent it.

Coming back to the cheap notebooks - you can get a reasonably good notebook with 4GB RAM for around $800. With Dell's next business day service (or even better the 4 hour service) you pretty much don't have to worry about it for the next 2 or 3 years. Should anything happen you can simply swap it for a spare one. This actually addresses whole set of points and issues I face with conventional servers. The installation and troubleshooting usually requires a senior person going onsite wasting at least half a day. The notebooks, on the other hand, can be pre-installed in the office and simply delivered to the customer by a junior staff who just needs to plug it to the network and everything is done. I still remember moving 4U servers (54kg each) a few years back. This obviously requires a proper backup strategy. The software part of which (i.e. what to backup and when and how and offsite strategy) is the same as on a regular server. For faster recovery in notebook servers I clone the drive to an external USB drive and have the backup copied also on this external drive (pgpool-II is really great for this). The cloning can actually be automated on monthly or weekly basis to have a reasonably fresh image on the external drive. If the drive fails (by far the most common thing on my servers as well as notebooks) all that needs to be done is to replace the drive. This takes a couple of minutes on my Dell and Compaq notebooks and even a non-technical person can do it. Once the system is back on-line, with everything installed and the backup conveniently on the drive, it takes less than 5 minutes to restore the database and deploy the latest revision of the system. And of course, can be done from anywhere.

I am not saying that notebook servers are fit for every situation - I am only saying that they're fit for most of our customers - situations where there's limited number of users (typically around 30 - 50) and most of them in the same location. While the internet connection for businesses in SG is as slow as it is expensive, it still still usable for even several offices accessing the system. I am still finding it hard to understand why in such a high tech place like SG it costs around $100 a month to get 100MB downstream/10MB upstream for residential connection yet it costs over $200 to get 1.5MB down/374kB for the office. There's so many initiatives for bringing internet to consumers - from free Wireless@SG, through iCell's lighting up East Cost Park and food courts, internet on the buses to great quality and dirt cheap 3G internet (I pay $20 for 50GB a month) - yet there is very little for the content providers. There's been a big push coming from government and SingTel for cloud computing. I haven't really heard of any local cloud provider worth mentioning - there are several offering either ridiculously high prices or ridiculously low specs or both. We've been using Rackspace's cloud servers ever since they became available - besides the lowest price they offer really great support - yet I've been moving more and more customers away to the notebook servers - from the peak of 19 servers I now only have 4 left. The main reason is actually price/performance ratio. My usual application requires (at least) 1GB memory instance the price of which comes to around S$70 (around 50 USD) which comes to around S$800 a year. The problem is that 1MB instance is sufficient only for average performance - it is not enough during peaks - and even small performance hiccups during peak result in great frustrations for the users. As S$1000 can buy you a 4GB Dell with next day on-site service for 2 years - it's quite compelling - especially when budgets are tight.