High Scalability has a great link to a video TechTalk with Cuong Do, YouTube’s engineering manager. He talks about the challenges YouTube faces (past and present) to meet it’s skyrocketing user demand, as well as the infrastructure that allows them to scale. I enjoyed the anecdotes: especially the frantic email sent at 2am alerting the dev team that they only had 3 days of storage left… I always thought Google/YouTube would be immune to emergencies like that… ignorance on my part
(requires Adobe Flash plugin… click HERE to watch it on YouTube)
I found this information interesting:
The application code is written mostly in Python (the web app is not the bottleneck… the database RPC is)
They use Apache for page content and lighttpd for serving video
HW RAID-10 across multiple disks was too slow. HW RAID-1 with SW RAID-0 was faster because the Linux I/O scheduler could see the multiple volumes and would therefore schedule more I/O
You can read a good summary of the talk HERE from the High Scalability website.
CIO has a good interview with Rob Reeg, president of Mastercard’s Global Technology and Operations. He discusses their infrastructure and processing architecture. Definitely worth looking at if you’re interested in how credit card transactions are processed.
Interviewer: How big of an infrastructure do you have to support and maintain? It must be huge.
Reeg: Actually from a pure server footprint standpoint… we probably have fewer actual footprint servers because of techniques like virtualization that help us leverage one box to do multiple things.
Where it gets interesting is philosophically: We try to put [transaction] processing as close to our customers, the banks, as possible. When we talk about the global network, we have small servers that sit with the bank customers that connect to our network. What it does is it gives us intelligence there at the end of the network. So as a transaction comes through, we can take a look at that transaction and decide how do we best process that transaction for the benefit of all those four parties in the model.
As to processing, the majority of transactions we’re looking at relate to how do we process them as fast as possible in the most accurate way. The way to do that is by peer to peer: If you’re using your card in Europe, in London, say, and you swipe your card as you check out of hotel, we can route that transaction to the hotel’s acquiring bank in London directly to your issuing bank and get that message back for approval without ever going through St. Louis or some big data center in the middle of all that.
Tom Distler is a senior software engineer at Pelco. He has a beautiful wife and a baby on the way. He enjoys playing the drums and writing about himself in the 3rd-person (not really :-)). Read more...