P16: a blog by Matt Kangas home archive
08 Jul 2008

Two new scalability-enhancing projects to keep an eye on

Hola! I hope everyone in the USA had a pleasant July 4th weekend. The weather in NYC was too overcast and drizzly to go to the beach, so I spent it bicycling around Brooklyn with my brother and some friends.

Here are two newly-released open source projects that look quite interesting for building scalable web services, if for no reason other than the big names behind them.

1) "The Cassandra Project" by Facebook. This is a structured p2p storage system, roughly along the lines of Google's Bigtable, Amazon's Dynamo, and Apache HBase. HBase is open-source but embryonic, while Bigtable/Dynamo are proprietary. It's nice to see another open-sourced option in this space.

A nice summary of the initial open-sourced code: Amie Street Dev Blog: Cassandra source on Google Code

2) "Graphite" by Orbitz is a "highly scalable real-time graphing system". It began life using RRD as its back-end, but Orbitz decided to rewrite a new storage engine from scratch for better scalability. Their FAQ page says the graphs are "very real-time", and their current production system can handle "approximately 160,000 distinct metrics per minute" on a two-server cluster.