P16: a blog by Matt Kangas home archive
01 Aug 2008

More scalable datastores: Scalaris, nmdb

Scalaris is a new distributed key-value datastore, recently announced and code posted to Google Code.

It was announced and demoed at Erlang eXchange 2008. Joe Armstrong (father of Erlang) later wrote on his blog: "my gut feeling is that what Alexander Reinefeld showed us will be the first killer application in Erlang"

Armstrong's summary:

  1. They make a peer to peer system based on the chord algorithm
  2. They added a replication later using the paxos algorithm
  3. They added a transaction layer
  4. The injected the wikipedia
  5. It went faster that the existing wikipedia

    "Applied to Wikipedia, Scalaris serves 2,500 transactions per second with just 16 CPUs, which is better than the public Wikipedia."

One downside: it's presently a memory-only store, so it's quite useless for permanent data storage. (One full power-outage in a data center will obliterate all of your data. Doh!)

nmdb is yet another distributed key-value store, this one implemented in ~5000 lines of C and using qdbm or berkeley db as the back-end store. It looks simple and stable. Major limitations: it's distributed, not replicated, so is more like a persistent memcache (like Tugela and memcachedb). There is also a hard 64kB size limit on key+value packets.

As you might have guessed from my articles on this topic -- I am looking a "Bigtable-like" datastore that I can recommend to clients. My criteria are:

I still haven't found anything I'd recommend. Dang it guys, finish one of these projects! :) Maybe I'll have to build something custom on top of MogileFS from scratch after all?