06 Aug 2008

Reply to Viktor, re "Tokyo Cabinet + Tyrant"

Viktor Sovietov left an interesting comment on my "More scalable datastores" post the other day:

Why do not try Tokyo Cabinet + Tyrant as per node storage? In combination with Scalaris and tcerl it could work quite well, I think.
Also, using distributed filesystem can be an option as well. I even prefer this approach, because it gives much better performance.

Well, I am currently using a distributed filesystem: MogileFS. The best thing about Mogile is that it's simple and it scales. But it stores metadata in MySQL so this is ultimately a bottleneck. Also, in Mogile, each record is just a separate file on a filesystem; fetching one record requires (1-2) HTTP connections.

Right now I'm using Mogile to store images (photos). This fits nicely with the one-record = one-file paradigm. But I have another dataset which is ~95 GB and growing, append-only; record size is typically 1 kB. If I store these directly as individual files on a large disk, I'm going to get killed by the block size in most filesystems.

(I presently have 6x 700 GB data drives/box times 4 boxes, using ext3, block/fragment size is 4096. So I'd waste 75% of my disk if I'm writing 1kB files.)

Tokyo Cabinet is an interesting suggestion. At first glance, it seems an interesting alternative to BerkeleyDB -- promising performance close to cdb . Tyrant looks like a straightforward network interface to a DB instance, too.

I'm sure it's possible to weave these together with tcerl and make a new persistent backend for Scalaris. But would it be a net win? I am not interested in forking Scalaris; I'd rather see the Scalaris folks express an interest in developing a persistent backend, and giving the nod to this as a potential solution.

To me, an open-source project is only useful if, three years after I've launched a system, there is still a community of people talking about that project and available to help support it. I want to launch a system, then hand it off to someone else for the maintenance phase. If the project doesn't have "legs", then I will the only person who can answer those support calls down the road, and I'd like to avoid that. :)