CouchDb updates are slow

I'm doing some very early performance analysis of CouchDb, and it looks like my design is going to be a little slow for updates. Compared to how quickly Lotus Notes/Domino NSF databases can commit new documents, CouchDb will take about twice as long, maybe more. It thought my design would be significantly slower, now I know for sure.

The reason is NSF uses a single flush for atomic updates, while CouchDb uses double flush. By "flush", I mean the database code forces any memory buffered data that it wrote to the database file to be finally written to disk. Flushes are expensive operations and generally something that should be avoided if possible.

The single flush scheme NSF uses is pretty fast but it has a hidden cost. I'll explain how it works from a high level.

When NSF opens a database for writing it sets an "Open" flag in the header and writes it back to disk. Then as documents gets added to the database, immediately after it writes the document to disk, but before returning to the caller, it flushes the data to disk, it also updates the in-memory copy of the file header and other cached data structures, but they are not written to disk. This happens for any number of updates while the database is still open. Once the caller is done with the database and closes it, NSF clears the "Open" flag and writes the up-to-date header and other structures to the database file and then flushes to disk. At this point everything on-disk is in a completely consistent state. This a efficient way to achieve atomic updates, but there is a hidden cost: The fixup.

Fixups are necessary any time a database is opened and it already has the "Open" flag set in the header. Usually this is the result of a crash or a power failure the last time the database was opened, and it means certain data structures may not be up-to-date and consistent. When NSF encounters a database like this, it must read significant portions of the database file and examine the data structures in question, checking for invariants in the data structures and making sure they agree with other related structures. When it encounters inconsistencies it corrects them. With a large database, this can be very time consuming and very disk I/O intensive.

Now this a simplification of the real NSF design, but conceptually that's how it works. The fixup approach is often a good compromise, it offers good performance in the average case and provided you don't crash too often or you don't have huge databases, the actual fixups aren't that big a deal. However, when either of those is true, then the fixup becomes big availability problem, you might have big problems getting to your data when you really need it.

To alleviate this, Lotus Domino NSF also offers transaction journaling. I won't go into the details of that, but provided you have a disk dedicated for, the journal file, it's offers fastest performance for update. And since it eliminates fixup it's also a great high availability option too. However, it also increases code complexity and has performance problems if used without the dedicated disk.

Instead of the fixup approach, CouchDb uses a simple double flush design that always keeps the database file in a consistent state. When writing a document into the database, CouchDb first flushes the actual document data and meta-data structures to disk, and like NSF it doesn't write the actual header during the first flush. Then once that data is definitely on disk, it then writes the file header and flushes that to disk, only then does it return to the caller. Since CouchDb has zero-overwrite design, which means nothing in the database file is ever overwritten or modified after the initial write (except for the file header, maybe it should be called a one-overwrite design since? meh). That means that no crash will ever cause the database to be in an inconsistent state (assuming the underlying file system is robust and doesn't corrupt already written data), incomplete writes are simply data/metadata the header can't see, and all the old data is exactly as it was before. This means a fixup is never necessary, so CouchDb should be highly crash tolerant.

But there is the write performance cost: CouchDb will be much slower for adding documents than Notes, and that's a bad thing…right? Under certain circumstances it is, but generally speaking, most real world Notes and Domino NSF performance problems stem from indexing of already written documents to create "views", not the updating of documents into the database.

However, CouchDb also offers a way batch a large number of updates into a single commit if high performance updates are desired. When doing updates this way, CouchDb performance increases dramatically (like 100 fold). This can make the copying or modification of large number of documents much quicker. But if a crash happens in the middle of these updates, some or all of the new documents might not be available the next time the database file is opened. NSF actually has a feature like this too, but I don't know how big a performance boost it gets from it.

So CouchDb updates are slow, but hopefully I've chosen a design that is more optimal for "indexing", the creation and incremental update of computed tables (CouchDb's close analog to Notes views). I'm hoping some related decisions I'm making pay off. But I haven't yet got that to the point where I can validate those performance characteristics, but if I was right about updates being slower, maybe I'll be right about indexing being faster. (crosses fingers)

Posted October 29, 2005 3:00 PM