Replication insight

Tonight I had the most beautiful insight into distributed revision management. Originally the CouchDb revision conflict management consisted of maintaining a linear list of revisions ids, such that every time a document is edited it generates a revision id that is added to the revision list. To detect conflicts in edits, it is simple enough to check if one document's revision list is a subset of another. If not, there's a conflict.

The problem is when a conflict is detected, what to do? If you want to just deterministically declare one a winner and jettison the loser then the problem is simple. However if you wish to preserve the losing conflict information so that it might be resolved later it becomes a much more difficult problem, the edge cases in a peer distributed model make it difficult. One approach is to deterministically generate a new conflict document, preserving the losing conflict information but now as a new and separate document. This is the approach Notes/Domino replication takes and it works ok in some applications, but it can cause problems.

What I realized tonight is that by converting the revision lists to a tree, then conflicts are simple tree merges (conflicts in the revisions lists becomes branches in the revision tree) and then by using a simple algorithm for deterministically selecting the winning leaf node, it makes distributed and deterministic conflict resolution and merging possible while still providing single document semantics. No separate document is necessary to preserve losing conflict data and all the edge cases work. I'm glossing over lots of details, but thankfully little existing CouchDb code needs to change to make this happen.

I know I'm still not explaining CouchDb in people terms (more like barely intelligible dorkinese), but this is stuff that has very real impact on how well it "just works" in the real world. I had to tell someone.

Posted November 8, 2006 12:33 AM

Comments

I've been watching too much election coverage... when I read your topic, I saw "Republican Insight." *sigh* :-)

Sean, November 8, 2006 1:15 AM

I'm turning dorkinese,
I think I'm turning dorkinese,
I really think so!

Dan Sickles, November 8, 2006 1:56 AM

If the system is peer-distributed, then what about conflicts within the tree itself? E.g., when two users on one server make a change to a single field, and two users on another server make changes to the same field. Notes has edit conflicts, and replication conflicts to handle this which is one of the reasons I've always disliked the document storage model for production data.

Michael, November 8, 2006 3:01 AM

I love the way Wikis handle conflicts and revisions. It really does not matter much which is the winner if there is an audit trail with diff highlighting.

Alan Bell, November 8, 2006 6:12 AM

with the Wiki style updates there is no semantic difference between editing a document normally and a conflict happening. When you post an updated version that new version "conflicts" with the existing version. It would be possible to keep all versions, perhaps purge old things on a schedule, or even keep everything as a stack of diffs. I think the Notes style conflict handling does make sense, but it breaks applications too often. I think that conflicts should not show up in computed tables, or when retrieving a document, but the prior versions should be retrievable with a url syntax below document level. Maybe http://couchserver/somedatabase/some_doc_name:$revisions
for a special computed table of revision information about that document including revisionID, datetime, username etc. and
http://couchserver/somedatabase/some_doc_name/$revisionID to retrieve a particular revision of that document.

Alan Bell, November 8, 2006 6:24 AM

Using a tree for the revision ids sounds good. I can see how that helps determining what revision differences across many replicas. From your description though, I'm not sure how you preserve the losing conflict information. Do you keep the losing information on the winning document, tagged against the revision tree?

On a related note, if you had a mechanism for tagging data against a revision, does that mean that you could have a built in revision control, whereby all revisions are stored. Obviously this would be a big overhead, and should only be offered as an option, but it would be extremely useful on many use cases.

I was wondering also if you had any more thoughts on how you see Couch in the market place. You have said before that you see Couch as an end user technology that can allow end users to build useful tools applications themselves. You've also posted before on the Turing Tarpit of having a framework so flexible that you get no benefit from using it. As a Domino developer, Couch is sounding very interesting and I'm planning to have a good dig around it. How do you see the a role of application developer using Couch. Any plans to offer other language APIs alongside the REST one?

Kerr, November 8, 2006 7:18 AM

"Do you keep the losing information on the winning document, tagged against the revision tree?"

Correct, the losing information (and actually the leaf nodes of all revision branches) are kept around and merged into a single document entity that's replicated around.

The simple algorithm for selecting the winning revision (the one that is actually "displayed" in the database) is to find the non-deletion revision node with the longest path from the root. (This causes deletion conflicts to always "lose" and ensures revisions with the most edits win. It's an arbitrary criteria, but it is determinstic and mostly works like people would expect). So even though a winner is selected, the losers are kept around to be resolved.

Later, the conflicts can be merged/resolved by a tool or a human, and to clear out conflict branch you simply append a deletion revision to end of that losing branch, thereby discarding the conflicting data (which could have been merged onto the winning branch or discarded completely).

Damien, November 8, 2006 9:30 AM

how would this interact with multiple saves of a document, like a periodic autosave?
so for example I open up version 2 of a document and start editing it, every 10 minutes a "save" is performed and after an hour I save and close what I am doing. Would all the interim saves be branches of the tree?

Alan Bell, November 8, 2006 10:07 AM

Multiple saves work just fine. Each save will be a new revisionid node added to the main revisionid node, so it becomes the new winning leaf.

For the client, each time to you save the document it just needs to keep track of the last revisionId. When it saves a new revisionid is created and sent back to the client, the old data is discarded (though recoverable, until database compaction old revisions are still available on disk). As long as the client keeps track of the last revision id everything works.

If someone edits the document from underneath, then the client gets back a conflict error when it tries to save. The client can do several things: it can update anyway and allow the update to become a conflict loser, it can force an update to be the conflict winner, it can merge the conflicts, or it can ask the user what to do.

Damien, November 8, 2006 3:30 PM

I am not sure how you are going to handle attachments, will they be part of the document and subject to the same revision mechanism, or will they be separate, and associated with the document revision tree as a whole?

Alan Bell, November 8, 2006 3:38 PM

Alan, regarding your earlier comments about the prior revisions, CouchDb will make the old revision data retrievable as long as they are are still on disk. So the revision history will maintain the whole revisionID tree, but only the data in the last revision of the branches (the winner and the conflicts) are actually guaranteed to be on disk.

Damien, November 8, 2006 3:39 PM

Conceptually attachments are handled uniformly with the conflict data. The losing conflict document's attachments are also available like everything else.

It just works.

Damien, November 8, 2006 3:41 PM

Post a comment




Remember Me?

(you may use HTML tags for style)