November 27, 2006

In Mountain View tomorrow

I'm flying out to see a certain search engine giant so I'll be in the Mountain View area tomorrow (Tuesday) afternoon. Anyone want to have dinner or have something cool to do?

Email me ( or call me on my cell: 704-323-1125


CouchDb 0.6.0

CouchDb 0.6.0 is now available on the project download page. Thanks to Jan Lehnardt and William Beh for their contributions to the project.

This has the new replication stuff in it, which I'll be writing about more later. For now, here's what's new from the readme.txt file:


The replication facility is now available in CouchDb. To replicate a database, from the erlang console:

> couch_rep:replicate( "local_database_name_a", "local_database_name_b").

You can also specify a HTTP URL to a remote database:

> couch_rep:replicate( "local_database_name_a", "http://remoteserver:8888/remote_database_name_b/").

Either or both databases can be remote:

> couch_rep:replicate( "http://remoteserver:8888/remote_database_name_a/", "http://remoteserver:8888/remote_database_name_b/").


CouchPeek can now create and delete documents and view their XML source, as well as display tables.

Build from Source

The process of building from source is made easier and less error prone.

One more thing, this has a new disk format will not work with database files from older versions of CouchDb. Beta versions of CouchDb will likely have database migration.


Jan has built a new Mac OS X package, also available from the downloads page.


November 21, 2006

CouchDb Demo Site

Jan announces the new CouchDb PHP Demo Site he's created:

In related news we've been really heads down busy trying to get the next alpha release (0.6.0) out. In addition to lots of build tweaks by Jan to make building from source easier, this release has the new-style replication system, which dawned on me shortly after I got the old-style replication working. I'm really excited about it, it's a big improvement over the Notes replication system and it gives CouchDb a better foundation to serve both as distributed document revision control system and general database server. Anyway, the new replicator is now working and I'm just HTTP enabling it. The next release is coming soon.


November 9, 2006

How not to Pitch To Y Combinator

I've gotten a few emails over the past couple of days, people worried I was disappointed or discouraged about the outcome of the Y Combinator thing. Disappointed? A little. Discouraged? Not in the least. Why this is just the first in a long string of rejections I have planned.

Anyway I was just contemplating how damn funny it was me trying to explain this thing to Paul Graham and company.

Picture this, four people that knew nothing about CouchDb ahead of time ("oh? it's database software?"), and my pitch is about as slick and polished as a Ford Pinto and I've got only 15 minutes to not only explain it, but to sell them as investors.

I was probably a textbook case of what not to do. Really, when I think about it it's hysterically funny shit.

I first tried to write this advice as though you were really trying to mess up the interview, but that was confusing in a very "Don't do what Donny-Don't does…" way. So instead this is conventional advice you should actually try to follow.

Don't Assume They Read Up Ahead of Time

I figured because I was accepted for interview, they must have read up on my project and done research. A few seconds into the interview it became apparent that wasn't the case.

Just because you are accepted to interview, do not make the assumption they know anything at all about your idea. All the stuff you put online, they didn't look at it, or if they did they didn't remember it. The amount of time they allot for you to pitch is really all the consideration you're going to get, if you didn't explain it then don't expect they will figure it out.

Here's the deal, they do one round of funding every six months. Apparently they do all the interviews to fill a fixed number funding slots on a single day. I don't know how many interviews they did that day, but Paul mentioned something about thirty more rejection calls he had to make. It could be they interviewed 30+ groups on a single day, it could be he was calling each person involved in the groups or it could be hyperbole that he had a lot of phone calls.

But regardless, they can't spend a bunch of additional brainpower on your pitch. Think about it. In addition to evaluating a ton of pitches in a single day, they are also doing all the other stuff involved with their currently funded startups. Even if they did read up ahead of time, they probably forgot almost everything by the time you pitch to them.

This is what you must keep in mind when presenting, because more than other investors you are really going to have to work hard to sell your project if it's difficult to explain. They simply won't have the time to figure it out it on their own.

Be Prepared to Be Derailed

I was trying to show them a demo, and they kept asking me questions and I kept getting sidetracked and I think I barely showed them anything.

Practice your demos for live people and have them to ask you questions as you go. Figure out how to keep the focus on the demo.

And of course expect them to ask the wrong questions. And not the right-wrong you questions you think they'll ask, but wrong-wrong stuff that can't imagine why they asked it.

Be a Jobs

Don't be like me. Have a slick polished presentation. Give this presentation to as many people as you can. Note the questions they ask, note what confuses them. Practice practice practice practice. Don't just wing it!

Paul Graham says I'm a Woz. No excuses, if I'd have just spent some serious time on presentation and worked real hard on it, I to could be a Jobs. So could a lot of people with the right amount of effort.

And that's not really the point, what they need to know that you can communicate this stuff. Even if your idea is one they really "get", they are also evaluating you on your ability to push the project forward with your communication skills. They'll only give you so much leeway, because they are far from the last people you'll need to sell your project to.

It's a Flawed Process. Tough Shit Fella

I've gotten a few emails from other rejected applicants, some seemed pretty upset with how the process works. That's understandable. The problem is not that the interview process they set up is bad. It's not. It's flawed. But it might also be brilliant.

VC funding is a numbers game. They recognize that and are trying to come up with a better system. And for certain types of ventures they may have found it.

That may not help you in your exact case. Tough shit, if this was your only option for funding, you really shouldn't be trying to launch a venture. The truth is, they were quite generous and kind to deal with in this process. Way better than you'll get just about anywhere else. Really.

If they pass on your project, just take it as an opportunity to learn as much as you can from the experience and polish your act for the next investors. I definitely got the impression the rejections are hard for Paul too. So if rejected try to thank him, he's actually out there trying new ways of getting funding to deserving people. He might be a rich internet mogul but I don't think he's a greedy one.


November 8, 2006

Replication insight

Tonight I had the most beautiful insight into distributed revision management. Originally the CouchDb revision conflict management consisted of maintaining a linear list of revisions ids, such that every time a document is edited it generates a revision id that is added to the revision list. To detect conflicts in edits, it is simple enough to check if one document's revision list is a subset of another. If not, there's a conflict.

The problem is when a conflict is detected, what to do? If you want to just deterministically declare one a winner and jettison the loser then the problem is simple. However if you wish to preserve the losing conflict information so that it might be resolved later it becomes a much more difficult problem, the edge cases in a peer distributed model make it difficult. One approach is to deterministically generate a new conflict document, preserving the losing conflict information but now as a new and separate document. This is the approach Notes/Domino replication takes and it works ok in some applications, but it can cause problems.

What I realized tonight is that by converting the revision lists to a tree, then conflicts are simple tree merges (conflicts in the revisions lists becomes branches in the revision tree) and then by using a simple algorithm for deterministically selecting the winning leaf node, it makes distributed and deterministic conflict resolution and merging possible while still providing single document semantics. No separate document is necessary to preserve losing conflict data and all the edge cases work. I'm glossing over lots of details, but thankfully little existing CouchDb code needs to change to make this happen.

I know I'm still not explaining CouchDb in people terms (more like barely intelligible dorkinese), but this is stuff that has very real impact on how well it "just works" in the real world. I had to tell someone.


November 5, 2006

The Woz

Today I met with Y Combinator about getting funding for CouchDb. I'm sure you want to know, how did it go?

Well, 15 minutes simply isn't enough time to explain this stuff, and I don't know that I was able to explain any of what makes this technology compelling. They asked a lot of what seemed like rapid fire questions and I felt I didn't answer any of them terribly well.

But one highlight was when Paul Graham turned to me and said I was a Steve Wozniak and that I need a Steve Jobs. Cool. Being a "Wozniak" is quite a compliment just about anytime. Except when doing demos and selling your project, in which case you *really* want to be Jobs.

So how did it go?

What am I a mind reader? Come back tomorrow I'll know more.

Just got off the phone with Paul Graham. They passed. Paul made the point once more that I'm the Woz and I need a Steve Jobs. He's wrong of course, but not that he could possibly know it from that meeting.

This was my very first funding application, and on the plus side it hammered home what I need to focus on: Presentation and presence. With a project this big I cannot focus on the technical aspects when pitching to money folks, even if they are technical. I have to be both grandiose and believable, infectious with enthusiasm without appearing infectious with disease if you know what I mean.


November 2, 2006

15 minutes

Check out these two CouchDb demo applications Jan Lehnardt created and put online:
Sofr - A threaded discussion.
BugShrink - A simple bug database.

Update:The demos are offline for now, we're setting up a dedicated server to host the demos.

The PHP source is available from the project source control.

I'm going to show these as part of the CouchDb demo for Y Combinator. I want to show how easy it is to create these types of applications, but the killer feature is to show them replicating without any special design consideration.

The Y Combinator funding process is a long, drawn out and complicated one. It works like this:
1. I, like all accepted applicants, get a 15 minute interview to talk about my project and answer questions.
2. Later that day I might get a funding offer. I can accept or decline.

That's it. Yes, it is kinda crazy. But maybe in a good way.

So I know you are probably asking yourself, how can Damien possibly fill a entire 15 minutes talking about CouchDb? My plan so far:

I spend 5 minutes explaining CouchDb from a high level point of view. I'll draw a bunch of crap on a whiteboard and probably get real animated (Win, Lose or Draw, for a number of reasons, comes to mind).

Then show the demos and show them replicating.

Then present a big list of topics related to the project (reliability, performance, atomicity, security, scaling up, scaling down, portability, internationalization, missing features, features not demo'd, etc) and answer as many questions as possible in the remaining time.

That's the plan. 15 minutes to explain an entire distributed database system.

How would you do it?