Document Oriented Development

Two nights ago, I was editing the "So what? Who cares? Why would I ever want to use CouchDb?"* section on the home page of the CouchDb Wiki. As I was feebly trying to explain what CouchDb is good for, the words "document oriented application" popped into my head. I immediately liked it, it felt like I had a term to concisely describe the sorts of applications CouchDb is made for.

Today, I decided to Google the term "document oriented". Turns out it's not new, here's an article I found Towards truly document oriented Web services on the O'Reilly site. The article gives and example of a REST API that is similar to the one I will be exposing with CouchDb. Cool.

"Document Oriented Development" I think this may be a poorly served yet hugely important area of application development. Particularly in storage and management. For document storage, you pretty much have two options in mainstream development, direct file system access and relational databases.

Traditional file based systems are simple enough, this is how most PC applications have dealt with documents for a long time. MS Office is a prime example: all documents are files. But a lack of a reporting capabilities and concurrency control limit what can be done, particularly in web applications.

And relational databases? There is nothing "relational" about documents, yet the vast majority of document management systems are built on top of a RDBMS. but unless normalized to the 4th normal form, you'll need a fixed document schema, limiting flexibilty. But when normalized to 4th normal form, performance suffers. Badly. And not to mention SQL queries become unwieldy.

XML databases are meant to solve these sorts of problems. There is even a standardized query language for it: XQuery. XML databases are great if you want to think of everything in terms of XML. But from what I've seen, XML databases will simplify development only if your data is already XML. Even then, I'm not so sure.

It seem ridiculous there aren't more mainstream tools to deal with this style of development. Lotus Notes got so much of this right over 15 years ago, and it's still singularly unique in its capabilities.

Define It?

I'd like to come up with a good definition of document oriented development, but the idea is still pretty nascent in my brain. This is what I wrote on the wiki to describe the applications:

A typical document oriented application in the real world, if it weren't computerized, would consist mostly of actual paper documents. These documents would need to get sent around, edited, searched, photocopied, approved, pinned to the wall, filed away, etc. They could be simple yellow sticky notes or 10000 page legal documents. Not all document-oriented applications have real world counterparts.

The Wikipedia has a good definition of document:

A document contains information. It often refers to an actual products of writing and is usually intended to communicate or store collections of data. Documents are often the focus and concern of Administration.

Documents could be seen to include any discrete representation of meaning, but usually it refers to something like a physical book, printed page(s) or a virtual document in electronic/digital format.

Hmmm... getting closer.

"Document Oriented Development" - By Ben Batchelder

Anyone want to take a crack at a definition at document oriented development? Or am I all wrong and there nothing particularly special about being "document oriented"?

* (that section heading, along with a bunch of others, was added by Jeff Atwood of Coding Horror. Thanks Jeff).

Posted May 31, 2006 12:18 PM