A Comfortable Couch

Tuesday, September 21, 2004

Couch Components

Here is a high level description of the Couch components. A couch system will consist of:
  • 2 or more Data Servers
  • 1 or more Map & Volume Servers
  • 1 or more Clients
A single machine can be data server, map server and client all in one, or any combination.

Data Server:
The data servers are storage nodes. In a couch network there must be at least 2 data servers, but Couch should be able to scale to hundreds of servers for a single network. They will have the simplest network interface. Mostly the RPCs will be limited to:
  • Store some piece of data
  • Retrieve some piece of stored data
  • Query for metadata about the stored data
  • Copy in a piece of data from another node
Also, should any data corrupt, it will identify the bad data and return an error to any clients who attempt to read the data, and it will alert the map server. It will use internal checksums to validate integrity. When corruptions occur clients can automatically failover to another server with a correct copy of the data.

Map & Volume Server:
The Map & Volume (MV) Servers act as a middle man between a client and the data servers.

It provides a representation of a virtual disk volume that the clients can query, like directory operations, for example. It also tells clients which nodes have a piece of data for reads and where to store data for a write. It will also provide additional functions such as a file locking service.

A MV server’s view of what the storage servers hold is a comprehensive cache of all its storage servers’ metadata. Everything it needs can be rebuilt from querying the storage servers for their metadata.

It will also manage rebalancing data throughout its Couch network as storage servers go down and new ones come online.

Reliability of the MV servers will be achieved with designated backup servers. The backup server’s job is to frequently and incrementally replicate in all the metadata from the storage servers and the primary server. If the primary goes offline, the next designated backup will take over and resume normal operation once it has completely synced up with all storage servers. Everything should be recoverable, including lock operations.

The clients are machines that have the Couch virtual disk volume mounted or installed as a drive. They won’t contain much data about the network, just caches mostly. Most operations, except read and write, will be delegated to the MV server.

For read/writes, the client asks the MV server for nodes it should read/write to. All writes will be sent to at least 2 storage servers, Raid 1 style. One benefit of Raid 1 storage is the ability to do read-aheads, although this only works for large sequential reads.

So that’s very high level overview of the system architecture. There is a TON of missing detail here, but much of this may change anyway as I begin development.


Anonymous said...

I'd like for you to comment on how your design compares to that of the google FS.


12:43 PMlink  

Post a Comment

<< Home