October 28, 2012

How to achieve lots of code?

I get mail.

hello damien

I read about you from a book on erlang.

Your couchdb application is really a rave.

please can you help me out ,i've got questions only a working programmer can answer.

i'm shooting now:

i've been programming in java for over 3 years

i know all about the syntax and so on but recently i ran a code counter on my apps and
the code sizes were dismal. 2-3k

commercial popular apps have code sizes in the 100 of thousands.

so tell me- for you and what you know of other developers how long does it take to write those large applications ( i.e over 30k lines of code)

what does it take to write large applications - i.e move from the small code size to really large code sizes?

thank you.

Never try to make your project big. Functionality is an asset, code is a liability. What does that mean? I love this Bill Gates quote:

Measuring programming progress by lines of code is like measuring aircraft building progress by weight.

More code than necessary will bloat your app binaries, causing larger downloads and more disk space, use more memory, and slow down execution with more frequent cache misses. It can make it harder to understand, harder to debug, and will typically have more flaws.

CouchDB, when we hit 1.0, was less than 20k lines of production code, not including dependencies. This included a storage engine (crash tolerant, highly concurrent MVCC with pauseless compactor), language agnostic map/reduce materialized indexing engine (also crash tolerant highly concurrent MVCC with pauseless compactor), master/master replication with conflict management, HTTP API with security model, and simple JS application server.

The small size is partly because it was written in Erlang, which generally requires 1/5 or less code of the equivalent in C or C++, and also because the original codebase was mostly written by one person (me), giving the design a level of coherency and simplicity that is harder to accomplish -- but still very possible -- in teams.

Test are different. Lines of code are more of an asset in tests. More tests (generally) means more reliable production code, helps document code functionality that can't get out of sync the way comments and design docs can (which is worse than no documentation) and doesn't slow down or bloat the compiled output. There are caveats to this, but generally more code in tests is a good thing.

Also you can go overboard with trying to make code short (CouchDB has some WTFs from terseness that are my fault). But generally you should try to make code compact and modular, with clear variable and function names. Early code should be verbose enough to be understandable by those who will work on it, and no more. You should never strive for lots of code, instead you want reliable, understandable, easily modifiable code. Sometimes that requires a lot of code. And sometimes -- often for performance reasons -- the code must be hard to understand to accomplish the project goals.

But often with careful thought and planning, you can make simple, elegant, efficient, high quality code that is easy to understand. That should be your goal.