December 31, 2005

Abandoned writings

I was looking over the draft posts I have in my blog system, and I came across this beauty from last July. I don't even remember writing it, but I wished I would have finished it. It really seemed like I was onto something, like a great mystery of the universe on the verge of being revealed.

Oh well, here it is in its unfinished glory:

---
A Buffer Too Far

Ok, so I'm writing CouchDb in C++, and lately I'm starting to question that decision. The reason I'm writing it in C++ is because it makes it easier to embed and integrate with other products/languages/devices.

Ok, so I'm trying to use the STL more and invent my own code less. And I'm realizing its fucked. Or maybe I'm just an idiot. Nah, I'm pretty sure it's fucked.

Ok, so I have type structure like this:
vector< pair<StreamPointer, vector<BYTE> > > x;

Now, for those that can't read this monstrosity above (which I really wrote in CouchDb), it says:

Make a thing called x. That thing will be a vector (a resizable array) that holds pair values, and those pair values each will hold a StreamPointer and also a vector that hold bytes.

For those who are still with me, that's ugly.

So I have an alternative, I can typedef stuff to make the declaration cleaner:

typedef pair<StreamPointer, vector<BYTE> > StreamPointerBufferPair;
...
vector<StreamPointerBufferPair> x;

Now, the line above says:
Make a thing called x. That thing will be a vector (a resizable array) that holds StreamPointerBufferPar.

StreamPointerBufferPair is a bit, eh, um....wordy. I guess. It's just that it's long. Buts it's sort of clear what it is, except for the Buffer part, which is a actually a vector<BYTE>, I'm not sure how to describe it clearly otherwise. So maybe this is cleaner:


typedef vector<BYTE> Buffer;
typedef pair<StreamPointer, Buffer > StreamPointerBufferPair;
...
vector<StreamPointerBufferPair> x;

Ok, that's maybe a little better. So lets continue coding. The developer now wants to use
grdz

--

Apparently I was clubbed in the head by an intruder at the end there, and I managed to get out "grdz" before the intruder saved my work and tucked me into bed. He's thoughtful that way.

Link

December 22, 2005

A brief introduction to Fabric

Fabric is a simple programming language I'm building as a query and validation language for my Couch project. Fabric is intended to be a pragmatic and simplified language for the purpose of querying and processing data in CouchDb.

Fabric will have many of the same operators and functions as Notes Formula Language and should be most familiar to those who already know Formula language.

This is brief introduction to Fabric. I'm sure it help if you know Notes Formula Language already.

Basics
Hello World in Fabric:

"Hello World!";

Ok, not terribly useful, its sort of like a Hello World In SQL. Here is an expression that concatenates fields FirstName and LastName and assigns it to FullName.

FullName := FirstName + " " + LastName;

The colon-equals (:=) is the assignment operator and plus (+) is the string concatation operator (and it's the number addition operator when used with numbers). The semicolon (;) indicates the end of the expression.

Here is the same expression, but it converts the full name to uppercase:

FullName := Uppercase(FirstName + " " + LastName);

It has integer and floating point math support, some simple examples:

x := 5;
squared := x*x; // squared is 25
foo := (x + 7)*2 // foo is 24
bar := 5/2.5; // bar is 2

The syntax is similar to Notes Formula Language, with the most noticable difference is the lack of the @ sign in function identifiers. Also, argument separators are commas, not semi-colons:

SomeFunction(arg1,arg2,arg3)

Lists
All values are lists (or arrays if you prefer) and they may contain many different types of elements. One difference is instead of having typed lists like Notes Formula Language (string lists, numbers lists, dates lists, etc), there is only one list type, and it can contain any mix of strings, numbers, dates, etc.

Colon is the list concatenation operator. This example concatenates the lists in quantity1 through 4 into one list. Then it sums the total of the elements in the list:

foo := quantity1 : quantity2 : quantity3 : quantity4;
Sum(foo);

A mixed type list:

Foo := 1 : "123 Fake st" : 3 : 5.67 : [10/24/1973];

Type Conversion
Fabric has implicit type conversion:

Foo := "1" + 3; // Foo is "13"
Bar := Number("1") + 3; // Bar is 4
Baz := Sum("1" : 2); // Baz is 3

Functions and operations that operate on elements of one type will automatically convert, if possible, elements to the correct datatype. Otherwise an error is generated.

Field concatenation
To concatenate a bunch of numbered fields is a very common operation in formula language. I've created a simple syntax for that:

foo := date1..15

Will take the values in fields date1 through date15 and combine them into a single list.

This will combine all fields date1, date2, date3 and so one until a field in the sequence is missing:

foo := date1..*

Branching

foo := if(cond)(
   bar1
)(cond2)(
   bar2
)(
   bar3
);

and the terse form of the same expression:

foo := if(cond; bar1; cond2; bar2; bar3);

User Defined Functions
You can define and call your own functions:

// Foo is a function that multiplies two number lists
// and converts them to text
Function(Foo)(
   Text($1 * $2);
);


Bar := %Foo(2, 8); // Bar is "16"
x := 2 : 3;
y := 8 : 9;
Bar2 := %Foo(x, y); // Bar2 is "16" "27"

All user defined functions need the identifier prepended with "%". I'm doing it this way so there is no collision between the core functions. I've also considering using the "@" symbol, but I fear it might confuse Notes people. Although it would be cool to be able brag that Fabric lets you create your own @Functions...

So inside of function bodies, you reference the args passed in by the predefined argument variables $1, $2...$n, or by using the Arg(n) function, with n being the 1 based index to passed arguments. $ArgCount contains the number of arguments passed.

Looping
The only looping construct I'm including for iterating over elments in a list using a forall construct. The expression is evaluated for each element and the result is used to construct a new list.

In this example, it checks each element in List to see if it starts with the letter "b", if it does, it returns the Uppercase version of the element, otherwise it returns the element unchanged. These elements then go to constructing a new list and assigning it to Foo:

Foo := forall(Element in List)(
   if( Begins(Element, "b") )(
      Uppercase(Element)
   )(
      Element
   )
);

You can process multiple lists in parallel. This example, compares each element in two lists(1st compared with 1st, 2nd compare with 2nd, etc), and returns a third list that is composed of the largest element in each corresponding list:

Foo := forall(A in ListA, B in ListB)(
   if(A > B, A, B)
);

And if one list is shorter than another, the last element is reused in subsequent iterations.

So for example:

listA := 5 : 8 : 2 : 10 : 3;
listB := 6 : 7 : 9;
Foo := forall(A in ListA, B in ListB)(
   if(A > B; A; B)
);

This produces the result of Foo 6 : 8 : 9 : 10 : 9;

Array/List indexing
Brackets are used for array addressing, and is 1 based;

colors := "red" : "blue" : "yellow";
foo := colors[1]; // foo is "red"
foo := colors[2]; // foo is "blue"
foo := colors[3]; // foo is "yellow"

2 arguments has a special subset meaning:

colors := "red" : "violet" : "blue" : "green" : "yellow" : "orange";
foo := colors[2, 4]; // foo is "violet" : "blue" : "green"

The first argument is the start position, the second is the end position and a new list is created with all the elements from start to end, inclusive.


Ok, even though this isn't a very complete description of the language and how it can be used, I'm going to stop writing about it for now anyway. I'd love to hear any feedback about the language and and its design to shape it as I go forward. Everything I've described above is already working, but I still have a lot of work to do, and I'm sure as I integrate it back into CouchDb I'll be revisiting some of my previous decisions. Thats just how it goes.

A couple of people have inquired as to why I'm creating a new programming language. Why not just use some other popular language?

The answer is simplicity.

First of all, Fabric isn't a general purpose programming language, it's a domain specific language (DSL to programming language geeks). It is designed to be tightly integrated into the CouchDb data model and make querying and processing it easy. I couldn't find another language that would be a good match without also burdening the user with lots to learn.

These queries should to authorable by people with spreadsheet level programming experience. I'm creating a language that is similar in syntax to Excel Formula Language and very much like Notes Formula Language. By sticking to simplified language demands and a familiar syntax, the learning curve is lowered and people should be productive quickly. But I'm also choosing to base this language off Notes Formula Language because it has been proven very powerful and appropriately concise for these types of uses. I'm hoping to tweek its usefulness a little bit for the better.

It will become more clear once I create some real world examples with CouchDb. Its kind of hard to see the usefulness of a single part of Couch, it will become more clear once I demonstrate how they work together.

Link

December 21, 2005

What is Couch?

Ok, I'm going to briefly describe my vision of what Couch will be. I've been a little shy about discussing it as of late, mostly because I have changed it drastically in the past and I wanted to wait until felt things had solidifed more. And so now that I feel pretty good about its direction and the likelyhood for success, I'd like to say more about what I'm building.

What is Couch? Concise version:

Couch is Lotus Notes built from the ground up for the Web.

What is Couch? Long version:

Couch is a simplified database application platform that is document centric and non-relational. It is a distributed database system, with bi-directional replication built in, enabling high scalability, failover and offline access.

Couch Components


CouchDb:

The core database storage and reporting facilities. CouchDb is not a relational database, but rather a schema-free document database.

Documents are objects that have any number of named field values. Since there is no fixed schema, documents can have any number of fields and those fields can have any arbitrary name and value. Unlike a relational database, where records are inserted and deleted out of named tables with a fixed schema, CouchDb has a flat object space and objects do not follow a defined schema.

CouchDb is a bi-directionally replicatable database, with the ability to incrementally replicate changes and resolve conflicts between two replica instances of the same database.

Reporting on the database is accomplished with "Computed Tables" that use Fabric (the CouchDb query language) to select documents and generate column values. Computed tables are very similar to Lotus Notes views. Each document that is selected by Fabric corresponds to a row in the table, and the column values are the result of Fabric expressions that reference the document fields. The same document can appear in many computed tables, or no computed tables, depending on the computed table Fabric selections.

Fabric:
This is the CouchDb query, processing and validation language. It's closely related to Notes Formula Language. It's simple list processing language, designed to be easy to learn and use, and tightly integrated into CouchDb and Couch Web Application Server.

Web Application Server:
This is a piece that will integrate into a web server (most likely Apache) and serve Couch applications and database content out of CouchDb. This layer will interface tightly with one or more popular scripting languages (PHP, Python, Ruby, etc). Design elements (forms, views, tables, etc) are simply documents in the database that store the client side HTML and server side markup and application scripts. These design documents are evaluated and translated into HTML by the Couch Web Application Server and its components.

The web application server also provides a REST based XML replication facility for clients and peer web servers to bidirectionally replicate CouchDb database.

Offline Personal Web Application Server:
This client installable version of the web application server that can replicate server based Couch applications for your local web browser. This provides the disconnected user with access to Couch applications.

Couch Web IDE:
A web browser based IDE hosted by Couch Web Application Server that allows end users to create simple Couch applications (think Lotus Notes V3). With DHTML and Ajax, it will provide a simplified and intuitive UI for creating forms, views and simple scripts (in other words: build Couch applications). People who can create Word documents and Excel spreadsheets should be productive in this environment very quickly.


So that's the grand vision for Couch. Yes, there is much work to be done. And no, I don't plan on doing all of it alone.

So what sorts of applications will Couch be good for? Email, bug databases, time sheets, RSS feeds, CRM, content management, blogs, forums.... Basically anything document-ish that you might also want to access while you're offline.

And developing those sorts of Couch applications will make Ruby on Rails look clumsy and heavyweight by comparison (that's right ROR, I'm talking smack). Plus it can do something no other mainstream web development platform can do, replicate the whole thing offline. Notes people know it's not a pipe dream.

I'm going to need a lot of help and feedback from the Notes community to make this project successful. If you're a Notes geek, I'd love to hear your questions and thoughts about Couch. That goes for anybody actually, but double for the Notes crowd. ;)

Link

December 20, 2005

Quiet Lately

I haven't been posting much lately, and the reason is mostly because I'm spending a great deal of time coding and its sapping most of my mental energy. But I've been busy, very busy. So a brief update what I've been up to.

I now have the Fabric engine compiling and executing formulas. It still has a way to go, but all the big complicated stuff is in place and now I'm mostly replacing scaffolding code and adding the remaining simple features. If I keep up this pace, I'll have it integrated into CouchDb in less than a month. So I'm very pleased with my progess.

I have an article I'm working on about some of the details of Fabric, I've been sitting on it for weeks (I've been holding it off until I finalized some things). Anyway, I really need to stop being so quiet about it, so I'm going to give myself a deadline of Thursday night to have that article posted. There, I've publicly said it, now I have to do it.

Link

December 16, 2005

This I need

ErgoQuest500_3.jpg
The Ergopod 500

Link

December 11, 2005

Waking Life

You can't fight city hall. Death and taxes. Don't talk about politics or religion. This is all the equivalent of enemy propaganda rolling across the picket line. Lay down GI, lay down GI. We saw it all through the 20th century. And now in the 21st century, it's time to stand up and realize that we should not allow ourselves to be crammed into this rat maze. We should not submit to dehumanization. I don't know about you, but I'm concerned about what's happening in this world. I'm concerned with the structure. I'm concerned with the systems of control, those that control my life and those that seek to control it even more. I want freedom. That's what I want. And that's what you should want. It's up to each and every one of us to turn loose and show them the greed, the hatred, the envy, and yes, the insecurities, because that's the central mode of control. Make us feel pathetic, small, so we'll willingly give up our sovereignty, our liberty, our destiny. We have got to realize that we're being conditioned on a mass scale. Start challenging this corporate slave state. The 21st century is going to be a new century. Not the century of slavery, not the century of lies and issues of no significance, of classism, of sadism, and all the rest of the modes of control. It's going to be the age of human kind standing up for something pure and something right. What a bunch of garbage: liberal, democrats, conservative, republican, it's all there to control us, it's two sides of the same coin. Two management teams bidding for control, the CEO job of Slavery, Incorporated! The truth is out there in front of you, but they lay out this buffet of lies. I'm sick of it! And I'm not going to take a bite of it. Do you got me? Our existence is not futile. We're going to win this thing. Humankind is too, good. We're not a bunch of underachievers. We're going to stand up, and we're going to be human beings. We're going to get fired up about the real things, the things that matter, creativity and the dynamic human spirit that refuses to submit.

Waking Life

Link

December 10, 2005

Quotes

Are you quite sure that all those bells and whistles, all those wonderful facilities of your so called powerful programming languages, belong to the solution set rather than the problem set? (Edsger Dijkstra)

Part of the reason so many companies continue to develop software using variations of waterfall is the misconception that the analysis phase of waterfall completes the design and the rest of the process is just non-creative execution of programming skills. (Steven Gordon)

Quotes about Computer Languages

There are some good quotes there. This one is particularly relevant to me right now:

Should array indices start at 0 or 1? My compromise of 0.5 was rejected without, I thought, proper consideration. (Stan Kelly-Bootle)

So not to start a flame war or anything, but I need to decide on the array indexing start for my new Fabric language (AKA the new Formula Language). I'd prefer 0, since every other language I code in starts arrays at 0. But I want this language to be instantly productive for those who already know Formula language, which is 1 based. And until just now, I had never even considered 0.5.

Any thoughts welcome.

Link

December 7, 2005

Most excellent new blog

If there was ever a symbiotic relationship, it is one between programmers and traders on Wall Street.

In very simple terms, they cannot do their job without our software, and we do not get paid without them using the software.

We feed each other’s families…

The interesting thing is that as a customer, they don’t actually want the product…in fact, nothing would make traders happier than if we took our software and just disappeared. This logic originates in the way a trader works. When doing his job, a good trader enters a state not unlike the thing we programmers call ‘The Zone’…only they are able to sustain it for much longer periods of time than the 2 or 3 hours we’re capable of. The dynamic mental map which is in play within the traders head when he is working is a thing of wonder…and something I find myself envious of. You can’t imagine the strategies which need to be balanced when you’re trading options across multiple sectors, across multiple expiries, across multiple exchanges, for multiple portfolios…at the same time.

…and yet they don’t want the software…

The very same product we create for them to be able to do their job is a hindrance. It is an absolute necessity, but it’s treated with equally absolute hatred. The 3 millisecond delay imposed by anything within the software is a hindrance. The ‘kewl’ feature implemented by the newbie programmer which moved the trader’s “SEND” button 5 pixels to the left is a hindrance. Consistency, stability and speed are of supreme importance…and this is just not the way of the programmer…our entire job is defined by improvement and change…

Wall Street Programmer

Great stuff so far. I'm subscribed.

Update:
It looks like he took the site down. A little too honest I suppose.

Link

December 6, 2005

Generated code under source control?

My Fabric project (which is coming along well BTW) is a formula compiler and runtime meant to be used with my object database (CouchDb). It uses Anltr to generate C++ compiler code from grammars I wrote. I have it set up so that everytime a build is run, it checks to see if the generated files cpp and hpp are out of date with the grammar, and if not it runs Antlr to generate the source files.

Currently I have these generated files included in the Visual Studio project solution browser and I have them under source control (Subversion). Neither is necessary, but it is convenient. I can't think of any problems this will cause, but sometime somewhere I seem to remember someone frowning on the practice. I just can't remember who or why. Anyone have a good reason not to do it this way?

(BTW I haven't been blogging much lately because I've been too focused on coding)

Link