A Comfortable Couch

Monday, November 15, 2004

Formula Language's dirty secret

This post by Stan Rogers reminded me the Lotus Notes Formula Language engine has a dirty secret, it's seriously slow at building lists.

When you build a list using the list concatenation operator (:), the engine is so inefficient it's embarrassing. For example, you probably thought the following construct incurred no cost at runtime:
foo := "damien" : "is" : "super" : "cool";

But the truth is, each list concatenation incurs memory allocation and copying. The way the old engines interpreted the line is:

  1. build a single element list with the value "damien"
  2. build a single element list with the value "is"
  3. create a new longer list and copy the "damien" list and "is" list into it. Discard the old lists.
  4. build a single element list with the value "super"
  5. create a new longer list and copy the "damien" "is" list and the "super" list into it. Discard the old lists.
  6. etc....
Blech! This isn't so bad with small lists, but as the number of concatenations increase, the more unnecessary copying that occurs.

So as the lists get longer, the inefficiencies add up exponentially, for CS geeks the big O complexity is x^2, little o is (x^2)/2. Now this isn't that bad since lists in are fairly limited in size (no more than 32k of actual data is possible in a R5 list), so you are actually saved from horrific waits by other limitations that occur earlier, but the waits can be bad enough with lots of these formulas thats things start to crawl. If you have formulas like this running on the web server or view formulas, all that inefficiency can add up big time.

How much slower in real world terms? Well, I did a test where I measured the load time of a form in milliseconds. I created 3 forms:

Form 1: This is a test control that has a field that just returns a NULL string:
Form 2: A field that builds a thousand element list using the list concatenation operator and assigns it to a field (I assign to non displayed field so that rendering the lists to the UI doesn't become a factor):
FIELD foo := "001" : "002" : "003" : "003" : "004" ....... "999";

Form 3: A field that builds a thousand element list using the permuted string concatenation operator and assigns it to a field.
digits := @Explode( "0;1;2;3;4;5;6;7;8;9");
FIELD foo := @Subset(digits *+ digits *+ digits; -999);
The average load time results are as follows (Notes R5.11, 2.4 ghz P4 machine):
Form 1 (control, no list building): 16ms
Form 2 (list concatenation): 62ms
Form 3 (permuted string concatenation): 16ms

Basically, the permuted string concatenation is happening so fast as to not affect the load time. So let's assume it's taking a least 1 ms to run, then the list concatenation formula is taking at least 46 times as long to run. My methodology isn't really thorough, but it gives a pretty indication of how slow things are.

Here is the result in R6.5 (Notes R6.5.1, 2.4 ghz P4 machine):
Form 1 (control, no list building): 15ms
Form 2 (list concatenation): 31ms
Form 3 (permuted string concatenation): 15ms

So the new engine is able to run the list concatenation formula significantly faster, but still not as nearly fast as the permuted string concatenation. I'm little surprised by that, I think it's because it's still is O(X^2), but it doesn't copy the actual values around, just pointers (I'm not really sure why, I'm trying to remember the details of how I wrote it, it's a lot more complicated in the new engine). Fortunately the new engine will do partial result caching, so if it's used in view column or selection formulas, it only takes the hit once for the first document during a view refresh, then it uses a cached result for each subsequent document evaluation. I'm not sure if the Domino web server takes advantage of that however (it really should if it doesn't, but it's often not simple thing).

Anyway, I hope someone's slow app out there is helped by this.


Stan Rogers said...

Great explanation of what's going on -- I had no idea that the allocation was static per member on the initial build (I figured it would be rather like dealing with a string, with a first-case list formula looking more like a string literal). Would you happen to know if the partial results caching would be the explanation behing the seemingly-miraculous increase in speed of the NotesDatabase.Search method when the formula exactly replicates a view selection formula? (I thought I'd sacrifice on Search to get StampAll, and it went so quickly I had to log everything in a second run to make sure.)

5:32 PMlink  
Anonymous said...

Any idea how exploding a string into a list compares in terms of performance?


5:35 PMlink  
Damien said...

Exploding a string has very good performance, probably slightly faster than the permuted string concatenation.

7:56 PMlink  
Anonymous said...

And you said you didn't have a clue as to what to write about if you did an article for e-ProWire... :-)


11:33 PMlink  
Anonymous said...

I did a bit more testing on this, and posted the results on my blog here:


Thanks for bringing this up, Damien - this is good stuff! Anything else you want to get off your chest? :)


9:55 AMlink  
Nahum Prashizky said...

Back then, in september 1999... Damien did an "Damien":"is":"a":"sexiest":"man":"in":"a":"world" list during his presentation...
I see that with age amount of elements gets shorter, as we can see that from rewriting an engine he became more optimization - oriented...

Nahum, the one you wanted to beat up after that presentation... ;-)

6:03 AMlink  

Post a Comment

<< Home