Comparing XSLT and XQuery

This paper will attempt an objective side-by-side comparison of the two languages: not just from the point of view of technical features, but also looking at usability, vendor support, performance, portability, and other decision factors.

Comparing XSLT and XQuery

When I started building CouchDb, I considered making it an XML document database with XQuery support. While I do love me some angle brackets, I was a turned off by the complexity and ugly syntax of it all. Not to mention XML really seems like the wrong level of abstraction for database queries: it's designed to be an easy-to-parse interchange format, not a storage format. Skimming this document, I still feel I've made the right decision, but I can't be bothered to read the whole thing (it's loooong).

I can't back up my decisions with anything other than my opinion, so if anyone wants to tell me why I'm wrong go right ahead.

Posted April 19, 2006 3:43 PM

Comments

Nope, I'd say you're right.

:o)

Ben Poole, April 19, 2006 4:43 PM

I think the jury is still out on that one. Relational DB vendors have been struggling with this issue for a while.

I'll agree that an "XML document database with XQuery support" is probably not the right approach, some sort of hybrid may be. For example, if you look at Workplace Designer, it's more than just a design tool. The runtime supports a document store that's quite similar to Notes but documents are represented as XML. Views are very similar to Notes, they're pre-computed queries that can use XPath or JavaScript expressions to extract or compute column values.

One difference between Notes and this store is that in Notes, documents are composed of items and items are flat. They may be multi-valued but they aren't hierarchical. By using XML, Designer applications can express hierarchy within documents. And the documents themselves can be strongly typed by using XML Schema.

If that sounds more complex than Notes, it is, but Designer IDE does a pretty good job of hiding complexity. For example, if you change a document's schema or a view definition, it will automatically fixup data. Also, it's worth noting that under the covers, the XML documents, views, etc. are stored in a relational DB.

What about content that's not representable as XML? Just like in Notes, you can attach arbitrary files to each document.

By the way, a long time ago (R5?) there was work at Iris to support XML as a datatype in Notes. It didn't get too far. I think too much time was spent figuring out how to store XML efficiently and too litle attention was spent on how to use it in the programming model. Kind of a shame, really. And the thing is, storage can be an issue depending on how you want to access the XML again, but if you can just store it as a blob, it usually compresses extremely well -- fancy binary encoding mechanism usually aren't needed.

Bob, April 20, 2006 12:27 AM

We'll see how it goes with viper, the new DB2 with a new on disk structure for XML. I personally would prefer an RDF model. RDF is a directed graph data model that and says nothing about serialized representation. A graph can represent a hierarchy just as a table can represent a list. But in reverse, hierarchicial structures in tables are difficult to read. For the same reason, graph structures in XML are difficult to read...hence the ugliness of RDF/XML. See Deakin:

http://www.xulplanet.com/ndeakin/article/133?show=c

I don't like XML as a storage format for that reason. I like persistence engines that are play-dough, not molds.

Bob, the lack of hierachical representation in Notes is a pita. I wish Notes were more efficient at storing the relationships at the document level. Perhaps CouchDb will be better at that...

As for schema support, Workplace is exactly what Notes is not. They use some fancy table manipulation foo in the background but viper should clean that up. As has been said before, the lack of schema in Notes is it's greatest strength and its greatest weakness. What fun would Notes be if you couldn't arbitrarily add items on the fly and stuff data types in items that don't match the data type of the field that created the item ;-)

Damien, keep CouchDB XML agnostic. And gimme a python api ;-)

Dan Sickles, April 20, 2006 2:26 AM

It depends on the data you want to store. If the data schema is more volatile use XML otherwise use a rdbms.

>What about content that's not representable as XML? >Just like in Notes, you can attach arbitrary files to each document.

Why is this not representable in XML?
You could store them as base64 encoded values, or am I wrong?

Tobias Mueller, April 20, 2006 4:16 AM

Dan: I'm not sure what you mean by "Workplace is exactly what Notes is not". I used to work on Workplace Designer and I think it may be more flexible than you think. Yes, it's a little more structured than Notes but not as rigid as a lot of other tools. We'll see where it goes next...

The mapping that happens under the covers is because Designer runtime sits on top of a relational store. Taking advantage of something like Viper would be challenging since Designer is supports multiple database backends.

Tobias: yes, you can store attachments in XML as base64 encoded values. My point was that attachments can be handled in a similar fashion to Notes, as related uninterpreted data.

Bob, April 20, 2006 1:52 PM

I understand that a XML store is powerful and flexible, but I question how easy it is to develop for? IMO, SQL databases have been dominate for a very long time because they are built on extremely simple principles: tables and set theory.

XML is hierarchical, which is another way of saying it has a tree structure. The problem with tree structures is that the query languages must be able to deal with them, you need some way to retrieve those deeply nested bits of data. Even if the applications use the XML in a flat name-value pair fashion, it still makes the tools, languages and queries more complex and harder to understand, because of the power that *might* be needed.

That's not to say XML storage models won't be slam dunk for some application domains, but we already have plenty of powerful databases and languages that can accomplish the same things. XML data models aren't intrinsically easier, just different than alternatives that already exist.

It's like the whole SOAP Vs. REST battle. The designers of SOAP realized that by introducing just a little more complexity into the design, it could free itself from it's HTTP bonds and become transport independent protocol. How's that working out for it?
Well, that tiny little bit of complexity is making it far less useful to real people than simpler bare bones REST protocols. I'm sure someone out there is using SOAP over some other non-HTTP transport and it's great for them, but the vast vast majority of its use will be over HTTP, and it's making things needlessly complex.

My relatively brief experience in this industry has taught me that the market consistently selects for the simplest alternatives that still get the job done.

This is what I'm aimimg for with CouchDb: simple simple simple...and it gets the job done.

Damien, April 20, 2006 3:13 PM

Is SQL easier than XQuery using XPath or XSLT using XPath?
I think we're just used to work with SQL. I've seen stored procedures that were not easy to understand.

>This is what I'm aimimg for with CouchDb: simple simple simple...and it gets the job done.

If you want flexibilty things getting started complicated. I really like working with Ruby On Rails Framework. Convention over configuration, always have a working model and the generated code from the scaffolding as entry point to modify.

Tobias Mueller, April 21, 2006 4:35 AM

"Is SQL easier than XQuery using XPath or XSLT using XPath?"

At its core? Yes. Although SQL queries that express complex relationships can be complicated, the language and its core concepts are exceedingly simple. And stored procedures are typically written in something like PL SQL, which is far more complex than core SQL.

But, CouchDb is not going to be a SQL db, not in any traditional sense anyway. However, I do plan at some point to allow SQL queries against computed tables.

"If you want flexibilty things getting started complicated."

You are right, and CouchDb isn't going for power and flexibilty, but for simplicity. If you want flexibilty, use Oracle, DB2 or SQL server. But don't look for simple there, they lost that long ago as they kept expanding the functionailty of their offerings.

Damien, April 21, 2006 10:55 AM

This is just an etymological aside, but I've noticed a lot of people saying things like you did above.

'SQL databases have been dominate for a very long time because they are built on extremely simple principles: tables and set theory'

Back in 'the-day', (and you can see Bob doing it above) we would have all called these 'Relational databases'. As I'm sure you all know, SQL came much later, as its name says, to address the differences of working with different vendor's relational solutions.

I'm not correcting anybody, everyone knows what you mean when you say either, I just find it interesting how word use evolves.

Pete, April 23, 2006 12:52 PM

Pete, as a guy who is now designing his own db system, I figure I'd use the most precise terminolgy possible. The truly die hard relational weenies will deny that SQL databases are relational, because they doesn't meet some criteria set by Codd a zillion years ago, as though Codd's notions of relational are the only one allowed.

I do this so the pedants don't say "How can he design a DBMS, He doesn't even know the difference between SQL and relational? BLATHER BLATHER BLATHER BLATHER". You know how some people are:
http://www.dbdebunk.com/index.html

Damien, April 23, 2006 1:03 PM

Actually, this is the better link to understand what I'm talking about:
http://www.dbdebunk.com/about.html

These guys apparently base their careers on "debunking" SQL databases.

Damien, April 23, 2006 4:34 PM

XML is totally flexible. And a complete mess for humans to read. I think the recent efforts to xml-ify the world are a little unrealistic since not everything can be represented accurately using a hierarchical schema.

Having said that, in Tornado we use XSLT to display the UI for views. Again completely non-intuitive but very flexible easyish when you get the hang of it.

Brendon Upson, April 23, 2006 6:37 PM

I hadn't seen the dbdebunk web site before. Pretty ugly. By the way, saying that C.J. Date is "some guy" is kinda funny. He worked with Codd at IBM on the original relational model. Maybe he's a curmudgeon now but he deserves a some respect for work he did in the past.

I haven't read their articles but, in general, I'm perfectly fine with the idea that what we do now with database systems suck. As long as they provide cogent arguments it's a good thing. Maybe it's impractical to do anything about it now but things change. For example, the Lisp crowd has been pontificating for decades on what sucks about more popular languages but over time, more and more of what was good in Lisp has gone mainstream.

Bob, April 24, 2006 6:01 PM

Post a comment




Remember Me?

(you may use HTML tags for style)