Tuesday, February 23, 2010

Using s-expressions instead of XML

Last time I needed to manipulate a large XML document I remembered Paul Graham's comment in What Made Lisp Different that programs communicating with s-expressions is an idea recently reinvented as XML. I began to wonder if I could just use s-expressions instead of having to deal with XML.

Step 0: Define an s-expression representation for XML.

 (tagname (@ attr "value" attr2 "value2")
   (tagname3 "data"))
If the attributes are optional, then that requires an extra token (@) to distinguish between attributes and the first nested tag.

If the attributes are not optional, then that requires an extra token (nil) when there are no attributes specified.

Most XML documents I've used have more tags without attributes, so I opted for using @.

Since @ can't be a tag name, if the first thing in the list (after the tag name) is a list whose car is @ then it is the XML attributes for that tag. I dubbed this representation SML (S-expression Meta Language).

UPDATE: I came up with a simpler representation.

Step 1: convert XML to s-expressions.
This seems like a job for Perl. It's great at manipulating data formats. So I wrote xml2sexp.pl which works great.

But it seems like a hack because there might be some XML syntax that it doesn't handle. XSLT was designed for transforming XML so it's a good choice for this also. So of course, I did some Googling and found this xml2sexp.xsl, but it's not complete. It can't even convert itself. So I decided to write my own. Yikes! Now I'm back to writing XML, which I was trying to avoid! I can't think of a programming language that is more unpleasant than XML. But it was a chance to learn XSLT, so I wrote xml2sexp.xsl too.

Step 2: Convert SML back to XML.
Now I'm in the Lisp world, so I can use my Lisp of choice, which happens to be Arc at the moment. So I wrote an Arc library, sml.arc, to convert SML back to XML. There's also a function to pretty-print the SML, since the SML created by the conversion from XML is pretty ugly SML.

Adios, XML! I'll never need to deal with you again. I can just use SML whenever I need to work with XML files.


  1. You may want to look at http://en.wikipedia.org/wiki/SXML

  2. http://docs.plt-scheme.org/xml/index.html

  3. Thank you.
    thank you.

    This is just what I needed to see.

  4. I prefer XML over s-expression tbh, although great info.
    I did find a good book/site for brushing up on your xml for those that don't like working with it too well!
    XMl For Dummies Homepage! and
    XML Dummies

    not saying that only dummies should use it, lol!

