Thursday, February 25, 2010

Simplified s-expression XML (SML)

Have you ever finished something and released it to the world and then almost immediately afterward realized a way to improve it? Well, that happened with my last blog post. I woke up in the middle of the night and realized that I don't need the damn @ sign in SML.

This will work just fine:
 (tagname attr "value" attr2 "value2"
   (tagname2)
   (tagname3 "data"))
When parsing the SML if there's a symbol, it must be an attribute name, because the only other things allowed are sublists and strings. That makes the format a lot more readable. I've updated sml.arc to use this new format.

It will also still accept the old format with the @ signs. Might as well keep it backward compatible. I hope I won't wake up in the middle of the night again tonight.

Tuesday, February 23, 2010

Using s-expressions instead of XML

Last time I needed to manipulate a large XML document I remembered Paul Graham's comment in What Made Lisp Different that programs communicating with s-expressions is an idea recently reinvented as XML. I began to wonder if I could just use s-expressions instead of having to deal with XML.

Step 0: Define an s-expression representation for XML.

 (tagname (@ attr "value" attr2 "value2")
   (tagname2)
   (tagname3 "data"))
If the attributes are optional, then that requires an extra token (@) to distinguish between attributes and the first nested tag.

If the attributes are not optional, then that requires an extra token (nil) when there are no attributes specified.

Most XML documents I've used have more tags without attributes, so I opted for using @.

Since @ can't be a tag name, if the first thing in the list (after the tag name) is a list whose car is @ then it is the XML attributes for that tag. I dubbed this representation SML (S-expression Meta Language).

UPDATE: I came up with a simpler representation.

Step 1: convert XML to s-expressions.
This seems like a job for Perl. It's great at manipulating data formats. So I wrote xml2sexp.pl which works great.

But it seems like a hack because there might be some XML syntax that it doesn't handle. XSLT was designed for transforming XML so it's a good choice for this also. So of course, I did some Googling and found this xml2sexp.xsl, but it's not complete. It can't even convert itself. So I decided to write my own. Yikes! Now I'm back to writing XML, which I was trying to avoid! I can't think of a programming language that is more unpleasant than XML. But it was a chance to learn XSLT, so I wrote xml2sexp.xsl too.

Step 2: Convert SML back to XML.
Now I'm in the Lisp world, so I can use my Lisp of choice, which happens to be Arc at the moment. So I wrote an Arc library, sml.arc, to convert SML back to XML. There's also a function to pretty-print the SML, since the SML created by the conversion from XML is pretty ugly SML.

Adios, XML! I'll never need to deal with you again. I can just use SML whenever I need to work with XML files.

Monday, February 22, 2010

Arc (Jarc) or anything as a scripting language

I'm testing the hypothesis that Arc could be used as a scripting language. I'm using Jarc, my own Arc interpreter, of course. This is easy on OS X because the execve() system call accepts any number of arguments. So I can start a script with
#!/usr/bin/java -cp /usr/local/lib/jarc.jar jarc.Jarc
But it doesn't work on Linux since the execve() system call only accepts one argument. :-( So I wrote a little C program, which was fun since I haven't written any C in over a decade at least. So now I can do
#!/usr/local/bin/jarc
And voila! I can write scripts for Linux too. You can read the whopping 26 lines of jarc.c if you are interested in the not so fascinating details. Yeah, it'll probably need to be enhanced so I can pass JVM args also. But I haven't needed that yet, and I'm on a write-it-when-you-need-it regimen.

The other ugly bit is that I actually had to change the Jarc parser. Of course, this is the great thing about writing your own language implementation---you can change whatever you want! Jarc has to ignore the first line of the file. So it treats # in line 1 column 1 as a comment character. Yes, I could have had jarc.c make a temporary file without the first line, but that seems inelegant, though much more general purpose. So this requires the latest Jarc (version 2) which I released last week on Jarc SourceForge download page. Now whatever will I do with it?

Thursday, February 11, 2010

Working for myself is a pain in the back

My first challenge as president and factotum was a pain in the back. After only a few hours at my desk I was hurting. Day after day! Yow! I've worked a desk job for decades spending half a day (12 hours) at my desk. But suddenly my new desk was leaving me twinging after a few hours.

The chair height is right, the desk height is right. I finally realized the problem was that the chair doesn't fit under the desk far enough. So I was leaning in to type, hence not sitting up straight, hence pain. Simple fix—remove the middle drawer. Ah. Now I can really work. Yea!

Monday, February 1, 2010

Working for Equity

In 1992 I was hired by a small (5 person) startup. Shortly after it went public in 1999, my equity in the company was worth more than all the salary I had earned there. Wow! Unfortunately, most of that equity disappeared in the dot com bust. But I did learn about dollar cost averaging and diversification.

In 2001 I joined another (10 person) startup company. After 8 years there, my equity stake was also greater than my years of salary.

So now I'm emboldened to forgo salary for a while and just work for equity at my own (1 person) company. Incorporation (e)paperwork was submitted today. I now have no salary. In this blog I'll share with you all my adventures in working for equity as I build a web site to make it easy to find all the live performances of your favorite bands.