CL-WHO

The CL-WHO library is one of many that make it easy to generate HTML. When first checking out CL-WHO, I thought that it must be at least a couple thousand lines of code long. As it turns out, it is only several hundred. At the core of CL-WHO is with-html-output (hence the name “who”), which allows one to use a DSL for generating HTML. With-html-output works like all macros. At a high level, it takes your code in the DSL, and compiles it into Lisp code that will generate the desired HTML (here are some examples).

With-html-output does little by itself. Almost all of the work is done by three functions: tree-to-templateprocess-tag, and convert-tag-to-string-list. Most of the time these functions call one another recursively in order to process the entire DSL. It is possible to customize the control flow, but I will get to that later. Here is a link to a gist of the output after tracing all of the functions and using macroexpand-1 to expand a simple example. The example only shows what happens when using basic tags in CL-WHO. It doesn’t show what happens when you embed Lisp expressions in the DSL.

Tree-to-template is the entry point into the compilation process. It loops through the DSL tree, and builds up a “template”. A template is just a list of strings and expressions. The strings in the template contain HTML and are meant to be printed directly to the HTML stream. On the other hand, the expressions contain code that will print objects to that stream. Eventually all of this output put together will be the desired HTML. As tree-to-template loops through the code, if it sees a non-tag, it will just collect that into the list.1 When it does see a tag, tree-to-template calls process-tag to process it, and then concatenates the result of that into the template.

Process-tag will extract the tag as well as the attribute list. Everything after the attribute list makes up the “body” of the tag. How is the body processed? Well, process-tag takes an additional argument, body-fn, which specifies how to process the body. Process-tag will then call convert-tag-to-string-list with the tag, the attribute list, the body, and body-fn. The reason process-tag doesn’t process the body itself is that convert-tag-to-string-list is a generic function, making it possible to customize its behavior.

Convert-tag-to-string-list handles the semantics of the tag. It takes all of the arguments above and returns a list of strings and expressions. That list will become part of the template eventually returned by tree-to-template. Since convert-tag-to-string-list is a generic function, it is possible to extend it. The documentation for CL-WHO gives an example of how one could create a custom “red” tag which changes the font of the text to red, even though there is no such HTML tag. In the default case, convert-tag-to-string-list takes the result from calling body-fn on body and surrounds that with strings for the opening and closing tags. Since convert-tag-to-string-list is customizable, it is possible to change the control flow and ultimately how the body is processed. If one wanted, they could make a call to process-tag, but with a different body-fn argument, changing how the code is processed further up (down?) the tree.

With the help of these functions with-html-output converts the DSL into a template. The template is then turned into a list of valid Lisp code.2 With-html-output then wraps the body with a macrolet which binds several local macros. These macros are: htm, fmt, esc, str. These macros make it easier to print objects to the stream used for output. Check out the documentation for CL-WHO for a more detailed description of what these macros do.

I really like CL-WHO. It is a great example of an embedded DSL. A Lisp hacker still has full access to Lisp from within what is a great DSL. The only problem I have with CL-WHO is the inability to have macros expand into code for the DSL. This decreases the flexibility of CL-WHO somewhat. The only way I can see to fix this problem would be to use a library such as :hu.dwim.walker to expand all of the macros in advance.

Once-only

One of the most common mistakes made when writing macros is evaluating one of the arguments multiple times. Not only can this be inefficient, but when side effects are involved, it leads to quirky behavior. Take a macro square, which simply squares its argument (in reality one would use a function to do this):

(defmacro square (x)
  `(* ,x ,x))

The above implementation is buggy. Why? Because the x argument is evaluated twice. To see why this is a bad thing, check out the following code:

(square (incf a))

The above winds up expanding into:

(* (incf a) (incf a))

Which is buggy since it increments a twice. A way to fix this problem is to bind the value of x to a gensym, and then use that gensym throughout the rest of the macro. Here is a bug free definition of square that uses with-gensyms:

(defmacro square (x)
  (with-gensyms (gx)
    `(let ((,gx ,x))
       (* ,gx ,gx))))

Is there a way to automate this? Yes, there is, by using a macro called once-onlyOnce-only is a relatively complicated macro, but it eliminates lots of boilerplate code. Once-only takes a list of expressions, generally arguments to a macro, and makes sure they are evaluated only once in the final macro expansion. Here is an implementation of once-only based on the one from Practical Common Lisp:

(defmacro once-only ((&rest names) &body body)
  (let ((gensyms (loop for n in names collect (gensym))))
    `(with-gensyms (,@gensyms)
      `(let (,,@(loop for g in gensyms
                      for n in names
                      collect ``(,,g ,,n)))
        ,(let (,@(loop for n in names
                       for g in gensyms
                       collect `(,n ,g)))
           ,@body)))))

In order to explain how once-only works, I’m first going to show how to rewrite square using it. From there I will show what square looks like after once-only has been expanded. After that I will show what the macro expansion of square looks like. Finally, I will give an explanation as to what is going on. If you are reading on a computer, I strongly recommed you open this page in another window so you can follow along with the code and the explanation at the same time. Here is an implementation of square that uses once-only:

(defmacro square (x)
  (once-only (x)
    `(* ,x ,x)))

Here is what square looks like after once-only has been expanded inline:

(defmacro square (x)
  (with-gensyms (#:g830)
    `(let (,`(,#:g830 ,x))
       ,(let ((x #:g830))
          `(* ,x ,x)))))

So a usage of square such as the following:

(square (incf x))

will wind up looking like the code below after macro expansion.

(let ((#:g831 (incf x)))
  (* #:g831 #:g831))

So what the heck is going on? In line 2 of once-only, it creates a list of gensyms, one for each of the expressions that should only be evaluated once. We then take these gensyms and on line 3, generate code that will bind them to fresh gensyms. That generated code becomes line 2 of square after once-only has been expanded. We need to do this because we are writing a macro that writes a macro or code that writes code that writes code. So, after once-only has been expanded, square’s body will contain a use of with-gensyms which will bind a bunch of gensyms to new gensyms every time square is ran. These fresh gensyms will eventually be the ones used to store the value of the expressions we want to be evaluated once only.

Now for lines 4-6. By using the double backquote, this code generates code that will generate code that will be part of the expansion of square. Lines 4-6 of once-only become line 3 of the definition of square, which becomes line 1 of the expansion of square. Basically the little segment

``(,,g ,,n)

says to generate code that will generate code (double backquote), that will be a list containing the value of the value of g, and the value of the value of n. The value of g will be one of the gensyms we created in once-only. From line 3 of square after once-only has been expanded, we see that this gensym was #:g830. The value of #:g830 will be another gensym, whatever it was bound to by with-gensyms. From the code will can see that this gensym was #:g831. The value of n will be one of the arguments to once-only. From the original code for square we see that the only argument to once-only is x. Then the value of x, or the value of the value of n, will be whatever is passed as the argument to the square macro, in this case (incf x). Ultimately the code looks like this as it goes through the multiple expansions:

``(,,g ,,n) => `(,#:g830 ,x) => (#:g831 (incf x))

Lines 4-6 take a list of expressions similar to those in the middle of the above process, splices them into a let by using the comma-at, then evaluates each one of them by using the comma in order to evaluate them once more. This works because the single comma in ,,@ actually applies to every element in the spliced list. Here is an example that demonstrates this:

``(,,@ '(x y z)) => `(,x ,y ,z)

Then on line 3 of square after once-only has been expanded, we wind up with the comma followed by a backquote which wind up canceling each other out. So this is how lines 4-6 of once-only get us line 3 of square which then gives us line 1 of the expansion of square.

Now for lines 7-10 of once-only. These lines generate lines 4 and 5 of the code for square after once-only has been expanded. All these lines do is generate code that will bind the given names to the gensyms that will contain their values at runtime. In this case we want to bind x to the gensym #:g831. Since the value of #:g830 is #:g831, we can just bind x to the value of #:g830. Then we just evaluate the body in this environment. By doing this, we bind x to an expression that will give us the same value as the expression previously contained in x! And that is how once-only ultimately works. In the expansion of square, we bind #:g831 to the value of (incf x). Then we bind x to #:g831 so any where we insert the expression x, we get #:g831, a gensym which is bound to the value of the expression that was initially bound to x, but only evaluated once.

Ultimately, once-only is a fairly useful macro. Like with-gensyms it is a utility for writing other macros. Once-only greatly reduces boiler plate and complexity in cases where it is used. It is because of these reasons once-only is one of the most popular macros out there.

Automatically Binding Gensyms

One of the most common macros that almost everyone keeps in their utilities file is with-gensymsWith-gensyms is a macro that binds a list of variables to gensyms. That’s it! All with-gensyms does it take a list of symbols and generates code which binds each of those symbols to a gensym.  Although with-gensyms is simple, it removes a lot of boiler plate code. Here is a simple implementation of with-gensyms:

(defmacro with-gensyms (vars &body body)
  `(let ,(loop for v in vars collect `(,v (gensym)))
     ,@body))

Looking at my implementation of accum, here is how one could simplify it by using with-gensyms. Pay attention to how much boiler plate is removed.

(defmacro accum (accfn &body body)
  (with-gensyms (ghead gtail garg)
    `(let* ((,ghead (list nil))
            (,gtail ,ghead))
       (macrolet ((,accfn (,garg)
                    `(setf ,',gtail
                           (setf (cdr ,',gtail)
                                 (list ,,garg)))))
         ,@body
         (cdr ,ghead)))))

By removing so much boiler plate, with-gensyms helps greatly reduce the cognitive load in certain cases. This will be important when I introduce once-only, the next macro I plan to talk about. There are also other variations of with-gensyms such as the one in Alexandria which makes it easier to have base names associated with the gensyms created.

Efficiently Building Lists

It is a common problem to need to build up a list of objects and keep all of the objects in the same order they were generated in. The idiom I see everyone use to solve this problem is to push the objects onto the head of a list as they are generated. Unfortunately, doing this reverses the initial order of the elements, so in order to preserve the ordering, a call to either reverse or nreverse is needed. For example, here is code from Alexandria that converts an alist into a plist: 1

(defun alist-plist (alist)
  (let (plist)
    (dolist (pair alist)
      (push (car pair) plist)
      (push (cdr pair) plist))
    (nreverse plist)))

A better method to build up a list is to add the elements to the tail of the list. By adding them to the tail, you avoid having to reverse the list later. This is the same technique that some implementations of Loop use for the “collect” clause in order to efficiently collect objects into lists.  In order to build up a list this way, you need to keep track of the head of the list (what you are ultimately going to return) and the tail of the list (where you add the next element). Doing all of that by hand would be painful. Instead we can write a macro to handle that for us automagically. Here is an implementation of accum2, a macro which makes accumulating elements easy and efficient.

(defmacro accum (accfn &body body)
  (let ((ghead (gensym "HEAD"))
        (gtail (gensym "TAIL"))
        (garg  (gensym "ARG")))
    `(let* ((,ghead (list nil))
            (,gtail ,ghead))
       (macrolet ((,accfn (,garg)
                    `(setf ,',gtail
                           (setf (cdr ,',gtail)
                                 (list ,,garg)))))
         ,@body
         (cdr ,ghead)))))

Accum is an example of a macro that writes a macro, or code that writes code that writes code. In order to explain how accum works, I am going to show how to rewrite alist-plist using accum and then explain how accum expands into code that does what we desire. Here is an implementation of alist-plist that uses accum instead of using nreverse:

(defun alist-plist (alist)
  (accum a
    (dolist (pair alist)
      (a (car pair))
      (a (cdr pair)))))

The above is both simpler and more efficient than the original version! Now for an explanation as to how it works. Accum works by generating a macrolet binding. Macrolet allows for the creation of lexically scoped macros. The lexically scoped macro is named by whatever is passed in as the “accfn” argument to accum, in this case “a”. That macro, named “a”, will take its argument and generate code that will evaluate the argument, and add the result of that to the tail of the list. Similar to how push is a macro that takes an argument and generates code that adds the result of evaluating that to the front of the list passed in as its second argument. After accum is expanded, the code will look like the following:

(defun alist-plist (alist)
  (let* ((#:head1460 (list nil))
         (#:tail1461 #:head1460))
    (macrolet ((a (#:arg1462)
                 `(setf #:tail1461
                        (setf (cdr #:tail1461)
                              (list ,#:arg1462)))))
      (dolist (pair alist)
        (a (car pair))
        (a (cdr pair)))
      (cdr #:head1460))))

As you should be able to see, “a” becomes a local macro which does what I described above.

I find accum to be one of many similar macros. Macros that abstract out a common pattern, and at the same time, are more efficient than what a human would normally write by hand. While many of these macros may be trivial, they both simplify code and make it faster!

The Ret Macro

The ret macro is simple, but useful. It allows you to bind a variable and simultaneously specify the final value of that variable as the result of the ret expression, hence the name “ret”, a blend of let and return. Here is the implementation of ret:

(defmacro ret (var val &body body)
  `(let ((,var ,val))
     ,@body
     ,var))

And now for an example. If you want to write a function that converts a hash-table into an alist, here is how you could do that using ret.

(defun table->alist (table)
  (ret result '()
    (maphash (lambda (k v)
               (push (cons k v) result))
             table)))

Personally, I find code using ret to be much clearer than the equivalent code using let. When writing the let version, I commonly find myself forgetting to return the value I want to at the end. Ret eliminates this problem by returning the result for me. The only downside I see to using ret is that it has the same problem as all of the “do” macros (dotimes, dolist, etc), they change where the resulting value comes from. Even so, this problem is much more mild with ret because ret will always change the control flow, whereas the “do” macros may or may not change the final value depending on the number of arguments passed to them.