Loops in Lisp Part 4: Series

This is part four of Loops in Lisp. Follow one of the following links for part one, two, or three).

One of the many advantages of programming in a functional style (by this, I mean manipulating your data through the operations, map, fold, and filter) is that your program winds up being made up a bunch of tiny and composable pieces. Since each piece is so small, usually only a few lines each, it becomes trivial to unit test the entire program. Additionally, it is easy to express new features as just the composition of several existing functions. One disadvantage of programming through map and friends, is that there is fairly large time penalty for allocating the intermediate results. For example, every time filter is called on a list, a new list needs to be allocated. These costs add up pretty quickly and can make a functional program much slower than its imperative equivalent.

One solution to this problem is laziness. Instead of allocating a new list every time an operation is performed on a list, you instead keep track of all of the transformations made on the list. Then when you fold over the list, you perform all of the transformations as you are folding over it. By doing this, you don’t need to allocate intermediate lists. Although laziness doesn’t allocate any intermediate lists, there is still a small cost for keeping track of the laziness. An alternative solution that makes functional programming just as fast as imperative programming is provided by the Series library.1 Series lets you write your program in a functional style without any runtime penalty at all!

Personally, the Series library is my favorite example of the magic that can be pulled off with macros. In short, Series works by taking your functional code and compiling it down into a single loop. In this loop, there is one step per transformation performed on the original list. The loop iterates over the values of the original sequence on at a time. On each iteration, the loop takes a single element, performs all of the transformations performed on the list on that single element, and then accumulates that value into the result according to the folding operation. This loop requires no additional memory allocation at runtime, and their is no time penalty either! As an example, here is a program that sums the first N squares, written using Series:

(defun integers ()
  "Returns a 'series' of all of the natural numbers."
  (declare (optimizable-series-function))
  (scan-range :from 1))

(defun squares ()
  "Returns a 'series' of all of the square numbers."
  (declare (optimizable-series-function))
  (map-fn t 
          (lambda (x) (* x x)) 
          (integers)))

(defun sum-squares (n)
  "Returns the sum of the first N square numbers."
  (collect-sum (subseries (squares) 0 n)))

(sum-squares 10)
=> 385

The above code certainly looks functional, there are no side effects in sight. Now let’s look at the code generated by Series. Here is what the macroexpansion of collect-sum looks like:

(common-lisp:let* ((#:out-969 n))
  (common-lisp:let ((#:numbers-966
                     (coerce-maybe-fold (- 1 1) 'number))
                    #:items-967
                    (#:index-965 -1)
                    (#:sum-959 0))
    (declare (type number #:numbers-966)
             (type (integer -1) #:index-965)
             (type number #:sum-959))
    (tagbody
       #:ll-970
       (setq #:numbers-966
             (+ #:numbers-966
                (coerce-maybe-fold 1 'number)))
       (setq #:items-967
             ((lambda (x) (* x x)) #:numbers-966))
       (incf #:index-965)
       (locally
          (declare (type nonnegative-integer #:index-965))
         (if (>= #:index-965 #:out-969)
             (go end))
         (if (< #:index-965 0)
             (go #:ll-970)))
       (setq #:sum-959 (+ #:sum-959 #:items-967))
       (go #:ll-970)
     end)
    #:sum-959))

What series does it looks at the entire lifetime of the sequence from its creation until it is folded. It uses this information to build the above loop which simultaneously generates the original sequence, maps over it, filters elements out of it, and folds it into the final result. Here is the breakdown of the expansion. Lines 1-9 are just initialization. They define all of the variables the loop will be using and set them to their starting values. The important variables to keep track of are #:NUMBERS-966, #:ITEMS-967, and #:SUM-959. As the code “iterates” over the original sequence, #:NUMBERS-966 is the value of the original sequence, #:ITEMS-967 is the square of that value, and #:SUM-959 is the sum of the squares so far. The rest of the code is the actual loop.

The loop first takes #:NUMBERS-966, the previous value of the sequence, and increments it in order to set it to current value of the sequence (since the sequence is the range from 1 to infinity). Next the loop takes the square of #:NUMBERS-966 to get the ith square number and stores that in #:ITEMS-967. Then the loop checks if it ha taken more than N elements out of the sequence, and if so, terminates. Finally the loop takes the value in #:ITEMS-967 and accumulates that into #:SUM-959.

Although the imperative version is equivalent to the original functional code, it is much faster than the functional code if the functional code were to allocate intermediate results or use laziness. This idea of turning transformations on a list into a loop doesn’t just work for this simple example, it also works for much more complicated programs. I just find it incredible that Series is able to take such pretty code and compile it into code that is extremely fast.

Loops in Lisp Part 3: Iterate

This is part 3 of Loops in Lisp. For part 1 on how you can build any kind of looping construct you want out of just goto and macros, click here. For part 2 on Loop, click here.

The Iterate library is pretty awesome. It provides a macro iterate (and an alias for it, iter) that is basically a Lispy version of loop. The most obvious consequence of this is that iterate uses a lot more parens than loop does:

;; Loop code
(loop for i from 1 to 10
      collect i)

;; Iterate code
(iter (for i from 1 to 10)
      (collect i))

Even though all of the extra parens make iterate much uglier than loop, they give iterate all of the advantages of Lisp syntax. One such advantage is the ability to embed iterate clauses within Lisp code and vice versa. While you can’t do this with loop, you can do it with iterate because the syntax of iterate is so similar to the syntax of ordinary Lisp code. Here is what happens when you try to embed a collect clause within Lisp code with loop and with iterate:1

;; Not valid loop code.
(loop for i from 1 to 10
      do (when (evenp i)
           (collect i)))

;; Valid iterate code
(iter (for i from 1 to 10)
      (when (evenp i)
        (collect i)))

Although the ability to seamlessly go between Lisp code and iterate is pretty awesome, the greatest feature provided by iterate is also the entire reason why Lisp syntax has so many parens in the first place. Lisp syntax (and by extension iterate) makes it easy to write macros! Because of this, you can add pretty much any feature you want to iterate. As a simple example, here’s how you could define an iterate clause specifically for looping over the digits of a number:2

(defun digits (n)
  "Returns a list of the digits of N."
  (map 'list #'digit-char-p (princ-to-string n)))

(defmacro-clause (for var in-digits-of n)
  `(for ,var in (digits ,n)))

And here is how you would use it:

(iter (for i in-digits-of 123)
      (sum i))
=> 6

I cannot express how awesome this is. If you want an iterate clause for iterating over SQL queries, you can add it. If you want an iterate clause for looping over your own custom data structure, you can add it. You can add any feature you want all because iterate allows for the use of macros!

Personally, I prefer to use iterate over loop. Even though it is uglier, it is much more extensible than loop because it decides to use a Lisp like syntax.

Loops in Lisp Part 2: Loop

This is part 2 of Loops in Lisp. Click here to view the previous post on how you can build any iteration abstraction you want out of just goto and macros.

The loop macro is probably the most well known of all macros. It provides a DSL for performing any kind of iteration imaginable. To give you an idea of just how powerful loop is, here are the first two Project Euler problems, solved using just loop:

;; Solution for problem #1.
(loop for i from 1 below 1000
      if (or (= 0 (mod i 3))
             (= 0 (mod i 5)))
        sum i)

;; Solution for problem #2.
(loop for a = 1 then (+ a b)
      and b = 0 then a
      while (< a 4000000)
      if (evenp a)
        sum a)

The coolest part of loop is that it is just a macro! That means it would be possible to build loop in Common Lisp, even if it wasn’t provided as a builtin (here is one such implementation). That also means any loop code is eventually compiled down to goto! For example, here is the expansion of the solution to the first Project Euler problem:

(block nil
  (let ((i 1))
    (declare (type (and real number) i))
    (let ((#:loop-sum-2482 0))
      (declare (type number #:loop-sum-2482))
      (tagbody
       sb-loop::next-loop
        (if (>= i '1000)
            (progn (go sb-loop::end-loop))
            nil)
        (if (let ((#:g2483 (= 0 (mod i 3))))
              (if #:g2483
                  #:g2483
                  (the t (= 0 (mod i 5)))))
            (setq #:loop-sum-2482 (+ #:loop-sum-2482 i)))
        (setq i (1+ i))
        (go sb-loop::next-loop)
       sb-loop::end-loop
        (return-from nil #:loop-sum-2482)))))

If you look carefully, the expansion is nothing more than a mix of a few gotos and conditionals. Also, even though the generated code is a complete mess, you are able to work with it through interface provided by loop. Even though loop is fairly complex, it is still much simpler than raw gotos. If you think about it, loop is really just a convenient way of specifying a combination of patterns of gotos and conditionals.

I don’t have much to add about loop that others haven’t already said. If you are looking for a basic introduction to loop you should read Peter Seibel’s guide which can be found here. If you are looking for a more complete reference, check out the loop chapter in Common Lisp the Language which can be found here.

While all of the features of loop compose well with each other, they do not compose well with the rest of Common Lisp. You cannot embed a loop clause (e.g. collect) within ordinary lisp code. That brings us to what will be next week’s topic, iterateIterate is basically a more lispy version of loop. It allows you to seamlessly go between iterate clauses and regular Lisp code. More importantly, iterate allows you to define macros that then become part of the iterate DSL!

Loops in Lisp Part 1: Goto

At its core, Common Lisp provides two primitives for performing iteration. The first of those primitives is recursion. Recursion is an amazing technique, but in this post I am going to focus on the other primitive – goto.

Goto is extremely powerful. It lets you manipulate the control flow of your program in anyway you can think of. This freedom to do whatever you want is also what makes goto so dangerous. In any given piece of code that uses goto, it is difficult to tell what the purpose of the goto is because it could be used for so many different reasons. Because of this, most languages provide various kinds of builtin loops instead of providing raw goto. Even though loops aren’t as general as goto, they express the intention of the code much more clearly.

As an example, let’s say you want to print all of the characters in a file. If your language provided while loops, you could do this by printing characters from the file one at a time while there are more characters left. If Common Lisp had while loops,1 the code for this procedure would look like this:

(while (peek-char file nil nil)
  (write-char (read-char file)))

If your language only had goto, it becomes much more difficult to implement the procedure. In the end, you have to, in some way, simulate a while loop. One way to code the procedure with just goto is the following. First check if there are any characters left in the file. If there aren’t any, goto the end. Otherwise print the next character and go back to the start. Here is Common Lisp code that implements this:2

(tagbody
  start
  (if (not (peek-char file nil nil))
      (go end))
  (write-char (read-char file))
  (go start)
  end)

Not only is the version with goto much more verbose, it is also much harder to understand. The code lacks clarity because goto is so general. It gives you no context into how it is being used. The reader of the code will have to think about the positioning of all of the gotos before they can think about the overall flow of the program. On the other hand, in the version with the while loop, merely the fact that a while loop is being used gives whoever is reading the code a decent idea of the control flow.

In reality all loops are eventually compiled down to gotos. Whenever the compiler for a language that provides loops sees a loop, it generates code that simulates the loop through goto. You can do the same thing with Lisp macros!

If you don’t know, Lisp macros are compile time functions which take code as their input and return code as their output. When Lisp code is being compiled, all of the macros in the code are called and each one is replaced with its result. This means you can write a macro that looks like a while loop when you use it, but at compile time generates code to simulate a while loop through goto. You are in effect adding while loops to the Lisp compiler! Here is code that defines such a macro:

(defmacro while (test &body body)
  (let ((gtop (gensym))
        (gend (gensym)))
    `(tagbody
       ,gtop
       (if (not ,test)
           (go ,gend))
       ,@body
       (go ,gtop)
       ,gend)))

With this macro, the first code example is now valid lisp code! The while macro takes as arguments a test and a body. It then generates code that uses the method used in the second example to simulate a while loop with goto. You can actually see what the first example looks like after expanding the macro by using the function macroexpand. Here is what the generated code looks like:

(tagbody
  #:g729
  (if (not (peek-char file nil nil))
      (go #:g730))
  (write-char (read-char file))
  (go #:g729)
  #:g730)

The generated code is the exact same as the code in the second example except for the names of the labels. This means the two examples are the same functionally! The only real difference between them is that the first one is expressed in terms of loops, and the second one is expressed in terms of goto. Since it is so much easier to think in terms of loops than goto, there is no reason why you wouldn’t use the first example over the second.

Macros allow you to build any feature you want as long as it is possible to simulate that feature through lower level features. With respect to goto, this means you can build any kind of control flow construct you want by simulating it with goto and then putting a macro on top. In Common Lisp, all of the looping constructs (do, do*, dotimes, dolist, loop) are really just macros that expand into goto. This is what Alan Kay meant when he said “Lisp isn’t a language, it’s a building material”. It bears repeating. In Lisp, you can build any feature you want as long as it is possible to simulate that feature in terms of lower level features.