How to Generate Self-Referential Programs

In this post, I am going to show you how to write programs that are self-referential. By self-referential, I mean programs which are able to obtain their own source code without any external input. In other words, they won’t just read from their own files. This post is based on section 6.1 of the book Introduction to the Theory of Computation.

Before we can start generating self-referential programs we are first going to need some techniques for generating programs in general. The first technique we need is a method of taking a given program and writing a second program that outputs the given program. As an example, given (+ 2 2), we would need to write a program that outputs (+ 2 2). In most languages this is easy. One way to do it in Lisp is to put a quote in front of the program:

'(+ 2 2)
=> (+ 2 2)

We are also going to need a function that automates this process. Such a function would take a program as its argument and return a new program that when ran, outputs the program that was originally passed to the function. In most languages doing this is fairly tricky. In Lisp, we can write this function easily through backquote:

(defun code-that-generates (program)
  `',program)

(code-that-generates '(+ 2 2))
=> '(+ 2 2)

If you don’t understand how backquote works, you can read this. Even though it’s for Emacs Lisp, everything there is still applicable to other Lisps. Just make sure that you understand that code-that-generates can be used to generate a program that outputs a given program.

Now that we have these two techniques, we can begin writing programs that are able to refer to themselves. The first self-referential program we will write will be an example of a quine. If you don’t know, a quine is a program that outputs its own source code. The quine we are going to write is made up of two parts, part A and part B, where part A is a function that is applied to part B:

(A B)

To describe how the quine works, it is easiest to start with part B. All that part B needs to do is return the source code of part A:

(A 'A)

Part A’s job is to take its own source code, and use it to obtain the source code of the entire quine. Since B is a program that outputs A, A can use code-that-generates on its own source code in order to obtain the source code of B. Once A has the source code of both A and B, it becomes trivial to combine the two to obtain the source code of the entire quine. Here is the complete quine, with the call to code-that-generates inlined:

((lambda (a)
   (let ((b `',a))
     `(,a ,b)))
 '(lambda (a)
    (let ((b `',a))
      `(,a ,b))))
=>
((lambda (a)
   (let ((b `',a))
     `(,a ,b)))
 '(lambda (a)
    (let ((b `',a))
      `(,a ,b))))

Now this is where things start getting interesting. A quine can be thought of as a program that generates its own source code, and immediately returns it. What if instead of immediately returning its own source code, the quine applied a function to it first, and then returned the result of that. The steps for building such a program are almost exactly the same as the steps we took for building the quine. This time, there is a third part F, for the function we want to call. The structure of the program will look like the following:

(F AB)

Where AB has a similar structure to our quine. After breaking AB into the two parts, A and B, the program looks like the following:

(F (A B))

Part B in the above program has the same responsibilities as B in the quine, it returns the source code for A:

(F (A 'A))

Then once A has the source code for itself, it can use code-that-generates to obtain the source code for B. Now that it has the source of A and B, it is easy for it to construct AB. Once part A has the code for AB, it can easily generate the source of the entire program. Here is what the program becomes after filling in everything except F:

(F
 ((lambda (a)
    (let ((b `',a))
      (let ((ab `(,a ,b)))
        `(F ,ab))))
  '(lambda (a)
     (let ((b `',a))
       (let ((ab `(,a ,b)))
         `(F ,ab))))))

What makes this so awesome is that F can be any function we want, and the above program will run F with the source code of the entire program! For example, replacing F with identity causes the program to become a quine:

(identity
 ((lambda (a)
    (let ((b `',a))
      (let ((ab `(,a ,b)))
        `(identity ,ab))))
  '(lambda (a)
    (let ((b `',a))
      (let ((ab `(,a ,b)))
        `(identity ,ab))))))
=>
(identity
 ((lambda (a)
    (let ((b `',a))
      (let ((ab `(,a ,b)))
        `(identity ,ab))))
  '(lambda (a)
    (let ((b `',a))
      (let ((ab `(,a ,b)))
        `(identity ,ab))))))

But we can also do some much more impressive things. We can replace F with a function that lists its argument twice, and get a program that returns a list containing its own source code twice:

((lambda (x) (list x x))
 ((lambda (a)
    (let ((b `',a))
      (let ((ab `(,a ,b)))
        `((lambda (x) (list x x)) ,ab))))
  '(lambda (a)
     (let ((b `',a))
       (let ((ab `(,a ,b)))
         `((lambda (x) (list x x)) ,ab))))))

=>

(((lambda (x) (list x x))
  ((lambda (a)
     (let ((b `',a))
       (let ((ab `(,a ,b)))
         `((lambda (x) (list x x)) ,ab))))
   '(lambda (a)
      (let ((b `',a))
        (let ((ab `(,a ,b)))
          `((lambda (x) (list x x)) ,ab))))))
 ((lambda (x) (list x x))
  ((lambda (a)
     (let ((b `',a))
       (let ((ab `(,a ,b)))
         `((lambda (x) (list x x)) ,ab))))
   '(lambda (a)
      (let ((b `',a))
        (let ((ab `(,a ,b)))
          `((lambda (x) (list x x)) ,ab)))))))

To make writing these self-referential programs easier, we can define a function that fills in F for us. It just requires a little nested backquote trickery.1

(defun self-referential-version-of (f)
  `(,f
     ((lambda (a)
        (let ((b `',a))
          (let ((ab `(,a ,b)))
            `(,',f ,ab))))
       '(lambda (a)
          (let ((b `',a))
            (let ((ab `(,a ,b)))
              `(,',f ,ab)))))))

(self-referential-version-of '(lambda (x) (list x x))
=>
((lambda (x) (list x x))
 ((lambda (a)
    (let ((b `',a))
      (let ((ab `(,a ,b)))
        `(,'(lambda (x) (list x x)) ,ab))))
  '(lambda (a)
     (let ((b `',a))
       (let ((ab `(,a ,b)))
         `(,'(lambda (x) (list x x)) ,ab))))))

Now that we’ve got a function that can generate self-referential programs for us, I am going to show you how to build something called a quine-relay. A quine-relay is like a normal quine, except it passes through multiple languages. The quine-relay we are going to write is a Lisp program that outputs a C program that outputs the original Lisp program. All we have to do is write a function that takes its argument and writes a C program that prints the argument it was given. Then we can pass that function to self-referential-version-of to get the quine-relay! That’s it! Here is a program that will generate the quine-relay:

(self-referential-version-of
  '(lambda (self)
     (format t

"#include <stdio.h>~%int main(){printf(\"%s\",~(~s~));}"

             (remove #\newline (prin1-to-string self)))))

I’ve omitted the actual quine-relay for brevity, but you can find it here if you are curious. There are a few idiosyncrasies in the above program and in the quine-relay because of the differences in behavior between Lisp and C. For example, in C you can’t have multi-line strings, so it becomes easier to remove all of the newlines from the Lisp program, than it is to keep them.

And that’s all it takes to write self-referential programs. After seeing how easy it is to generate a quine-relay, it shouldn’t be hard to imagine how to write one with many more steps. You may even be able to get up to 100 if you work at it long enough.

Loops in Lisp Part 4: Series

This is part four of Loops in Lisp. Follow one of the following links for part one, two, or three).

One of the many advantages of programming in a functional style (by this, I mean manipulating your data through the operations, map, fold, and filter) is that your program winds up being made up a bunch of tiny and composable pieces. Since each piece is so small, usually only a few lines each, it becomes trivial to unit test the entire program. Additionally, it is easy to express new features as just the composition of several existing functions. One disadvantage of programming through map and friends, is that there is fairly large time penalty for allocating the intermediate results. For example, every time filter is called on a list, a new list needs to be allocated. These costs add up pretty quickly and can make a functional program much slower than its imperative equivalent.

One solution to this problem is laziness. Instead of allocating a new list every time an operation is performed on a list, you instead keep track of all of the transformations made on the list. Then when you fold over the list, you perform all of the transformations as you are folding over it. By doing this, you don’t need to allocate intermediate lists. Although laziness doesn’t allocate any intermediate lists, there is still a small cost for keeping track of the laziness. An alternative solution that makes functional programming just as fast as imperative programming is provided by the Series library.1 Series lets you write your program in a functional style without any runtime penalty at all!

Personally, the Series library is my favorite example of the magic that can be pulled off with macros. In short, Series works by taking your functional code and compiling it down into a single loop. In this loop, there is one step per transformation performed on the original list. The loop iterates over the values of the original sequence on at a time. On each iteration, the loop takes a single element, performs all of the transformations performed on the list on that single element, and then accumulates that value into the result according to the folding operation. This loop requires no additional memory allocation at runtime, and their is no time penalty either! As an example, here is a program that sums the first N squares, written using Series:

(defun integers ()
  "Returns a 'series' of all of the natural numbers."
  (declare (optimizable-series-function))
  (scan-range :from 1))

(defun squares ()
  "Returns a 'series' of all of the square numbers."
  (declare (optimizable-series-function))
  (map-fn t 
          (lambda (x) (* x x)) 
          (integers)))

(defun sum-squares (n)
  "Returns the sum of the first N square numbers."
  (collect-sum (subseries (squares) 0 n)))

(sum-squares 10)
=> 385

The above code certainly looks functional, there are no side effects in sight. Now let’s look at the code generated by Series. Here is what the macroexpansion of collect-sum looks like:

(common-lisp:let* ((#:out-969 n))
  (common-lisp:let ((#:numbers-966
                     (coerce-maybe-fold (- 1 1) 'number))
                    #:items-967
                    (#:index-965 -1)
                    (#:sum-959 0))
    (declare (type number #:numbers-966)
             (type (integer -1) #:index-965)
             (type number #:sum-959))
    (tagbody
       #:ll-970
       (setq #:numbers-966
             (+ #:numbers-966
                (coerce-maybe-fold 1 'number)))
       (setq #:items-967
             ((lambda (x) (* x x)) #:numbers-966))
       (incf #:index-965)
       (locally
          (declare (type nonnegative-integer #:index-965))
         (if (>= #:index-965 #:out-969)
             (go end))
         (if (< #:index-965 0)
             (go #:ll-970)))
       (setq #:sum-959 (+ #:sum-959 #:items-967))
       (go #:ll-970)
     end)
    #:sum-959))

What series does it looks at the entire lifetime of the sequence from its creation until it is folded. It uses this information to build the above loop which simultaneously generates the original sequence, maps over it, filters elements out of it, and folds it into the final result. Here is the breakdown of the expansion. Lines 1-9 are just initialization. They define all of the variables the loop will be using and set them to their starting values. The important variables to keep track of are #:NUMBERS-966, #:ITEMS-967, and #:SUM-959. As the code “iterates” over the original sequence, #:NUMBERS-966 is the value of the original sequence, #:ITEMS-967 is the square of that value, and #:SUM-959 is the sum of the squares so far. The rest of the code is the actual loop.

The loop first takes #:NUMBERS-966, the previous value of the sequence, and increments it in order to set it to current value of the sequence (since the sequence is the range from 1 to infinity). Next the loop takes the square of #:NUMBERS-966 to get the ith square number and stores that in #:ITEMS-967. Then the loop checks if it ha taken more than N elements out of the sequence, and if so, terminates. Finally the loop takes the value in #:ITEMS-967 and accumulates that into #:SUM-959.

Although the imperative version is equivalent to the original functional code, it is much faster than the functional code if the functional code were to allocate intermediate results or use laziness. This idea of turning transformations on a list into a loop doesn’t just work for this simple example, it also works for much more complicated programs. I just find it incredible that Series is able to take such pretty code and compile it into code that is extremely fast.

Loops in Lisp Part 3: Iterate

This is part 3 of Loops in Lisp. For part 1 on how you can build any kind of looping construct you want out of just goto and macros, click here. For part 2 on Loop, click here.

The Iterate library is pretty awesome. It provides a macro iterate (and an alias for it, iter) that is basically a Lispy version of loop. The most obvious consequence of this is that iterate uses a lot more parens than loop does:

;; Loop code
(loop for i from 1 to 10
      collect i)

;; Iterate code
(iter (for i from 1 to 10)
      (collect i))

Even though all of the extra parens make iterate much uglier than loop, they give iterate all of the advantages of Lisp syntax. One such advantage is the ability to embed iterate clauses within Lisp code and vice versa. While you can’t do this with loop, you can do it with iterate because the syntax of iterate is so similar to the syntax of ordinary Lisp code. Here is what happens when you try to embed a collect clause within Lisp code with loop and with iterate:1

;; Not valid loop code.
(loop for i from 1 to 10
      do (when (evenp i)
           (collect i)))

;; Valid iterate code
(iter (for i from 1 to 10)
      (when (evenp i)
        (collect i)))

Although the ability to seamlessly go between Lisp code and iterate is pretty awesome, the greatest feature provided by iterate is also the entire reason why Lisp syntax has so many parens in the first place. Lisp syntax (and by extension iterate) makes it easy to write macros! Because of this, you can add pretty much any feature you want to iterate. As a simple example, here’s how you could define an iterate clause specifically for looping over the digits of a number:2

(defun digits (n)
  "Returns a list of the digits of N."
  (map 'list #'digit-char-p (princ-to-string n)))

(defmacro-clause (for var in-digits-of n)
  `(for ,var in (digits ,n)))

And here is how you would use it:

(iter (for i in-digits-of 123)
      (sum i))
=> 6

I cannot express how awesome this is. If you want an iterate clause for iterating over SQL queries, you can add it. If you want an iterate clause for looping over your own custom data structure, you can add it. You can add any feature you want all because iterate allows for the use of macros!

Personally, I prefer to use iterate over loop. Even though it is uglier, it is much more extensible than loop because it decides to use a Lisp like syntax.

Loops in Lisp Part 2: Loop

This is part 2 of Loops in Lisp. Click here to view the previous post on how you can build any iteration abstraction you want out of just goto and macros.

The loop macro is probably the most well known of all macros. It provides a DSL for performing any kind of iteration imaginable. To give you an idea of just how powerful loop is, here are the first two Project Euler problems, solved using just loop:

;; Solution for problem #1.
(loop for i from 1 below 1000
      if (or (= 0 (mod i 3))
             (= 0 (mod i 5)))
        sum i)

;; Solution for problem #2.
(loop for a = 1 then (+ a b)
      and b = 0 then a
      while (< a 4000000)
      if (evenp a)
        sum a)

The coolest part of loop is that it is just a macro! That means it would be possible to build loop in Common Lisp, even if it wasn’t provided as a builtin (here is one such implementation). That also means any loop code is eventually compiled down to goto! For example, here is the expansion of the solution to the first Project Euler problem:

(block nil
  (let ((i 1))
    (declare (type (and real number) i))
    (let ((#:loop-sum-2482 0))
      (declare (type number #:loop-sum-2482))
      (tagbody
       sb-loop::next-loop
        (if (>= i '1000)
            (progn (go sb-loop::end-loop))
            nil)
        (if (let ((#:g2483 (= 0 (mod i 3))))
              (if #:g2483
                  #:g2483
                  (the t (= 0 (mod i 5)))))
            (setq #:loop-sum-2482 (+ #:loop-sum-2482 i)))
        (setq i (1+ i))
        (go sb-loop::next-loop)
       sb-loop::end-loop
        (return-from nil #:loop-sum-2482)))))

If you look carefully, the expansion is nothing more than a mix of a few gotos and conditionals. Also, even though the generated code is a complete mess, you are able to work with it through interface provided by loop. Even though loop is fairly complex, it is still much simpler than raw gotos. If you think about it, loop is really just a convenient way of specifying a combination of patterns of gotos and conditionals.

I don’t have much to add about loop that others haven’t already said. If you are looking for a basic introduction to loop you should read Peter Seibel’s guide which can be found here. If you are looking for a more complete reference, check out the loop chapter in Common Lisp the Language which can be found here.

While all of the features of loop compose well with each other, they do not compose well with the rest of Common Lisp. You cannot embed a loop clause (e.g. collect) within ordinary lisp code. That brings us to what will be next week’s topic, iterateIterate is basically a more lispy version of loop. It allows you to seamlessly go between iterate clauses and regular Lisp code. More importantly, iterate allows you to define macros that then become part of the iterate DSL!

Loops in Lisp Part 1: Goto

At its core, Common Lisp provides two primitives for performing iteration. The first of those primitives is recursion. Recursion is an amazing technique, but in this post I am going to focus on the other primitive – goto.

Goto is extremely powerful. It lets you manipulate the control flow of your program in anyway you can think of. This freedom to do whatever you want is also what makes goto so dangerous. In any given piece of code that uses goto, it is difficult to tell what the purpose of the goto is because it could be used for so many different reasons. Because of this, most languages provide various kinds of builtin loops instead of providing raw goto. Even though loops aren’t as general as goto, they express the intention of the code much more clearly.

As an example, let’s say you want to print all of the characters in a file. If your language provided while loops, you could do this by printing characters from the file one at a time while there are more characters left. If Common Lisp had while loops,1 the code for this procedure would look like this:

(while (peek-char file nil nil)
  (write-char (read-char file)))

If your language only had goto, it becomes much more difficult to implement the procedure. In the end, you have to, in some way, simulate a while loop. One way to code the procedure with just goto is the following. First check if there are any characters left in the file. If there aren’t any, goto the end. Otherwise print the next character and go back to the start. Here is Common Lisp code that implements this:2

(tagbody
  start
  (if (not (peek-char file nil nil))
      (go end))
  (write-char (read-char file))
  (go start)
  end)

Not only is the version with goto much more verbose, it is also much harder to understand. The code lacks clarity because goto is so general. It gives you no context into how it is being used. The reader of the code will have to think about the positioning of all of the gotos before they can think about the overall flow of the program. On the other hand, in the version with the while loop, merely the fact that a while loop is being used gives whoever is reading the code a decent idea of the control flow.

In reality all loops are eventually compiled down to gotos. Whenever the compiler for a language that provides loops sees a loop, it generates code that simulates the loop through goto. You can do the same thing with Lisp macros!

If you don’t know, Lisp macros are compile time functions which take code as their input and return code as their output. When Lisp code is being compiled, all of the macros in the code are called and each one is replaced with its result. This means you can write a macro that looks like a while loop when you use it, but at compile time generates code to simulate a while loop through goto. You are in effect adding while loops to the Lisp compiler! Here is code that defines such a macro:

(defmacro while (test &body body)
  (let ((gtop (gensym))
        (gend (gensym)))
    `(tagbody
       ,gtop
       (if (not ,test)
           (go ,gend))
       ,@body
       (go ,gtop)
       ,gend)))

With this macro, the first code example is now valid lisp code! The while macro takes as arguments a test and a body. It then generates code that uses the method used in the second example to simulate a while loop with goto. You can actually see what the first example looks like after expanding the macro by using the function macroexpand. Here is what the generated code looks like:

(tagbody
  #:g729
  (if (not (peek-char file nil nil))
      (go #:g730))
  (write-char (read-char file))
  (go #:g729)
  #:g730)

The generated code is the exact same as the code in the second example except for the names of the labels. This means the two examples are the same functionally! The only real difference between them is that the first one is expressed in terms of loops, and the second one is expressed in terms of goto. Since it is so much easier to think in terms of loops than goto, there is no reason why you wouldn’t use the first example over the second.

Macros allow you to build any feature you want as long as it is possible to simulate that feature through lower level features. With respect to goto, this means you can build any kind of control flow construct you want by simulating it with goto and then putting a macro on top. In Common Lisp, all of the looping constructs (do, do*, dotimes, dolist, loop) are really just macros that expand into goto. This is what Alan Kay meant when he said “Lisp isn’t a language, it’s a building material”. It bears repeating. In Lisp, you can build any feature you want as long as it is possible to simulate that feature in terms of lower level features.

Defmemo

In my last post I talked about memoization i.e. caching the results of a function. Memoization is a fairly common technique for optimization. It is common enough to warrant writing a macro that makes it easy to define memoized functions. When demonstrating memoization, I had a memoized Fibonacci function that looked like this:1

(let ((table (make-hash-table)))
  (defun fib (n)
    (or (gethash n table)
        (setf (gethash n table)
              (if (<= 0 n 1)
                  n
                  (+ (fib (- n 1))
                     (fib (- n 2))))))))

There are a couple problems with the above code. One problem is the boilerplate. If you wanted ten different memoized functions, you would have to copy lines 1, 3, and 4 for every single memoized function. Some people like to call programmers who do this needless duplication, “human compilers”, since they are writing code that the compiler should be writing for them.

Another issue with the above code is the lack of abstraction. If you wanted to change the caching mechanism to say, only cache the last hundred values, you would have to change the definition of every single function! Ideally you would only need to modify the code in one place in order to change how the caching is implemented.

Defmemo is one way to solve both of these problems. Here is what the above code would look like if it were were to use defmemo:

(defmemo fib (n)
  (if (<= 0 n 1)
      n
      (+ (fib (- n 1))
         (fib (- n 2)))))

Defmemo solves both of the problems extremely well. It removes all of the differences between the memoized version on the regular version except for having to use “defmemo” instead of “defun”. Defmemo also solves the abstraction problem by moving all of the code relevant to memoization into the body of defmemo. If you want to change how memoization works, all you have to do is change the code for defmemo.

Now for the implementation of defmemo. The implementation is made up of two separate parts. First, a higher order function, memo, which takes a function as an argument, and returns a memoized version of that function. The second part is the actual macro, defmemo. Instead of just defining the function like defun, defmemo first builds a lambda expression for the body. Then it generates code that calls memo on that lambda function. Finally defmemo uses the result of memo as the implementation of the function being defined.2

Here is the code for memo:34

(defun memo (f)
  (let ((cache (make-hash-table :test #'equalp)))
    (lambda (&rest args)
      (or (gethash args cache)
          (setf (gethash args cache)
                (apply f args))))))

Memo works by returning a function that has an internal hash-table. When that function is called, it first checks its hash-table to see if it has been called with the same arguments before. If so, it returns the value it had calculated the first time it was called.5 If it hasn’t been called with the same arguments before, the function will instead call the function that was passed in to memo, and then store the result of that inside the table. This way, if the memoized function is called with the same arguments a second time, it can just look up the result in the table.

Next, for defmemo itself, we need to generate code that takes the body as a lambda expression, passes that lambda function through memo, and uses that as the implementation of the function. One way to set the implementation of a function to be a lambda function is to use setf with symbol-function.6 For example, here is how you could set the implementation of square to be a lambda function that squares its argument:

(setf (symbol-function 'square) (lambda (x) (* x x)))

(square 5) => 25

Based on the paragraph above, here is the code for defmemo:

(defmacro defmemo (name args &body body)
 `(setf (symbol-function ',name) 
        (memo (lambda ,args ,@body))))

Now instead of defining a function with defun, we can define it with defmemo and it will automatically be memoized! Defmemo is a great example of how you can define your own ways to define functions. Many libraries provide similar features in which you use the same syntax as defun, only with a bit of magic thrown in.

Or=

This post makes use of places. If you are unfamiliar with places, see my post Getting Places.

There are many cases where caching the results of a function (also called memoization), make a function much more efficient. For example a function that calculates the Fibonacci numbers:

(defun fib (n)
  (if (<= 0 n 1)
      n
      (+ (fib (- n 1))
         (fib (- n 2)))))

If you try running fib on different values, you will notice that around 35 or so, it starts to take quite a long time to run. The problem is that fib calculates the smaller Fibonacci numbers many more times than it needs to. When calculating the 35th Fibonacci number, the second Fibonacci number is calculated a total of 5702887 times.1

This is where memoization comes in. If the above function were memoized, it would only need to calculate each Fibonacci number once. Then, whenever fib is asked to calculate a number it has already calculated, it can just look up the result in the table. Here is what the above code would look like if it were to take advantage of memoization:

(let ((table (make-hash-table)))
  (defun fib (n)
    (or (gethash n table)
        (setf (gethash n table)
              (if (<= 0 n 1)
                  n
                  (+ (fib (- n 1))
                     (fib (- n 2))))))))

With the memoized version, you will hit a stack overflow before you find a value that takes more than a moment to calculate. The problem with the above implementation is that it has some duplicate code. There are two calls made to gethash. The first call checks to see if the value has already been calculated. If not, fib calculates the value manually, and then uses the second call to store it into the table. The fact that the gethash call is repeated may not seem like a problem, but when the expression for the place is more complicated, it can become a much bigger deal.

Or= is a macro that fixes this problem. It does so by first checking whether its first argument, which should be a place, has a non-nil value.2 If it does, or= will just return that value. Otherwise it evaluates its remaining arguments until one of them evaluates to a non-nil value. Or= will then write the value of that expression into the place designated by the first argument. Here is the above code rewritten to use or=.

(let ((table (make-hash-table)))
  (defun fib (n)
    (or= (gethash n table)
         (if (<= 0 n 1)
             n
             (+ (fib (- n 1))
                (fib (- n 2)))))))

The implementation of or= looks very similar to the incf ‘template’3 that is used when writing a macro that works with places. Here is the implementation of or=:

(defmacro or= (place &rest args)
  (multiple-value-bind 
        (temps exprs stores store-expr access-expr) 
      (get-setf-expansion place)
    `(let* (,@(mapcar #'list temps exprs)
            (,(car stores) (or ,access-expr ,@args)))
       ,store-expr)))

This time, the value being stored to the place is the or of the place and whatever other arguments are passed in. Since or evaluates its arguments lazily, we get the desired behavior of or= – evaluate the expression (and store the result) only if the place doesn’t have a value already. One problem with or= is that it determines if a value has already been stored in the place by testing if the value is non-nil. This can lead to a problem if the value stored in the place is actually nil! As an exercise, try writing a version of or= that takes advantage of the multiple values returned by gethash in order to properly handle nil.

In my next post, I am going to continue with the memoization example and demonstrate how to write a macro defmemo, which makes it easy to define memoized functions.

Zap

This post makes use of places. If you are unfamiliar with how places work, see my post Getting Places.

Many languages provide syntactic sugar for evaluating an expression involving a variable and assigning the result of that expression to the variable at the same time. In these languages you can do something such as the following:

x += 5

The above expression both adds five to the value of x and writes that new value back to x. In this post, I’m going to show you how you can write a macro zap that is a generalized version of this technique. With zap the above example would look like the following:

(zap #'+ x 5)

There are a couple things that make zap really cool. First of all, it can be used with any function. For example, if you wanted to cube the value in x, you could use the following:

(zap #'expt x 3)

The other thing that makes zap so awesome is that it can be used on any place. If you want to use zap on the value stored in a hash table with key 5, you can do that:

(zap #'+ (gethash 5 table) 5)

Now that you’ve seen how zap is used, here is how it can be implemented:

(defmacro zap (fn place &rest args)
  (multiple-value-bind 
        (temps exprs stores store-expr access-expr) 
      (get-setf-expansion place)
    `(let* (,@(mapcar #'list temps exprs)
            (,(car stores) 
              (funcall ,fn ,access-expr ,@args)))
       ,store-expr)))

You should be able to see that the code for zap is eerily similar to that of incf (from Getting Places). They are the exact same except instead of binding the gensym that will hold the new value to one plus the value already in the place:

(,(car stores) (+ 1 ,access-expr))

The gensym is bound to the result of calling the function with the value in the place and all of the other arguments passed to zap:

(,(car stores) (funcall ,fn ,access-expr ,@args))

Although zap is just a nice syntactic shortcut, it is a great example of the crazy things you can do with places.

Getting Places

This post will serve as an introduction to writing macros that work with places. I will refer back to it whenever I examine a macro which deals with places.

Places are an incredible part of Common Lisp. In short, a “place” is any location that can hold a value. The obvious example of a place is a variable. Less obvious examples include the elements of an array, or the slots of an object. What makes the concept of places special is that Common Lisp provides a standard interface for reading and writing to them. You can write macros on top of this interface that work for every kind of place. As an example, look at the macro incf. It takes a place as an argument, adds one to its value, and stores the new value back into the place. If you want to increment a variable x, you would use:

(incf x)

And if you wanted to increment the element at index x of a sequence, you would use:

(incf (elt seq x))

They use the exact same syntax even though a variable is very different from an element of a sequence. Because it takes advantage of the interface for places, incf will work on any place, be it a variable, the slot of an object, or a user defined place.

So at this point you are probably wondering how does incf work and more generally, how do you write macros that use places? To write such a macro, you need to use the function get-setf-expansion.1 Get-setf-expansion takes an expression representing a place and returns a total of five values (if you are unfamiliar with multiple values, see my post on multiple-value-bind). Altogether, these five values tell you everything you need to know about the place in order to read and write to it.

To show you how you are supposed to use get-setf-expansion, I’m first going to demonstrate how you could use it to write the expansion of incf by hand. After that, I will show code that will automate this, which winds up being an implementation of incf. Let’s start by writing the expansion of the example above. The one where the element of a sequence is being incremented. To write the expansion of that by hand, you would first call get-setf-expansion to obtain all of the information:2

(get-setf-expansion '(elt seq x))

In SBCL this call will return the following values:

;; (1) temps
(#:seq1017 #:x1016)

;; (2) exprs
(seq x) 

;; (3) stores
(#:new1015) 

;; (4) store-expr
(sb-kernel:%setelt #:seq1017 #:x1016 #:new1015) 

;; (5) access-expr
(elt #:seq1017 #:x1016))

From now on, I will refer to each value returned by get-setf-expansion by the name in the comment before it (e.g. temps refers to the first value).

In order to uniquely identify the element of a sequence (the place we are working with in this example), you need two things. You need the sequence itself and the index into the sequence. That is exactly what the two expressions in exprs evaluate to! Since incf needs to use these values multiple times, the two values have to be bound to gensyms in order to prevent multiple evaluation (see my post on once-only for why multiple evaluation is a problem). You are supposed to bind the values of the expressions to the gensyms in temps so that the other expressions returned by get-setf-expansion can use those gensyms to easily determine the place being worked with. The bindings need to be made with let* because it is possible for an expression in exprs to refer to the value of a previous expression in exprs. So the first part of the expansion will bind all of the symbols in temps to values of the expressions in exprs with let*:

(let* ((#:seq1017 seq) (#:x1016 x))
  ...)

Now the gensyms in temps can be used to uniquely identify the place. As I mentioned previously, the other expressions can now easily determine the place through the gensyms. For example, access-expr can be used to retrieve the value currently in the place. Since the place we are dealing with is the element of a sequence,  access-expr is just a call to elt using the gensyms in temps as the arguments. We are going to use access-expr in a moment, but first I have to talk about how to write to the place.

In order to write to the place, you need to use stores and store-exprStores is a list of gensyms that need to be bound to the values that are to be stored in the place (it is possible for a single place to hold multiple values).  In this case we want to bind the gensym in stores to one plus the value already in the place. We can easily obtain the value in the place through access-expr. Once the gensyms have been bound, you can use store-expr to actually write the values in stores to the place. Notice how store-expr is a call to an internal SBCL function sb-kernel:setelt% that uses the gensyms in temps and stores as arguments. Presumably sb-kernel:setelt% sets the element of a sequence. After adding the binding for the gensym in stores and store-expr, we wind up with the final expansion which looks like:3

(let* ((#:seq1017 seq) 
       (#:x1016 x) 
       (#:new1015 (+ 1 (elt #:seq1017 #:x1016))))
  (sb-kernel:%setelt #:seq1017 #:x1016 #:new1015))
 

To review, the above code first binds the gensyms in temps to the values of the expressions in exprs. This allows access-expr and store-expr to use the gensyms in temps in order to determine the place being worked with. Then the code uses access-expr to retrieve the value, adds one to that, and binds that value to the gensym in stores. This is because the value of the gensym in stores is ultimately going to be the one written to the place. Finally the code evaluates store-expr in order to actually store the value in the gensym into the place.

Now here is one possible implementation of incf,4 which is code for everything we just did by hand. I called it incf% so that it doesn’t have the same name as the builtin version.

(defmacro incf% (place)
  (multiple-value-bind
        (temps exprs stores store-expr access-expr)
      (get-setf-expansion place)
    `(let* (,@(mapcar #'list temps exprs)
            (,(car stores) (+ 1 ,access-expr)))
       ,store-expr)))

The above code first binds the five values returned by get-setf-expansion to variables. It then generates a let* binding which binds the symbols in temps to the expressions in exprs and also binds the gensym in stores to one plus the result of evaluating access-expr. Finally the above code splices in store-expr to actually write the value. And that is everything there is to incf.

Incf is but a single example of what can be done with places. In the next couple of posts, I plan to cover some really cool macros that encapsulate a bunch of common patterns related to places.

Defasm

This post is the second part of a two part series exploring the emulator cl-6502. If you haven’t read the first part exploring the implementation of addressing modes in cl-6502, you can find it here.

This post is going to go over how cl-6502 implements the instruction set of the 6502. Most of the work in defining the instruction set is done by a single macro, defasm. But before I can go into the details of defasm, I have to explain how cl-6502 represents instructions.

cl-6502 represents each instruction as a function inside an array called *array-funs*. The function for a specific instruction is indexed by that instruction’s opcode.1 To execute an instruction, cl-6502 looks up the opcode of the current instruction and calls the function at that location inside of *array-funs*. There is also a second array, *opcode-metadata*, which keeps track of some metadata about each instruction such as the number of bytes each one takes up. All defasm does is make it easy to generate all of the functions and metadata that wind up inside of those two arrays.

To show you just how easy it is to implement instructions with defasm, here is the implementation of the adc (add with carry) instruction:

(defasm adc (:docs "Add to Accumulator with Carry")
    ((#x61 6 2 indirect-x)
     (#x65 3 2 zero-page)
     (#x69 2 2 immediate)
     (#x6d 4 3 absolute)
     (#x71 5 2 indirect-y)
     (#x75 4 2 zero-page-x)
     (#x79 4 3 absolute-y)
     (#x7d 4 3 absolute-x))

  (let ((result (+ (cpu-ar cpu) 
                   (getter) 
                   (status-bit :carry))))
    (set-flags-if 
      :carry (> result #xff)
      :overflow (overflow-p result (cpu-ar cpu) (getter))
      :negative (logbitp 7 result)
      :zero (zerop (wrap-byte result)))
    (setf (cpu-ar cpu) (wrap-byte result))))

There are two main parts to the above code. The first part specifies all of the addressing modes the instruction is compatible with along with the metadata for each variant of the instruction (there is a different version of the instruction for every possible addressing mode the instruction can be used with).

After that is the body – the code that actually implements the instruction being defined. The body is responsible for setting all of the appropriate flags and memory locations to the values they should have after executing the instruction. Make sure you note that just like in defaddress, the variable cpu can be used in the body to reference an object that represents the current state of the cpu.

Defasm takes these two pieces, and generates one lambda expression for each variant of the instruction. All of the generated lambda expressions use the same body, except defasm generates some additional code that allows the body to work across all of the different addressing modes.

Now to get into the specifics of the DSL. In the addressing mode part of the DSL, there are four pieces of metadata that need to be associated with each version of the instruction. The first part is the opcode, the machine code representation of the instruction. Next up is the number of cycles it takes for the instruction to execute. After that is the size of the instruction, the number of bytes it takes up in memory. Last is the name of the addressing mode used for that specific variant of the instruction. As an example, here is the metadata for the adc instruction in the indirect-x addressing mode:

(#x61 6 2 indirect-x)

What it is saying is that this version of the instruction has the opcode #x61, takes six cycles to run, takes two bytes in memory, and uses the indirect-x addressing mode. The fact that when an instruction is used in different addressing modes, it uses a different number of clock cycles and takes up a different amount of space is one reason why different addressing modes are provided in assembly language.

For the body, defasm does something very clever to have the body work for every possible addressing modes. Within the body, the functions getter and setter are bound to local functions that can be used to obtain and modify the argument to the instruction. For each variant of the instruction, defasm generates the definition of these two functions differently so that they will always calculate the correct argument for the given addressing mode.

For example, in the version of adc that uses immediate addressing, getter will just return the value of the operand, but in the version that uses absolute addressing, getter will use the operand as an address and look up the value at that location in memory. In the definition of the adc instruction above, the body uses getter to obtain the argument, adds that to the value in the accumulator, adds in the carry, and then sets all of the appropriate flags and registers depending on the final value it winds up with. Since getter and setter work across all of the different addressing modes, so does the body!

Now let’s look at the actual implementation of defasm:

(defmacro defasm (name (&key (docs "") raw-p (track-pc t))
                  modes &body body)
  `(progn

     ,@(loop for (op cycles bytes mode) in modes collect
         `(setf (aref *opcode-meta* ,op) 
                ',(list name docs cycles bytes mode)))

     ,@(loop for (op cycles bytes mode) in modes collect
         `(setf (aref *opcode-funs* ,op)
                (lambda (cpu)
                  (incf (cpu-pc cpu))
                  (flet ((getter ()
                           ,(make-getter name mode raw-p))
                         (setter (x)
                           (setf (,mode cpu) x)))
                    ,@body)
                  ,@(when track-pc
                     `((incf (cpu-pc cpu) ,(1- bytes))))
                  (incf (cpu-cc cpu) ,cycles))))))

As usual, I’m going to show a snippet of the implementation of defasm and then show what the macroexpansion of that piece looks like. The first part of the implementation handles the addressing modes and metadata:

(loop for (op cycles bytes mode) in modes collect
  `(setf (aref *opcode-meta* ,op) 
         ',(list name docs cycles bytes mode)))

For each addressing mode, this generates code which will store a list containing the metadata into the proper place in the *opcode-meta* array. In other words it takes each part that looks like:

(#x61 6 2 indirect-x)
and generates code that looks like:
(setf (aref *opcode-meta* #x61)
     '(adc "Add to accumulator with carry" 6 2 indirect-x))

After that we have the part that will generate the actual lambda expressions for the functions that will be stored in *array-funs*:

(loop for (op cycles bytes mode) in modes collect
  `(setf (aref *opcode-funs* ,op)
         (lambda (cpu)
           (incf (cpu-pc cpu))
           (flet ((getter ()
                   ,(make-getter name mode raw-p))
                 (setter (x)
                   (setf (,mode cpu) x)))
             ,@body)
          ,@(when track-pc
              `((incf (cpu-pc cpu) ,(1- bytes))))
          (incf (cpu-cc cpu) ,cycles))))

This code loops over all of the metadata for the different addressing modes and uses this information to generate the expression for each variant of the instruction. As mentioned previously, the function will be stored by the variant’s opcode. As for the actual function itself, it does something along these lines. First, it advances the pc. This is done so that the pc now points to the operand of the instruction. By doing this, the job of defaddress becomes much easier since it can use the pc as a pointer to the operand. Next, the function evaluates the body in an environment with getter and setter bound to functions that can be used to read and write to the argument. After that it will advance the pc forward to the next instruction (unless track-pc was false, which happens for instructions that modify the pc themselves such as jumps). Finally, the function will increment the cycle count by the number of cycles it takes the instruction to execute.

The definitions of getter and setter are really just calls to the function with the same name as the addressing mode associated with the variant of the instruction.2 If you look back at the last post, you will see that defaddress automatically generates these “mode” functions. All they do is calculate the effective argument for the given addressing mode! Exactly what getter does. As an example of what the expansion looks like, here is the lambda expression generated for the adc instruction in the indirect-x addressing mode.

(setf (aref *opcode-funs* #x61)
      (lambda (cpu)
        (incf (cpu-pc cpu))
        (flet ((getter ()
                 (get-byte (indirect-x cpu)))
               (setter (x)
                 (setf (indirect-x cpu) x)))
         (let ((result (+ (cpu-ar cpu) 
                          (getter) 
                          (status-bit :carry))))
          (set-flags-if :carry (> result 255) 
                        :overflow (overflow-p result 
                                              (cpu-ar cpu)
                                              (getter))
                        :negative (logbitp 7 result) 
                        :zero (zerop (wrap-byte result)))
          (setf (cpu-ar cpu) (wrap-byte result))))
        (incf (cpu-pc cpu) 1)
        (incf (cpu-cc cpu) 6)))

And that’s all there is to defasm! There are a couple really cool things you should note about cl-6502. First off, the macros expand into a lot of code. The definition of adc at the beginning of this post expands into roughly 500 lines of code. Here is a link to a gist of it if you want to see it. More incredibly, cl-6502 implements an entire emulator in under 1000 lines of code. cl-6502 is a fantastic example of how effective macros are at creating concise DSLs.