Zap

This post makes use of places. If you are unfamiliar with how places work, see my post Getting Places.

Many languages provide syntactic sugar for evaluating an expression involving a variable and assigning the result of that expression to the variable at the same time. In these languages you can do something such as the following:

x += 5

The above expression both adds five to the value of x and writes that new value back to x. In this post, I’m going to show you how you can write a macro zap that is a generalized version of this technique. With zap the above example would look like the following:

(zap #'+ x 5)

There are a couple things that make zap really cool. First of all, it can be used with any function. For example, if you wanted to cube the value in x, you could use the following:

(zap #'expt x 3)

The other thing that makes zap so awesome is that it can be used on any place. If you want to use zap on the value stored in a hash table with key 5, you can do that:

(zap #'+ (gethash 5 table) 5)

Now that you’ve seen how zap is used, here is how it can be implemented:

(defmacro zap (fn place &rest args)
  (multiple-value-bind 
        (temps exprs stores store-expr access-expr) 
      (get-setf-expansion place)
    `(let* (,@(mapcar #'list temps exprs)
            (,(car stores) 
              (funcall ,fn ,access-expr ,@args)))
       ,store-expr)))

You should be able to see that the code for zap is eerily similar to that of incf (from Getting Places). They are the exact same except instead of binding the gensym that will hold the new value to one plus the value already in the place:

(,(car stores) (+ 1 ,access-expr))

The gensym is bound to the result of calling the function with the value in the place and all of the other arguments passed to zap:

(,(car stores) (funcall ,fn ,access-expr ,@args))

Although zap is just a nice syntactic shortcut, it is a great example of the crazy things you can do with places.

Getting Places

This post will serve as an introduction to writing macros that work with places. I will refer back to it whenever I examine a macro which deals with places.

Places are an incredible part of Common Lisp. In short, a “place” is any location that can hold a value. The obvious example of a place is a variable. Less obvious examples include the elements of an array, or the slots of an object. What makes the concept of places special is that Common Lisp provides a standard interface for reading and writing to them. You can write macros on top of this interface that work for every kind of place. As an example, look at the macro incf. It takes a place as an argument, adds one to its value, and stores the new value back into the place. If you want to increment a variable x, you would use:

(incf x)

And if you wanted to increment the element at index x of a sequence, you would use:

(incf (elt seq x))

They use the exact same syntax even though a variable is very different from an element of a sequence. Because it takes advantage of the interface for places, incf will work on any place, be it a variable, the slot of an object, or a user defined place.

So at this point you are probably wondering how does incf work and more generally, how do you write macros that use places? To write such a macro, you need to use the function get-setf-expansion.1 Get-setf-expansion takes an expression representing a place and returns a total of five values (if you are unfamiliar with multiple values, see my post on multiple-value-bind). Altogether, these five values tell you everything you need to know about the place in order to read and write to it.

To show you how you are supposed to use get-setf-expansion, I’m first going to demonstrate how you could use it to write the expansion of incf by hand. After that, I will show code that will automate this, which winds up being an implementation of incf. Let’s start by writing the expansion of the example above. The one where the element of a sequence is being incremented. To write the expansion of that by hand, you would first call get-setf-expansion to obtain all of the information:2

(get-setf-expansion '(elt seq x))

In SBCL this call will return the following values:

;; (1) temps
(#:seq1017 #:x1016)

;; (2) exprs
(seq x) 

;; (3) stores
(#:new1015) 

;; (4) store-expr
(sb-kernel:%setelt #:seq1017 #:x1016 #:new1015) 

;; (5) access-expr
(elt #:seq1017 #:x1016))

From now on, I will refer to each value returned by get-setf-expansion by the name in the comment before it (e.g. temps refers to the first value).

In order to uniquely identify the element of a sequence (the place we are working with in this example), you need two things. You need the sequence itself and the index into the sequence. That is exactly what the two expressions in exprs evaluate to! Since incf needs to use these values multiple times, the two values have to be bound to gensyms in order to prevent multiple evaluation (see my post on once-only for why multiple evaluation is a problem). You are supposed to bind the values of the expressions to the gensyms in temps so that the other expressions returned by get-setf-expansion can use those gensyms to easily determine the place being worked with. The bindings need to be made with let* because it is possible for an expression in exprs to refer to the value of a previous expression in exprs. So the first part of the expansion will bind all of the symbols in temps to values of the expressions in exprs with let*:

(let* ((#:seq1017 seq) (#:x1016 x))
  ...)

Now the gensyms in temps can be used to uniquely identify the place. As I mentioned previously, the other expressions can now easily determine the place through the gensyms. For example, access-expr can be used to retrieve the value currently in the place. Since the place we are dealing with is the element of a sequence,  access-expr is just a call to elt using the gensyms in temps as the arguments. We are going to use access-expr in a moment, but first I have to talk about how to write to the place.

In order to write to the place, you need to use stores and store-exprStores is a list of gensyms that need to be bound to the values that are to be stored in the place (it is possible for a single place to hold multiple values).  In this case we want to bind the gensym in stores to one plus the value already in the place. We can easily obtain the value in the place through access-expr. Once the gensyms have been bound, you can use store-expr to actually write the values in stores to the place. Notice how store-expr is a call to an internal SBCL function sb-kernel:setelt% that uses the gensyms in temps and stores as arguments. Presumably sb-kernel:setelt% sets the element of a sequence. After adding the binding for the gensym in stores and store-expr, we wind up with the final expansion which looks like:3

(let* ((#:seq1017 seq) 
       (#:x1016 x) 
       (#:new1015 (+ 1 (elt #:seq1017 #:x1016))))
  (sb-kernel:%setelt #:seq1017 #:x1016 #:new1015))
 

To review, the above code first binds the gensyms in temps to the values of the expressions in exprs. This allows access-expr and store-expr to use the gensyms in temps in order to determine the place being worked with. Then the code uses access-expr to retrieve the value, adds one to that, and binds that value to the gensym in stores. This is because the value of the gensym in stores is ultimately going to be the one written to the place. Finally the code evaluates store-expr in order to actually store the value in the gensym into the place.

Now here is one possible implementation of incf,4 which is code for everything we just did by hand. I called it incf% so that it doesn’t have the same name as the builtin version.

(defmacro incf% (place)
  (multiple-value-bind
        (temps exprs stores store-expr access-expr)
      (get-setf-expansion place)
    `(let* (,@(mapcar #'list temps exprs)
            (,(car stores) (+ 1 ,access-expr)))
       ,store-expr)))

The above code first binds the five values returned by get-setf-expansion to variables. It then generates a let* binding which binds the symbols in temps to the expressions in exprs and also binds the gensym in stores to one plus the result of evaluating access-expr. Finally the above code splices in store-expr to actually write the value. And that is everything there is to incf.

Incf is but a single example of what can be done with places. In the next couple of posts, I plan to cover some really cool macros that encapsulate a bunch of common patterns related to places.

Defasm

This post is the second part of a two part series exploring the emulator cl-6502. If you haven’t read the first part exploring the implementation of addressing modes in cl-6502, you can find it here.

This post is going to go over how cl-6502 implements the instruction set of the 6502. Most of the work in defining the instruction set is done by a single macro, defasm. But before I can go into the details of defasm, I have to explain how cl-6502 represents instructions.

cl-6502 represents each instruction as a function inside an array called *array-funs*. The function for a specific instruction is indexed by that instruction’s opcode.1 To execute an instruction, cl-6502 looks up the opcode of the current instruction and calls the function at that location inside of *array-funs*. There is also a second array, *opcode-metadata*, which keeps track of some metadata about each instruction such as the number of bytes each one takes up. All defasm does is make it easy to generate all of the functions and metadata that wind up inside of those two arrays.

To show you just how easy it is to implement instructions with defasm, here is the implementation of the adc (add with carry) instruction:

(defasm adc (:docs "Add to Accumulator with Carry")
    ((#x61 6 2 indirect-x)
     (#x65 3 2 zero-page)
     (#x69 2 2 immediate)
     (#x6d 4 3 absolute)
     (#x71 5 2 indirect-y)
     (#x75 4 2 zero-page-x)
     (#x79 4 3 absolute-y)
     (#x7d 4 3 absolute-x))

  (let ((result (+ (cpu-ar cpu) 
                   (getter) 
                   (status-bit :carry))))
    (set-flags-if 
      :carry (> result #xff)
      :overflow (overflow-p result (cpu-ar cpu) (getter))
      :negative (logbitp 7 result)
      :zero (zerop (wrap-byte result)))
    (setf (cpu-ar cpu) (wrap-byte result))))

There are two main parts to the above code. The first part specifies all of the addressing modes the instruction is compatible with along with the metadata for each variant of the instruction (there is a different version of the instruction for every possible addressing mode the instruction can be used with).

After that is the body – the code that actually implements the instruction being defined. The body is responsible for setting all of the appropriate flags and memory locations to the values they should have after executing the instruction. Make sure you note that just like in defaddress, the variable cpu can be used in the body to reference an object that represents the current state of the cpu.

Defasm takes these two pieces, and generates one lambda expression for each variant of the instruction. All of the generated lambda expressions use the same body, except defasm generates some additional code that allows the body to work across all of the different addressing modes.

Now to get into the specifics of the DSL. In the addressing mode part of the DSL, there are four pieces of metadata that need to be associated with each version of the instruction. The first part is the opcode, the machine code representation of the instruction. Next up is the number of cycles it takes for the instruction to execute. After that is the size of the instruction, the number of bytes it takes up in memory. Last is the name of the addressing mode used for that specific variant of the instruction. As an example, here is the metadata for the adc instruction in the indirect-x addressing mode:

(#x61 6 2 indirect-x)

What it is saying is that this version of the instruction has the opcode #x61, takes six cycles to run, takes two bytes in memory, and uses the indirect-x addressing mode. The fact that when an instruction is used in different addressing modes, it uses a different number of clock cycles and takes up a different amount of space is one reason why different addressing modes are provided in assembly language.

For the body, defasm does something very clever to have the body work for every possible addressing modes. Within the body, the functions getter and setter are bound to local functions that can be used to obtain and modify the argument to the instruction. For each variant of the instruction, defasm generates the definition of these two functions differently so that they will always calculate the correct argument for the given addressing mode.

For example, in the version of adc that uses immediate addressing, getter will just return the value of the operand, but in the version that uses absolute addressing, getter will use the operand as an address and look up the value at that location in memory. In the definition of the adc instruction above, the body uses getter to obtain the argument, adds that to the value in the accumulator, adds in the carry, and then sets all of the appropriate flags and registers depending on the final value it winds up with. Since getter and setter work across all of the different addressing modes, so does the body!

Now let’s look at the actual implementation of defasm:

(defmacro defasm (name (&key (docs "") raw-p (track-pc t))
                  modes &body body)
  `(progn

     ,@(loop for (op cycles bytes mode) in modes collect
         `(setf (aref *opcode-meta* ,op) 
                ',(list name docs cycles bytes mode)))

     ,@(loop for (op cycles bytes mode) in modes collect
         `(setf (aref *opcode-funs* ,op)
                (lambda (cpu)
                  (incf (cpu-pc cpu))
                  (flet ((getter ()
                           ,(make-getter name mode raw-p))
                         (setter (x)
                           (setf (,mode cpu) x)))
                    ,@body)
                  ,@(when track-pc
                     `((incf (cpu-pc cpu) ,(1- bytes))))
                  (incf (cpu-cc cpu) ,cycles))))))

As usual, I’m going to show a snippet of the implementation of defasm and then show what the macroexpansion of that piece looks like. The first part of the implementation handles the addressing modes and metadata:

(loop for (op cycles bytes mode) in modes collect
  `(setf (aref *opcode-meta* ,op) 
         ',(list name docs cycles bytes mode)))

For each addressing mode, this generates code which will store a list containing the metadata into the proper place in the *opcode-meta* array. In other words it takes each part that looks like:

(#x61 6 2 indirect-x)
and generates code that looks like:
(setf (aref *opcode-meta* #x61)
     '(adc "Add to accumulator with carry" 6 2 indirect-x))

After that we have the part that will generate the actual lambda expressions for the functions that will be stored in *array-funs*:

(loop for (op cycles bytes mode) in modes collect
  `(setf (aref *opcode-funs* ,op)
         (lambda (cpu)
           (incf (cpu-pc cpu))
           (flet ((getter ()
                   ,(make-getter name mode raw-p))
                 (setter (x)
                   (setf (,mode cpu) x)))
             ,@body)
          ,@(when track-pc
              `((incf (cpu-pc cpu) ,(1- bytes))))
          (incf (cpu-cc cpu) ,cycles))))

This code loops over all of the metadata for the different addressing modes and uses this information to generate the expression for each variant of the instruction. As mentioned previously, the function will be stored by the variant’s opcode. As for the actual function itself, it does something along these lines. First, it advances the pc. This is done so that the pc now points to the operand of the instruction. By doing this, the job of defaddress becomes much easier since it can use the pc as a pointer to the operand. Next, the function evaluates the body in an environment with getter and setter bound to functions that can be used to read and write to the argument. After that it will advance the pc forward to the next instruction (unless track-pc was false, which happens for instructions that modify the pc themselves such as jumps). Finally, the function will increment the cycle count by the number of cycles it takes the instruction to execute.

The definitions of getter and setter are really just calls to the function with the same name as the addressing mode associated with the variant of the instruction.2 If you look back at the last post, you will see that defaddress automatically generates these “mode” functions. All they do is calculate the effective argument for the given addressing mode! Exactly what getter does. As an example of what the expansion looks like, here is the lambda expression generated for the adc instruction in the indirect-x addressing mode.

(setf (aref *opcode-funs* #x61)
      (lambda (cpu)
        (incf (cpu-pc cpu))
        (flet ((getter ()
                 (get-byte (indirect-x cpu)))
               (setter (x)
                 (setf (indirect-x cpu) x)))
         (let ((result (+ (cpu-ar cpu) 
                          (getter) 
                          (status-bit :carry))))
          (set-flags-if :carry (> result 255) 
                        :overflow (overflow-p result 
                                              (cpu-ar cpu)
                                              (getter))
                        :negative (logbitp 7 result) 
                        :zero (zerop (wrap-byte result)))
          (setf (cpu-ar cpu) (wrap-byte result))))
        (incf (cpu-pc cpu) 1)
        (incf (cpu-cc cpu) 6)))

And that’s all there is to defasm! There are a couple really cool things you should note about cl-6502. First off, the macros expand into a lot of code. The definition of adc at the beginning of this post expands into roughly 500 lines of code. Here is a link to a gist of it if you want to see it. More incredibly, cl-6502 implements an entire emulator in under 1000 lines of code. cl-6502 is a fantastic example of how effective macros are at creating concise DSLs.