This post is the second part of a two part series exploring the emulator cl-6502. If you haven’t read the first part exploring the implementation of addressing modes in cl-6502, you can find it here.
This post is going to go over how cl-6502 implements the instruction set of the 6502. Most of the work in defining the instruction set is done by a single macro, defasm. But before I can go into the details of defasm, I have to explain how cl-6502 represents instructions.
cl-6502 represents each instruction as a function inside an array called *array-funs*. The function for a specific instruction is indexed by that instructions opcode.1 To execute an instruction, cl-6502 looks up the opcode of the current instruction and calls the function at that location inside of *array-funs*. There is also a second array, *opcode-metadata*, which keeps track of some metadata about each instruction such as the number of bytes each one takes up. All defasm does is make it easy to generate all of the functions and metadata that wind up inside of those two arrays.
To show you just how easy it is to implement instructions with defasm, here is the implementation of the adc (add with carry) instruction:
(defasm adc (:docs "Add to Accumulator with Carry") ((#x61 6 2 indirect-x) (#x65 3 2 zero-page) (#x69 2 2 immediate) (#x6d 4 3 absolute) (#x71 5 2 indirect-y) (#x75 4 2 zero-page-x) (#x79 4 3 absolute-y) (#x7d 4 3 absolute-x)) (let ((result (+ (cpu-ar cpu) (getter) (status-bit :carry)))) (set-flags-if :carry (> result #xff) :overflow (overflow-p result (cpu-ar cpu) (getter)) :negative (logbitp 7 result) :zero (zerop (wrap-byte result))) (setf (cpu-ar cpu) (wrap-byte result))))
There are two main parts to the above code. The first part specifies all of the addressing modes the instruction is compatible with along with the metadata for each variant of the instruction (there is a different version of the instruction for every possible addressing mode the instruction can be used with).
After that is the body the code that actually implements the instruction being defined. The body is responsible for setting all of the appropriate flags and memory locations to the values they should have after executing the instruction. Make sure you note that just like in defaddress, the variable cpu can be used in the body to reference an object that represents the current state of the cpu.
Defasm takes these two pieces, and generates one lambda expression for each variant of the instruction. All of the generated lambda expressions use the same body, except defasm generates some additional code that allows the body to work across all of the different addressing modes.
Now to get into the specifics of the DSL. In the addressing mode part of the DSL, there are four pieces of metadata that need to be associated with each version of the instruction. The first part is the opcode, the machine code representation of the instruction. Next up is the number of cycles it takes for the instruction to execute. After that is the size of the instruction, the number of bytes it takes up in memory. Last is the name of the addressing mode used for that specific variant of the instruction. As an example, here is the metadata for the adc instruction in the indirect-x addressing mode:
(#x61 6 2 indirect-x)
What it is saying is that this version of the instruction has the opcode #x61, takes six cycles to run, takes two bytes in memory, and uses the indirect-x addressing mode. The fact that when an instruction is used in different addressing modes, it uses a different number of clock cycles and takes up a different amount of space is one reason why different addressing modes are provided in assembly language.
For the body, defasm does something very clever to have the body work for every possible addressing modes. Within the body, the functions getter and setter are bound to local functions that can be used to obtain and modify the argument to the instruction. For each variant of the instruction, defasm generates the definition of these two functions differently so that they will always calculate the correct argument for the given addressing mode.
For example, in the version of adc that uses immediate addressing, getter will just return the value of the operand, but in the version that uses absolute addressing, getter will use the operand as an address and look up the value at that location in memory. In the definition of the adc instruction above, the body uses getter to obtain the argument, adds that to the value in the accumulator, adds in the carry, and then sets all of the appropriate flags and registers depending on the final value it winds up with. Since getter and setter work across all of the different addressing modes, so does the body!
Now lets look at the actual implementation of defasm:
(defmacro defasm (name (&key (docs "") raw-p (track-pc t)) modes &body body) <code>(progn ,@(loop for (op cycles bytes mode) in modes collect </code>(setf (aref *opcode-meta* ,op) ',(list name docs cycles bytes mode))) ,@(loop for (op cycles bytes mode) in modes collect <code>(setf (aref *opcode-funs* ,op) (lambda (cpu) (incf (cpu-pc cpu)) (flet ((getter () ,(make-getter name mode raw-p)) (setter (x) (setf (,mode cpu) x))) ,@body) ,@(when track-pc </code>((incf (cpu-pc cpu) ,(1- bytes)))) (incf (cpu-cc cpu) ,cycles))))))
As usual, I’m going to show a snippet of the implementation of defasm and then show what the macroexpansion of that piece looks like. The first part of the implementation handles the addressing modes and metadata:
(loop for (op cycles bytes mode) in modes collect <code>(setf (aref *opcode-meta* ,op) ',(list name docs cycles bytes mode)))
For each addressing mode, this generates code which will store a list containing the metadata into the proper place in the *opcode-meta* array. In other words it takes each part that looks like:
(#x61 6 2 indirect-x)
1 |
and generates code that looks like: |
(setf (aref *opcode-meta* #x61) '(adc "Add to accumulator with carry" 6 2 indirect-x))
After that we have the part that will generate the actual lambda expressions for the functions that will be stored in *array-funs*:
(loop for (op cycles bytes mode) in modes collect </code>(setf (aref *opcode-funs* ,op) (lambda (cpu) (incf (cpu-pc cpu)) (flet ((getter () ,(make-getter name mode raw-p)) (setter (x) (setf (,mode cpu) x))) ,@body) ,@(when track-pc `((incf (cpu-pc cpu) ,(1- bytes)))) (incf (cpu-cc cpu) ,cycles))))
This code loops over all of the metadata for the different addressing modes and uses this information to generate the expression for each variant of the instruction. As mentioned previously, the function will be stored by the variants opcode. As for the actual function itself, it does something along these lines. First, it advances the pc. This is done so that the pc now points to the operand of the instruction. By doing this, the job of defaddress becomes much easier since it can use the pc as a pointer to the operand. Next, the function evaluates the body in an environment with getter and setter bound to functions that can be used to read and write to the argument. After that it will advance the pc forward to the next instruction (unless track-pc was false, which happens for instructions that modify the pc themselves such as jumps). Finally, the function will increment the cycle count by the number of cycles it takes the instruction to execute.
The definitions of getter and setter are really just calls to the function with the same name as the addressing mode associated with the variant of the instruction.2 If you look back at the last post, you will see that defaddress automatically generates these mode functions. All they do is calculate the effective argument for the given addressing mode! Exactly what getter does. As an example of what the expansion looks like, here is the lambda expression generated for the adc instruction in the indirect-x addressing mode.
(setf (aref *opcode-funs* #x61) (lambda (cpu) (incf (cpu-pc cpu)) (flet ((getter () (get-byte (indirect-x cpu))) (setter (x) (setf (indirect-x cpu) x))) (let ((result (+ (cpu-ar cpu) (getter) (status-bit :carry)))) (set-flags-if :carry (> result 255) :overflow (overflow-p result (cpu-ar cpu) (getter)) :negative (logbitp 7 result) :zero (zerop (wrap-byte result))) (setf (cpu-ar cpu) (wrap-byte result)))) (incf (cpu-pc cpu) 1) (incf (cpu-cc cpu) 6)))
And that’s all there is to defasm! There are a couple really cool things you should note about cl-6502. First off, the macros expand into a lot of code. The definition of adc at the beginning of this post expands into roughly 500 lines of code. Here is a link to a gist of it if you want to see it. More incredibly, cl-6502 implements an entire emulator in under 1000 lines of code. cl-6502 is a fantastic example of how effective macros are at creating concise DSLs.