3 Execution Model
The execution model of a86 programs is the same as that of x86, but this section gives a brief overview of the most important aspects.
Execution proceeds instruction by instructions. Instructions may read or modify the state of Registers, Flags, and Memory and, except for jumping instructions, execution proceeds to the next instruction in memory.
3.1 Flags
The processor makes use of flags to handle comparisons. For our purposes, there are four flags to be aware of: zero (ZF), sign (SF), carry (CF), and overflow (OF).
These flags are set by each of the arithmetic operations, which are appropriately annotated in the Instruction Set. Each of these operations is binary (meaning they take two arguments), and the flags are set according to properties of the result of the arithmetic operation. Many of these properties look at the most-significant bit (MSB) of the inputs and output.
ZF is set when the result is 0.
SF is set when the MSB of the result is set.
CF is set when a bit was set beyond the MSB.
OF is set when one of two conditions is met:
The MSB of each input is set and the MSB of the result is not set.
The MSB of each input is not set and the MSB of the result is set.
Note that CF is only useful for unsigned arithmetic, while OF is only useful for signed arithmetic. In opposite cases, they provide no interesting information.
These flags, along with many others, are stored in a special FLAGS register that cannot be accessed by normal means. Each flag is represented by a single bit in the register, and they all have specific bits assigned by the x86 specification. For example, CF is bit 0, ZF is bit 6, SF is bit 7, and OF is bit 11, as indexed from the least-significant bit position (but you don’t need to know these numbers).
The various conditions that can be tested for correspond to combinations of the flags. For example, the Jc instruction will jump if CF is set, otherwise execution will fall through to the next instruction. Most of the condition suffixes are straightforward to deduce from their spelling, but some are not. The suffixes (e.g., the c in Jc) and their meanings are given below. For brevity’s sake the flags’ names are abbreviated by ommitting the F suffix and prefixing them with either + or - to indicate set and unset positions, respectively, as needed. Some of the meanings require use of the bitwise operators | (OR), & (AND), ^ (XOR), and =? (equality).
Suffix | Flag | Suffix | Flag |
z | +Z | nz | -Z |
e | +Z | ne | -Z |
s | +S | ns | -S |
c | +C | nc | -C |
o | +O | no | -O |
l | (S ^ O) | g | (-Z & (S =? O)) |
le | (+Z | (S ^ O)) | ge | (S =? O) |
The e suffix (“equal?”) is just a synonym for the z suffix (“zero?”). This is because it is common to use the Cmp instruction to perform comparisons, but Cmp is actually identical to Sub with the exception that the result is not stored anywhere (i.e., it is only used for setting flags according to subtraction). If two values are subtracted and the resulting difference is zero (ZF is set), then the values are equal.
3.2 Stack
The a86 execution model includes access to memory that can be used as a stack data structure. There are operations that manipulate the stack, such as Push, Pop, Call, and Ret, and the stack register pointer 'rsp is dedicated to the stack. Stack memory is allocated in “low” address space and grows downward. So pushing an element on to the stack decrements 'rsp.
The stack is useful as a way to save away values that may be needed later. For example, let’s say you have two (assembly-level) functions and you want to produce the sum of their results. By convention, functions return their result in 'rax, so doing something like this won’t work:
(seq (Call 'f) (Call 'g) (Add 'rax ...))
The problem is the return value of 'f gets clobbered by 'g. You might be tempted to fix the problem by moving the result to another register:
(seq (Call 'f) (Mov 'rbx 'rax) (Call 'g) (Add 'rax 'rbx))
This works only so long as 'g doesn’t clobber 'rbx. In general, it might not be possible to avoid that situation. So the solution is to use the stack to save the return value of 'f while the call to 'g proceeds:
(seq (Call 'f) (Push 'rax) (Call 'g) (Pop 'rbx) (Add 'rax 'rbx))
This code pushes the value in 'rax on to the stack and then pops it off and into 'rbx after 'g returns. Everything works out so long as 'g maintains a stack-discipline, i.e. the stack should be in the same state when 'g returns as when it was called.
We can make a complete example to confirm that this works as expected. First let’s set up a little function for letting us try out examples:
Examples
> (define (eg asm) (asm-interp (prog (Global 'entry) (Label 'entry) asm ; the example code we want to try out (Ret) (Label 'f) ; calling 'f returns 36 (Mov 'rax 36) (Ret) (Label 'g) ; calling 'g returns 6, but (Mov 'rbx 4) ; it clobbers 'rbx just for the lulz (Mov 'rax 6) (Ret))))
Now let’s try it, using the stack to confirm it does the right thing:
Examples
> (eg (seq (Call 'f) (Push 'rax) (Call 'g) (Pop 'rbx) (Add 'rax 'rbx))) 42
Compare that with the first version that used a register to save the result of 'f:
Examples
> (eg (seq (Call 'f) (Mov 'rbx 'rax) (Call 'g) (Add 'rax 'rbx))) 10
The Push and Pop instructions offer a useful illusion, but of course, there’s not really any data structure abstraction here; there’s just raw memory and registers. But so long as code abides by conventions, the illusion turns out to be the true state of affairs.
What’s really going on under the hood of Push and Pop is that the 'rsp register is decremented and the value is written to the memory location pointed to by the value of 'rsp.
The following code is mostly equivalent to what we wrote above (and we will discuss the difference in the next section):
Examples
> (eg (seq (Call 'f) (Sub 'rsp 8) ; "allocate" a word on the stack (Mov (Offset 'rsp 0) 'rax) ; write 'rax to top frame (Call 'g) (Mov 'rbx (Offset 'rsp 0)) ; load top frame into 'rbx (Add 'rsp 8) ; "deallocate" word on the stack (Add 'rax 'rbx))) 42
As you can see from this code, it would be easy to violate the usual invariants of stack data structure to, for example, access elements beyond the top of the stack. The value of Push and Pop is they make clear that you are using things in a stack-like way and they keep you from screwing up the accesses, offsets, and adjustments to 'rsp.
Just as Push and Pop are useful illusions, so too are Call and Ret. They give the impression that there is a notion of a procedure and procedure call mechanism in assembly, but actually there’s no such thing.
Think for a moment about what it means to “call” 'f in the examples above. When executing (Call 'f), control jumps to the instruction following (Label 'f). When we then get to (Ret), somehow the CPU knows to jump back to the instruction following the (Call 'f) that we started with.
What’s really going on is that (Call 'f) is pushing the address of subsequent instruction on to the stack and then jumping to the label 'f. This works in concert with Ret, which pops the return address off the stack and jumping to it.
Just as we could write equivalent code without Push and Pop, we can write the same code without Call and Ret.
We do need one new trick, which is the Lea instruction, which loads an effective address. You can think of it like Mov except that it loads the address of something rather than what is pointed to by an address. For our purposes, it is useful for loading the address of a label:
(Lea 'rax 'f)
This instruction puts the address of label 'f into rax. You can think of this as loading a function pointer into 'rax. With this new instruction, we can illuminate what is really going on with Call and Ret:
Examples
> (eg (seq (Lea 'rax 'fret) ; load address of 'fret label into 'rax (Push 'rax) ; push the return pointer on to stack (Jmp 'f) ; jump to 'f (Label 'fret) ; <– return point for "call" to 'f (Push 'rax) ; save result (like before) (Lea 'rax 'gret) ; load address of 'gret label into 'rax (Push 'rax) ; push the return pointer on to stack (Jmp 'g) ; jump to 'g (Label 'gret) ; <– return point for "call" to 'g (Pop 'rbx) ; pop saved result from calling 'f (Add 'rax 'rbx))) 42
The above shows how to encode Call as Lea, Push, and Jmp. The encoding of Ret is just:
(seq (Pop 'rbx) (Jmp 'rbx))
While the Push and Pop operations are essentially equivalent to manually adjusting the stack pointer and target register. The one difference is that these special stack-manipulation operations do not set any flags like Add and Sub do. So while you can often choose to manually implement stack manipulation, you’ll need to use these instructions specifically if you want to preserve the condition flags after adjusting the stack.
3.3 Memory
The stack is really just a pointer some location in memory, but it is possible to alloacte, read, and modify memory elsewhere too. The stack memory is allocated by the operating system and the location of this memory is initially placed in the rsp register.
It is possible to statically allocate memory within the program itself using the Data section and psuedo-instructions such as Dq, etc.
For example, this program statically allocates a quad-word (64-bits), initialized to 0. The program then modifies the memory by writing 42 and then returning the value obtained by dereferencing that memory, i.e. 42:
Examples
> (asm-interp (Mov r8 42) (Mov (Offset 'm) r8) (Mov rax (Offset 'm)) (Ret) (Data) (Label 'm) (Dq 0)) 42
It is also possible to dynamically allocate memory. This can be done by a wrapper, written e.g. in C, that allocates memory and passes in a pointer to that memory as an argument to the assembly code. It’s also possible to call standard C library function like malloc to allocate memory within an assembly program.
This program is analogous to the one above, but instead of statically allocating a quad-word of memory, it makes a call to malloc with an argument of 8 in order to allocate 8 bytes of memory. A pointer to the newly allocated memory is returned in rax, which is then written to with the value 42, before being dereferenced and returned:
Examples
> (asm-interp (Mov rdi 8) (Extern 'malloc) (Call 'malloc) (Mov r8 42) (Mov (Offset rax) r8) (Mov rax (Offset rax)) (Ret)) 42