Let's write a Toy Emulator in Python

In this tutorial, we will build an Emulator for Ben Eater's 8-bit breadboard computer.

I'm a big fan of Ben Eater because of his video series on 8-bit breadboard computer and 6502 computer. If you haven't seen any of his videos, I recommend you to start with the 8-bit computer playlist. He builds everything from the ground up on a breadboard and teaches you about the internals and how they work.

In this tutorial we will be creating an emulator in python that can run programs such as printing Fibonacci series or Triangular numbers.

You can find the entire code on GitHub

Before we get started, we need to know the internals of the CPU that we are going to emulate.

CPU Specs:

If you know the basics of Microprocessors and Microcontrollers, then you might be familiar with the following words. It's okay even if you don't know about them.

The CPU has three 8 bit registers A, B, IR, a 4 bit address bus, a whopping 16 bytes of RAM, 3 Flags for carry, zero, and halt and a 16 bit OUTPUT register.

What is an Address Bus

A 4 bit address bus can address a memory of 2^4 bytes, in this case it can address the memory from the location 0 to 16 or 0x0 to 0xF in hexadecimal format. Here's how it works.

A BUS is nothing but a bunch of wires connected to some kind of component. Address bus will address a memory area on a CPU and a Data bus does the same thing but, Address bus is used to write data into the CPU and Data bus is used to read the data from the CPU.

For example, your CPU has a 4 bit address bus, and as I told you that a bunch of wires is called as BUS, consider 4 bit address bus as 4 wires and each wire can be turned ON or OFF.

As, there are 4 wires, and each wire can be turned ON or OFF, we will have 2^4 choices as follows.

				
I'll just consider ON as 1 and OFF as 0,
now you'll be left with the following options for 4 wires.

	 ---------------------------------------
	| Binary|           |HEX|          |DEC |
	 ----------------------------------------
	|0 0 0 0| ---------> 0x0 ---------> 0   |
	|0 0 0 1| ---------> 0x1 ---------> 1   |
	|0 0 1 0| ---------> 0x2 ---------> 2   |
	|0 0 1 1| ---------> 0x3 ---------> 3   |
	|0 1 0 0| ---------> 0x4 ---------> 4   |
	|0 1 0 1| ---------> 0x5 ---------> 5   |
	|0 1 1 0| ---------> 0x6 ---------> 6   |
	|0 1 1 1| ---------> 0x7 ---------> 7   |
	|1 0 0 0| ---------> 0x8 ---------> 8   |
	|1 0 0 1| ---------> 0x9 ---------> 9   |
	|1 0 1 0| ---------> 0xA ---------> 10  |
	|1 0 1 1| ---------> 0xB ---------> 11  |
	|1 1 0 0| ---------> 0xC ---------> 12  |
	|1 1 0 1| ---------> 0xD ---------> 13  |
	|1 1 1 0| ---------> 0xE ---------> 14  |
	|1 1 1 1| ---------> 0xF ---------> 15  |
	 ---------------------------------------
So, there are 16 different values, for a 4 bit address bus.

			

Let's consider a memory or RAM as a city and each byte of memory as a restaurant. If you want to order food from a restaurant, how will you do it? Obviously, you need to know the name of it, right? You can order food from a restaurant, only if you know its name. Similarly, if you want to read data from a memory, you need to know it's address. If you have a 4 bit address bus, then you can read 16 bytes of memory, if a you have 32 bit address bus, then you can read 2^32 or 4 GB of memory. So, that's all about memory and address bus.

Registers

Simply consider registers as variables in a programming language. An 8 bit register will store 8 bit value, 16 bit register will store 16 bit value and so on.

Registers are made up of Flip-Flops and Flip-Flops are made up of fundamental gates. There are many types of registers, few of them are Serial In Serial Out shift register(SISO), SIPO, PISO, PIPO, shift registers.

Each Flip-Flop will store one bit, connect a bunch of them and they will be your registers.

Flags

Flags are bunch of bits that set and reset everytime a particular operation occurs on the CPU

For example if you consider a carry flag, it will be always be set to zero, unless there is a carry overflow

There are three registers, CF(carry flag), ZF(zero flag), HALT(halt flag) and they do what they say.

Well, that's all about this CPU.

How does CPU execute programs?

A CPU can only understand two things, a 0 and a 1. That's all! They are dumb. This is called as machine code, they are raw bits, or a bunch of 0s and 1s.

For example: A program that performs 7+3-2 will look something like this

				
0x18 0x29 0x3a 0xe0
0x00 0x00 0x00 0x00
0x07 0x03 0x02 0x00
0x00 0x00 0x00 0x00

well, that's how machine code looks, and it's assembly will look like this

LDA 8
ADD 9
SUB 10
OUT 0
NOP
NOP
NOP
NOP
7
3
2
NOP
NOP
NOP
NOP
NOP

			

Assembly language will represent machine code in pnemonic form. The first line LDA 8 will load the value from the address 8 into the A register. Now, the register A contains the value 7. Then, ADD 9 will add the value at address 9 to the A register, so, the value in A register will be 7+3 = 10. SUB 1O, will subtract the value from the memory address 10, which will set the A register value to 10-2 = 8. OUT 0, will output the A register's value on the screen.

These pnemonics are called as instructions. For every CPU out there, there will be an Instruction set, and each instruction set will be different.

For this CPU, we have the following instruction set. 0000 is NOP, 0001 is LDA and so on. So, if the machine code contains 0000, then the CPU will understand it as NOP, and if the machine code has 0001 1010, then our CPU will consider it as an LDA instruction and loads the value from address 1010 into the A register.

				
 ------------------------------------------------------------------------------------------
| 0000 - NOP --> No Operation                                                              |
| 0001 - LDA --> Load contents of a memory address XXXX into A register                    |
| 0010 - ADD --> Load contents of a memory address XXXX into B register, then performs A+B |
| 0011 - SUB --> Load contents of a memory address XXXX into B register, then performs A-B |
| 0100 - STA --> Store contents of A register at memory address XXXX                       |
| 0101 - LDI --> Load 4 bit immediate value into A register                                |
| 0110 - JMP --> Unconditional jump: sets program counter to XXXX and executes from there  |
| 0111 - JC  --> Jump if carry: sets program counter to XXXX when carry flag is set        |
| 1000 - JZ  --> Jump if zero: sets program counter to XXXX when zero flag is set          |
| 1110 - OUT --> Output contents of A register to 7 segment display                        |
| 1111 - HLT --> Halts the execution                                                       |
 ------------------------------------------------------------------------------------------
			
			

For this particular CPU, every instruction is 4 bits long. So, it will be easy for us to emulate this CPU.

Let's define our CPU class

We just put all the CPU components and initialize them to 0 in the constructor or the __init__() method.

				
class CPU:
    def __init__(self):
        #Program counter
        self.pc = 0

        #Registers A, B, IR
        self.A = 0
        self.B = 0
        self.IR = 0

        #RAM or self.memory, 4 bit address bus --> 2^4 = 16 bytes of RAM
        self.memory = [0]*16

        #Output register
        self.OUT = 0;

        #Flags
        self.CF = False
        self.ZF = False
        self.HALT = False
				
			

We have defined our CPU class, but how do we read the machine code and execute them? Let's first write a method that loads the program or machine code into the CPU's memory or RAM. This program will be at most 16 bytes long.

				
def loadProgram(self, filename):
    #if the file name was specified open it and store it in a buffer
    try:
        with open(filename, "rb") as f:
            buffer = f.read()
            f.close()
    except:
        print(f"Error: cannot open file: {filename}")
        exit()

    #copy it into the CPU's memory
    for i in range(16):
        self.memory[i] = buffer[i];
			
			

loadProgram() method will read the binary file or the machine code and copy the bytes into the CPU's memory. That's all it does.

We initialized our CPU class, created a method that loads the machine code into the CPU's memory. Now, all we need to do is execute the machine code.

Execution cycle:

Programs are executed in a cycle and this is called as execution cycle.

We have seen the machine code above, and how CPU understands the machine code. To execute the program on the CPU, we need to decode the machine code. This machine code containing the instruction + address is called as OPCODE.

Typically, execution cycle has 3 steps, FETCH -> DECODE -> EXECUTE.

Fetch:

In this step, we will fetch the current instruction with the help of Program Counter or the Instruction Register and this is called as opcode.

Decode:

In this step, we will decode the opcode, that is we will seperate out the instruction and the address.

Execute:

In this step, we will execute the instruction and change the values on registers.

Fetching is easy, we store the instruction that is being executed in instruction register as follows

cpu.IR = cpu.memory[cpu.pc]

As I told you earlier, pc is program counter that points to current executing instruction, with the help of pc, we will get the current executing instruction into the instruction register

Decoding the instruction will be complex for bigger CPUs, but for this one, it will be easy, because each instruction on this CPU will always be 8 bits long.

For example, if the machine code is 00011001 or 0x18, then we can decode it with simple bit manipulations as follows

Higher 4 bits represent the instruction and lower 4 bits represent the memory address

				
we have the machine code as follows
00011001

To extract the higher 4 bits, we will perform & operation with 11110000 as follows

      0 0 0 1 1 0 0 1
    & 1 1 1 1 0 0 0 0
   -------------------
      0 0 0 1 0 0 0 0

You can see that the lower 4 bits are all set to zero and higher 4 bits are same as before.
We can now right shift the result by 4 to extract the exact higher 4 bits as follows.

      0 0 0 1 0 0 0 0 >> 4 = 0 0 0 1 -----> This is an LDA instruction

To extract the lower 4 bits, we will perform & operation with 00001111 as follows

      0 0 0 1 1 0 0 1
    & 0 0 0 0 1 1 1 1
   -------------------
      0 0 0 0 1 0 0 1 -----> This is memory address 9

You can see that the higher 4 bits are all set to zero and lower 4 bits are same as before.
We don't need to right shift the result because, 0s on left side of a binary number doesn't count

The same thing can be performed using hexadecimal numbers as follows
00011001 is 0x19 and 11110000 is 0xf0, if we perform & operation between them,
the result will be as follows

      0x19         0x19
    & 0xf0       & 0x0f
   --------     --------
      0x10         0x09
  
			

Once we decode the instruction, we will compare it with our instruction set, and based on that we will perform the operations and the code will look something like this. We will create this method inside the CPU class

				
class CPU:
	def __init__(self):
	#all the previous code goes here

	def execute(self):
	    #decode instruction from opcode by masking higher 4 bits
	    opcode = (self.IR & 0xF0) >> 4;

	    if(opcode == 0x0):
	        #NOP
	        pass;
	    elif(opcode == 0x1):
	        #LDA
	        self.A = self.memory[self.IR & 0x0F]
	    elif(opcode == 0x2):
	        #ADD
	        self.CF = (self.A + self.memory[self.IR & 0x0F]) > 255
	        self.B = self.memory[self.IR & 0x0F]
	        self.A = self.A + self.B
	        self.ZF = self.A == 0
	    elif(opcode == 0x3):
	        #SUB
	        self.CF = (self.A - self.memory[self.IR & 0x0F]) > 255
	        self.B = self.memory[self.IR & 0x0F]
	        self.A = self.A - self.B
	        self.ZF = self.A == 0
	    elif(opcode == 0x4):
	        #STA
	        self.memory[(self.IR & 0x0F)] = self.A
	    elif(opcode == 0x5):
	        #LDI
	        self.A = self.IR & 0x0F
	    elif(opcode == 0x6):
	        #JMP - jump to address (self.IR & 0x0F) and do not increment the program counter
	        self.pc = (self.IR & 0x0F) - 1
	    elif(opcode == 0x7):
	        #JC
	        if self.CF:
	            self.pc = (self.IR & 0x0F) - 1
	    elif(opcode == 0x8):
	        #JZ
	        if self.ZF:
	            self.pc = (self.IR & 0x0F) - 1
	    elif(opcode == 0xE):
	        #OUT
	        self.OUT = self.A
	        print(f"OUT : {self.OUT}")
	    elif(opcode == 0xF):
	        #HLT
	        self.HALT = True
	    else:
	        print(f"Illegal opcode {hex(opcode)}")
				
			

We have seen how we perform Fetch, Decode and Execute operation, we are now left with execution cycle. The above function will execute one single instruction, we need to execute all the instructions on the memory, to do that, let's create a main function with the while loop as follows.

				
def main(filename, speed):
    cpu = CPU()
    cpu.loadProgram(filename)

    while not cpu.HALT:
        #fetch instruction into Instruction Register
        try:
            cpu.IR = cpu.memory[cpu.pc]
            cpu.execute()
            cpu.pc += 0b0001
            time.sleep(float(speed)) #clock speed
        except Exception as e:
            print("HALTING System...")
            break;
				
			

The CPU will stop executing instructions only when the HALT flag is set and that's the reason why we will check whether it is set to false or not in the while loop.

In each iteration, we will fetch the instruction from the memory that we have loaded using the loadProgram() method. With the help of program counter or pc, we will fetch it and store it in Instruction register or IR.

Then we will execute the current instruction and once we do that, we will increment the program counter by 1 to fetch the next instruction.

We now have to run the main function by passing the filename and speed of execution. time.sleep() will slow down the execution speed in the main function. This will help us see how the instructions are executed

Let's write some more code to do that

				
import sys
import time

if __name__ == "__main__":
    try:
        filename = sys.argv[1]
        speed = sys.argv[2]
    except:
        print(" ------------------------------------------- ")
        print("|Usage: python3 cpu.py <bin file> <speed>   |")
        print(" ------------------------------------------- \n")
        print(" ------------------------------------------- ")
        print("| <bin file> : compiled asm file            |\n| <speed> : (0 to 1), 0 fastest, 1 slowest  |\n")
        print("| Default program: Triangular Numbers       |\n| Run: python3 cpu.py default <speed>       |")
        print(" ------------------------------------------- \n")
        print(" ------------------------------------------- ")
        print("| Example: python3 cpu.py fib.bin 0.05      |")
        print(" ------------------------------------------- \n")
        exit()
    main(filename, speed)
				
			

That's all, save the program and run it by providing a bin file that has some program in it. The entire code should look something like this

				
import sys
import time

class CPU:
    def __init__(self):
        #Program counter
        self.pc = 0

        #Registers A, B, IR
        self.A = 0
        self.B = 0
        self.IR = 0

        #RAM or self.memory, 4 bit address bus --> 2^4 = 16 bytes of RAM
        self.memory = [0]*16

        #Output register
        self.OUT = 0;

        #Flags
        self.CF = False
        self.ZF = False
        self.HALT = False

    def loadProgram(self, filename):
        #Load your own program manually
        if filename == "default":
            #prints triangular numbers

            self.memory[0x0] = 0x1F
            self.memory[0x1] = 0x2E
            self.memory[0x2] = 0x79
            self.memory[0x3] = 0xE0
            self.memory[0x4] = 0x4F
            self.memory[0x5] = 0x1E
            self.memory[0x6] = 0x2D
            self.memory[0x7] = 0x4E
            self.memory[0x8] = 0x60
            self.memory[0x9] = 0x50
            self.memory[0xA] = 0x4F
            self.memory[0xB] = 0x1D
            self.memory[0xC] = 0x4E
            self.memory[0xD] = 1
            self.memory[0xE] = 1
            self.memory[0xF] = 0

        else:
            #if the file name was specified open it
            try:
                with open(filename, "rb") as f:
                    buffer = f.read()
                    f.close()
            except:
                print(f"Error: cannot open file: {filename}")
                exit()

            for i in range(16):
                self.memory[i] = buffer[i];

    def execute(self):
        #decode instruction from opcode by masking higher 4 bits
        opcode = (self.IR & 0xF0) >> 4;

        if(opcode == 0x0):
            #NOP
            pass;
        elif(opcode == 0x1):
            #LDA
            self.A = self.memory[self.IR & 0x0F]
        elif(opcode == 0x2):
            #ADD
            self.CF = (self.A + self.memory[self.IR & 0x0F]) > 255
            self.B = self.memory[self.IR & 0x0F]
            self.A = self.A + self.B
            self.ZF = self.A == 0
        elif(opcode == 0x3):
            #SUB
            self.CF = (self.A - self.memory[self.IR & 0x0F]) > 255
            self.B = self.memory[self.IR & 0x0F]
            self.A = self.A - self.B
            self.ZF = self.A == 0
        elif(opcode == 0x4):
            #STA
            self.memory[(self.IR & 0x0F)] = self.A
        elif(opcode == 0x5):
            #LDI
            self.A = self.IR & 0x0F
        elif(opcode == 0x6):
            #JMP - jump to address (self.IR & 0x0F) and do not increment the program counter
            self.pc = (self.IR & 0x0F) - 1
        elif(opcode == 0x7):
            #JC
            if self.CF:
                self.pc = (self.IR & 0x0F) - 1
        elif(opcode == 0x8):
            #JZ
            if self.ZF:
                self.pc = (self.IR & 0x0F) - 1
        elif(opcode == 0xE):
            #OUT
            self.OUT = self.A
            print(f"OUT : {self.OUT}")
        elif(opcode == 0xF):
            #HLT
            self.HALT = True
        else:
            print(f"Illegal opcode {hex(opcode)}")


def main(filename, speed):
    cpu = CPU()
    cpu.loadProgram(filename)

    while not cpu.HALT:
        #fetch instruction into Instruction Register
        try:
            cpu.IR = cpu.memory[cpu.pc]
            cpu.execute()
            cpu.pc += 0b0001
            time.sleep(float(speed)) #clock speed
        except Exception as e:
            print("HALTING System...")
            break;

if __name__ == "__main__":
    try:
        filename = sys.argv[1]
        speed = sys.argv[2]
    except:
        print(" ------------------------------------------- ")
        print("|Usage: python3 cpu.py     |")
        print(" ------------------------------------------- \n")
        print(" ------------------------------------------- ")
        print("|  : compiled asm file            |\n|  : (0 to 1), 0 fastest, 1 slowest  |\n")
        print("| Default program: Triangular Numbers       |\n| Run: python3 cpu.py default        |")
        print(" ------------------------------------------- \n")
        print(" ------------------------------------------- ")
        print("| Example: python3 cpu.py fib.bin 0.05      |")
        print(" ------------------------------------------- \n")
        exit()
    main(filename, speed)
				
			

Running the program

Once you write the code, run it by specifying bin file and the speed of execution and the output should look something like this.

That's all, you can find the code here on my GitHub repo. I have also provided a simple assembler, you can find more info in the README.md file on GitHub.