Welcome back. Continuing from our last post, today, we’re going to peek behind the curtain and see how the Little Computer 3 (LC-3) actually “understands” our instructions. Imagine you’re writing a letter to someone who only reads binary—you’d need a very specific format for them to understand your message. That’s exactly what we’re doing when we write LC-3 assembly code: we’re writing human-readable instructions that need to be translated into a language of 1s and 0s that the computer understands.

Here is a simple example. When you write ADD R1, R2, R3 in assembly, you’re saying “take the values in registers R2 and R3, add them together, and put the result in R1.” But the computer needs this instruction packaged in a very specific way—as a 16-bit pattern that tells it exactly what to do. Think of it like filling out a standardized form where each box has a specific meaning:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
; Human writes this:
ADD R1, R2, R3    ; "Add R2 and R3, put result in R1"

; Assembler converts to this format:
0001 001 010 000 011
|    |   |   |   |
|    |   |   |   └── R3 (second source)
|    |   |   └────── Mode (0 for register)
|    |   └──────── R2 (first source)
|    └──────────── R1 (destination)
└─────────────────── Opcode for ADD

Quick Refresher: The Building Blocks

Before we dive deeper, let’s understand what we’re working with. Think of the LC-3 as a small office with these key components:

  1. Eight filing cabinets (our registers R0-R7) for quick access to values we’re working with.
  2. A massive storage room with exactly 65,536 numbered shelves (our memory locations). Why 65,536? Because with a 16-bit address, we can specify any location from x0000 to xFFFF—that’s 2^16 unique addresses. Each shelf holds one 16-bit value.
  3. A set of standardized procedures (instructions) for moving and processing information. Each procedure starts with a 4-bit code (opcode) that tells us what kind of operation we’re performing.

This setup gives us a perfect balance of speed (registers) and space (memory), while keeping our instructions simple and uniform—every instruction fits in exactly 16 bits.


Basic Arithmetic: ADD and AND

Let’s start with the most fundamental operations. These are like the basic arithmetic you do every day, just formatted in a very specific way for the computer.

The ADD Instruction

ADD comes in two flavors, just like you might add either two numbers from your calculator’s memory or a number from memory plus a small constant. In LC-3 terms:

  1. Register Mode: ADD R3, R1, R2 (Add the contents of R1 and R2)
  2. Immediate Mode: ADD R3, R1, #5 (Add 5 to the contents of R1)

Let’s see how ADD R3, R1, #5 gets encoded:

1
2
3
4
5
6
7
0001 011 001 1 00101
|    |   |   | |
|    |   |   | └── 5 in binary (immediate value)
|    |   |   └──── Mode (1 for immediate)
|    |   └────────R1 (first source)
|    └────────────R3 (destination)
└─────────────────ADD opcode

When would you use this? Imagine you’re counting items in a loop—you’d use ADD R0, R0, #1 to increment your counter by 1.

The AND Operation

AND works exactly like ADD but performs a bitwise AND operation instead of addition. It’s particularly useful when you need to check specific bits in a value—like checking if a number is odd by ANDing it with 1.

The NOT Operation

NOT is a simpler operation that flips all the bits in a value—turning every 1 to 0 and every 0 to 1. The instruction looks like this:

1
NOT R2, R1        ; Flip all bits in R1 and store in R2

In machine code, NOT has a unique pattern:

1
2
3
4
5
6
1001 010 001 111111
|    |   |   |
|    |   |   └── Always 111111 for NOT
|    |   └────── R1 (source)
|    └──────────R2 (destination)
└─────────────── NOT opcode (1001)

You might use NOT when you need to find the opposite of a binary pattern, or as part of calculating other operations like subtraction (by NOTing a number and adding 1, you get its negative).


Working with Memory: Loading and Storing

Now, let’s talk about moving data between our registers and memory. Think of this like retrieving files from your storage room (loading) or filing them away (storing).

Loading Values (LD)

When you write:

1
LD R0, MESSAGE    ; Load the value stored at MESSAGE into R0

The LC-3 needs to know how far away MESSAGE is from our current position. We call this the “offset,” and it’s like saying “walk forward 4 shelves” or “go back 3 shelves” in our storage room analogy. The assembler calculates this distance for us and encodes it in the instruction.

Storing Values (ST)

ST is the opposite of LD—it takes what’s in a register and saves it to memory. You’ll use this when you need to save results for later use:

1
ST R0, RESULT     ; Save R0's content to the memory location labeled RESULT

Getting Addresses (LEA)

Sometimes you don’t want the contents of a memory location—you want its address instead. That’s where LEA (Load Effective Address) comes in. Think of it like getting the shelf number instead of what’s on the shelf:

1
LEA R2, MESSAGE   ; Put the address of MESSAGE into R2

This is particularly useful when you’re working with strings or arrays and need to remember where they start in memory. The machine code follows the same pattern as LD, but with opcode 1110.


Control Flow: Making Decisions

Sometimes your program needs to make decisions or jump to different sections. LC-3 provides several ways to do this.

Branching (BR)

Branching is like a road sign that says “if condition X is true, go this way.” For example:

1
2
ADD R0, R0, #-1   ; Subtract 1 from R0
BRz DONE          ; If the result was zero, jump to DONE

The BR instruction looks at “condition codes” (negative, zero, or positive) set by the previous instruction and decides whether to jump or continue straight ahead. In machine code, it looks like this:

1
2
3
4
5
0000 nzp offset-9-bits
|    |   |
|    |   └── How far to jump (-256 to +255 locations)
|    └────── Which conditions to check (n=4, z=2, p=1)
└─────────── BR opcode (0000)

For example, BRz sets only the ‘z’ bit (010), while BRnzp sets all three bits (111). You can combine these flags any way you need—BRnp would jump if the last result was either negative or positive (but not zero).

Jumping Around (JMP and RET)

Sometimes you want to jump unconditionally to a specific location. The JMP instruction (opcode 1100) does exactly this:

1
2
3
4
5
6
7
8
9
JMP R3            ; Jump to address stored in R3

; Machine code format:
1100 000 011 000000
|    |   |   |
|    |   |   └── Always zeros
|    |   └────── R3 (register containing target address)
|    └──────────Always zeros
└─────────────── JMP opcode (1100)

RET, which we use to return from subroutines, is actually just a special case of JMP—specifically, JMP R7. Since we always store our return address in R7, this gets us back to where we came from.

Subroutine Calls (JSR/JSRR)

Sometimes you want to jump to another part of your program but remember where you came from—like marking your page in a book before checking the index. JSR (Jump to Subroutine) and JSRR (Jump to Subroutine Register) handle this:

1
2
3
JSR PRINT_NUM     ; Jump to PRINT_NUM routine, saving return address in R7
; ... (PRINT_NUM does its work)
RET              ; Return to where we came from using R7

This is particularly useful for code you use repeatedly, like printing numbers or handling input. The computer automatically saves the return address in R7 (that’s why we call R7 the “link register”).


A Complete Example

Let’s put it all together with a simple program that counts down from 5 to 0:

1
2
3
4
5
6
7
8
        .ORIG x3000           ; Start our program at memory location x3000
        
        ADD R0, R0, #5        ; R0 ← 5 (our counter)
LOOP    ADD R0, R0, #-1       ; Subtract 1
        BRp LOOP             ; If R0 > 0, keep counting down
        HALT                 ; Stop when we reach zero

        .END

System Calls and Pseudo-Ops

TRAP Instructions

The LC-3 provides several built-in routines through TRAP instructions. Think of these as pre-written helper functions that handle common tasks:

1
2
3
TRAP x21   ; OUT - Display character from R0
TRAP x23   ; IN - Read character into R0
TRAP x25   ; HALT - Stop the program

Each TRAP instruction has an 8-bit “trap vector” (like x21, x23, x25) that tells the computer which helper routine to use. In machine code, TRAP uses opcode 1111 followed by the vector number.

Assembler Directives (Pseudo-ops)

While not actual CPU instructions, pseudo-ops help organize our program:

  • .ORIG x3000: “Start putting the program here in memory”
  • .END: “This is the end of our program”
  • .FILL x1234: “Put this value (x1234) at this spot in memory”
  • .BLKW #5: “Reserve 5 words of memory here”
  • .STRINGZ "Hello": “Store this string here, with a zero at the end”

Quick Reference Guide

Here’s a handy reference for the terms we’ve covered:

  • Opcode: The 4-bit code that starts each instruction (like 0001 for ADD)
  • Immediate Value: A constant number built right into the instruction
  • Offset: The distance to a memory location, used in LD/ST/BR instructions
  • Condition Codes: Flags (negative/zero/positive) set by arithmetic operations
  • Trap Vector: An 8-bit code identifying which system routine to call
  • Link Register: R7, used to store return addresses for subroutines

Wrapping Up: From Assembly to Bits

Understanding how assembly maps to machine code demystifies how computers work at their core. Each 16-bit instruction is like a tiny, self-contained message telling the CPU exactly what to do. While we’ve covered the basics here, in our next post we’ll dive into building an actual assembler that can translate assembly code into these precise patterns.

Whether we’re planning to write low-level code or work with high-level AI models, I believe this foundation helps ourselves understand how computers actually process our instructions under the hood.

That’s it for today. Until next time, happy coding—and happy bit-wrangling!