Bytecode Instruction Set Architecture
Complete specification of ilusm bytecode opcodes, operand formats, and VM instruction set. The authoritative implementation lives in lib/backend/mcde.ilu.
Overview
ilusm bytecode is a stack-based instruction set designed for efficient execution and compact representation. The VM executes bytecode instructions with a stack-based evaluation model.
Key Features
- Stack-based - All operations work on an operand stack
- Compact encoding - Single-byte opcodes with variable-length operands
- Type-aware - Instructions include type information for optimization
- Extensible - Versioned instruction set for future extensions
Version Information
Current bytecode version: v6 (ILU_BC_VER in mcde.ilu)
- v1: Base instruction set
- v2: Added
swp,ovr,mod,neg,not(opcodes 28-32) - v3: Added
dp2,rot,nip,neq,le,ge,gt,ld0,ld1,inc,dec,bpt,asr(opcodes 33-45) - v4: Added
dr2,tck(tuck),pck(pick depth) (opcodes 46-48) - v6: Added channel operations and task spawning (opcodes 55-60)
Instruction Format
Basic Format
[opcode:1] [operand1] [operand2] ...
Operand Types
| Type | Size | Description |
|---|---|---|
| u8 | 1 byte | Unsigned 8-bit integer |
| i8 | 1 byte | Signed 8-bit integer |
| u16 | 2 bytes | Unsigned 16-bit integer (little-endian) |
| i16 | 2 bytes | Signed 16-bit integer (little-endian) |
| u32 | 4 bytes | Unsigned 32-bit integer (little-endian) |
| i32 | 4 bytes | Signed 32-bit integer (little-endian) |
| var | variable | Variable length (string, list, object) |
Constant Pool
Complex values are stored in the constant pool and referenced by index:
- Strings - UTF-8 encoded
- Numbers - IEEE 754 format
- Lists and Objects - Serialized representation
Complete Opcode Table
| Hex | Dec | Mnemonic | Stack Effect | Description |
|---|---|---|---|---|
| 0x00 | 0 | NOP | - | No operation |
| 0x01 | 1 | LDC_U8 | → u8 | Load unsigned 8-bit constant |
| 0x02 | 2 | LDC_I8 | → i8 | Load signed 8-bit constant |
| 0x03 | 3 | LDC_U16 | → u16 | Load unsigned 16-bit constant |
| 0x04 | 4 | LDC_I16 | → i16 | Load signed 16-bit constant |
| 0x05 | 5 | LDC_POOL | → value | Load from constant pool |
| 0x06 | 6 | LDC_TRUE | → tru | Load boolean true |
| 0x07 | 7 | LDC_FALSE | → fls | Load boolean false |
| 0x08 | 8 | LDC_NIL | → nil | Load nil value |
| 0x09 | 9 | GLO_LOAD | → value | Load global variable |
| 0x0A | 10 | GLO_STORE | value → | Store global variable |
| 0x0B | 11 | LOC_LOAD | → value | Load local variable |
| 0x0C | 12 | LOC_STORE | value → | Store local variable |
| 0x0D | 13 | DUP | x → x x | Duplicate top of stack |
| 0x0E | 14 | DROP | x → | Drop top of stack |
| 0x0F | 15 | SWAP | x y → y x | Swap top two stack items |
| 0x10 | 16 | ADD | x y → x+y | Addition |
| 0x11 | 17 | SUB | x y → x-y | Subtraction |
| 0x12 | 18 | MUL | x y → x*y | Multiplication |
| 0x13 | 19 | DIV | x y → x/y | Division |
| 0x14 | 20 | MOD | x y → x%y | Modulo |
| 0x15 | 21 | CALL | ... → result | Call function (address in operand) |
| 0x16 | 22 | CALLV | ... → result | Call variable function |
| 0x17 | 23 | RET | value → | Return from function |
| 0x18 | 24 | JMP | - | Unconditional jump |
| 0x19 | 25 | JMP_IF | cond → | Conditional jump (if truthy) |
| 0x1A | 26 | JMP_IF_NOT | cond → | Conditional jump (if falsy) |
| 0x1B | 27 | EQ | x y → bool | Equality comparison |
| 0x1C | 28 | LT | x y → bool | Less than comparison |
| 0x1D | 29 | LE | x y → bool | Less than or equal comparison |
| 0x1E | 30 | GT | x y → bool | Greater than comparison |
| 0x1F | 31 | GE | x y → bool | Greater than or equal comparison |
| 0x20 | 32 | NEQ | x y → bool | Not equal comparison |
| 0x21 | 33 | AND | x y → bool | Logical AND |
| 0x22 | 34 | OR | x y → bool | Logical OR |
| 0x23 | 35 | NOT | x → !x | Logical NOT |
| 0x24 | 36 | NEG | x → -x | Negation |
| 0x25 | 37 | STR_AT | str i → char | String character access |
| 0x26 | 38 | STR_SUB | str i j → sub | String substring |
| 0x27 | 39 | STR_LEN | str → len | String length |
| 0x28 | 40 | STR_CAT | str1 str2 → str3 | String concatenation |
| 0x29 | 41 | FILE_READ | path → content | Read file (syscall) |
| 0x2A | 42 | FILE_WRITE | path content → | Write file (syscall) |
| 0x2B | 43 | LIST_NEW | → list | Create new list |
| 0x2C | 44 | LIST_GET | list i → value | List element access |
| 0x2D | 45 | LIST_SET | list i value → | List element assignment |
| 0x2E | 46 | LIST_LEN | list → len | List length |
| 0x2F | 47 | OBJ_NEW | → obj | Create new object |
| 0x30 | 48 | OBJ_GET | obj key → value | Object property access |
| 0x31 | 49 | OBJ_SET | obj key value → | Object property assignment |
| 0x32 | 50 | TYP | value → type | Get type of value |
| 0x33 | 51 | INT | value → int | Convert to integer |
| 0x34 | 52 | STR | value → str | Convert to string |
| 0x35 | 53 | value → | Print value to stdout | |
| 0x36 | 54 | HALT | - | Stop execution |
| 0x37 | 55 | SWP | x y z → y x z | Swap with third item |
| 0x38 | 56 | OVR | x y → y x y | Over (duplicate second item) |
| 0x39 | 57 | ROT | x y z → y z x | Rotate top three items |
| 0x3A | 58 | NIP | x y → y | Nip (remove second item) |
| 0x3B | 59 | DP2 | x y → x y x y | Duplicate pair |
| 0x3C | 60 | TCK | x y → y x | Tuck (copy under) |
| 0x3D | 61 | PCK | ... → item | Pick from stack depth |
| 0x3E | 62 | LD0 | → 0 | Load zero constant |
| 0x3F | 63 | LD1 | → 1 | Load one constant |
| 0x40 | 64 | INC | x → x+1 | Increment |
| 0x41 | 65 | DEC | x → x-1 | Decrement |
| 0x42 | 66 | BPT | - | Breakpoint (debug) |
| 0x43 | 67 | ASR | cond msg → | Assertion (error if false) |
| 0x44 | 68 | DR2 | x y → y x y x | Duplicate and rotate |
| 0x45 | 69 | CHN_NEW | cap → chan | Create new channel |
| 0x46 | 70 | CHN_SEND | chan val → | Send to channel |
| 0x47 | 71 | CHN_RECV | chan → val | Receive from channel |
| 0x48 | 72 | CHN_CLOSE | chan → | Close channel |
| 0x49 | 73 | SPAWN | argc → task | Spawn new task |
| 0x4A | 74 | WAIT | task → result | Wait for task completion |
| 0x4B | 75 | YIELD | - | Yield to scheduler |
| 0x4C | 76 | PANIC | msg → | Panic with message |
| 0x4D | 77 | DEBUG | value → | Debug output |
| 0x4E | 78 | GARBAGE | - | Force garbage collection |
| 0x4F | 79 | PROF_BEGIN | name → | Begin profiling block |
| 0x50 | 80 | PROF_END | name → | End profiling block |
Operand Encoding Details
Constant Loading
# Small integers (direct encoding) 0x01 42 ; LDC_U8 42 0x02 -5 ; LDC_I8 -5 0x03 0x34 0x12 ; LDC_U16 0x1234 0x04 0x34 0x12 ; LDC_I16 -4660 (two's complement) # Constant pool 0x05 0x00 ; LDC_POOL index 0 (string "hello") 0x05 0x01 ; LDC_POOL index 1 (number 3.14159) # Built-in constants 0x06 ; LDC_TRUE 0x07 ; LDC_FALSE 0x08 ; LDC_NIL
Variable Access
# Global variables 0x09 0x00 ; GLO_LOAD global index 0 0x0A 0x01 ; GLO_STORE global index 1 # Local variables (function frame) 0x0B 0x00 ; LOC_LOAD local index 0 0x0C 0x01 ; LOC_STORE local index 1
Control Flow
# Jumps (relative offsets) 0x18 0x0A 0x00 ; JMP +10 bytes 0x19 0x05 0x00 ; JMP_IF +5 bytes (if truthy) 0x1A 0x08 0x00 ; JMP_IF_NOT +8 bytes (if falsy) # Function calls 0x15 0x50 0x00 ; CALL to address 0x0050 0x16 0x03 ; CALLV function at constant pool index 3
Stack Operations
Basic Stack Manipulation
# Before: [a, b, c] 0x0D ; DUP # After: [a, b, c, c] # Before: [a, b, c] 0x0E ; DROP # After: [a, b] # Before: [a, b, c] 0x0F ; SWAP # After: [a, c, b]
Advanced Stack Operations
# Before: [a, b, c] 0x37 ; SWP # After: [b, a, c] # Before: [a, b, c] 0x38 ; OVR # After: [a, b, a, b] # Before: [a, b, c] 0x39 ; ROT # After: [b, c, a] # Before: [a, b, c] 0x3A ; NIP # After: [a, c]
Pick Operations
# Pick from stack depth (PCK) # Stack: [a, b, c, d, e] 0x3D 0x02 ; PICK depth 2 → returns c # Stack: [a, b, c, d, e, c] # Tuck (TCK) # Stack: [a, b, c] 0x3C ; TCK # Stack: [a, c, b]
Control Flow Instructions
Function Calls
# Direct call (fixed address) 0x15 addr_lo addr_hi ; CALL to absolute address # Variable call (indirect) 0x16 const_idx ; CALLV function at constant pool index # Return 0x17 ; RET (return value on stack)
Conditional Jumps
# If statement pattern ; ... condition evaluation ... 0x19 offset_lo offset_hi ; JMP_IF true_branch ; ... false branch code ... 0x18 else_offset_lo else_offset_hi ; JMP to end true_branch: ; ... true branch code ... end:
Loops
# While loop pattern loop_start: ; ... condition evaluation ... 0x1A end_offset_lo end_offset_hi ; JMP_IF_NOT end ; ... loop body code ... 0x18 start_offset_lo start_offset_hi ; JMP to start end:
Object and Collection Operations
List Operations
# Create list 0x2B ; LIST_NEW (empty list) ; ... push elements ... 0x2D 0 0 ; LIST_SET index 0 0x2D 1 1 ; LIST_SET index 1 # Access list 0x2C index_lo index_hi ; LIST_GET # List length 0x2E ; LIST_LEN
Object Operations
# Create object 0x2F ; OBJ_NEW (empty object) # Set property 0x30 key_idx ; OBJ_GET (push key) ; ... push value ... 0x31 ; OBJ_SET # Get property 0x30 key_idx ; OBJ_GET
Type Operations
# Get type 0x32 ; TYP (returns type string) # Type conversion 0x33 ; INT (convert to integer) 0x34 ; STR (convert to string)
Version History
v6 (Current)
Added concurrency and debugging support:
- Channel operations (0x45-0x48)
- Task spawning and waiting (0x49-0x4A)
- Yield instruction (0x4B)
- Enhanced debugging (0x4C-0x4D)
- Profiling support (0x4F-0x50)
v4
Added advanced stack operations:
- DR2 - Duplicate and rotate (0x44)
- TCK - Tuck operation (0x3C)
- PCK - Pick from depth (0x3D)
v3
Added comparison and arithmetic extensions:
- Extended comparisons (LE, GT, GE, NEQ)
- Logical operations (AND, OR, NOT)
- Negation (NEG)
- Convenience constants (LD0, LD1)
- Increment/Decrement (INC, DEC)
- Assertion (ASR)
v2
Added stack manipulation primitives:
- SWP - Swap with third
- OVR - Over (duplicate second)
- ROT - Rotate three
- NIP - Nip (remove second)
- MOD - Modulo operation
v1
Base instruction set with:
- Stack operations (DUP, DROP, SWAP)
- Arithmetic (ADD, SUB, MUL, DIV)
- Basic control flow (JMP, CALL, RET)
- Object and list operations
- String operations
Implementation Notes
Reference Implementation
The authoritative bytecode interpreter is in lib/backend/ilusm_vm.ilu. If VM behavior differs from this specification, fix the VM to match the specification.
Validation
Bytecode validation is performed by mcde_prog_ok in lib/backend/mcde.ilu. Validation includes:
- Byte range checking
- Operand validation
- Jump target verification
- Constant pool bounds checking
Debugging Support
Debug instructions for development:
BPT- Breakpoint for debuggerDEBUG- Debug outputPANIC- Unrecoverable errorASR- Assertion checking