ilusm.dev

Bytecode Instruction Set Architecture

Complete specification of ilusm bytecode opcodes, operand formats, and VM instruction set. The authoritative implementation lives in lib/backend/mcde.ilu.

Overview

ilusm bytecode is a stack-based instruction set designed for efficient execution and compact representation. The VM executes bytecode instructions with a stack-based evaluation model.

Key Features

  • Stack-based - All operations work on an operand stack
  • Compact encoding - Single-byte opcodes with variable-length operands
  • Type-aware - Instructions include type information for optimization
  • Extensible - Versioned instruction set for future extensions

Version Information

Current bytecode version: v6 (ILU_BC_VER in mcde.ilu)

  • v1: Base instruction set
  • v2: Added swp, ovr, mod, neg, not (opcodes 28-32)
  • v3: Added dp2, rot, nip, neq, le, ge, gt, ld0, ld1, inc, dec, bpt, asr (opcodes 33-45)
  • v4: Added dr2, tck (tuck), pck (pick depth) (opcodes 46-48)
  • v6: Added channel operations and task spawning (opcodes 55-60)

Instruction Format

Basic Format

[opcode:1] [operand1] [operand2] ...

Operand Types

TypeSizeDescription
u81 byteUnsigned 8-bit integer
i81 byteSigned 8-bit integer
u162 bytesUnsigned 16-bit integer (little-endian)
i162 bytesSigned 16-bit integer (little-endian)
u324 bytesUnsigned 32-bit integer (little-endian)
i324 bytesSigned 32-bit integer (little-endian)
varvariableVariable length (string, list, object)

Constant Pool

Complex values are stored in the constant pool and referenced by index:

  • Strings - UTF-8 encoded
  • Numbers - IEEE 754 format
  • Lists and Objects - Serialized representation

Complete Opcode Table

HexDecMnemonicStack EffectDescription
0x000NOP-No operation
0x011LDC_U8→ u8Load unsigned 8-bit constant
0x022LDC_I8→ i8Load signed 8-bit constant
0x033LDC_U16→ u16Load unsigned 16-bit constant
0x044LDC_I16→ i16Load signed 16-bit constant
0x055LDC_POOL→ valueLoad from constant pool
0x066LDC_TRUE→ truLoad boolean true
0x077LDC_FALSE→ flsLoad boolean false
0x088LDC_NIL→ nilLoad nil value
0x099GLO_LOAD→ valueLoad global variable
0x0A10GLO_STOREvalue →Store global variable
0x0B11LOC_LOAD→ valueLoad local variable
0x0C12LOC_STOREvalue →Store local variable
0x0D13DUPx → x xDuplicate top of stack
0x0E14DROPx →Drop top of stack
0x0F15SWAPx y → y xSwap top two stack items
0x1016ADDx y → x+yAddition
0x1117SUBx y → x-ySubtraction
0x1218MULx y → x*yMultiplication
0x1319DIVx y → x/yDivision
0x1420MODx y → x%yModulo
0x1521CALL... → resultCall function (address in operand)
0x1622CALLV... → resultCall variable function
0x1723RETvalue →Return from function
0x1824JMP-Unconditional jump
0x1925JMP_IFcond →Conditional jump (if truthy)
0x1A26JMP_IF_NOTcond →Conditional jump (if falsy)
0x1B27EQx y → boolEquality comparison
0x1C28LTx y → boolLess than comparison
0x1D29LEx y → boolLess than or equal comparison
0x1E30GTx y → boolGreater than comparison
0x1F31GEx y → boolGreater than or equal comparison
0x2032NEQx y → boolNot equal comparison
0x2133ANDx y → boolLogical AND
0x2234ORx y → boolLogical OR
0x2335NOTx → !xLogical NOT
0x2436NEGx → -xNegation
0x2537STR_ATstr i → charString character access
0x2638STR_SUBstr i j → subString substring
0x2739STR_LENstr → lenString length
0x2840STR_CATstr1 str2 → str3String concatenation
0x2941FILE_READpath → contentRead file (syscall)
0x2A42FILE_WRITEpath content →Write file (syscall)
0x2B43LIST_NEW→ listCreate new list
0x2C44LIST_GETlist i → valueList element access
0x2D45LIST_SETlist i value →List element assignment
0x2E46LIST_LENlist → lenList length
0x2F47OBJ_NEW→ objCreate new object
0x3048OBJ_GETobj key → valueObject property access
0x3149OBJ_SETobj key value →Object property assignment
0x3250TYPvalue → typeGet type of value
0x3351INTvalue → intConvert to integer
0x3452STRvalue → strConvert to string
0x3553PRINTvalue →Print value to stdout
0x3654HALT-Stop execution
0x3755SWPx y z → y x zSwap with third item
0x3856OVRx y → y x yOver (duplicate second item)
0x3957ROTx y z → y z xRotate top three items
0x3A58NIPx y → yNip (remove second item)
0x3B59DP2x y → x y x yDuplicate pair
0x3C60TCKx y → y xTuck (copy under)
0x3D61PCK... → itemPick from stack depth
0x3E62LD0→ 0Load zero constant
0x3F63LD1→ 1Load one constant
0x4064INCx → x+1Increment
0x4165DECx → x-1Decrement
0x4266BPT-Breakpoint (debug)
0x4367ASRcond msg →Assertion (error if false)
0x4468DR2x y → y x y xDuplicate and rotate
0x4569CHN_NEWcap → chanCreate new channel
0x4670CHN_SENDchan val →Send to channel
0x4771CHN_RECVchan → valReceive from channel
0x4872CHN_CLOSEchan →Close channel
0x4973SPAWNargc → taskSpawn new task
0x4A74WAITtask → resultWait for task completion
0x4B75YIELD-Yield to scheduler
0x4C76PANICmsg →Panic with message
0x4D77DEBUGvalue →Debug output
0x4E78GARBAGE-Force garbage collection
0x4F79PROF_BEGINname →Begin profiling block
0x5080PROF_ENDname →End profiling block

Operand Encoding Details

Constant Loading

# Small integers (direct encoding)
0x01 42        ; LDC_U8 42
0x02 -5        ; LDC_I8 -5
0x03 0x34 0x12 ; LDC_U16 0x1234
0x04 0x34 0x12 ; LDC_I16 -4660 (two's complement)

# Constant pool
0x05 0x00      ; LDC_POOL index 0 (string "hello")
0x05 0x01      ; LDC_POOL index 1 (number 3.14159)

# Built-in constants
0x06           ; LDC_TRUE
0x07           ; LDC_FALSE  
0x08           ; LDC_NIL

Variable Access

# Global variables
0x09 0x00      ; GLO_LOAD global index 0
0x0A 0x01      ; GLO_STORE global index 1

# Local variables (function frame)
0x0B 0x00      ; LOC_LOAD local index 0
0x0C 0x01      ; LOC_STORE local index 1

Control Flow

# Jumps (relative offsets)
0x18 0x0A 0x00 ; JMP +10 bytes
0x19 0x05 0x00 ; JMP_IF +5 bytes (if truthy)
0x1A 0x08 0x00 ; JMP_IF_NOT +8 bytes (if falsy)

# Function calls
0x15 0x50 0x00 ; CALL to address 0x0050
0x16 0x03      ; CALLV function at constant pool index 3

Stack Operations

Basic Stack Manipulation

# Before: [a, b, c]
0x0D           ; DUP
# After:  [a, b, c, c]

# Before: [a, b, c]  
0x0E           ; DROP
# After:  [a, b]

# Before: [a, b, c]
0x0F           ; SWAP  
# After:  [a, c, b]

Advanced Stack Operations

# Before: [a, b, c]
0x37           ; SWP
# After:  [b, a, c]

# Before: [a, b, c]
0x38           ; OVR
# After:  [a, b, a, b]

# Before: [a, b, c]
0x39           ; ROT
# After:  [b, c, a]

# Before: [a, b, c]
0x3A           ; NIP
# After:  [a, c]

Pick Operations

# Pick from stack depth (PCK)
# Stack: [a, b, c, d, e]
0x3D 0x02      ; PICK depth 2 → returns c
# Stack: [a, b, c, d, e, c]

# Tuck (TCK)
# Stack: [a, b, c]
0x3C           ; TCK
# Stack: [a, c, b]

Control Flow Instructions

Function Calls

# Direct call (fixed address)
0x15 addr_lo addr_hi  ; CALL to absolute address

# Variable call (indirect)
0x16 const_idx       ; CALLV function at constant pool index

# Return
0x17                 ; RET (return value on stack)

Conditional Jumps

# If statement pattern
; ... condition evaluation ...
0x19 offset_lo offset_hi  ; JMP_IF true_branch
; ... false branch code ...
0x18 else_offset_lo else_offset_hi  ; JMP to end
true_branch:
; ... true branch code ...
end:

Loops

# While loop pattern
loop_start:
; ... condition evaluation ...
0x1A end_offset_lo end_offset_hi  ; JMP_IF_NOT end
; ... loop body code ...
0x18 start_offset_lo start_offset_hi  ; JMP to start
end:

Object and Collection Operations

List Operations

# Create list
0x2B                 ; LIST_NEW (empty list)
; ... push elements ...
0x2D 0 0             ; LIST_SET index 0
0x2D 1 1             ; LIST_SET index 1

# Access list
0x2C index_lo index_hi  ; LIST_GET

# List length
0x2E                 ; LIST_LEN

Object Operations

# Create object
0x2F                 ; OBJ_NEW (empty object)

# Set property
0x30 key_idx         ; OBJ_GET (push key)
; ... push value ...
0x31                 ; OBJ_SET

# Get property  
0x30 key_idx         ; OBJ_GET

Type Operations

# Get type
0x32                 ; TYP (returns type string)

# Type conversion
0x33                 ; INT (convert to integer)
0x34                 ; STR (convert to string)

Version History

v6 (Current)

Added concurrency and debugging support:

  • Channel operations (0x45-0x48)
  • Task spawning and waiting (0x49-0x4A)
  • Yield instruction (0x4B)
  • Enhanced debugging (0x4C-0x4D)
  • Profiling support (0x4F-0x50)

v4

Added advanced stack operations:

  • DR2 - Duplicate and rotate (0x44)
  • TCK - Tuck operation (0x3C)
  • PCK - Pick from depth (0x3D)

v3

Added comparison and arithmetic extensions:

  • Extended comparisons (LE, GT, GE, NEQ)
  • Logical operations (AND, OR, NOT)
  • Negation (NEG)
  • Convenience constants (LD0, LD1)
  • Increment/Decrement (INC, DEC)
  • Assertion (ASR)

v2

Added stack manipulation primitives:

  • SWP - Swap with third
  • OVR - Over (duplicate second)
  • ROT - Rotate three
  • NIP - Nip (remove second)
  • MOD - Modulo operation

v1

Base instruction set with:

  • Stack operations (DUP, DROP, SWAP)
  • Arithmetic (ADD, SUB, MUL, DIV)
  • Basic control flow (JMP, CALL, RET)
  • Object and list operations
  • String operations

Implementation Notes

Reference Implementation

The authoritative bytecode interpreter is in lib/backend/ilusm_vm.ilu. If VM behavior differs from this specification, fix the VM to match the specification.

Validation

Bytecode validation is performed by mcde_prog_ok in lib/backend/mcde.ilu. Validation includes:

  • Byte range checking
  • Operand validation
  • Jump target verification
  • Constant pool bounds checking

Debugging Support

Debug instructions for development:

  • BPT - Breakpoint for debugger
  • DEBUG - Debug output
  • PANIC - Unrecoverable error
  • ASR - Assertion checking