ilusm.dev

ilusm Language Specification

Version 1.0 | April 2026

Introduction

ilusm is a self-hosting programming language with a 5-character surface convention (keywords, stdlib names, builtins) and a large standard library shipped as Ilusm source under lib/stdlib/. The in-tree compiler targets bytecode; the default shipped binary is a minimal C bootstrap (ilusm-vm) while the full stack is developed under lib/ per the repository README.

This specification defines the syntax, semantics, and standard library interface for ilusm version 1.0.

For the implementation shape (lexer, parser, AST, tree interpreter, bytecode compiler, VM) mapped onto the usual compiler-course pipeline, see Compiler pipeline.

Notation

The syntax is specified using Extended Backus-Naur Form (EBNF):

  • | alternation
  • () grouping
  • [] option (0 or 1 times)
  • {} repetition (0 or more times)

Source Code Representation

Source code is Unicode text encoded in UTF-8. The text is not canonicalized.

Each code point is distinct; for instance, uppercase and lowercase letters are different characters.

Implementation restriction: For compatibility with other tools, a compiler may disallow the NUL character (U+0000) in the source text.

Lexical Elements

Comments

Comments provide program documentation and are ignored by the compiler. There is one form:

  • Line comments start with # and continue until the end of the line.

Tokens

Tokens form the vocabulary of the ilusm language. There are four classes: identifiers, keywords, operators, and literals.

Identifiers

An identifier is a sequence of letters and digits. The first character must be a letter. Identifiers are case-sensitive.

Surface names (keywords, stdlib modules, builtins) are limited to 5 characters by language convention. User identifiers may be longer but typically follow the 5-character convention for consistency.

Keywords

The following identifiers are reserved and cannot be used as variable names:

tru fls nil if whl brk cnt def fun use mat try err asr

These keywords form the core control flow and declaration constructs of the language.

Operators

+ - * / == != < <= > >= and or ! .. | <- = =>

Values

Booleans

A boolean value is either tru or fls.

Nil

The value nil represents the absence of a value, similar to null in other languages.

Numbers

ilusm supports integers and floating-point numbers. Integers can be written in decimal, hexadecimal (0x prefix), or binary (0b prefix).

Strings

A string is a sequence of bytes. Strings are immutable. String literals are enclosed in double quotes.

Interpolated strings start with $: $"hello {name}"

Lists

A list is a sequence of values. List literals are enclosed in brackets: [1, 2, 3]

Objects

An object is a collection of key-value pairs. Object literals use braces: {name: "Alice", age: 30}

Variables

A variable is a storage location for holding a value. Variables are created by assignment:

x = 42

Variables declared with def are immutable:

def pi = 3.14159

Types

ilusm is dynamically typed with runtime type checking. Values have intrinsic types, while variables are untyped storage locations. The built-in types are:

  • nil - the nil type (single value: nil)
  • bol - boolean type (values: tru, fls)
  • num - number type (integers and floating-point)
  • str - immutable UTF-8 string type
  • lst - dynamic array type with O(1) indexing
  • obj - map type with string keys
  • fun - user function/closure type
  • bif - builtin function type

Type conversion is performed via built-in functions: int(), str(), typ().

Expressions

Primary Expressions

Primary = identifier | literal | "(" Expression ")" | ListLiteral | ObjectLiteral | Lambda

Operators

Operator precedence (highest to lowest):

  1. Postfix: ?. ?[] (optional chaining)
  2. Prefix: ! -
  3. Multiplicative: * /
  4. Additive: + -
  5. Range: ..
  6. Comparison: == != < <= > >=
  7. Logical AND: and
  8. Logical OR: or
  9. Pipeline: |

Lambda Expressions

A lambda creates an anonymous fnc:

Lambda = "\\" [ Parameters ] Expression

Examples:

  • \(x) x * 2
  • \(a, b) a + b
  • \() 42

Statements

If Statement

IfStmt = "if" Expression ":" Statement [ "|" Statement ]

While Statement

WhileStmt = "whl" Expression ":" Statement

For-Each Statement

ForStmt = identifier "<-" Expression ":" Statement

Match Statement

MatchStmt = "mat" Expression ":" { Pattern "=>" Expression "|" }

Fncs

Fncs are first-class values. A fnc declaration creates a variable and assigns a fnc value to it.

Fnc Declaration

Fnc = [ "fun" ] identifier Parameters [ "=" | "=\\n" Block ] Parameters = "(" [ identifier { "," identifier } [ ",..." identifier ] ] ")" Block = { Statement }

Rest Parameters

sum(...nums) = s = 0 n <- nums: s = s + n s

Modules

A module is a collection of declarations in a single file. Programs are constructed from one or more modules.

Import Declaration

Import = "use" identifier | "use" string

Examples:

  • use trl - import standard library module
  • use "lib.ilu" - import local file