Benchmarks

Overview

The repository ships release/benchmarks/run_multi_lang_bench.py. It runs three tiny numeric workloads on Ilusm plus whatever scripting/compiled languages are on your PATH (Python, Perl, Node, Awk, Bash, C, Rust, C++, Go, POSIX sh, Tcl, Guile; Java when javac/java and the snippet are available).

Each cell is the median wall time in seconds over three full process runs. That measures startup + loop together-fair for comparing scripting runtimes, harsh on Ilusm when the driver uses the bootstrap seed path.

For execution model (ILBC VM vs tree-walk, memory, optional JIT syscalls), read Performance. For opcode-level detail, Bytecode ISA.

Not claimed here: fixed “Ilusm vs Python” speedups for arbitrary apps, generational GC pause budgets, or guaranteed JIT-the driver only times micro-loops under a specific runner.

Live results

What "live" means on this page: results are updated periodically by re-running the driver and refreshing this HTML. There is no automated benchmark service on ilusm.dev.

Capture below: 2026-04-08 · Host: Linux x86_64 (Steam Deck–class machine) · Source: development checkout · Ilusm runner: ./ilusm (ilusm-min bootstrap seed; ilusm-vm not on PATH) · Total driver wall: ~2m 47s.

Reproduce exactly: export ILUSM_HOME="$(pwd)", ./build.sh, then python3 release/benchmarks/run_multi_lang_bench.py. Set ILUSM_BENCH_CMD='ilusm-vm run' when you want the Ilusm row to match a VM-shaped install.

Workload 1 - Integer sum (`1 .. 1_000_000` → `500000500000`)

Language	Median (s)	Notes
Ilusm	3.1582	`./ilusm` seed
Python 3.13	0.6076	`python3 -c`
Perl	0.1113
JavaScript (Node)	0.1288	v23.4.0
Bash	6.9140	arith for-loop
Awk	0.2089
C (gcc -O2)	0.0067	compiled snippet
Rust (-O)	0.0036	fastest on this row
C++ (-O2)	0.0039
Go	0.0107
POSIX sh	11.5725
Tcl	1.2661
Guile	0.5700
Java	-	skipped (tooling/snippet)

Relative to fastest (Rust 0.0036 s): Ilusm ~872×, Python ~169×, C ~1.9× (see driver “Relative to fastest” block for full list).

Workload 2 - Iterative Fibonacci F(40) → `102334155`

Language	Median (s)	Notes
Ilusm	0.1773	`./ilusm` seed
Python 3.13	0.0492
Perl	0.0060
JavaScript (Node)	0.1095
Bash	0.0133
Awk	0.0055
C (gcc -O2)	0.0034
Rust (-O)	0.0046
C++ (-O2)	0.0031	fastest on this row
Go	0.0106
POSIX sh	0.0084
Tcl	0.0168
Guile	0.0483
Java	-	skipped

Startup dominates; Ilusm still ~58× vs fastest C++ on this tiny body. Python ~16×, Perl ~2×.

Workload 3 - Triple nested loops (mod 1_000_000_007) → `499749986`

Language	Median (s)	Notes
Ilusm	5.2483	`./ilusm` seed
Python 3.13	0.5183
Perl	0.1757
JavaScript (Node)	0.2940
Bash	6.7757
Awk	0.4127
C (gcc -O2)	0.0058	fastest on this row
Rust (-O)	0.0130
C++ (-O2)	0.0169
Go	0.0146
POSIX sh	13.1620
Tcl	1.3892
Guile	0.8584
Java	-	skipped

Ilusm ~897× vs C here; Python ~89×-still an interpreter vs compiled loop story on this host.

Methodology

Methodology summary:

Shipping vs timing - Ilusm is defined by lib/**/*.ilu, ilusm.ilbc, and a prebuilt seed. Cross-language tables are only fair as “Ilusm” when you document the same runner you ship (set ILUSM_BENCH_CMD).
Resolution order - ILUSM_BENCH_CMD / ILUSM_CMD if set; else ilusm-vm run on PATH; else ./ilusm (bootstrap).
Timing - Median of 3 runs per language per workload; Java gets 2 JVM warmup runs before the timed trio.
Correctness - Each snippet prints a single expected value; the driver checks it.
Before publishing - Run ./build.sh, record OS, CPU model, and the exact runner line the script printed.

export ILUSM_HOME="$(pwd)"
./build.sh
python3 release/benchmarks/run_multi_lang_bench.py 2>&1 | tee ilusm-bench-$(date -u +%Y%m%dT%H%MZ).txt

Language comparison

These microbenches are not representative of I/O-heavy or stdlib-rich programs. They stress tight loops and process startup.

Workload 1 - One million iterations; compiled languages win by orders of magnitude; Ilusm on the seed path is in the same ballpark as “slow shell” tiers but faster than POSIX sh on this capture.
Workload 2 - Only 39 iterations; startup and interpreter overhead dominate. Differences between Ilusm, Python, and Node mostly reflect launch cost, not loop quality.
Workload 3 - ~1M inner-body iterations with modulo; separates compiled throughput from interpreter overhead. Bash and POSIX sh remain very slow; Ilusm pays the seed + bytecode/eval path cost.

Use the suite for regression tracking (same machine, same ILUSM_BENCH_CMD, same commit workflow) and rough order-of-magnitude awareness-not for marketing a single “Ilusm vs Python” factor.

Benchmark tests

Multi-language driver - release/benchmarks/run_multi_lang_bench.py (sources in release/benchmarks/snippets/).

Test-ladder timing - Per-file wall times for correctness suites and stdlib import/target passes:

./scripts/run_all_tests_timed.sh

Stdlib bench module - In-language timers, warmup, suites: lib/stdlib/bench.ilu (uses tim, logs via obs). Import in your own .ilu when building custom microbenches.

There is no ./bench.sh or tests/bench.ilu in the current tree; older site copy that referenced them was removed.

Verification

Full correctness ladder:

./scripts/run_all_tests.sh

Expect === All tiers completed OK ===.

CI - This site does not run benchmarks on every deploy.