ilusm.dev

cgen

C code generation from an ilusm AST - emits standalone C source that compiles with gcc or clang, including a tagged-union Value type, lists, objects, constructors, name mangling, binary operators, if/while/function statements, forward declarations, and a main entry point.

Load with: use cgen

What this module does

cgen is the C backend of the ilusm compiler. Given an AST produced by the parser, it emits a complete, self-contained .c file that requires no external ilusm runtime - just a C standard library.

The emitted C defines a Value tagged union with seven types (nil, num, str, bool, list, obj, fun, builtin), heap-allocated list and object structures with realloc-based growth, and helper constructors (mk_num, mk_str, mk_bool, mk_list, mk_obj). ilusm identifiers are mangled to legal C names with an ilu_ prefix.

Quick example

use cgen
use prs

# Parse ilusm source to AST
src = fs.rd("myprogram.ilu")
ast = prs.prs_src_at(src, "myprogram.ilu")

# Emit C source
c_src = cgen.prog(ast.pr)

# Write to file
cgen.write("myprogram.c", ast.pr)

# Now compile externally:
# gcc -O2 -o myprogram myprogram.c

Functions

Runtime preamble

cgen.header()

Returns the C runtime preamble as a string (~80 lines). Includes:

  • Standard headers: stdio.h, stdlib.h, string.h, stdint.h
  • ValType enum: V_NIL, V_NUM, V_STR, V_BOOL, V_LIST, V_OBJ, V_FUN, V_BIF
  • Value tagged-union struct with a union of double num, char *str, int boolean, ListNode *list, ObjData *obj, BuiltinFn bif, and a {fn, env} closure pair
  • ListNode: items array with len/cap, growable via realloc
  • ObjData: key-value field array with len/cap
  • Constructors: mk_num, mk_str (strdup), mk_bool, mk_list(cap), mk_obj()
  • Helpers: list_push, obj_set, obj_get

Name mangling

cgmangle(name)

Converts an ilusm identifier to a legal C name. Prepends ilu_, replaces . with _, and replaces - with __. For example: "trl.map""ilu_trl_map".

Expression codegen

cgen.expr(node)

Recursively generates a C expression string from an AST node. Handles:

  • "nm" - number literal → mk_num(v)
  • "st" - string literal → mk_str("...") (with C escape sequences)
  • "id" - identifier → mangled C name
  • "bn" - binary operator: +, -, *, /, ==, !=, <, >mk_num(...) or mk_bool(...)
  • "ca" - function call
  • "dt" - field access → obj_get(obj, "field")
  • "ls" - list literal → mk_list(n) + list_push calls in a GCC statement-expression
  • "ob" - object literal → mk_obj() + obj_set calls
  • "bl" - boolean literal → mk_bool(1) or mk_bool(0)

cgesc(s)

C-escapes a string: doubles backslashes, escapes double quotes and newlines.

cgbinop(op, left_c, right_c)

Generates a C binary-op expression from an ilusm operator and two already-generated C expression strings.

cgargs(args)

Generates a comma-separated C argument list from a list of AST argument nodes.

cglist(elements)

Generates a C statement-expression that constructs a list and pushes elements. Uses GCC ({ ... }) compound-expression syntax.

cgobj(pairs)

Generates a C statement-expression that creates an object and sets its fields.

Statement codegen

cgen.stmt(node)

Generates a C statement string from an AST node. Handles:

  • "def" - variable definition: Value name = expr;
  • "ret" - return statement
  • "ifs" - if/else block (with optional else branch)
  • "whl" - while loop
  • "fun" - function definition (delegates to cgfun)
  • "use" - module import (emitted as empty string)
  • Any other node - treated as an expression statement

cgblock(stmts)

Generates a sequence of C statements from a list of AST nodes, each indented with two spaces.

cgfun(node)

Generates a complete C function definition from a "fun" AST node - prototype, parameter list, body, and implicit return V_NIL_VAL;.

Full program

cgfwd(stmts)

Generates forward declarations for all top-level function definitions. This ensures mutually recursive functions compile correctly.

cgen.prog(prog)

Generates a complete C source file from a program AST. Emits the runtime preamble, forward declarations, all function definitions, and a main() that runs all non-function top-level statements.

cgen.fn(fn_node)

Generates a single C function definition from a function AST node. Useful for incremental or partial code generation.

cgen.write(path, prog)

Generates the full C source for a program and writes it to path using __sys_write_file.

Notes

  • The emitted C uses GCC statement-expressions (({ ... })) for list and object literals - compile with gcc or clang, not MSVC.
  • Only the operators +, -, *, /, ==, !=, <, > are implemented; others emit V_NIL_VAL.
  • No GC is emitted - this is a simple arena model suitable for short-lived programs.
  • Requires trl and txt.