Code Generation

Backends and lowering: Cranelift, native, LLVM, GPU, and debug info.

Code Generation

Code generation is where Sounio most clearly has multiple real compiler profiles. The right description is no longer “the default artifact exposes one path”; it is “the repo ships separate checked JIT and GPU artifacts, and each one exposes a different backend contract.”

Current source map

  • self-hosted/native/ covers native lowering, encoding, ABI handling, register allocation, object formats, and test suites.
  • self-hosted/wasm/ covers lowering, encoding, module construction, and driver code for the WASM path.
  • self-hosted/gpu/ covers PTX, SPIR-V, Metal, tensor-oriented work, and GPU lowering paths.
  • self-hosted/llvm/ remains present as a separate backend-oriented subtree.

What the checked artifacts prove

  • JIT profile: Cranelift JIT enabled; LLVM and GPU codegen disabled.
  • GPU profile: GPU codegen enabled; JIT disabled; PTX emission available through build --backend gpu.
  • Documentation should therefore distinguish default-profile behavior, GPU-profile behavior, and source-tree implementation breadth.

Native v2 preview lane

The repo also carries a public preview native-v2 lane for the backend sovereignty program. The old native-v2-shadow compatibility alias has now been retired. This is not a new stable end-user backend.

  • It exists to pin RuntimeContext, target register policy, and Machine IR expectations in a machine-readable artifact.
  • The artifact now distinguishes the allocatable x86 callee-saved set (r12..r15) from the actual allocation order consumed by regalloc and lowering (r15, r14, r13, r12).
  • The current x86-64 preview lane now lowers through a real Machine IR + legality path, emits strict scalar-core native ELFs in the self-hosted shell, publishes real stack-map/deopt metadata plus a concrete gc_state block and managed-object descriptor table through RuntimeContext, emits the v2 root-map/deopt-id/OSR-eligibility stack-map schema, routes native-v2 heap objects through a fixed-capacity handle table for alloc/field/index access, compiles allocation overflow into a real runtime slow-path trap, and now carries an executable descriptor-driven mark/compact GC model with precise pointer-slot scanning plus pin-aware relocation rules; unsupported opcode families fail closed.
  • AArch64 now emits the same scalar-core preview contract as Mach-O output, but it is still compile-only preview rather than a runtime-attested native lane.

Confirm the active backend set

export SOUC_BIN="$(pwd)/artifacts/omega/souc-bin/souc-linux-x86_64-jit"
"$SOUC_BIN" info

export SOUC_GPU_BIN="$(pwd)/artifacts/omega/souc-bin/souc-linux-x86_64-gpu"
"$SOUC_GPU_BIN" info
"$SOUC_GPU_BIN" build examples/kernel_vec_add.sio --backend gpu -o /tmp/kernel_vec_add.ptx

export SOUNIO_NATIVE_V2_CONTRACT_PATH=/tmp/native_backend_v2_contract.v1.json
"$SOUC_BIN" run self-hosted/compiler/main.sio -- --self-test

The canonical gate-backed artifact is artifacts/omega/native_backend_v2_contract.v1.json.

How to write codegen docs that do not go stale

  • Describe the backend directories when discussing implementation architecture.
  • Describe the checked JIT artifact when discussing the default user experience.
  • Describe the checked GPU artifact when discussing public GPU codegen.
  • Only describe backend execution paths that you have validated with the exact artifact and command you are documenting.