Code Generation

Backends and lowering: Cranelift, native, LLVM, GPU, and debug info.

Code Generation

Code generation is where Sounio most clearly has multiple real compiler profiles. The default public workflow is the checked self-hosted native path behind bin/souc. Separate omega artifacts expose optional Cranelift JIT and GPU profiles with different backend contracts.

Current source map

  • self-hosted/native/ covers native lowering, encoding, ABI handling, register allocation, object formats, and test suites.
  • self-hosted/wasm/ covers lowering, encoding, module construction, and driver code for the WASM path.
  • self-hosted/gpu/ covers PTX, SPIR-V, Metal, tensor-oriented work, and GPU lowering paths.
  • self-hosted/llvm/ remains present as a separate backend-oriented subtree.

What the checked artifacts prove

  • Default launcher (bin/souc): self-hosted native codegen to ELF/Mach-O; see artifacts/omega/selfhost_verification_report.v1.json and docs/guide/MINIMUM_VIABLE_SOUNIO.md.
  • Optional JIT profile: Cranelift JIT enabled in a separate checked artifact — not the default onboarding path.
  • GPU profile: GPU codegen enabled; JIT disabled; PTX emission available through build --backend gpu.
  • Self-hosted GPU templates: the default compiler (bin/souc) compiles GPU syntax and runs CPU fallback. The kretikos CLI builds small in-tree emitter drivers for predefined PTX/CUBIN templates: kretikos emit-ptx (from self-hosted/gpu/ptx.sio) and kretikos emit-cubin (from self-hosted/gpu/nvidia_bare.sio). kretikos bundle groups those artifacts with hashes, structural checks, optional toolchain/runtime validation, and explicit boundaries.
  • Documentation should therefore distinguish default-profile behavior, GPU-profile behavior, self-hosted template behavior, and source-tree implementation breadth.

Native v2 preview lane

The repo also carries a public preview native-v2 lane for the backend sovereignty program. The old native-v2-shadow compatibility alias has now been retired. This is not a new stable end-user backend.

  • It exists to pin RuntimeContext, target register policy, and Machine IR expectations in a machine-readable artifact.
  • The artifact now distinguishes the allocatable x86 callee-saved set (r12..r15) from the actual allocation order consumed by regalloc and lowering (r15, r14, r13, r12).
  • The current x86-64 preview lane now lowers through a real Machine IR + legality path, emits strict scalar-core native ELFs in the self-hosted shell, publishes real stack-map/deopt metadata plus a concrete gc_state block and managed-object descriptor table through RuntimeContext, emits the v2 root-map/deopt-id/OSR-eligibility stack-map schema, routes native-v2 heap objects through a fixed-capacity handle table for alloc/field/index access, compiles allocation overflow into a real runtime slow-path trap, and now carries an executable descriptor-driven mark/compact GC model with precise pointer-slot scanning plus pin-aware relocation rules; unsupported opcode families fail closed.
  • AArch64 now emits the same scalar-core preview contract as Mach-O output, but it is still compile-only preview rather than a runtime-attested native lane.

Confirm the active backend set

export SOUC_BIN="$(pwd)/bin/souc"
"$SOUC_BIN" info

# Self-hosted GPU artifact templates
kretikos emit-ptx vec_add     -o /tmp/kretikos_vec_add.ptx
kretikos emit-ptx vec_sub     -o /tmp/kretikos_vec_sub.ptx
kretikos emit-ptx vec_mul     -o /tmp/kretikos_vec_mul.ptx
kretikos emit-ptx vec_div     -o /tmp/kretikos_vec_div.ptx
kretikos emit-ptx vec_add_f64 -o /tmp/kretikos_vec_add_f64.ptx
kretikos emit-ptx fma         -o /tmp/kretikos_fma.ptx
kretikos emit-ptx fma_f64     -o /tmp/kretikos_fma_f64.ptx
kretikos emit-ptx store_u32_const -o /tmp/kretikos_store_u32.ptx
kretikos emit-metal vec_add          -o /tmp/kretikos_vec_add.metal
kretikos emit-metal ossm_oct_step    -o /tmp/kretikos_ossm.metal
kretikos emit-metal sedenion_cd_step -o /tmp/kretikos_sedenion.metal
kretikos emit-cubin vec_add_f32 -o /tmp/kretikos_vec_add.cubin
kretikos bundle -o /tmp/kretikos-bundle
kretikos bundle -o /tmp/kretikos-validated-bundle --validate-toolchain --validate-runtime

# GPU artifact (broader pattern support)
export SOUC_GPU_BIN="$(pwd)/artifacts/omega/souc-bin/souc-linux-x86_64-gpu"
"$SOUC_GPU_BIN" info
"$SOUC_GPU_BIN" build examples/kernel_vec_add.sio --backend gpu -o /tmp/kernel_vec_add.ptx

export SOUNIO_NATIVE_V2_CONTRACT_PATH=/tmp/native_backend_v2_contract.v1.json
"$SOUC_BIN" run self-hosted/compiler/main.sio -- --self-test

The canonical gate-backed artifact is artifacts/omega/native_backend_v2_contract.v1.json.

How to write codegen docs that do not go stale

  • Describe the backend directories when discussing implementation architecture.
  • Describe the checked JIT artifact when discussing the default user experience.
  • Describe the checked GPU artifact when discussing public GPU codegen.
  • Only describe backend execution paths that you have validated with the exact artifact and command you are documenting.