Code Generation

Code generation is where Sounio most clearly has multiple real compiler profiles. The default public workflow is the checked self-hosted native path behind bin/souc. Separate omega artifacts expose optional Cranelift JIT and GPU profiles with different backend contracts.

Current source map

self-hosted/native/ covers native lowering, encoding, ABI handling, register allocation, object formats, and test suites.
self-hosted/wasm/ covers lowering, encoding, module construction, and driver code for the WASM path.
self-hosted/gpu/ covers PTX, SPIR-V, Metal, tensor-oriented work, and GPU lowering paths.
self-hosted/llvm/ remains present as a separate backend-oriented subtree.

What the checked artifacts prove

Default launcher (bin/souc): self-hosted native codegen to ELF/Mach-O; see artifacts/omega/selfhost_verification_report.v1.json and docs/guide/MINIMUM_VIABLE_SOUNIO.md.
Optional JIT profile: Cranelift JIT enabled in a separate checked artifact — not the default onboarding path.
GPU profile: GPU codegen enabled; JIT disabled; PTX emission available through build --backend gpu.
Self-hosted GPU templates: the default compiler (bin/souc) compiles GPU syntax and runs CPU fallback. The kretikos CLI builds small in-tree emitter drivers for predefined PTX/CUBIN templates: kretikos emit-ptx (from self-hosted/gpu/ptx.sio) and kretikos emit-cubin (from self-hosted/gpu/nvidia_bare.sio). kretikos bundle groups those artifacts with hashes, structural checks, optional toolchain/runtime validation, and explicit boundaries.
Documentation should therefore distinguish default-profile behavior, GPU-profile behavior, self-hosted template behavior, and source-tree implementation breadth.

Native v2 preview lane

The repo also carries a public preview native-v2 lane for the backend sovereignty program. The old native-v2-shadow compatibility alias has now been retired. This is not a new stable end-user backend.

It exists to pin RuntimeContext, target register policy, and Machine IR expectations in a machine-readable artifact.
The artifact now distinguishes the allocatable x86 callee-saved set (r12..r15) from the actual allocation order consumed by regalloc and lowering (r15, r14, r13, r12).
The current x86-64 preview lane now lowers through a real Machine IR + legality path, emits strict scalar-core native ELFs in the self-hosted shell, publishes real stack-map/deopt metadata plus a concrete gc_state block and managed-object descriptor table through RuntimeContext, emits the v2 root-map/deopt-id/OSR-eligibility stack-map schema, routes native-v2 heap objects through a fixed-capacity handle table for alloc/field/index access, compiles allocation overflow into a real runtime slow-path trap, and now carries an executable descriptor-driven mark/compact GC model with precise pointer-slot scanning plus pin-aware relocation rules; unsupported opcode families fail closed.
AArch64 now emits the same scalar-core preview contract as Mach-O output, but it is still compile-only preview rather than a runtime-attested native lane.

Confirm the active backend set

export SOUC_BIN="$(pwd)/bin/souc"
"$SOUC_BIN" info

# Self-hosted GPU artifact templates
kretikos emit-ptx vec_add     -o /tmp/kretikos_vec_add.ptx
kretikos emit-ptx vec_sub     -o /tmp/kretikos_vec_sub.ptx
kretikos emit-ptx vec_mul     -o /tmp/kretikos_vec_mul.ptx
kretikos emit-ptx vec_div     -o /tmp/kretikos_vec_div.ptx
kretikos emit-ptx vec_add_f64 -o /tmp/kretikos_vec_add_f64.ptx
kretikos emit-ptx fma         -o /tmp/kretikos_fma.ptx
kretikos emit-ptx fma_f64     -o /tmp/kretikos_fma_f64.ptx
kretikos emit-ptx store_u32_const -o /tmp/kretikos_store_u32.ptx
kretikos emit-metal vec_add          -o /tmp/kretikos_vec_add.metal
kretikos emit-metal ossm_oct_step    -o /tmp/kretikos_ossm.metal
kretikos emit-metal sedenion_cd_step -o /tmp/kretikos_sedenion.metal
kretikos emit-cubin vec_add_f32 -o /tmp/kretikos_vec_add.cubin
kretikos bundle -o /tmp/kretikos-bundle
kretikos bundle -o /tmp/kretikos-validated-bundle --validate-toolchain --validate-runtime

# GPU artifact (broader pattern support)
export SOUC_GPU_BIN="$(pwd)/artifacts/omega/souc-bin/souc-linux-x86_64-gpu"
"$SOUC_GPU_BIN" info
"$SOUC_GPU_BIN" build examples/kernel_vec_add.sio --backend gpu -o /tmp/kernel_vec_add.ptx

export SOUNIO_NATIVE_V2_CONTRACT_PATH=/tmp/native_backend_v2_contract.v1.json
"$SOUC_BIN" run self-hosted/compiler/main.sio -- --self-test

The canonical gate-backed artifact is artifacts/omega/native_backend_v2_contract.v1.json.

How to write codegen docs that do not go stale

Describe the backend directories when discussing implementation architecture.
Describe the checked JIT artifact when discussing the default user experience.
Describe the checked GPU artifact when discussing public GPU codegen.
Only describe backend execution paths that you have validated with the exact artifact and command you are documenting.