On this page

Language

What the language does.
What it doesn’t. Exactly.

Four things Sounio gives you that other languages don’t. Anything not on this page is either scaffolding or missing — /proof has the honest status.

Sounio is a statically typed, natively compiled systems language. Its contribution relative to Rust (ownership / linearity), Koka (algebraic effects), and Lean (verified computation) is the epistemic dimension: gradual confidence typing, GUM uncertainty propagation, and confidence gates — all enforced by the compiler, not by convention or runtime library.

Confidence survives every operation.

Every value in Sounio can carry the evidence behind it. The type system enforces this at compile time — not as a lint, not as a runtime check.

Knowledge<T>

The base epistemic type. Wraps any value with a confidence level ε ∈ [0, 1], an uncertainty interval, and a provenance string. GUM uncertainty propagates automatically through arithmetic.


Knowledge<f64>, Knowledge<mg>

Validated<T>

A distinct type that can only be produced by passing a formal confidence gate. Cannot be confused with raw measurements. Confidence-gated execution branches on Validated vs. raw.


fn prescribe(d: Validated<f64, ε >= 0.82>)

Unobserved<T>

Values remain quantum-indeterminate until a comparison or I/O boundary collapses them. Bayesian and quantum measurement semantics live in the type system, not in runtime libraries.


fn observe(x: Unobserved<f64>) -> bool with Observe

Intervention<T> & Counterfactual<T>

Compiler-level types that carry causal confidence. Intervention<T> corresponds to Pearl's do-operator; Counterfactual<T> encodes hypothetical reasoning under the same discipline as observational data.


do(X = x): Intervention<f64>

Knowledge<T> progression — bare numeric to fully epistemic

Tier 1
T

Raw scalar or structured value with no attached evidence metadata.
Tier 2
Quantity<T, Unit>

Value promoted to dimension-aware representation with unit constraints.
Tier 3
Knowledge<T>

Uncertainty, confidence, and source attribution become part of the value.
Tier 4
Knowledge<T, Provenance>

Executable value with traceability chain for audit and reproducibility.

Confidence-gated execution

confidence_gate.sio

let dose: Knowledge<f64> = Knowledge(
  500.0,
  ε=0.97,
  prov="lab_run_042"
)

// Confidence gate: only passes if ε >= 0.82
if dose.ε >= 0.82 {
  prescribe(dose)           // type: Knowledge<f64>
} else {
  request_remeasurement()
}

output

Algebra declarations

Declare algebraic laws on types. The e-graph optimizer uses only the rewrites your algebra permits — no unsound reordering of non-commutative ops.

algebra Octonion over f64 {
  add: commutative, associative
  mul: alternative, non_commutative
  reassociate: fano_selective
}

Units & dimensional analysis

Units are checked at compile time. A value declared as mg cannot be silently coerced to kg or dimensionless. Refinement predicates (e.g. x > 0) work today; SMT integration is planned.

let dose: mg = 500.0
let weight: kg = 78.5
// let wrong = dose + weight  ← compile error: mg + kg

GUM propagation — uncertainty flows automatically

Measure

Capture value + uncertainty + confidence from instrument or model.

Propagate

Composition rules carry uncertainty through transformations.

Evaluate

Confidence thresholds and effect policies decide admissibility.

Act

Execution continues only when evidence satisfies gate criteria.

Side effects live in the signature.

Side effects are visible in function signatures. Linear types enforce exactly-once resource consumption. Neither requires runtime overhead.

Built-in effects

with IO File, network, stdin/stdout — any real-world I/O.

with Panic Controlled abort path. Functions without Panic cannot panic.

with Observe Collapses Unobserved<T> to a concrete value — separates measurement from computation.

with Async Async/await, channels, spawn. Phase 1 wired (channel + send/recv + spawn); full scheduler pending.

with GPU Kernel launch, memory transfer. Syntax integrated; end-to-end path requires CUDA driver.

with Probabilistic Monte Carlo sampling, Bayesian updates. In stdlib; effect annotation enforces separation from deterministic code.

The subset rule

A function's effects must be a subset of its caller's declared effects. A function without with IO cannot call a function that has it. This is checked by the type checker, not at runtime.

Linear structs

A linear struct must be consumed exactly once — it cannot be copied, dropped silently, or moved into two branches. File handles, network connections, and cryptographic contexts are the canonical use case.

effects_and_linear.sio

// Effects are part of the signature — callee effects must
// be a subset of the caller's declared effects.
fn read_sensor() -> Knowledge<f64> with IO, Observe {
    let raw = read_port(0x3F8)          // requires IO
    return observe(raw)                  // requires Observe
}

// Linear struct: the handle must be consumed exactly once.
linear struct FileHandle { fd: i32 }

fn close(h: FileHandle) with IO {
    syscall_close(h.fd)                 // h is consumed here
}

output

The compiler compiles itself.

The compiler is self-hosted: written in Sounio, it compiles Sounio. The bootstrap chain is verified — stage 2 and stage 3 produce byte-for-byte identical binaries.

Compilation pipeline

Step 1

Source + Types

Parser, refinement checks, and uncertainty-aware typing.

Step 2

IR + Effects

Lowering and effect analysis with provenance retention.

Step 3

Backend Select

Native and GPU target selection with capability checks.

Step 4

Runtime Evidence

Execution outputs with confidence and traceability context.

Pipeline stages

Source → Lexer → Parser Recursive-descent parser. No external parser generator.

AST → Check Bidirectional type inference + effect analysis. Knowledge<T> subtyping, confidence gates, linear resource tracking.

Check → HIR → SIR High-level IR, then Static IR (SSA form). Epistemic metadata preserved through lowering.

SIR → HLIR (e-graph) 1,000+ algebraic rewrite rules. Algebra declarations constrain which rewrites apply.

HLIR → ELF (x86-64) Native code generation. No LLVM. Bootstrap chain: C → gen1 → gen2 → gen3 (fixed point).

Reproducing the self-hosting fixed point

# Stage 0: C compiler compiles the first Sounio compiler
./artifacts/self-hosted/souc-self-hosted-x86_64 \
    self-hosted/compiler/lean_single.sio gen1.elf

# Stage 1 → Stage 2: first self-hosted compile
./gen1.elf self-hosted/compiler/lean_single.sio gen2.elf

# Stage 2 → Stage 3: fixed-point check
./gen2.elf self-hosted/compiler/lean_single.sio gen3.elf

# Must be identical — this is the fixed point.
sha256sum gen2.elf gen3.elf

Achieved 2026-03-23. Recorded SHA-256 (CHANGELOG): 7b588877f870… (741 KB). See /proof §2 for the full verification.

E-graph optimizer

The optimizer implements equality saturation over an e-graph — the same approach used by egg/egglog but running inside a Sounio compiler written in Sounio. 1,000+ algebraic rewrite rules, all at FAIL=0 in the test gate. Algebra declarations constrain which rewrites apply per type.

Common commands

$ souc check file.sio Type-check; no output generated.

$ souc run file.sio Compile to temp ELF, execute, clean up.

$ souc compile file.sio -o out.elf Direct native compilation to ELF.

$ souc check --show-types file.sio Dump inferred types (JIT mode only).

What is wired. What is not.

What is and isn't wired up end-to-end. The anti-slop move is to say this explicitly before the reader discovers it.

Same contract as the homepage and README.md — regenerated from artifacts.

Stdlib 可靠性门控: 251 通过 / 0 失败 / 0 跳过 / 251 总计 · 来源: artifacts/stdlib/stdlib_reliability_status.v1.json

正常运行

Epistemic core — Knowledge[T] with GUM propagation, provenance tracking, and compile-time confidence gates (vancomycin fixtures)
Self-hosted compiler — Fixed-point verified (cycle parity=true); 172 self-hosted source files
Algebra — Clifford, Cayley-Dickson, octonions — 168 theorem verified computationally
Native codegen — Linux x86-64 ELF plus checked macOS artifact lanes via bin/souc; PE/COFF backend exists for cross-compile
Core stdlib gate — 251 / 251 stdlib reliability tests pass (2026-05-12)
Optimizer — 1,000+ e-graph rewrite rules with FAIL=0 in optimizer tests
Language server — LSP smoke gate pass; hover, defs, refs, rename, formatting, semantic tokens
Closure literals — Named function refs and closure tests in tests/run-pass/ (see README resolved list)

脚手架/草图

Theorem prover — Large arena and data structures; full inference logic not complete
Epistemic modules — Many modules are signatures with minimal bodies (~70% scaffolding)
Neural networks — Quaternion/octonion NN lanes exist but are not all end-to-end stable
Genomics — Several files are stubs disabled on parser limitations
Async runtime — Self-hosted async tests pass per KNOWN_LIMITATIONS.md; broader runtime integration still partial
Geometry engine — Extended geometry paths disabled; core engine partial

缺失

Epistemic ODE solver (general RHS) — Exponential decay works; general RHS needed for full PBPK epistemic integration
Ontology federation — Local ontology work exists; 15M-term federated query not implemented
GPU CLI entry point — PTX/GPU codegen exists in-tree; no complete default CLI path (gpu/lib.sio stub)
Windows pre-built binary — PE/COFF backend is production-grade; no checked .exe shipped in this checkout
AArch64 native-v2 parity — Apple Silicon support uses checked Mach-O artifact lane; native-v2 aarch64 still preview-grade
Checked launcher REPL — README Known Limitations: bin/souc does not expose repl; separate REPL beta exists outside default lane

GPU path

Kernel syntax is integrated into the language — a Sounio function annotated with GPU compiles to PTX via a backend bridge. The world's first GUM uncertainty propagation through a GPU WMMA kernel is verified in tests/stdlib/gpu/ (13/13 gate PASS on CUDA hardware). The end-to-end driver path requires a CUDA-capable host — gpu/lib.sio is currently an empty stub.

Test gate snapshot

Stdlib reliability gate: 251 pass / 0 fail / 0 skip across 251 tests. See /proof §4.

Getting started Honest status → Language guide (docs) →

What the language does.What it doesn’t. Exactly.

Confidence survives every operation.

Knowledge<T>

Validated<T>

Unobserved<T>

Intervention<T> & Counterfactual<T>

T

Quantity<T, Unit>

Knowledge<T>

Knowledge<T, Provenance>

Algebra declarations

Units & dimensional analysis

Measure

Propagate

Evaluate

Act

Side effects live in the signature.

Built-in effects

The subset rule

Linear structs

The compiler compiles itself.

Source + Types

IR + Effects

Backend Select

Runtime Evidence

Pipeline stages

E-graph optimizer

Common commands

What is wired. What is not.

Runtime honesty (synced)

正常运行

脚手架/草图

缺失

GPU path

Test gate snapshot

What the language does.
What it doesn’t. Exactly.