On this page
Language

What the language does.
What it doesn’t. Exactly.

Four things Sounio gives you that other languages don’t. Anything not on this page is either scaffolding or missing — /proof has the honest status.

Sounio is a statically typed, natively compiled systems language. Its contribution relative to Rust (ownership / linearity), Koka (algebraic effects), and Lean (verified computation) is the epistemic dimension: gradual confidence typing, GUM uncertainty propagation, and confidence gates — all enforced by the compiler, not by convention or runtime library.

A

Confidence survives every operation.

Every value in Sounio can carry the evidence behind it. The type system enforces this at compile time — not as a lint, not as a runtime check.

Knowledge<T>

The base epistemic type. Wraps any value with a confidence level ε ∈ [0, 1], an uncertainty interval, and a provenance string. GUM uncertainty propagates automatically through arithmetic.

Knowledge<f64>, Knowledge<mg>

Validated<T>

A distinct type that can only be produced by passing a formal confidence gate. Cannot be confused with raw measurements. Confidence-gated execution branches on Validated vs. raw.

fn prescribe(d: Validated<f64, ε >= 0.82>)

Unobserved<T>

Values remain quantum-indeterminate until a comparison or I/O boundary collapses them. Bayesian and quantum measurement semantics live in the type system, not in runtime libraries.

fn observe(x: Unobserved<f64>) -> bool with Observe

Intervention<T> & Counterfactual<T>

Compiler-level types that carry causal confidence. Intervention<T> corresponds to Pearl's do-operator; Counterfactual<T> encodes hypothetical reasoning under the same discipline as observational data.

do(X = x): Intervention<f64>

Knowledge<T> progression — bare numeric to fully epistemic

  1. Tier 1

    T

    Raw scalar or structured value with no attached evidence metadata.

  2. Tier 2

    Quantity<T, Unit>

    Value promoted to dimension-aware representation with unit constraints.

  3. Tier 3

    Knowledge<T>

    Uncertainty, confidence, and source attribution become part of the value.

  4. Tier 4

    Knowledge<T, Provenance>

    Executable value with traceability chain for audit and reproducibility.

Confidence-gated execution

confidence_gate.sio
let dose: Knowledge<f64> = Knowledge(
  500.0,
  ε=0.97,
  prov="lab_run_042"
)

// Confidence gate: only passes if ε >= 0.82
if dose.ε >= 0.82 {
  prescribe(dose)           // type: Knowledge<f64>
} else {
  request_remeasurement()
}

Algebra declarations

Declare algebraic laws on types. The e-graph optimizer uses only the rewrites your algebra permits — no unsound reordering of non-commutative ops.

algebra Octonion over f64 {
  add: commutative, associative
  mul: alternative, non_commutative
  reassociate: fano_selective
}

Units & dimensional analysis

Units are checked at compile time. A value declared as mg cannot be silently coerced to kg or dimensionless. Refinement predicates (e.g. x > 0) work today; SMT integration is planned.

let dose: mg = 500.0
let weight: kg = 78.5
// let wrong = dose + weight  ← compile error: mg + kg

GUM propagation — uncertainty flows automatically

1

Measure

Capture value + uncertainty + confidence from instrument or model.

2

Propagate

Composition rules carry uncertainty through transformations.

3

Evaluate

Confidence thresholds and effect policies decide admissibility.

4

Act

Execution continues only when evidence satisfies gate criteria.

B

Side effects live in the signature.

Side effects are visible in function signatures. Linear types enforce exactly-once resource consumption. Neither requires runtime overhead.

Built-in effects

with IO File, network, stdin/stdout — any real-world I/O.
with Panic Controlled abort path. Functions without Panic cannot panic.
with Observe Collapses Unobserved<T> to a concrete value — separates measurement from computation.
with Async Async/await, channels, spawn. Phase 1 wired (channel + send/recv + spawn); full scheduler pending.
with GPU Kernel launch, memory transfer. Syntax integrated; end-to-end path requires CUDA driver.
with Probabilistic Monte Carlo sampling, Bayesian updates. In stdlib; effect annotation enforces separation from deterministic code.

The subset rule

A function's effects must be a subset of its caller's declared effects. A function without with IO cannot call a function that has it. This is checked by the type checker, not at runtime.

Linear structs

A linear struct must be consumed exactly once — it cannot be copied, dropped silently, or moved into two branches. File handles, network connections, and cryptographic contexts are the canonical use case.

effects_and_linear.sio
// Effects are part of the signature — callee effects must
// be a subset of the caller's declared effects.
fn read_sensor() -> Knowledge<f64> with IO, Observe {
    let raw = read_port(0x3F8)          // requires IO
    return observe(raw)                  // requires Observe
}

// Linear struct: the handle must be consumed exactly once.
linear struct FileHandle { fd: i32 }

fn close(h: FileHandle) with IO {
    syscall_close(h.fd)                 // h is consumed here
}
C

The compiler compiles itself.

The compiler is self-hosted: written in Sounio, it compiles Sounio. The bootstrap chain is verified — stage 2 and stage 3 produce byte-for-byte identical binaries.

Compilation pipeline

Step 1

Source + Types

Parser, refinement checks, and uncertainty-aware typing.

Step 2

IR + Effects

Lowering and effect analysis with provenance retention.

Step 3

Backend Select

Native and GPU target selection with capability checks.

Step 4

Runtime Evidence

Execution outputs with confidence and traceability context.

Pipeline stages

Source → Lexer → Parser Recursive-descent parser. No external parser generator.
AST → Check Bidirectional type inference + effect analysis. Knowledge<T> subtyping, confidence gates, linear resource tracking.
Check → HIR → SIR High-level IR, then Static IR (SSA form). Epistemic metadata preserved through lowering.
SIR → HLIR (e-graph) 1,000+ algebraic rewrite rules. Algebra declarations constrain which rewrites apply.
HLIR → ELF (x86-64) Native code generation. No LLVM. Bootstrap chain: C → gen1 → gen2 → gen3 (fixed point).

Reproducing the self-hosting fixed point

# Stage 0: C compiler compiles the first Sounio compiler
./artifacts/self-hosted/souc-self-hosted-x86_64 \
    self-hosted/compiler/lean_single.sio gen1.elf

# Stage 1 → Stage 2: first self-hosted compile
./gen1.elf self-hosted/compiler/lean_single.sio gen2.elf

# Stage 2 → Stage 3: fixed-point check
./gen2.elf self-hosted/compiler/lean_single.sio gen3.elf

# Must be identical — this is the fixed point.
sha256sum gen2.elf gen3.elf

Achieved 2026-03-23. Recorded SHA-256 (CHANGELOG): 7b588877f870… (741 KB). See /proof §2 for the full verification.

E-graph optimizer

The optimizer implements equality saturation over an e-graph — the same approach used by egg/egglog but running inside a Sounio compiler written in Sounio. 1,000+ algebraic rewrite rules, all at FAIL=0 in the test gate. Algebra declarations constrain which rewrites apply per type.

Common commands

$ souc check file.sio Type-check; no output generated.
$ souc run file.sio Compile to temp ELF, execute, clean up.
$ souc compile file.sio -o out.elf Direct native compilation to ELF.
$ souc check --show-types file.sio Dump inferred types (JIT mode only).
D

What is wired. What is not.

What is and isn't wired up end-to-end. The anti-slop move is to say this explicitly before the reader discovers it.

Same contract as the homepage and README.md — regenerated from artifacts.

Stdlib 可靠性门控: 251 通过 / 0 失败 / 0 跳过 / 251 总计 · 来源: artifacts/stdlib/stdlib_reliability_status.v1.json

正常运行

  • Epistemic core — Knowledge[T] with GUM propagation, provenance tracking, and compile-time confidence gates (vancomycin fixtures)
  • Self-hosted compiler — Fixed-point verified (cycle parity=true); 172 self-hosted source files
  • Algebra — Clifford, Cayley-Dickson, octonions — 168 theorem verified computationally
  • Native codegen — Linux x86-64 ELF plus checked macOS artifact lanes via bin/souc; PE/COFF backend exists for cross-compile
  • Core stdlib gate — 251 / 251 stdlib reliability tests pass (2026-05-12)
  • Optimizer — 1,000+ e-graph rewrite rules with FAIL=0 in optimizer tests
  • Language server — LSP smoke gate pass; hover, defs, refs, rename, formatting, semantic tokens
  • Closure literals — Named function refs and closure tests in tests/run-pass/ (see README resolved list)

脚手架/草图

  • Theorem prover — Large arena and data structures; full inference logic not complete
  • Epistemic modules — Many modules are signatures with minimal bodies (~70% scaffolding)
  • Neural networks — Quaternion/octonion NN lanes exist but are not all end-to-end stable
  • Genomics — Several files are stubs disabled on parser limitations
  • Async runtime — Self-hosted async tests pass per KNOWN_LIMITATIONS.md; broader runtime integration still partial
  • Geometry engine — Extended geometry paths disabled; core engine partial

缺失

  • Epistemic ODE solver (general RHS) — Exponential decay works; general RHS needed for full PBPK epistemic integration
  • Ontology federation — Local ontology work exists; 15M-term federated query not implemented
  • GPU CLI entry point — PTX/GPU codegen exists in-tree; no complete default CLI path (gpu/lib.sio stub)
  • Windows pre-built binary — PE/COFF backend is production-grade; no checked .exe shipped in this checkout
  • AArch64 native-v2 parity — Apple Silicon support uses checked Mach-O artifact lane; native-v2 aarch64 still preview-grade
  • Checked launcher REPL — README Known Limitations: bin/souc does not expose repl; separate REPL beta exists outside default lane

GPU path

Kernel syntax is integrated into the language — a Sounio function annotated with GPU compiles to PTX via a backend bridge. The world's first GUM uncertainty propagation through a GPU WMMA kernel is verified in tests/stdlib/gpu/ (13/13 gate PASS on CUDA hardware). The end-to-end driver path requires a CUDA-capable host — gpu/lib.sio is currently an empty stub.

Test gate snapshot

Stdlib reliability gate: 251 pass / 0 fail / 0 skip across 251 tests. See /proof §4.