What this route is arguing
This route is here to prove that Sounio can expose a real acceleration lane, emit a real artifact, and survive the kind of runtime pressure that destroys weak systems claims.
A bounded GPU surface with PTX emission, attested lanes, and public commands a reviewer can inspect.
What this route is arguing
This route is here to prove that Sounio can expose a real acceleration lane, emit a real artifact, and survive the kind of runtime pressure that destroys weak systems claims.
A skeptical reader should ask
A skeptical reader should be able to ask for the command, the emitted PTX, the runtime boundary, and the exact point where the public claim stops.
Boundary
The claim is not total GPU completeness. The claim is a bounded GPU surface that is stronger because it is inspectable.
This route is not here to imply that Sounio already owns every GPU backend or every runtime lane. It is here to make a much tighter argument:
That matters because most language websites wave at GPU ambition. They rarely show the exact commands, the exact artifact, and the exact boundary in the same place.
Sounio should be able to accelerate a lane without turning into mythology.
The claim is not “all GPU programming is solved.” The claim is that a language serious about scientific software must be able to expose a bounded acceleration surface and keep its story coherent under scrutiny:
This is why the GPU route belongs on the storefront. It is one of the clearest places where Sounio either proves it can survive contact with hard systems work, or fails publicly.
If the answer to those questions is weak, this route should not be on the homepage at all.
kernel fn vector_add(n: i64) with GPU {
}
kernel fn scale_vector(factor: f64, n: i64) with GPU, Div {
}
fn main() with GPU, IO {
let grid = (16, 1, 1)
let block = (64, 1, 1)
perform GPU.launch(vector_add, grid, block)(1024)
perform GPU.launch(scale_vector, grid, block)(2.0, 1024)
perform GPU.sync()
}
This surface is intentionally modest. The point is to show a lane that can be named, checked, lowered, and inspected, not to pretend the language has already collapsed every GPU concern into one elegant abstraction.
export SOUC_GPU_BIN="$(pwd)/artifacts/omega/souc-bin/souc-linux-x86_64-gpu"
export SOUNIO_STDLIB_PATH="$(pwd)/stdlib"
"$SOUC_GPU_BIN" check examples/gpu.sio
"$SOUC_GPU_BIN" build examples/kernel_matmul.sio --backend gpu -o /tmp/kernel_matmul.ptx
That is the standard of proof this route should live under: commands you can run, artifacts you can read, and a bounded story about what the result means.
GPU work is where a lot of language marketing goes to die.
It is easy to sketch intrinsics, mention tensor cores, or promise multiple backends. It is much harder to keep a self-hosted compiler, a runtime path, a public command surface, and a storefront narrative aligned while real bugs are being found in stack alignment, launch parameter layout, PTX module loading, or runtime dispatch.
That is exactly why this route is useful. It demonstrates pressure. The GPU surface is valuable not only because it accelerates compute, but because it forces the language to reveal whether its honesty survives code generation and runtime contact.
cuda-sm80, ROCm rocm-gfx942build --backend gpuself-hosted/gpu/, including Metal,
SPIR-V, tensor, and runtime layersgpu.* intrinsic sketches are still implementation work, not the
checked public surfaceThis route does not claim:
The honest claim is narrower and therefore stronger: there is a real acceleration surface here, and it is bounded by artifacts instead of wishful copy.