Lexer & Parser

Name: Sounio
Author: Sounio Language Project

The lexer and parser are no longer something you should primarily explain through old Rust crate paths. The current implementation story is in the self-hosted tree, with thin compiler-level wrappers and many small files that split the syntax work into focused modules.

Current source map

self-hosted/lexer/ contains the tokenization stack: cursor.sio, reader.sio, tables.sio, token.sio, numparse.sio, and related support files.
self-hosted/parser/ contains syntax-layer modules such as ast.sio, exprs.sio, stmts.sio, patterns.sio, items.sio, types.sio, and recovery support.
self-hosted/compiler/lexer.sio and self-hosted/compiler/parser.sio provide a higher-level compiler-facing entry point into those lower-level modules.

What to document as stable

The repo actively maintains a real lexer and parser in Sounio itself.
The tree structure shows deliberate separation between tokenization, AST building, statement parsing, and pattern parsing.
The safest syntax claims are still the ones backed by current fixtures and by direct souc check validation.

Useful implementation landmarks

self-hosted/lexer/token.sio
self-hosted/parser/ast.sio
self-hosted/parser/exprs.sio
self-hosted/parser/items.sio
self-hosted/compiler/parser.sio

Documentation guidance

Do not anchor public compiler docs to crates/souc/src/lexer or parser as the primary current explanation; those paths no longer describe the active tree accurately.
When explaining grammar, use small checked examples and then point curious contributors to the self-hosted parser modules.
Treat recovery behavior and edge-case parsing claims conservatively unless you have current tests for them.