Move language files into dedicated folder

2026-02-07 10:43:30 +01:00 · 2026-02-07 10:43:30 +01:00 · 1b406899e0
commit 1b406899e0
parent 3d1cd89067
15 changed files with 7 additions and 343 deletions
--- a/src/PARSER-PLAN.md
+++ b/src/PARSER-PLAN.md
@ -1,336 +0,0 @@
-## Goal
-
-Implement a correct parser for the language described in `SYNTAX.md`, producing the existing AST types (`Expr`, `Pattern`, `ProductPattern`, etc.).
-
-Code quality is **not** the primary concern.
-Correctness, clarity, and reasonable error messages are.
-
---
-
-## Overall architecture
-
-The parser is split into **two stages**:
-
-1. **Lexing (tokenization)**
-   Converts source text into a stream of tokens, each with precise source location info.
-2. **Parsing**
-   Consumes the token stream and constructs the AST using recursive-descent parsing.
-
-This split is deliberate and should be preserved.
-
---
-
-## Stage 1: Lexer (Tokenizer)
-
-### Purpose
-
-The lexer exists to:
-
-* Normalize the input into a small set of token types
-* Track **line / column / offset** precisely
-* Make parsing simpler and more reliable
-* Enable good error messages later
-
-The lexer is intentionally **simple and dumb**:
-
-* No semantic decisions
-* No AST construction
-* Minimal lookahead
-
---
-
-### Unicode handling
-
-The input may contain arbitrary Unicode (including emoji) inside identifiers and strings.
-
-**Important rule**:
-
-* Iterate over Unicode *code points*, not UTF-16 code units.
-
-In TypeScript:
-
-* Use `for (const ch of input)` or equivalent
-* Do **not** index into strings with `input[i]`
-
-Column counting:
-
-* Increment column by **1 per code point**
-* Exact visual width is not required
-
---
-
-### Source positions and spans
-
-All tokens must carry precise location information.
-
-Suggested types (can be adjusted):
-
-```ts
-type Position = {
-  offset: number; // code-point index from start of input
-  line: number;   // 1-based
-  column: number; // 1-based
-};
-
-type Span = {
-  start: Position;
-  end: Position;
-};
-```
-
-Each token has a `span`.
-
---
-
-### Token types
-
-Suggested minimal token set:
-
-```ts
-type Token =
-  | { kind: "number"; value: number; span: Span }
-  | { kind: "string"; value: string; span: Span }
-  | { kind: "identifier"; value: string; span: Span }
-  | { kind: "keyword"; value: Keyword; span: Span }
-  | { kind: "symbol"; value: Symbol; span: Span }
-  | { kind: "eof"; span: Span };
-```
-
-Where:
-
-```ts
-type Keyword = "let" | "fn" | "match" | "apply" | "=" | "!" | "|";
-type Symbol = "#" | "$" | "(" | ")" | "{" | "}" | "," | ".";
-```
-
-Notes:
-
-* Operators like `+`, `==`, `<=`, `*` are **identifiers**
-* `=` is treated as a keyword (same ofr `|`)
-* Identifiers are parsed first, then checked against keywords
-
---
-
-### Lexer responsibilities
-
-The lexer should:
-
-* Skip whitespace (spaces, tabs, newlines)
-* Track line and column numbers
-* Emit tokens with correct spans
-* Fail immediately on:
-
-  * Unterminated string literals
-  * Invalid characters
-
-The lexer **should not**:
-
-* Attempt error recovery
-* Guess intent
-* Validate grammar rules
-
---
-
-## Stage 2: Parser
-
-### Parsing strategy
-
-Use **recursive-descent parsing**.
-
-The grammar is:
-
-* Context-free
-* Non-left-recursive
-* No precedence rules
-* No implicit associativity
-
-This makes recursive descent ideal.
-
---
-
-### Parser state
-
-The parser operates over:
-
-```ts
-class Parser {
-  tokens: Token[];
-  pos: number;
-}
-```
-
-Helper methods are encouraged:
-
-```ts
-peek(): Token
-advance(): Token
-matchKeyword(kw: Keyword): boolean
-matchSymbol(sym: Symbol): boolean
-expectKeyword(kw: Keyword): Token
-expectSymbol(sym: Symbol): Token
-error(message: string, span?: Span): never
-```
-
---
-
-### Error handling
-
-Error recovery is **not required**.
-
-On error:
-
-* Throw a `ParseError`
-* Include:
-
-  * A clear message
-  * A span pointing to the offending token (or best approximation)
-
-The goal is:
-
-* One good error
-* Accurate location
-* No cascading failures
-
---
-
-### Expression parsing
-
-There is **no precedence hierarchy**.
-
-`parseExpr()` should:
-
-* Look at the next token
-* Dispatch to the correct parse function based on:
-
-  * keyword (e.g. `let`, `fn`, `match`, `apply`)
-  * symbol (e.g. `$`, `#`, `(`, `{`)
-  * identifier (e.g. top-level function call)
-
-Order matters.
-
---
-
-### Important parsing rules
-
-#### Variable use
-
-```txt
-$x
-```
-
-* `$` immediately followed by identifier
-* No whitespace allowed
-
-#### Tag expressions
-
-```txt
-#foo
-#foo expr
-```
-
-Parsing rule:
-
-* After `#tag`, look at the next token
-* If the next token can start an expression **and is not a terminator** (`)`, `}`, `,`, `|`, `.`):
-
-  * Parse a `tagged-expr`
-* Otherwise:
-
-  * Parse a `tag-expr`
-
-This rule is intentional and should be implemented directly.
-
---
-
-#### Tuples vs grouping
-
-Parentheses always construct **tuples**.
-
-```txt
-()
-(123)
-(1, 2, 3)
-```
-
-Parentheses are **not** used for grouping expressions. So `(123)` is NOT the same as `123`.
-
---
-
-#### Lists with separators
-
-Many constructs use:
-
-```txt
-list-sep-by(p, sep)
-```
-
-This allows:
-
-* Empty lists
-* Optional leading separator
-* Optional trailing separator
-
-Implement a reusable helper that:
-
-* Stops at a known terminator token
-* Does not allow repeated separators without elements
-
---
-
-### Parsing patterns
-
-Patterns are parsed only in specific contexts:
-
-* `match` branches
-* `let` bindings
-* lambda parameters
-
-There are **two distinct pattern parsers**:
-
-* `parsePattern()` — full patterns (including tags)
-* `parseProductPattern()` — no tags allowed
-
-These should be separate functions.
-
---
-
-### AST construction
-
-Parser functions should construct AST nodes directly, matching the existing AST types exactly.
-
-If necessary, spans may be:
-
-* Stored directly on AST nodes, or
-* Discarded after parsing
-
-Either is acceptable.
-
---
-
-## Division of responsibility
-
-**Lexer**:
-
-* Characters → tokens
-* Unicode-safe
-* Tracks positions
-
-**Parser**:
-
-* Tokens → AST
-* Grammar enforcement
-* Context-sensitive decisions
-* Error reporting
-
-Do **not** merge these stages.
-
---
-
-## Final notes
-
-* Favor clarity over cleverness
-* Favor explicit structure over abstraction
-* Assume the grammar in `SYNTAX.md` is authoritative
-* It is acceptable to tweak helper types or utilities if needed
-
-Correct parsing is the goal. Performance and elegance are not.
--- a/src/lang/SYNTAX.md
+++ b/src/lang/SYNTAX.md
--- a/src/lang/debug/expr_show.ts
+++ b/src/lang/debug/expr_show.ts
--- a/src/lang/debug/repl.ts
+++ b/src/lang/debug/repl.ts
--- a/src/lang/debug/value_show.ts
+++ b/src/lang/debug/value_show.ts
--- a/src/lang/parser/SCANNER.md
+++ b/src/lang/parser/SCANNER.md
--- a/src/lang/parser/cursor.test.ts
+++ b/src/lang/parser/cursor.test.ts
--- a/src/lang/parser/cursor.ts
+++ b/src/lang/parser/cursor.ts
--- a/src/lang/parser/parser.ts
+++ b/src/lang/parser/parser.ts
--- a/src/lang/parser/scanner.ts
+++ b/src/lang/parser/scanner.ts
@ -1,10 +1,8 @@

-import { CARRIAGE_RETURN, char, NEW_LINE, SPACE, TAB } from './source_text';
-import type { SourceText, Span, SourceLocation, CodePoint, StringIndex, CodePointIndex } from './source_text';
+import { CARRIAGE_RETURN, char, NEW_LINE } from './source_text';
+import type { Span, CodePoint } from './source_text';
 import { isDigit, isWhitespace, scanNumber, scanString } from './cursor';
-import type { Cursor, CursorState, GenericScanError, NumberError, StringError } from './cursor';
-import { Result } from '../result';
-import { Expr } from 'src/value';
+import type { Cursor, GenericScanError, NumberError, StringError } from './cursor';

 export function skipWhitespaceAndComments(cursor: Cursor): number {
  let totalConsumed = 0;
--- a/src/lang/parser/source_text.ts
+++ b/src/lang/parser/source_text.ts
--- a/src/lang/result.ts
+++ b/src/lang/result.ts
--- a/src/lang/value.ts
+++ b/src/lang/value.ts
--- a/tmp_repl/test.flux
+++ b/tmp_repl/test.flux
@ -9,3 +9,5 @@ let {
  }

 }
+
+
--- a/tmp_repl/tmp_repl.md
+++ b/tmp_repl/tmp_repl.md
@ -15,7 +15,7 @@ npm install -D sass-embedded
 npx ts-node src/parser/cursor.test.ts


-npx ts-node src/debug/repl.ts
+npx ts-node src/lang/debug/repl.ts

-npx ts-node src/debug/repl.ts tmp_repl/test.flux
+npx ts-node src/lang/debug/repl.ts tmp_repl/test.flux