35 lines
2.3 KiB
Markdown
35 lines
2.3 KiB
Markdown
TypeScript library for handling source code strings without having to deal with intricacies of JS's UTF16 encoding.
|
|
|
|
# SourceText
|
|
A sane, UTF-16-safe string wrapper specifically designed for parsing source code, tracking line numbers, and generating CLI error messages.
|
|
Think of it as a fat wrapper for a string that understand more info about the string like line structure.
|
|
- makes the original string easy to traverse in error-free way by introducing a character abstraction - type `CodePoint` and its position within the SourceText called `CodePointIndex`
|
|
- tracks where lines start (handling various platform specific weirdness like `\r\n`)
|
|
|
|
# Core: SourceText vs TextRegion
|
|
The most important thing to remember is the difference between `SourceText` and `SourceRegion`.
|
|
- `SourceText`: The heavy, immutable root object. Basically a fat wrapper for a JS string. It ingests the raw string, normalizes JS's weird UTF-16 surrogate pairs into actual code points, and indexes all the line breaks. You only create this once per file.
|
|
- `SourceRegion`: A region of source-code (think of it as a string-slice to a large part of the original source-code). This is what parsers/lexers work with. Most of the time you'll have exactly one `SourceRegion` spanning the whole source-code, but for certain languages it is advantageous to partition the code into multiple such large regions.
|
|
|
|
It also allows for Spatial Tracking or various sub-regions within the source. It introduces
|
|
- point-like `SourceLocation` abstraction (basically where a cursor could be)
|
|
- and interval-like `Span` abstraction (basically what a mouse selection could span)
|
|
|
|
# Locations and Spans
|
|
- `SourceLocation` is basically a smart 2D coordinate equivalent to `(line, col)` (but also tracks `CodePointIndex`)
|
|
- `Span` an interval determined by `start` and `end` SourceLocations
|
|
|
|
|
|
# Rendering CLI Errors
|
|
Secondary functionality is `function renderSpan(region: SourceRegion, span: Span, contextLines = 1): LineView[]` which is able to render spans of source-code as follows
|
|
```
|
|
7 | ◊foo
|
|
8 | item1
|
|
9 | item2 (bad indent - nested text without constructor)
|
|
^^^^
|
|
10 |
|
|
```
|
|
|
|
# Warning
|
|
Performance is currently not prioritized. But the library is written in such a way that internal representation can be swapped out without affecting the clients.
|
|
|