| src | ||
| tmp_repl | ||
| .gitignore | ||
| LICENSE | ||
| package-lock.json | ||
| package.json | ||
| README.md | ||
| tsconfig.json | ||
TypeScript library for handling source code strings without having to deal with intricacies of JS's UTF16 encoding.
SourceText
A sane, UTF-16-safe string wrapper specifically designed for parsing source code, tracking line numbers, and generating CLI error messages. Think of it as a fat wrapper for a string that understand more info about the string like line structure.
- makes the original string easy to traverse in error-free way by introducing a character abstraction - type
CodePointand its position within the SourceText calledCodePointIndex - tracks where lines start (handling various platform specific weirdness like
\r\n)
Core: SourceText vs TextRegion
The most important thing to remember is the difference between SourceText and SourceRegion.
SourceText: The heavy, immutable root object. Basically a fat wrapper for a JS string. It ingests the raw string, normalizes JS's weird UTF-16 surrogate pairs into actual code points, and indexes all the line breaks. You only create this once per file.SourceRegion: A region of source-code (think of it as a string-slice to a large part of the original source-code). This is what parsers/lexers work with. Most of the time you'll have exactly oneSourceRegionspanning the whole source-code, but for certain languages it is advantageous to partition the code into multiple such large regions.
It also allows for Spatial Tracking or various sub-regions within the source. It introduces
- point-like
SourceLocationabstraction (basically where a cursor could be) - and interval-like
Spanabstraction (basically what a mouse selection could span)
Locations and Spans
SourceLocationis basically a smart 2D coordinate equivalent to(line, col)(but also tracksCodePointIndex)Spanan interval determined bystartandendSourceLocations
Rendering CLI Errors
Secondary functionality is function renderSpan(region: SourceRegion, span: Span, contextLines = 1): LineView[] which is able to render spans of source-code as follows
7 | ◊foo
8 | item1
9 | item2 (bad indent - nested text without constructor)
^^^^
10 |
Warning
Performance is currently not prioritized. But the library is written in such a way that internal representation can be swapped out without affecting the clients.