yglnk/docs/index.gmi
2023-01-09 20:07:18 +01:00

130 lines
4.1 KiB
Text

# Goals
"yglnk" is a document linking language. In contrast to a classic linker script or such, it is also used to implement glue for bidirectional hyperlinks.
* contains multiple sections, and supports pointers between sections
* sections are 16-byte aligned
* needs to support linking of content files to style files
* needs to itself support links to other such yglnk files
* binary for compactness
* links can contain additional metadata (e.g. name, and arbitrary key-value pairs)
* strings are prefixed by their length to avoid costly separator parsing
This format supersedes "gardglue"
# Serialization
All integers are encoded as big-endian.
## Header
```
header := magic[4b] generator[4b] type[4b] version[4b]
```
The file magic at offset 0 is "YgLn" = 0x5967'4c63. The header itself forms a section, and contains the types of the top-level sections. After the first 32 bytes follows the primary linear table.
## File (sub)types
```
0x00000000 FT_NONE
0x00000001 FT_TEXT
```
## Types (in tables)
```
0x00000000 TT_PLAIN_TEXT (UTF-8)
0x00000001 TT_NESTED_TEXT
0x00000010 TT_STRING_TAB
0x00000012 TT_LINEAR_PLAIN_TAB
0x00000020 TT_HASH_PLAIN_TAB
0x00000021 TT_HASH_LINK_TAB
0x00000030 TT_2DHC_PLAIN_TAB
0x00000031 TT_2DHC_LINK_TAB
```
## Nested Text
Strings are marked with their length in order to avoid nasty delimiter parsing and such. Text should be easily nestable. A nesting contains only other nestings or nt-strings.
```
nt string := typ[1b]=0x00 length[2b] data[1b * length] (utf-8)
nt nesting := typ[1b]!=0 subtype[2b] length[4b] elems[1b * length]
```
### Known nt nesting types
```
0x00---- NTT_STRING
0x010000 NTT_DIV
0x010001 NTT_GROUP
0x010002 NTT_HEADER
0x010003 NTT_QUOTE
0x010004 NTT_CODE
```
## External link table (TT_*_LINK_TAB)
A hash table or alternatively "hilbert curve" table (see corresponding sections). Used to reference external content and facilitate its lookup.
## Linear Tables
A simple list of entries. An entry containing all-zeros indicates the end of the list. The actual location decoded resides at `location << 4`, because names and sections are aligned to 16 byte boundaries. `rest` contains potentially additional data, also aligned to 16 bytes. `entsize * entcount << 4` is the length of each entry.
```
list header := strtab_link[4b] entcount[2b] entsize[2b] reserved[8b]
list entry := name[4b] type[4b] location[4b] meta[4b] rest[*]
```
## Hash tables
A hash table, using 64bit-xxHash. chains are traversed in order, the last bit of the first 8b of a chain entry is used to indicate if another entry follows (0 = last entry), the rest contains the hash.
```
ht header := strtab_link[4b] nbuckets[4b] nchains[4b] entsize[2b] nblf[2b] blshift[1b] seed[3b]
ht body := bloom[8b * nlbf] buckets[4b * nbuckets] chains[16b * (1 + entsize) * nchains]
chain entry := hash[8b] name[4b] type[4b] rest[16b * entsize]
```
## 2D "hilbert curve" tables
An X-Y-indexable table, similar to the previous tables, stores "key-value" pairs, but the key this time is a 2 byte value, where the first byte is an x coordinate and the second byte is an y coordinate. The purpose is to increase locality between adjacent x values and adjacent y values. Only the first `xybits` are honored of each x and y value. the size of the table is then `16 << (2 * xybits)`.
The reason this table doesn't allow more data per entry is that performance hinges heavily on the fact that entries are relatively small.
```
2dt header := xybits[1b] reserved[15b]
2dt entry := meta[8b] location[8b]
```
# Examples
## 01
```
00000000: 5967 4c63 0000 0000 0000 0001 0000 0000
magic generator type version
@ linear table:
00000010: 0000 0004 0002 0001 0000 0000 0000 0000
strtab_ln ecnt esi6 reserved
name type location esiz ecnt
00000020: 0000 0001 0000 0010 0000 0004 0001 0001
.strtab strtab
00000030: 0000 0009 0000 0001 0000 0005 0001 0001
.text nt text
@ 0004 string table:
00000040: 002e 7374 7274 6162 002e 7465 7874 0000
.strtab .text
@ 0005 text:
00000050: 0000 0d48 656c 6c6f 2057 6f72 6c64 210a
t len Hello World!
```