initial commit
This commit is contained in:
commit
66d54b0f2a
87
docs/index.gmi
Normal file
87
docs/index.gmi
Normal file
|
@ -0,0 +1,87 @@
|
|||
# Goals
|
||||
|
||||
"yglnk" is a document linking language. In contrast to a classic linker script or such, it is also used to implement glue for bidirectional hyperlinks.
|
||||
|
||||
* contains multiple sections, and supports pointers between sections
|
||||
* sections are 16-byte aligned
|
||||
* needs to support linking of content files to style files
|
||||
* needs to itself support links to other such yglnk files
|
||||
* binary for compactness
|
||||
* links can contain additional metadata (e.g. name, and arbitrary key-value pairs)
|
||||
* strings are prefixed by their length to avoid costly separator parsing
|
||||
|
||||
This format supersedes "gardglue"
|
||||
|
||||
# Serialization
|
||||
|
||||
All integers are encoded as big-endian.
|
||||
|
||||
## Header
|
||||
|
||||
```
|
||||
header := magic[4b] generator[4b] type[4b] version[4b] tstr_loc[4b] reserved[12b]
|
||||
```
|
||||
|
||||
The file magic at offset 0 is "YgLn" = 0x5967'4c63. The header itself forms a section, and contains the types of the top-level sections. After the first 32 bytes follows the primary linear table.
|
||||
The top level linear table has `entsize = 2`.
|
||||
|
||||
## Types
|
||||
|
||||
```
|
||||
0x00000000 T_PLAIN_TEXT (UTF-8)
|
||||
0x00000001 T_NESTED_TEXT
|
||||
0x00000010 T_LINEAR_PLAIN_TAB
|
||||
0x00000020 T_HASH_PLAIN_TAB
|
||||
0x00000021 T_HASH_LINK_TAB
|
||||
0x00000030 T_2DHC_PLAIN_TAB
|
||||
0x00000031 T_2DHC_LINK_TAB
|
||||
```
|
||||
|
||||
## External link table (0x21, T_HASH_LINK_TAB; 0x31, T_2DHC_LINK_TAB)
|
||||
|
||||
A hash table or "hilbert curve" table (see corresponding sections). Used to reference external content and facilitate its lookup.
|
||||
|
||||
## Linear Tables
|
||||
|
||||
A simple list of entries. An entry containing all-zeros indicates the end of the list. The actual location decoded resides at `location << 4`, because names and sections are aligned to 16 byte boundaries. `rest` contains potentially additional data, also aligned to 16 bytes. `entsize * entcount << 4` is the length of each entry.
|
||||
|
||||
```
|
||||
list entry := name[4b] type[4b] location[4b] entsize[2b] entcount[2b] rest[*]
|
||||
```
|
||||
|
||||
## Hash tables
|
||||
|
||||
A hash table, using 64bit-xxHash. chains are traversed in order, the last bit of the first 8b of a chain entry is used to indicate if another entry follows (0 = last entry), the rest contains the hash.
|
||||
|
||||
```
|
||||
ht header := strtab_link[8b] nbuckets[4b] nvals[4b] nblf[4b] blshift[4b]
|
||||
ht body := bloom[4b * nlbf] buckets[4b * nbuckets] chains[32b * nvals]
|
||||
chain entry := hash[8b] name[4b] type[4b] value[16b]
|
||||
```
|
||||
|
||||
## 2D "hilbert curve" tables
|
||||
|
||||
An X-Y-indexable table, similar to the previous tables, stores "key-value" pairs, but the key this time is a 2 byte value, where the first byte is an x coordinate and the second byte is an y coordinate. The purpose is to increase locality between adjacent x values and adjacent y values. Only the first `xybits` are honored of each x and y value. the size of the table is then `16 << (2 * xybits)`.
|
||||
|
||||
```
|
||||
2dt header := xybits[1b] reserved[15b]
|
||||
2dt entry := meta[8b] location[8b]
|
||||
```
|
||||
|
||||
## Nested Text
|
||||
|
||||
Strings are marked with their length in order to avoid nasty delimiter parsing and such. Text should be easily nestable. A nesting contains only other nestings or nt-strings.
|
||||
|
||||
```
|
||||
nt string := typ[1b]=0x00 length[2b] data[1b * length] (utf-8)
|
||||
nt nesting := typ[1b]!=0 subtype[2b] length[4b] elems[1b * length]
|
||||
```
|
||||
|
||||
### Known nt nesting types
|
||||
|
||||
```
|
||||
0x00---- NTT_STRING
|
||||
0x010000 NTT_DIV
|
||||
0x010001 NTT_GROUP
|
||||
0x010002 NTT_HEADER
|
||||
```
|
Loading…
Reference in a new issue