spooler: improve queue metadata

2022-11-30 18:30:45 +01:00 · 2022-11-30 18:30:45 +01:00 · a494dd4517
commit a494dd4517
parent c54df77319
1 changed files with 9 additions and 23 deletions
--- a/WIP/SPOOLER.txt
+++ b/WIP/SPOOLER.txt
@ -11,27 +11,15 @@ the public keys have all the same length, and get encoded as urlsafe base64.
 ## per public keys ...
 ... there can be a bunch of associated files.
- {pubkey}.{chunkid}.slcs contains slice listings (* =SLCS *)
+- {pubkey}.{start}.data contains the actual data (* =DataS; {start} is the urlsafe base64 encoded starting point *)
- {pubkey}.{chunkid}.data contains the actual data (* =DataS *)
+- {pubkey}.{start}.meta contains the offsets of data blobs (* =MetaS *)
 - {pubkey}.lock is a lock file for the public key, which gets used to prevent overlapping writes and GCs...
 <proto>
-SLCS ::= [*]SlicePtr
+MetaS ::= [*]MetaEntry
-SlicePtr ::= (* 16 bytes per entry *)
+MetaEntry ::= offset:u32(big-endian)
    (* slice boundaries *)
    slice:Slice
    (* data boundaries *)
    dptr:Pointer
 Slice ::=
    start:u64(big-endian)
    length:u16(big-endian)
 Pointer ::=
    start:u32(big-endian)
    length:u16(big-endian)
 DataS ::=
    (* @siglen is inferred from the used SigAlgo *)
@ -41,15 +29,13 @@ DataS ::=
 ... for compaction, the corresponding pubkey gets locked,
 - new temporary files are created in the corresponding directory,
- the slices get sorted, and for each blob the length gets calculated,
+- the chunks get sorted, and starting from the newest blob, going reverse
- summed up starting from the newest blob, going reverse,
+- the size of the slices get calculated,
- until we hit the maximum size per pubkey
+  until a slice is found which hits the maximum size per pubkey
  (usually available storage space * 0.8 divided by the amount of known pubkeys)
- then we cut off the remaining, not yet processed blobs
+- then we cut off the remaining, not yet processed blobs/slice
 - and start now from the first kept blob going forward
- write all of them to a new data file, and create a corresponding slice listing file
+- write all of them to a new data file, and create a corresponding metadata file
 - note that adjacent slices in the slice listing file also get merged
 the compaction algorithm should run roughly once every 15 minutes, but only visit
 pubkeys to which data has been appended to in that timespan.