Skip to content

Reimplement world state restore as zero-copy operation

Adam Reichold requested to merge zero-copy-restore into main

This reduces the overhead of restoring the world state from disk significantly as it avoids allocating and initialising an owned version of the world. Instead it directly fixes up pointers in the persistently stored version. This has the additional benefit that memory mapping can be used instead of reading the data into a heap-allocated buffers, so that if only parts of the state is actually accessed for a particular measurement, the rest will not necessarily be brought into the page cache at all.

Using a standard diet breadth experiment run clocking in at 2.2 GB and measuring the prevalence values only yields the following results on my notebook:

> hyperfine "../target/release/owned . >/dev/null" "../target/release/zero_copy . >/dev/null" "../target/release/zero_copy_mmap . >/dev/null"
Benchmark #1: ../target/release/owned . >/dev/null
  Time (mean ± σ):     985.9 ms ±  13.1 ms    [User: 577.3 ms, System: 407.8 ms]
  Range (min … max):   974.8 ms … 1010.5 ms    10 runs
 
Benchmark #2: ../target/release/zero_copy . >/dev/null
  Time (mean ± σ):     610.0 ms ±  11.9 ms    [User: 267.3 ms, System: 341.9 ms]
  Range (min … max):   592.2 ms … 622.9 ms    10 runs
 
Benchmark #3: ../target/release/zero_copy_mmap . >/dev/null
  Time (mean ± σ):     339.9 ms ±  13.6 ms    [User: 286.4 ms, System: 53.4 ms]
  Range (min … max):   321.4 ms … 353.3 ms    10 runs
 
Summary
  '../target/release/zero_copy_mmap . >/dev/null' ran
    1.79 ± 0.08 times faster than '../target/release/zero_copy . >/dev/null'
    2.90 ± 0.12 times faster than '../target/release/owned . >/dev/null'

To realise zero-copy deserialisation without leaking memory, this MR also replaces the hash tables used for contacts and vectors by sorted key-value arrays which might actually be a performance win as they typically have between 1 and 1000 entries.

Edited by Adam Reichold

Merge request reports