perf(pack): single-pass O(N) delta resolution — fixes the next bottleneck on huge pushes #29
pack-reader-perf
into main
Sub-issue under fangorn/anvil#153 umbrella. The streaming receive-pack fix in #28 made the receive path itself O(N), but exposed Pack.Reader as the next bottleneck — a live ovs push test ran for 27+ minutes pinned at 100% CPU during parse before I aborted.
Root cause
The offset-keyed cache in the previous resolver was a no-op: writes used {:sha, sha} keys, reads used raw offset keys, so they never collided. Every OFS_DELTA went through do_read_object which re-parsed the pack header AND re-decompressed via zlib for every base in the chain. For a pack with N objects and mean chain depth D, that’s O(N · D) zlib decompressions.
ovs has 134k objects, 108k deltas, chain depth ~5–10. Half a million to a million zlib decompresses, single-threaded.
Fix
Walk entries in pack order (already produced that way by parse_entries) maintaining a real offset-keyed by_offset map populated as we go. Pack format guarantees every OFS_DELTA’s base is at a smaller offset, so by the time we hit a delta, its base is already resolved — apply the delta and record the result. REF_DELTAs that reference a forward or external base are deferred to a fixed-point post-pass; that pass terminates on either resolution-fixpoint or genuine unresolvability with a structured error.
Each object decompressed once (in parse_entries), each delta applied once. O(N) total.
Two preserved bits of context that were previously thrown away and re-derived on every chain walk:
- OFS_DELTA
neg_offsetis now stored on the entry struct - REF_DELTA
base_shais now stored on the entry struct
What this does NOT fix
Tracked under #153 still:
- Memory: parse still collects the entire resolved entries list before
store_entries/2writes anything. Peak RSS is still O(pack size). Streaming parse-and-store rewrite is the next sub-issue. - Point-lookup API (
Reader.read_object/2,3,4, used byObjectResolver) still has the broken cache. Different code path; left alone for this PR per scope discipline.
Test plan
- 928 tests / 0 failures across the existing suite (including 67 pack tests).
-
mix format --check-formattedclean. -
mix dialyzerclean. - Live ovs push test against prod once this and the anvil mix.lock bump deploy. Expectation: parse phase drops from 27+ min to single-digit seconds.