fix(diff): linear-space Myers (Myers 1986 §4b) + :atomics V table #23
myers-linear-space-counters
into main
Summary
The hand-rolled ExGitObjectstore.Diff.Myers had two structural problems that together OOM’d the BEAM on large diffs (root cause of Anvil chiron PR #68 crash, tracked at fangorn/anvil#81):
- O(D²) memory —
find_daccumulated one V-table perditeration into a list and materialized it into a tuple for backtracking. After d steps V has 2d+1 entries; sum across the trace = (D+1)² entries. For D=10k that’s ~8 GB of map nodes for ONE file. Mapfor V — diagonal index k is a contiguous integer range [-d, d], a perfect fit for an array. UsingMapcost ~80 B per entry and O(log n) per access. Profile showedMap.get/3taking 38% of total CPU.
This PR replaces both with the actual linear-space variant from Myers’ 1986 paper §4b, with V stored in :atomics.
Two commits
1. fix(diff): linear-space Myers (Myers 1986 §4b)
Divide-and-conquer at the middle snake: find the split point (x, y) where the optimal edit script crosses the middle of the edit graph, then recurse on a[0..x) vs b[0..y) and a[x..n) vs b[y..m). The snake’s equalities fall out of the recursion naturally because they appear in both halves.
Memory: O(N+M) total. Each bisect call holds two V tables for its lifetime, then frees them before recursing. Recursion depth is O(log(N+M)) on average.
Translation reference: Google diff_match_patch‘s diff_bisect (a faithful port of Myers §4b), cross-checked against git xdiff’s xdl_split.
V stored as Map in this commit — correctness first.
2. perf(diff): swap Myers V table from Map to :atomics
:atomics is the right primitive for V: fixed-size array of signed 64-bit ints, mutable in place, lives off the BEAM term heap, no GC pressure. Sentinel -1 for “not yet reached” still works (signed default).
Mutability simplifies the recursion: forward_sweep and reverse_sweep no longer thread updated v1/v2 through their return — they mutate in place and return only the bounds.
Verified
- All existing diff tests pass byte-identical (10/10
myers_test.exs, 23/23diff/, 903/903 full suite). - New stress test: 10k-line × ~30%-diff input peaks at ~4.5 MB process heap (sampled every 5 ms). Old impl peaked in the GBs and OOM’d inside a 6 GB-capped container.
Bench
10k lines, ~33% changed, single Myers.diff_lines/2 call:
| Old hand-rolled | Linear-space + Map V | Linear-space + :atomics V | |
|---|---|---|---|
| Wall | OOM | 12.2 s | 4.7 s |
| Peak heap | unbounded GBs | 9.1 MB | 4.5 MB |
| GCs | thrashing | ~21k | ~1k |
Real chiron PR #68 (216 files, 45k diff lines) inside a 6 GB cgroup’d container:
| Before this PR | After this PR | |
|---|---|---|
| Phase A (full diff compute) | OOM at 4 min | 13.6 s, completes |
| BEAM peak allocator | 16.2 GB | 0.70 GB |
| GC count | thrashing | 18,361 |
Critical correctness rules from the paper
Earlier hand-roll attempts in this branch’s history got these wrong; capturing them here for future reference:
- Δ = N − M parity drives WHICH sweep checks overlap (front when Δ is odd, reverse when Δ is even). Doing both is wrong.
- Reverse-frame ↔ forward-frame mapping:
k_other = delta − k_self(minus, not plus). - When bisect runs out of d iterations without finding overlap (tiny inputs like n=m=1 with no match, or no commonality at all), fall back to splitting at the top-right corner so both halves are STRICTLY smaller — otherwise the recursion can re-call itself on the same range.
Requirements
REQ-DIFF-001(memory bounded) — covered by new stress test (annotated)REQ-DIFF-002(no Map.get in inner loop) — satisfied by:atomicsV tableREQ-DIFF-003(output unchanged) — covered by existing test corpus passing byte-identical
Test plan
-
mix test test/ex_git_objectstore/diff/— 24/24 pass -
mix test— full suite 903/903 pass - After merge: bump
ex_git_objectstorein Anvil’smix.lock, re-run profile against chiron PR #68 to confirm Phase A completes inside the 6 GB cap
Closes #59. Unblocks fangorn/anvil#81.