perf(graph): batched ahead_behind_many — walk base ancestors once for N heads #25
graph-ahead-behind-perf
into main
Why
`ahead_behind/3` walks `ancestors(base) ∪ ancestors(head)` from scratch on every call. When a caller asks the same question for many heads against one base — typical PR-list page where every PR has `base = main` — that re-walks `ancestors(base)` N times.
For Anvil’s chiron PR-list (393-commit main, 70 PRs), this is a 70× redundancy: 27,510 node-visits where 393 should suffice for the base side.
What
New `ExGitObjectstore.ahead_behind_many(repo, base_sha, head_shas)` that walks `ancestors(base)` once into a set, then for each head:
- BFS from head, classifying each commit as in-base (merge point) or not (head-only → ahead).
- DOWN-BFS within base_ancestors from the merge points to size the intersection; `behind = |base_ancestors| - |intersection|`.
Cost goes from `O(N · |ancestors(base)|)` to `O(|ancestors(base)| + Σ head walks)`.
Public API uses the standard graph/fallback routing:
- Graph available + base in graph → batched fast path. Heads not in graph (just-pushed branches) fall back per-head.
- No graph → every head goes through per-head `ahead_behind/3` (which already has its cat_object walker fallback).
Numbers
Measured against chiron (393-commit main, 70 PRs, graph built):
| Time | |
|---|---|
| 70 × per-head `ahead_behind` | 33 ms |
| Batched `ahead_behind_many` | 6 ms (5.5×) |
Note: chiron’s graph wasn’t actually built in production, so today’s `ahead_behind` per-call cost is much worse (~50 ms via cat_object fallback). `mix anvil.graphs.rebuild –only fangorn/chiron` would already drop per-head from ~50 ms to ~0.4 ms; this batched API is on top of that.
Tests
- 6 new tests in `graph/queries_test.exs` cross-checking against unbatched `ahead_behind/3` for the chiron-shape workload, diverged heads, equal-to-base, missing head, missing base, empty list.
- 3 new tests in `graph_integration_test.exs` covering the wrapper’s three routing cases (full graph hit, partial fallback for just-pushed heads, no-graph total fallback).
- Full suite: 912 tests, 0 failures.
Test plan
- CI green
- Anvil PR (companion) bumps the dep + wires `compute_ahead_behind/2` to use `ahead_behind_many`
🤖 Generated with Claude Code