ref:1ad0b2b22aa4b08be0fdf4531570983f050f12a8

feat(graph): batched ahead_behind_many — walk base ancestors once for N heads

`ahead_behind/3` walks `ancestors(base) ∪ ancestors(head)` from scratch on every call. When a caller asks the same question for many heads against one base — typical PR-list page where every PR has `base = main` — that re-walks `ancestors(base)` N times. Add a batched variant that walks `ancestors(base)` once into a set, then for each head: - BFS from head, classifying each commit as in-base (merge point) or not (head-only → ahead); - DOWN-BFS within base_ancestors from the merge points to size the intersection; behind = |base_ancestors| - |intersection|. Cost goes from O(N · |ancestors(base)|) to O(|ancestors(base)| + Σ head walks). Public API wraps with the standard graph/fallback routing: if the graph isn't loaded, every head goes through per-head `ahead_behind/3` (which already has its own cat_object fallback). If specific heads aren't in the graph (just-pushed branches), they fall back per-head while the rest take the fast path. Measured against chiron (393-commit main, 70 PRs, graph built): - per-head loop: 33 ms - batched: 6 ms (5.5×) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
SHA: 1ad0b2b22aa4b08be0fdf4531570983f050f12a8
Author: Cole Christensen <cole.christensen@macmillan.com>
Date: 2026-04-30 04:45
Parents: 03224cd
4 files changed +332 -0
Type
lib/ex_git_objectstore.ex +55 −0
@@ -730,5 +730,60 @@
end
@doc """
Like `ahead_behind/3`, but for many heads against a single base.
Walks `ancestors(base)` once and reuses it across every head, instead
of re-walking it for each call. For workloads where `head_shas` are
many small offsets from a common base (e.g. a PR-list page where
every PR has `base = main`), this turns
`O(N · |ancestors(base)|)` into `O(|ancestors(base)| + Σ head walks)`.
Returns `{:ok, %{head_sha => %{ahead: N, behind: M}}}` with one entry
per head. Heads not in the graph or whose ref couldn't be resolved
fall back to per-head `ahead_behind/3` (which has its own
cat_object walker fallback). If the graph itself isn't available,
every head goes through the per-head fallback.
Emits `[:ex_git_objectstore, :graph, :query]` telemetry with
`operation: :ahead_behind_many`. `path` is `:graph` when the batched
fast path was used, `:fallback` when nothing was in the graph and
every head went through the per-head walker.
"""
@spec ahead_behind_many(Repo.t(), sha(), [sha()]) ::
{:ok, %{sha() => %{ahead: non_neg_integer(), behind: non_neg_integer()}}}
| {:error, term()}
def ahead_behind_many(%Repo{} = repo, base_sha, head_shas) when is_list(head_shas) do
metadata = %{operation: :ahead_behind_many, repo_id: repo.id}
Telemetry.span([:ex_git_objectstore, :graph, :query], metadata, fn ->
case load_or_fetch_graph(repo) do
{:ok, graph} ->
if Graph.member?(graph, base_sha) do
{:ok, fast} = Graph.ahead_behind_many(graph, base_sha, head_shas)
missing = Enum.reject(head_shas, &Map.has_key?(fast, &1))
merged = fill_per_head(repo, base_sha, missing, fast)
{{:ok, merged}, Map.put(metadata, :path, :graph)}
else
merged = fill_per_head(repo, base_sha, head_shas, %{})
{{:ok, merged}, Map.put(metadata, :path, :fallback)}
end
{:error, _} ->
merged = fill_per_head(repo, base_sha, head_shas, %{})
{{:ok, merged}, Map.put(metadata, :path, :fallback)}
end
end)
end
defp fill_per_head(repo, base_sha, head_shas, acc) do
Enum.reduce(head_shas, acc, fn head_sha, acc ->
case ahead_behind(repo, base_sha, head_sha) do
{:ok, counts} -> Map.put(acc, head_sha, counts)
{:error, _} -> acc
end
end)
end
@doc """
Commits reachable from `head_sha` but not from `base_sha`, newest-first.
Empty when `head_sha` is an ancestor of (or equal to) `base_sha`.