fangorn/ex_git_objectstore
public
ref:03224cd34075cd4a0bb08807f7d82f3e5a5a7dc2
perf(commit-walk): pack-first ordering eliminates loose-object miss cost
After the per-process pack cache landed, profile showed ~400ms per
ahead_behind call (down from 17s) — and the dominant remaining cost
was loose-object existence checks. Object.read(repo, sha) was hitting
:prim_file via File.read for EVERY commit in the walk, even when
those SHAs are in a pack. On chiron's mostly-packed history that's
90k synchronous round-trips to a singleton GenServer for files that
don't exist.
Inverts the lookup order in ObjectResolver.read/2:
- was: loose first, packs as fallback
- now: packs first, loose as fallback
Pack lookup after the cache warmup is a cached Index.lookup/2 (binary
search in memory) — no syscall, no GenServer. Falling through to the
loose check only happens when a SHA isn't in any pack, which is the
recently-written-not-yet-packed case.
Safe by content addressing: if a SHA exists both loose AND packed,
both must contain byte-identical data (otherwise SHAs wouldn't match).
Git itself uses pack-first ordering for the same performance reason.
Verified: full suite 903/903 green.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
SHA:
03224cd34075cd4a0bb08807f7d82f3e5a5a7dc2
Author:
Cole Christensen <cole.christensen@macmillan.com>
Date:
2026-04-29 13:09
Parents:
d12a4cf
1 files changed
+16
-3
| Type | ||
|---|---|---|
|
|
lib/ex_git_objectstore/object_resolver.ex | +16 −3 |
|
||