ref:main

Parallelize S3 backend hot paths: list_refs GETs and put_pack uploads #25

closed Opened by cole.christensen@gmail.com

Links

No links yet.

Problem

The S3 backend has two sequential-I/O patterns that dominate latency for MinIO/S3-backed deployments:

1. list_refs/3 fetches each ref value with a sequential GET

lib/ex_git_objectstore/storage/s3.ex:200-227

def list_refs(config, prefix, ref_prefix) do
full_prefix = "#{prefix}/#{ref_prefix}"
case s3_list(config, full_prefix) do
{:ok, keys} ->
refs =
keys
|> Enum.map(fn key ->
fetch_ref_from_key(config, key, prefix) # Sequential GET per ref
end)
|> Enum.reject(&is_nil/1)
|> Enum.sort()

Each ref requires a separate GET (content is in the object body, not in LIST response). A repo with 50 branches + 200 tags = 250 sequential 100ms GETs = ~25 seconds of latency.

Every git clone, git fetch, and git push pays this cost because UploadPack.list_all_refs_with_head and ReceivePack.list_all_refs_with_head both call Ref.list(repo, "refs/heads/") and Ref.list(repo, "refs/tags/") during protocol advertisement.

2. put_pack/5 uploads pack and idx sequentially

lib/ex_git_objectstore/storage/s3.ex:135-139

def put_pack(config, prefix, pack_sha, pack_data, idx_data) do
with :ok <- s3_put(config, pack_key(prefix, pack_sha, "pack"), pack_data) do
s3_put(config, pack_key(prefix, pack_sha, "idx"), idx_data)
end
end

Two large PUTs serialized — for big packs on a pushed commit, this doubles the write latency unnecessarily.

Impact

  • #1 is the dominant cost of every git protocol operation against S3/MinIO backends
  • Clone UX on ref-heavy repos feels broken (20+ seconds before any data transfers)
  • #2 adds 100ms–N seconds to every git push (depends on pack size)

Acceptance Criteria

  • S3.list_refs/3 parallelizes the per-ref GETs with Task.async_stream (pattern already used in ExGitObjectstore.blob_sizes/3)
  • S3.put_pack/5 uploads pack and idx concurrently
  • Both use bounded concurrency (default 32, configurable)
  • Preserves existing return types and ordering (refs still sorted)
  • Filesystem and Memory backends unchanged (no parallelism needed)
  • Benchmark or test demonstrating the speedup on a repo with >50 refs
  • CHANGELOG entry under [Unreleased]

Notes

  • Reference implementation: commit 6a8dd64 (blob_sizes/3) uses the same pattern successfully
  • max_concurrency default of 32 matches typical hackney pool size; document bumping the hackney pool if deployers need higher concurrency
  • The underlying architectural question — should refs be stored in a packed-refs-style single blob? — is deferred to a separate issue. This change works within the current one-key-per-ref layout.