fangorn/ex_git_objectstore
public
ref:0b4172664b8de16f00a04798d46c8cff942cac87
Parallelize S3 list_refs GETs and put_pack uploads (#13)
## Summary
- `S3.list_refs/3` now issues per-ref GETs concurrently via `Task.async_stream` (default `max_concurrency: 32`)
- `S3.put_pack/5` uploads `.pack` and `.idx` concurrently via `Task.async` + `Task.await_many`
- Both wrapped in `:telemetry.span/3` under the new `[:ex_git_objectstore, :storage, _]` namespace so the improvement is observable in production
- Tuning knobs (`list_refs_concurrency`, `list_refs_timeout`, `put_pack_timeout`) live in the S3 config map — no Anvil-side changes required, defaults are sensible
Closes #25
## Impact
- **Protocol advertisement on ref-heavy repos: ~25s → ~1s.** A repo with 50 branches + 200 tags used to trigger 250 sequential 100ms GETs on every clone/fetch/push. Now parallel, bounded by `max_concurrency: 32`.
- **Pack upload latency halved** on every push that produces a pack.
## Partial-write semantics (put_pack)
If one of the two concurrent PUTs succeeds and the other fails, the successful object is left in place and the function returns `{:error, reason}`. A `.pack` without a matching `.idx` is unreachable through any lookup path, so GC/fsck reclaims it. Retries with the same `pack_sha` overwrite the orphan. Documented in the moduledoc alongside the existing CAS note.
## Telemetry
Two new events (S3 backend only for now; Filesystem/Memory may adopt in a follow-up):
| Event | Measurements (stop) | Metadata |
|---|---|---|
| `[:ex_git_objectstore, :storage, :list_refs, _]` | `:duration`, `:ref_count` | `:ref_prefix`, `:backend` |
| `[:ex_git_objectstore, :storage, :put_pack, _]` | `:duration`, `:pack_size`, `:idx_size` | `:pack_sha`, `:backend` |
Full event list is documented in the `ExGitObjectstore.Telemetry` moduledoc.
## Test plan
- [x] Existing 1001-ref pagination test extended with sort assertion
- [x] Low-concurrency smoke test (`list_refs_concurrency: 1`) verifies the config path
- [x] Concurrent `put_pack` round-trip with 128KB pack + 32KB idx
- [x] Error propagation test using a nonexistent bucket
- [x] Telemetry assertions for both events (backend metadata, size measurements, ref_count)
- [x] 595 total tests pass (566 non-S3 + 29 S3), `mix dialyzer` clean, `mix format --check-formatted` clean
## Out of scope (follow-ups)
- Filesystem/Memory backend telemetry (uniform coverage)
- Retry/backoff for transient S3 errors
- Range-based pack reads (`NOTE(C7)` in `object_resolver.ex`)
- hackney connection pool tuning docs
SHA:
0b4172664b8de16f00a04798d46c8cff942cac87
Author:
Anvil <noreply@anvil.fangorn.io>
Date:
2026-04-17 18:36
Parents:
7b8145f
4 files changed
+205
-18
| Type | ||
|---|---|---|
|
|
CHANGELOG.md | +8 −0 |
|
||