feat(upload-pack): streaming v2 fetch response — no full-pack buffering #32
stream-pack-response
into main
Summary
Add a streaming variant for protocol v2 fetch responses so the entire packfile no longer materializes in BEAM process heap before being sent.
Tracking: Anvil #214 (and bundled with Anvil #213 / Anvil PR #135 — the SSH window fix is useless without this fix and vice versa).
Why
Building a 161 MB pack via the current `UploadPackV2.feed/2` holds three full-pack-sized binaries transiently:
- `pack_data` returned from `Writer.generate/1`
- `sideband_data` after `PktLine.encode_sideband(1, pack_data)`
- `response` after `IO.iodata_to_binary([ack, shallow, hdr, sideband_data, flush])`
Plus the object list. After Erlang refc-binary fragmentation, peak heap reaches ~10× the pack size. The 2026-05-11 02:50 prod BEAM OOM-kill (verified via dmesg + Anvil.Perf.MemoryMonitor logs) is the consequence: a single fetch pushes a 3.8 GiB no-swap host past available RAM.
Per the project performance protocol, this is a per-unit cost in a shared primitive — every fetch on every repo pays it.
What’s in this PR
All additive — no behaviour changes to `feed/2` or `Writer.generate/1`.
`Writer.generate_stream/2` and `/3`
Callback-based pack writer. Invokes `write_fn` with each piece of the pack as it’s produced: 12-byte header → one zlib-compressed entry per object → trailing 20-byte SHA-1 checksum. SHA-1 is computed incrementally via `:crypto.hash_update/2` so no full-pack binary ever exists.
`/3` threads an accumulator through each write — needed by the sideband chunker downstream.
`PktLine.encode_sideband_frame/2` + `max_sideband_data/0`
Single-frame encoder for streaming callers; existing `encode_sideband/2` (which materializes everything) is unchanged.
`ExGitObjectstore.Protocol.SidebandWriter` (new)
Re-chunks arbitrary-sized writes into spec-compliant sideband-1 frames (≤ 65515 bytes per frame). Buffer held as iodata for O(1) amortized append; only materializes to a binary when draining a full-size frame.
`UploadPackV2.feed/3`
Streaming counterpart to `feed/2`. ls-refs and multi-round ack responses still arrive as a single `write_fn` call (they’re small); packfile responses stream the prefix once, then push pack bytes through the sideband chunker, then write the trailing flush.
Tests
42 new assertions across three suites. All pass; full repo suite (942 tests) green.
- Writer streaming: streamed bytes are byte-identical to `Writer.generate/1` output for single-blob, many-mixed-object, and reader-roundtrip cases. `/3` accumulator threading verified.
- SidebandWriter: small writes coalesce into one frame; oversized writes split into multiple spec-compliant frames; exact-boundary writes drain cleanly without a partial frame.
- UploadPackV2 feed/3: ls-refs, multi-round acks, and full clone fetch responses are byte-identical to `feed/2`. A 70 KiB-blob fetch emits multiple chunks (not one giant binary), every chunk ≤ 65520.
Memory impact
Before (per the existing `build_packfile_response/6`): peak heap ≈ pack size × 3 + object list, fragmented by refc binaries.
After (`feed/3` path): peak heap is bounded by the largest single zlib-compressed object plus a single sideband frame (≤ 65515 bytes). For Anvil’s prod fetch of fangorn/hephaestus (161 MB pack, 1445 objects, largest object well under 1 MB), this should drop the ~1.95 GB transient peak to well under 100 MB.
Consumer migration
Anvil’s SSH CLI will adopt `feed/3` in fangorn/anvil PR #135. Other consumers continue using `feed/2` until they want to opt in.
Acceptance
- Streamed pack output is byte-identical to non-streaming for matched inputs.
- Sideband-1 frames stay within spec (≤ 65515 bytes per frame).
- Existing tests + new tests all pass (942 tests, 0 failures).
- No new `mix credo –strict` findings on the touched files.
Out of scope
- Pack rebuilt-from-scratch caching (Anvil #214 mentions; separate issue).
- Pack-build CPU cost (~16s for 161 MB pack); the algorithmic improvement is a separate PR.