ref:main

Telemetry outcome :rolled_back conflates validation-phase reject with mid-commit rollback #35

open Opened by cole.christensen@gmail.com

Links

No links yet.

Problem

The atomic-receive-pack telemetry span emits `%{outcome: :rolled_back}` when `atomic_reject_all/3` runs (validation failed, no storage writes happened) AND when `atomic_commit_phase` actually performs a rollback after a mid-batch storage failure. Operators alerting on `:rolled_back` can’t distinguish the two — one is a normal client error, the other is a storage-inconsistency alarm.

Fix plan

Split into three outcomes:

  • `:committed` — all refs applied, no failures
  • `:rejected_pre_commit` — validation rejected the batch, no writes were made
  • `:rolled_back` — Phase 2 partially wrote, rollback restored pre-flight snapshots

Also add a fourth, `:rollback_failed`, for the case where `rollback_refs` itself hit a `{:error, _}` on `Ref.put` / `Ref.delete`. Currently that fires a `[:receive_pack, :rollback_failed]` event; folding it into the same `atomic` span as an outcome value keeps alerting in one place.

Acceptance criteria

  • Telemetry span metadata has an `outcome` field that can be any of the four atoms above.
  • `telemetry_test.exs` asserts each outcome fires on its respective path.

Context

Flagged in the audit of PR #20.