Elixir library for Git object storage and pack file operations

Type
.githooks
bench
ci
lib
test
.anvil.yml 5.6 KB
.DS_Store 8.0 KB
.formatter.exs 97 B
.gitignore 674 B
CHANGELOG.md 12.4 KB
ex_git_objectstore 46 B
LICENSE 11.0 KB
mix.exs 3.1 KB
mix.lock 9.5 KB
NOTICE 112 B
README.md 6.5 KB
RED_TEAM_JOURNAL.md 28.2 KB
README.md

ExGitObjectstore

Pure Elixir git object store with pluggable storage backends.

Read, write, and manipulate git objects (blobs, trees, commits, tags), refs, and packfiles without requiring libgit2, the git CLI, or any NIF. All git data is stored through a pluggable storage backend — use the local filesystem, S3, or in-memory storage.

Features

  • Git objects — encode, decode, hash, read, and write blobs, trees, commits, and tags
  • Refs — branches, tags, HEAD, compare-and-swap updates
  • Packfiles — read and write .pack and .idx v2 files, delta resolution
  • Three-way merge — recursive tree merge with conflict detection
  • Diff engine — Myers diff algorithm with unified diff output and context hunks
  • Graph traversal — commit log, merge base (LCA) finding
  • Git wire protocol — pkt-line framing, upload-pack, receive-pack
  • Pluggable storage — filesystem, S3 (any S3-compatible service), and in-memory backends
  • ETS caching — LRU object cache with configurable size limits
  • Integrity verification — fsck with full (SHA re-hash) and quick (refs-only) modes

Installation

Add ex_git_objectstore to your list of dependencies in mix.exs:

def deps do
[
{:ex_git_objectstore, "~> 0.1.0"}
]
end

Quick Start

alias ExGitObjectstore.{Repo, Object}
alias ExGitObjectstore.Object.{Blob, Commit, Tree}
# Create a repo with in-memory storage
repo = %Repo{
id: "my-repo",
storage: ExGitObjectstore.Storage.Memory,
storage_config: %{}
}
# Initialize and write objects
:ok = ExGitObjectstore.init(repo)
{:ok, blob_sha} = Object.write(repo, Blob.from_content("Hello, world!\n"))
{:ok, tree_sha} = Object.write(repo, Tree.new([
%{mode: "100644", name: "README.md", sha: blob_sha}
]))
{:ok, commit_sha} = Object.write(repo, %Commit{
tree: tree_sha,
parents: [],
author: "Alice <alice@example.com> 1700000000 +0000",
committer: "Alice <alice@example.com> 1700000000 +0000",
message: "Initial commit"
})
:ok = ExGitObjectstore.create_branch(repo, "main", commit_sha)
# Read it back
{:ok, {^commit_sha, commit}} = ExGitObjectstore.commit(repo, "main")
{:ok, tree} = ExGitObjectstore.tree(repo, "main")
{:ok, content} = ExGitObjectstore.blob(repo, blob_sha)

Storage Backends

Filesystem

repo = %Repo{
id: "my-repo",
storage: ExGitObjectstore.Storage.Filesystem,
storage_config: %{root: "/var/git/repos/my-repo"}
}

Uses the standard git loose object layout with atomic writes (temp file + rename) and lock-file compare-and-swap for ref updates.

S3

repo = %Repo{
id: "my-repo",
storage: ExGitObjectstore.Storage.S3,
storage_config: %{
bucket: "my-git-bucket",
prefix: "repos/my-repo",
ex_aws_config: [
access_key_id: "...",
secret_access_key: "...",
region: "us-east-1"
]
}
}

Works with AWS S3, MinIO, or any S3-compatible service. Handles pagination for repositories with many objects.

Memory

repo = %Repo{
id: "my-repo",
storage: ExGitObjectstore.Storage.Memory,
storage_config: %{}
}

Stores everything in an ETS table. Useful for testing and ephemeral operations.

Git protocol v2 capability matrix

UploadPackV2 and ReceivePack are validated against a real git CLI via the integration test suite. The table below lists every capability / sub-argument the protocol v2 spec defines and the current implementation status.

UploadPack

Capability / argument Status Notes
version 2 ✅ supported Capability advert emitted on connect.
ls-refs ✅ supported ref-prefix, symrefs, peel, unborn all honoured. HEAD advertised as a symref when requested. Annotated tags peel to a recursive target (depth capped at 10).
fetch=shallow ✅ supported deepen <n>, deepen-since <ts>, deepen-not <ref>, deepen-relative, shallow <sha> all honoured. shallow-info section emitted when the walker surfaces any new / unshallow boundary.
fetch=wait-for-done ✅ supported Client --negotiate-only works end-to-end.
fetch=filter (partial) ✅ supported blob:none, blob:limit=<n>[k|m|g], tree:<n>, object:type=<t>, sparse:oid=<oid>, combine:a+b+…. Invalid specs are rejected with a band-3 ERR reply. Lazy promisor fetches (blob/tree wants) bypass the filter.
server-option ✅ advertised Server accepts server-option <value> lines in fetch requests; currently a no-op (parsed but not acted on).
object-format=sha1 ✅ supported Only SHA-1 is supported. SHA-256 is out of scope.
Multi-round negotiation ✅ supported Client may send haves across multiple rounds without done; stateful session persists until packfile is sent.
packfile-uris ❌ not supported No CDN-offloaded pack URIs.
wanted-refs ❌ not supported Server never emits a wanted-refs section.
Protocol v0 / v1 ❌ not supported Clients must request v2 (git 2.26+ does by default).

ReceivePack

Capability Status Notes
report-status ✅ supported Per-ref ok / ng lines returned.
delete-refs ✅ supported Push of zero-sha → ref deletes.
atomic ✅ supported Two-phase validate-then-commit with rollback on mid-batch failure. Atomicity is best-effort at the storage layer — see Documentation/atomic.md.
report-status-v2 ❌ not supported Only v1 status lines are emitted.
side-band-64k ❌ not supported Status goes in-band.
quiet ❌ not supported Ignored; hooks still fire.
ofs-delta (write-side) ❌ not supported Pack writer stores full objects only; thin-pack reading via REF_DELTA is fully supported.

Known Limitations

  • ofs-delta generation — pack writer stores full objects only, no delta compression on the write side.
  • Atomic ref updates are validation-based with rollback from a pre-flight snapshot, not a native multi-key storage transaction. A VM crash mid-rollback can leave refs partially applied. Failures are logged via Logger.error and emitted as a [:ex_git_objectstore, :protocol, :receive_pack, :rollback_failed] telemetry event.

License

Copyright 2026 Cole Christensen

Licensed under the Apache License, Version 2.0. See LICENSE for details.