Git Is Not Just a Backup Tool. Here Is What It Actually Is.
by Eric Hanson, Backend Developer at Clean Systems Consulting
The Save Button Misconception
Your team pushes code at the end of the day so it doesn't get lost. Someone asks "did you push?" and they mean "is your work safe somewhere?" That is a perfectly reasonable use of Git — and also a sign that the team is leaving most of Git's value on the table.
Git is not a backup tool. Backups store the current state of something. Git stores a complete, navigable history of how something became what it is. That distinction sounds academic until you're three weeks into a regression that nobody can explain, or you need to understand why a critical business rule was written the way it was.
What Git Actually Models
Git stores snapshots, not diffs. Every commit is a complete picture of the repository at that point in time, stored as a tree of objects. The efficiency comes from the fact that unchanged files are represented as pointers to existing objects, not re-copied data. The working model looks like this:
Blob objects → individual file contents
Tree objects → directory structures pointing to blobs
Commit objects → a tree + parent commit(s) + metadata
Those commit objects form a directed acyclic graph (DAG). Each commit points to its parent. Branches are just named pointers to commits. Tags are the same, but typically immutable. HEAD is a pointer to the currently checked-out commit or branch.
# This is what a commit object actually contains
git cat-file -p HEAD
# tree 4b825dc642cb6eb9a060e54bf8d69288fbee4904
# parent a1b2c3d4e5f6...
# author Eric Hanson <eric@example.com> 1714041600 +0700
# committer Eric Hanson <eric@example.com> 1714041600 +0700
#
# Add payment idempotency key validation
This graph structure is why git log --graph shows branching lines. It is literally drawing the shape of the DAG.
Why This Changes How You Use Git
Once you understand Git as a graph database of snapshots, several things click into place.
Branching is cheap because branches are just pointers. Creating a branch in Git takes microseconds regardless of repository size. In older VCS tools like SVN, branching meant copying directories. That shaped team behavior — people avoided branches because they were expensive. In Git, the cost is a 41-byte file containing a commit SHA. There is no reason not to branch.
Merging is about graph reconciliation, not file copying. When you merge two branches, Git finds their common ancestor commit (called the merge base), then applies the changes from both branches relative to that ancestor. Merge conflicts happen when both branches changed the same lines relative to that ancestor. Git is not confused — it found two valid but contradictory sets of instructions and is asking you to decide.
Rebasing rewrites graph history. This is why rebasing a shared branch is dangerous. You are not moving commits — you are creating new commits with new SHAs that contain the same changes but different parents. Anyone else with the old commits in their local history now has a diverged graph.
# Before rebase: your branch has commits D and E on top of B
A - B - C (main)
\
D - E (feature)
# After rebase onto main: D and E are replaced with D' and E'
A - B - C (main)
\
D' - E' (feature)
D and E still exist in the object store until garbage collection runs. Reflog tracks them.
The Audit Trail Nobody Thinks About Until They Need It
Because Git stores the full graph with author metadata, timestamps, and messages, it functions as an audit trail for every change ever made to the codebase. The question is whether that audit trail is useful or noise.
A repository where every commit message is "fix" or "WIP" has the mechanical structure of an audit trail but none of the value. You cannot answer "why does this code check for null here?" by looking at commit a3f9d2: fix.
A repository with disciplined commits can answer questions like:
- Which change introduced this performance regression? (
git bisect) - What was the intended behavior of this function before it was changed? (
git log -p -- path/to/file) - Who made this decision and when? (
git blame, thengit show <sha>)
These are not hypothetical. They are the questions you ask during incidents, during onboarding, and during security audits.
What Git Is Not Good At
Being precise about Git's model means being honest about its limits.
Git does not handle large binary files well. The object model stores complete snapshots of every version of every file. A 50MB PSD file committed ten times is 500MB in your object store. Git LFS (Large File Storage) solves this by storing a pointer in the repo and the binary separately, but it adds operational overhead.
Git is not a deployment tool. The fact that you can check out any commit does not mean that commit is deployable. That's what CI/CD pipelines are for.
Git is not a database. Using it as one — storing generated artifacts, runtime data, or secrets — creates problems that compound over time.
The Practical Shift
Stop thinking of git push as saving your work. Think of it as publishing a set of documented decisions to a shared graph that your entire team can navigate, reason about, and build on.
That shift changes what a commit means. It is not "the state of my files at 5pm." It is a unit of reasoning — a logical change with a documented intent, permanently recorded in a structure that enables anyone who comes after you to understand what happened and why.
Start there: next time you commit, ask whether someone reading that commit in two years would understand not just what changed, but why.