Layer Caching in Docker Is a Big Deal and Most Devs Ignore It

by Arif Ikhsanudin, Backend Developer

Ten developers, forty wasted minutes a day

Multiply ten developers by four Docker builds each per day at two minutes a build and you get 80 developer-minutes daily on image builds. Multiply that by 250 working days and you're at 333 developer-hours annually — roughly eight developer-weeks — sitting at a terminal watching progress bars. This is a common situation on teams that have never deliberately optimized their Dockerfile layer order, and the fix takes an afternoon.

The mechanism at the center of this is Docker's layer cache. Understanding it properly changes how you write Dockerfiles.

What a layer is and when Docker caches it

Every instruction in a Dockerfile that modifies the filesystem creates a new layer: RUN, COPY, ADD. Each layer is identified by a hash of its instruction text and its inputs. Docker compares this hash against its cache before executing. On a hit, Docker reuses the cached layer — no execution, no I/O, near-instant. On a miss, Docker executes the instruction and creates a new layer, then invalidates the cache for all subsequent layers, because downstream layers may depend on what this one produced.

This cascading invalidation is the behavior most developers don't internalize. A single cache miss early in the Dockerfile forces every subsequent instruction to re-execute, even if nothing those instructions depend on has changed.

FROM python:3.12-slim
WORKDIR /app
COPY . .                           # <-- miss here on any source change
RUN pip install -r requirements.txt  # <-- always re-runs, even if requirements unchanged
RUN python manage.py collectstatic   # <-- always re-runs

Every time any source file changes — a comment in a view, a blank line in a model — pip reinstalls all dependencies from scratch. On a project with 150 Python packages, that's 90 seconds on every build.

Ordering layers by change frequency

The fix is simple to state: put instructions that change rarely near the top, instructions that change often near the bottom.

FROM python:3.12-slim
WORKDIR /app
COPY requirements.txt .            # only changes when deps change
RUN pip install --no-cache-dir -r requirements.txt  # cached until deps change
COPY . .                           # changes with every source edit
RUN python manage.py collectstatic

Now pip install is cached as long as requirements.txt doesn't change. A source edit only replays the last two instructions. The savings are proportional to how expensive the cached layer is — package installation, compilation, asset processing.

A taxonomy of layers by cache stability

Not all layers are equal. In practice, layers fall into rough stability tiers:

Highly stable (cache for weeks or months):

  • Base image pulls — FROM node:20-alpine
  • Global tool installation — system packages that your app runtime depends on
  • Build tool installation — compilers, build utilities

Moderately stable (cache for days, invalidated by dep changes):

  • Dependency manifest copy — COPY package.json package-lock.json ./
  • Dependency installation — npm ci, pip install, mvn dependency:go-offline

Volatile (invalidated on nearly every build):

  • Application source — COPY src/ ./src/
  • Generated artifacts — compiled output, static assets

The ideal Dockerfile mirrors this ordering: stable at top, volatile at bottom.

The gotcha with multi-module projects

In a Maven multi-module project, copying only pom.xml from the root isn't enough — child module POMs also need to be present before mvn dependency:go-offline can resolve the full dependency graph.

FROM maven:3.9-eclipse-temurin-17 AS build
WORKDIR /app

# Copy all POMs first to allow dependency resolution
COPY pom.xml .
COPY module-api/pom.xml module-api/
COPY module-core/pom.xml module-core/
COPY module-web/pom.xml module-web/

RUN mvn dependency:go-offline -q

# Now copy source
COPY module-api/src module-api/src
COPY module-core/src module-core/src
COPY module-web/src module-web/src

RUN mvn package -DskipTests -q

Verbose, but each module's source is its own layer. If only module-web changes, only its source layer and the final package step need to replay. The dependency layer stays cached.

BuildKit's cache mounts: persistent caches across builds

BuildKit adds a more powerful tool: --mount=type=cache. This mounts a persistent cache directory that survives between builds on the same machine, separate from the layer cache.

FROM maven:3.9-eclipse-temurin-17 AS build
WORKDIR /app
COPY pom.xml .
RUN --mount=type=cache,target=/root/.m2 \
    mvn dependency:go-offline -q
COPY src ./src
RUN --mount=type=cache,target=/root/.m2 \
    mvn package -DskipTests -q

The Maven local repository at /root/.m2 is preserved between builds. Even when the layer cache misses (because pom.xml changed), Maven finds its previously downloaded JARs in the mounted cache and only downloads what's new. On a large project this reduces dependency resolution from 3 minutes to under 30 seconds even on a full cache miss.

The tradeoff: this cache is machine-local. Ephemeral CI runners won't benefit. For CI, use registry-based caching instead.

Cache keys and COPY precision

Docker computes the cache key for a COPY instruction using the checksum of all copied files. COPY . . checksums your entire working directory. If any file changes — even a README.md — the cache is invalidated.

Be precise with COPY:

# Broad — invalidated by README, .gitignore, test files, anything
COPY . .

# Precise — only invalidated when business logic changes
COPY src/main/ ./src/main/

This is particularly impactful when you have test directories, documentation, or tooling config that has no bearing on the runtime image. Copy only the source that the RUN instruction actually needs.

How to verify your cache is working

Run the build twice and observe the output. With classic Docker:

docker build . 2>&1 | grep -E "CACHED|Step"

Layers reported as CACHED were served from cache. If you're changing only a source file and see dependency installation steps without CACHED, your ordering is wrong.

With BuildKit's --progress=plain:

docker buildx build --progress=plain . 2>&1 | grep -E "CACHED|[0-9]+\.[0-9]+s"

Each step shows either CACHED or its execution time. The slow steps that aren't marked CACHED are your optimization targets.

The one thing to act on today

Open your team's most-used Dockerfile. Look at where the source code copy (COPY . . or similar) falls relative to the dependency installation step. If source is copied before dependencies are installed, swap them. Add a .dockerignore if it's missing. That's the change — it takes 10 minutes and the benefits accumulate on every build from that point forward.

Scale Your Backend - Need an Experienced Backend Developer?

We provide backend engineers who join your team as contractors to help build, improve, and scale your backend systems.

We focus on clean backend design, clear documentation, and systems that remain reliable as products grow. Our goal is to strengthen your team and deliver backend systems that are easy to operate and maintain.

We work from our own development environments and support teams across US, EU, and APAC timezones. Our workflow emphasizes documentation and asynchronous collaboration to keep development efficient and focused.

  • Production Backend Experience. Experience building and maintaining backend systems, APIs, and databases used in production.
  • Scalable Architecture. Design backend systems that stay reliable as your product and traffic grow.
  • Contractor Friendly. Flexible engagement for short projects, long-term support, or extra help during releases.
  • Focus on Backend Reliability. Improve API performance, database stability, and overall backend reliability.
  • Documentation-Driven Development. Development guided by clear documentation so teams stay aligned and work efficiently.
  • Domain-Driven Design. Design backend systems around real business processes and product needs.

Tell us about your project

Our offices

  • Copenhagen
    1 Carlsberg Gate
    1260, København, Denmark
  • Magelang
    12 Jalan Bligo
    56485, Magelang, Indonesia

More articles

OAuth Is Confusing Until You Understand What Problem It Actually Solves

OAuth 2.0 is not just authentication — it is a framework for delegated authorization. Once you see what it was designed to prevent, the flows stop looking arbitrary.

Read more

When You Merge Into Main by Mistake

Accidental merges happen to the best of us. Here’s how to handle it without causing chaos or losing sleep.

Read more

When Staging Access Requires Manager Approval

Ever waited hours just to test a feature on staging? When every access request has to go through a manager, productivity takes a hit.

Read more

Why Good Backend Engineers Rarely Work on Fiverr

Ever browsed Fiverr for a backend developer and wondered why most gigs feel… shallow? The truth is, top backend engineers rarely show up there—and for good reasons.

Read more