Multi-Stage Builds: The Dockerfile Trick That Shrinks Your Image
by Eric Hanson, Backend Developer at Clean Systems Consulting
The image that ships your compiler
Your Go service is a single statically linked binary. It's 18MB. Your Docker image is 800MB, because the Dockerfile uses golang:1.22 as the base image and never changes — the full Go toolchain, standard library sources, and a Debian userland are all present in the final image alongside your 18MB binary.
This is the problem multi-stage builds solve. They've been available since Docker 17.05 (released 2017), they're widely documented, and a large share of production Dockerfiles still don't use them. If you're shipping build tools in your runtime image, this is the fix.
How multi-stage builds work
A Dockerfile can declare multiple FROM instructions. Each FROM starts a new stage. Stages can copy artifacts from each other with COPY --from=<stage>. Only the final stage becomes the image that gets tagged and pushed.
# Stage 1: build
FROM golang:1.22-alpine AS build
WORKDIR /app
COPY go.mod go.sum ./
RUN go mod download
COPY . .
RUN CGO_ENABLED=0 go build -o server ./cmd/server
# Stage 2: runtime
FROM scratch
COPY --from=build /app/server /server
ENTRYPOINT ["/server"]
The build stage pulls the full Go toolchain, downloads modules, and compiles the binary. The final stage (scratch) is an empty base — zero operating system. We copy only the compiled binary from the build stage. The Go toolchain never appears in the final image.
Result: a runtime image containing exactly one file. For a typical Go service, that's 15–25MB versus 700–900MB for the equivalent single-stage build.
The Java equivalent
Java can't use scratch because it requires the JVM. But we can still separate the build environment (full JDK + Maven/Gradle) from the runtime (JRE only):
FROM maven:3.9-eclipse-temurin-17 AS build
WORKDIR /app
COPY pom.xml .
RUN mvn dependency:go-offline -q
COPY src ./src
RUN mvn package -DskipTests -q
FROM eclipse-temurin:17-jre-alpine
WORKDIR /app
COPY --from=build /app/target/app.jar app.jar
ENTRYPOINT ["java", "-jar", "app.jar"]
eclipse-temurin:17-jre-alpine is ~180MB compressed. maven:3.9-eclipse-temurin-17 is ~540MB. The difference is the JDK compiler toolchain, Maven itself, and Maven's local repository cache — none of which the runtime needs.
For a Spring Boot fat JAR, this is the canonical setup. You can push it further using Spring Boot's layered JAR feature:
FROM eclipse-temurin:17-jre-alpine AS extract
WORKDIR /app
COPY --from=build /app/target/app.jar app.jar
RUN java -Djarmode=layertools -jar app.jar extract
FROM eclipse-temurin:17-jre-alpine
WORKDIR /app
COPY --from=extract /app/dependencies/ ./
COPY --from=extract /app/spring-boot-loader/ ./
COPY --from=extract /app/snapshot-dependencies/ ./
COPY --from=extract /app/application/ ./
ENTRYPOINT ["java", "org.springframework.boot.loader.launch.JarLauncher"]
This unpacks the JAR into layers ordered by change frequency — third-party dependencies change rarely, your application classes change often. Docker caches each layer independently, which dramatically improves cache hit rate for iterative builds.
Python and Node: the pattern shifts
For interpreted languages, "compilation" often means dependency installation rather than a compile step. Multi-stage builds are still useful for keeping dev dependencies out of production images.
Python:
FROM python:3.12-slim AS build
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir --prefix=/install -r requirements.txt
FROM python:3.12-slim
COPY --from=build /install /usr/local
COPY src/ ./src/
CMD ["python", "src/main.py"]
The --prefix=/install flag installs packages to a specific directory, making it easy to copy just the installed packages to the runtime image. This keeps pip itself and any build-time tools out of the final layer structure.
Node.js:
FROM node:20-alpine AS build
WORKDIR /app
COPY package.json package-lock.json ./
RUN npm ci
COPY src/ ./src/
RUN npm run build
FROM node:20-alpine
WORKDIR /app
COPY package.json package-lock.json ./
RUN npm ci --omit=dev
COPY --from=build /app/dist ./dist
CMD ["node", "dist/index.js"]
The build stage installs all dependencies (including devDependencies like TypeScript, bundlers, test tools) and compiles. The runtime stage reinstalls only production dependencies. The alternative — copying node_modules from the build stage — risks including dev tools in the runtime image if your pruning logic has gaps.
Named stages and selective building
Stages can be targeted directly:
docker build --target build -t myapp:build .
This builds only up to and including the build stage, which is useful for:
- Running tests in CI without producing a runtime image
- Debugging build issues without waiting for the full pipeline
- Producing different images from the same Dockerfile for different environments
FROM node:20-alpine AS deps
WORKDIR /app
COPY package.json package-lock.json ./
RUN npm ci
FROM deps AS test
COPY . .
RUN npm test
FROM deps AS build
COPY src/ ./src/
RUN npm run build
FROM node:20-alpine AS runtime
WORKDIR /app
RUN npm ci --omit=dev
COPY --from=build /app/dist ./dist
CMD ["node", "dist/index.js"]
In CI: docker build --target test . to run tests. If tests pass: docker build --target runtime . to produce the deployment artifact. Same Dockerfile, different targets for different pipeline stages.
What you give up
Multi-stage Dockerfiles are harder to debug when something goes wrong in an intermediate stage. You can't just exec into the final container and inspect build artifacts because they were never copied in. If you need to inspect what the build stage produced, run:
docker build --target build -t debug-build . && docker run -it --rm debug-build sh
Also: COPY --from only copies files, not environment variables or working directory settings from the source stage. If your runtime depends on environment configuration set in the build stage, you need to re-declare it.
The size comparison you can run right now
If you have a single-stage Dockerfile, add a second FROM with a minimal base and a COPY --from for your artifact. Build both and compare:
docker images | grep your-image
For compiled languages, the difference is typically 4x–20x. For JVM apps, 3x–5x is typical. For interpreted languages, 1.5x–2x depending on dev dependency weight.
Start with the single COPY --from for your compiled artifact. The rest is refinement.