Stop Copying Everything Into Your Docker Image

by Eric Hanson, Backend Developer at Clean Systems Consulting

The COPY instruction that ships your secrets

You've seen this Dockerfile pattern hundreds of times:

FROM node:20-alpine
WORKDIR /app
COPY . .
RUN npm ci
CMD ["node", "src/index.js"]

It looks clean. It's four lines. It works locally. It also routinely copies .env files with production credentials, node_modules directories that double build time, .git folders with full commit history, test fixtures with sanitized-but-still-real-looking data, and editor configs that have no business being in a production artifact.

This isn't a hypothetical. Teams have pushed images to public registries with AWS keys in .env files. Others ship 800MB images because node_modules was present in the build context when it could have been installed fresh inside the container. The fix isn't complicated, but it requires being deliberate about what you actually want in your image.

What COPY . . actually does

When Docker executes COPY . ., it takes everything from the build context — which defaults to the directory you run docker build from — and adds it to the image layer. The daemon compresses and sends the entire build context to the Docker engine before a single layer is built.

If your project directory looks like this:

my-app/
├── src/
├── tests/
├── node_modules/    # 400MB
├── .git/            # 150MB with history
├── .env             # DB_PASSWORD=prod-secret
├── coverage/        # 50MB of test output
└── package.json

Then docker build . sends roughly 600MB to the daemon before it does anything else. Your CI build "takes forever" before even reaching the RUN npm ci line. That's the build context transfer.

.dockerignore is mandatory, not optional

The ..dockerignore file works exactly like .gitignore — patterns listed in it are excluded from the build context entirely. It's not included in most basic tutorials, which is why so many production Dockerfiles don't have one.

A baseline .dockerignore for a Node.js project:

node_modules/
.git/
.gitignore
.env
.env.*
coverage/
.nyc_output/
*.log
dist/
build/
.DS_Store
.vscode/
.idea/
README.md
*.md
tests/
__tests__/

For Java/Maven:

.git/
target/
.mvn/wrapper/maven-wrapper.jar
*.log
.env
.idea/
*.iml
src/test/

After adding this file to a real project, I've seen build context sizes drop from 700MB to 12MB. That's the difference between a 90-second CI build and a 10-second one, just from context transfer alone.

Strategic COPY: only bring in what you need

Even with .dockerignore, being deliberate about what you copy — and when — matters for layer caching and correctness.

The pattern of copying dependency manifests first, installing, then copying source is not just style:

FROM node:20-alpine
WORKDIR /app

# Copy only what's needed for dependency installation
COPY package.json package-lock.json ./
RUN npm ci --production

# Now copy source
COPY src/ ./src/

This separates the slow layer (dependency installation) from the fast layer (your source code). When you change a file in src/, Docker replays from the COPY src/ line — npm ci is cached and skipped. Change package.json and both layers rebuild, which is correct. This ordering is the single most impactful caching optimization available in most Dockerfiles.

Compare to the naive version where COPY . . precedes npm ci: every source file change invalidates the dependency install cache, so every build reinstalls packages from scratch. On a project with 800 transitive dependencies, that's a 2–3 minute penalty on every code change.

Secrets deserve extra attention

.dockerignore keeps .env out of the build context, but it's worth being explicit about what "secret" means in this context.

Files that commonly contain secrets and should never enter an image:

  • .env, .env.local, .env.production
  • *.pem, *.key, *.p12 — TLS certificates and private keys
  • credentials.json, service-account.json — cloud provider credentials
  • .aws/credentials, .kube/config — tool-specific credential files
  • application-prod.yml or application-prod.properties — Spring Boot production configs

Even in a private registry, baking secrets into an image layer is dangerous. Layers are immutable and persist. If that image ever gets pushed to the wrong place, or someone gets registry access, those credentials are exposed. Supply secrets at runtime via environment variables, Kubernetes secrets, or a secrets manager — not at build time via COPY.

What about COPY vs ADD?

Docker has two instructions for copying files into an image: COPY and ADD. Use COPY for everything except two specific cases:

ADD can unpack local .tar archives automatically and can fetch files from URLs. Both of these behaviors are footguns. Fetching from a URL in an ADD instruction means the layer content depends on an external resource that Docker won't re-fetch unless the URL changes (even if the content at that URL has changed). Use RUN curl with explicit checksums if you need to fetch something, so the behavior is explicit.

# Don't do this
ADD https://example.com/some-tool.tar.gz /usr/local/bin/

# Do this — explicit, auditable
RUN curl -Lo /tmp/some-tool.tar.gz https://example.com/some-tool.tar.gz \
    && echo "expectedsha256  /tmp/some-tool.tar.gz" | sha256sum -c \
    && tar -xzf /tmp/some-tool.tar.gz -C /usr/local/bin/ \
    && rm /tmp/some-tool.tar.gz

The tradeoff: more verbose, but the checksum verification means a compromised upstream artifact gets caught at build time rather than silently deployed.

Audit what's actually in your image

Run this against your current image to see what's in each layer:

docker save your-image:tag | tar -xO --wildcards '*/layer.tar' | tar -tv 2>/dev/null | sort -k5 -rn | head -50

Or use dive for an interactive view. Either way — look at your image before shipping it to staging. What's in there is what runs in production.

The immediate action

Create a .dockerignore file in every repository that has a Dockerfile. Do it before your next build. Check docker history your-image:tag to see how big the layer created by your COPY instruction is. If it's larger than your compiled application, something is wrong.

Explicit is better than implicit here. Know exactly what's in your image — because your security team will eventually ask.

Scale Your Backend - Need an Experienced Backend Developer?

We provide backend engineers who join your team as contractors to help build, improve, and scale your backend systems.

We focus on clean backend design, clear documentation, and systems that remain reliable as products grow. Our goal is to strengthen your team and deliver backend systems that are easy to operate and maintain.

We work from our own development environments and support teams across US, EU, and APAC timezones. Our workflow emphasizes documentation and asynchronous collaboration to keep development efficient and focused.

  • Production Backend Experience. Experience building and maintaining backend systems, APIs, and databases used in production.
  • Scalable Architecture. Design backend systems that stay reliable as your product and traffic grow.
  • Contractor Friendly. Flexible engagement for short projects, long-term support, or extra help during releases.
  • Focus on Backend Reliability. Improve API performance, database stability, and overall backend reliability.
  • Documentation-Driven Development. Development guided by clear documentation so teams stay aligned and work efficiently.
  • Domain-Driven Design. Design backend systems around real business processes and product needs.

Tell us about your project

Our offices

  • Copenhagen
    1 Carlsberg Gate
    1260, København, Denmark
  • Magelang
    12 Jalan Bligo
    56485, Magelang, Indonesia

More articles

Recovering From a Failed Software Project

“So… what now?” After the dust settles, this is the question every team has to face.

Read more

What Mid-Level Developers Get Wrong About System Design

Mid-level developers often bring strong implementation skills into system design conversations and apply the wrong mental model. The gaps are predictable, and fixing them is less about knowledge than about shifting the frame.

Read more

Your Integration Tests Are Too Slow Because You Are Testing Too Much at Once

Integration tests that take 15 minutes to run are usually testing multiple unrelated behaviors in each test case, or starting too much infrastructure for each individual test. The fix is narrowing scope, not adding more hardware.

Read more

What Nobody Tells You About Scaling a Backend System

Scaling is not about adding more servers. It is about identifying which resource is the binding constraint and relieving exactly that constraint — without creating three new ones in the process.

Read more