Your Docker Image Has More Inside It Than You Think

by Eric Hanson, Backend Developer at Clean Systems Consulting

The image nobody has actually opened

Your production image has been running for eight months. It's built by CI, pushed to a registry, pulled by Kubernetes. Hundreds of deploys. Nobody has ever opened it to see what's inside.

This is normal. It's also a problem. Images accumulate content from base layers, build artifacts, and COPY instructions in ways that aren't obvious from reading the Dockerfile. Sensitive configuration files, debug utilities, package manager caches, test fixtures — any of these can end up in a layer and stay there until someone specifically looks.

Here's how to actually inspect your image, and what to do when you find something that shouldn't be there.

Layer anatomy: what you're actually looking at

A Docker image is a stack of read-only layers. Each layer is a tarball of filesystem changes — files added, modified, or deleted relative to the previous layer. The final image filesystem is the union of all layers.

The important implication: deleting a file in a later layer doesn't remove it from the image. The file is present in the earlier layer's tarball. If you add a file in RUN step 3 and delete it in RUN step 7, the file is still in the layer created by step 3, and therefore still in the image. Anyone who unpacks the image or inspects the layers can retrieve it.

This is why cache cleanup must happen in the same RUN instruction that creates the cache:

# Wrong — cache is in one layer, cleanup is in another
RUN apt-get update && apt-get install -y build-essential
RUN rm -rf /var/lib/apt/lists/*   # too late — the lists are in the previous layer

# Right — cleanup in same RUN, same layer
RUN apt-get update && apt-get install -y --no-install-recommends build-essential \
    && rm -rf /var/lib/apt/lists/*

How to inspect what's actually in your image

docker history: start here

docker history --no-trunc your-image:tag

This shows each layer, the instruction that created it, and the layer size. Large layers are your first investigation target. A 200MB layer created by COPY . . means your build context was copied in with something heavy.

Exploring the filesystem with docker run

docker run --rm -it your-image:tag sh

If the image has a shell, you can browse interactively. Check:

ls -la /app           # what's in your working directory?
ls -la /tmp           # temp files that shouldn't be there?
find / -name "*.env" 2>/dev/null   # env files anywhere in the image?
find / -name "*.pem" 2>/dev/null   # certificates or private keys?
find / -name ".git" -type d 2>/dev/null  # git history?

Exporting and unpacking layers

For images without a shell (distroless, scratch-based), or for a systematic audit:

docker save your-image:tag | tar x -C /tmp/image-audit/

This extracts the image to disk as a directory of tarballs (one per layer). You can then unpack each layer:

for layer in /tmp/image-audit/*/layer.tar; do
  echo "=== $layer ==="
  tar -tv -f "$layer" | sort -k5 -rn | head -20  # largest files first
done

dive: the practical tool

dive (github.com/wagoodman/dive) provides an interactive TUI for browsing image layers and seeing exactly which files changed in each layer. Install it and run:

dive your-image:tag

The left panel shows layers with their size. The right panel shows the filesystem diff for the selected layer — green for added, yellow for modified, red for removed. Files "removed" in a layer show as red but are still present in earlier layers — that's the problem the UI helps you visualize.

dive also has a CI mode that fails if any image efficiency metric falls below a threshold:

dive --ci your-image:tag

Common things that shouldn't be there

Build artifacts and intermediate files

Source code, test directories, compiled intermediate objects that didn't make it into the final artifact. Common in single-stage builds that didn't clean up.

Package manager caches

/root/.m2 (Maven), /root/.cache/pip (pip), /root/.npm (npm cache), /var/cache/apt/ (apt). These are left by dependency installation and don't serve any purpose at runtime.

Development credentials and configuration

.env files, application-local.yml, AWS credential files in ~/.aws/, private keys copied in during build. These end up in images when .dockerignore is absent or incomplete.

Version control history

.git/ directories containing your full commit history. This is surprisingly common when COPY . . is used without a .dockerignore. A .git directory in an image means anyone with registry access can clone your repository history.

Build tools that weren't removed

gcc, make, curl, wget installed for build-time use and never cleaned up. These tools make container escape and lateral movement easier for an attacker who has code execution in the container.

Fixing what you find

If you find something that shouldn't be in the image, the fix depends on where it entered:

Via COPY: add it to .dockerignore.

Via RUN: ensure cleanup happens in the same RUN instruction, or switch to a multi-stage build so the intermediate layer never enters the final stage.

From the base image: switch to a more minimal base (alpine variants, distroless, slim variants), or add an explicit deletion in the same layer if it's a specific known file.

Credentials baked in: this is a multi-part fix. Remove them from the image, rotate them immediately (assume they've been compromised if the image was ever pushed to a registry), and implement runtime secret injection.

Making image auditing part of the process

Ad hoc audits find problems after the fact. Better to integrate inspection into CI:

  1. docker scout or trivy for known vulnerabilities (covered separately)
  2. dive --ci for efficiency and unexpected large files
  3. A simple script that checks for known-bad patterns:
# Fail if .env or .git appear in the image
docker run --rm your-image:tag sh -c '
  (find / -name ".env" 2>/dev/null | grep -q .) && exit 1
  (find / -name ".git" -type d 2>/dev/null | grep -q .) && exit 1
  exit 0
'

The first time you run this on an existing image, you will find something. Almost every project does. The goal isn't to be surprised by a security audit — it's to find these things yourself first.

Scale Your Backend - Need an Experienced Backend Developer?

We provide backend engineers who join your team as contractors to help build, improve, and scale your backend systems.

We focus on clean backend design, clear documentation, and systems that remain reliable as products grow. Our goal is to strengthen your team and deliver backend systems that are easy to operate and maintain.

We work from our own development environments and support teams across US, EU, and APAC timezones. Our workflow emphasizes documentation and asynchronous collaboration to keep development efficient and focused.

  • Production Backend Experience. Experience building and maintaining backend systems, APIs, and databases used in production.
  • Scalable Architecture. Design backend systems that stay reliable as your product and traffic grow.
  • Contractor Friendly. Flexible engagement for short projects, long-term support, or extra help during releases.
  • Focus on Backend Reliability. Improve API performance, database stability, and overall backend reliability.
  • Documentation-Driven Development. Development guided by clear documentation so teams stay aligned and work efficiently.
  • Domain-Driven Design. Design backend systems around real business processes and product needs.

Tell us about your project

Our offices

  • Copenhagen
    1 Carlsberg Gate
    1260, København, Denmark
  • Magelang
    12 Jalan Bligo
    56485, Magelang, Indonesia

More articles

Dublin's Best Backend Developers Work for Google and Meta — What the Rest of Us Do

You posted a backend role three weeks ago. The only applicants who fit are already at a FAANG company and just "seeing what's out there." They're not leaving.

Read more

Why Top-Tier Backend Talent Is Leaving Big Tech to Become Independent Contractors

A wave of experienced backend engineers is leaving stable tech jobs to work independently. For startups that know how to engage them, this is a meaningful shift.

Read more

Backwards Compatibility Is a Promise. Stop Breaking It.

Every time you make an unannounced breaking change, you are telling your users that their time is worth less than your convenience. Here is how to take that promise seriously.

Read more

Why Dutch Tech Startups Are Winning With Async Remote Backend Contractors

Your engineering standup just ended. Fifteen minutes of status updates, and the only backend ticket that moved this week was a bug fix. The features that actually matter are still waiting on someone who has time to build them.

Read more