Why Your Docker Image Works Locally But Breaks in Production

February 3, 2026

by Eric Hanson, Backend Developer at Clean Systems Consulting

The container that passes CI and fails in ECS

Your service works in local Docker Compose. It builds clean in CI. The image gets pushed to ECR. It deploys to ECS Fargate, and the health check fails. The logs show the application started but the health endpoint never responds. You spend two hours ssh-ing into nothing (Fargate doesn't have ssh), reading CloudWatch logs, and eventually discover the issue is a read-only filesystem mount the app was trying to write to.

This is the category of Docker problem that doesn't show up until production: the image is fine, the Dockerfile is fine, but the environment the image runs in is different in ways you didn't account for.

Here's a map of the most common mismatches, and how to close them.

Architecture: the ARM/AMD64 gap

If you develop on an Apple M-series Mac, you build ARM64 images by default. If production runs on x86-64 (most cloud instances, most CI runners), the image you built locally won't run in production — or worse, it will run via emulation and behave differently.

Verify your image architecture:

docker inspect your-image:tag | grep Architecture

Build for the production platform explicitly:

docker build --platform linux/amd64 -t your-image:tag .

Or use docker buildx for multi-platform builds that produce manifests supporting both:

docker buildx build \
  --platform linux/amd64,linux/arm64 \
  -t your-registry/your-image:tag \
  --push .

In CI, always set --platform linux/amd64 (or whatever your production target is) explicitly. Don't let the runner's native architecture determine the output architecture.

File permissions and user mismatch

Locally, Docker often runs as root or with a user that matches your laptop's UID. In production environments — Kubernetes with runAsNonRoot: true, ECS task definitions with a user field, Fargate with restricted execution — the container user may differ.

If your application writes to a directory inside the container that was created by root during the build, a non-root runtime user will get permission denied errors.

# Creates /app as root, copies files as root
FROM node:20-alpine
WORKDIR /app
COPY --chown=node:node . .
USER node

The --chown flag on COPY sets ownership at copy time. Do this for all COPY instructions when you intend to run as non-root. Also:

RUN mkdir -p /app/logs /app/tmp \
    && chown -R node:node /app/logs /app/tmp

Create any directories your application writes to during build, set ownership explicitly, then switch to the non-root user.

Environment variables: present locally, absent in production

In local development, environment variables come from a .env file loaded by Docker Compose or a local shell profile. In production, they come from Kubernetes secrets, ECS task definition environment fields, or a secrets manager at startup.

The failure mode: a variable is set in your local .env but missing from the production environment config. The application starts, reaches the code path that uses the variable, and either crashes or behaves unexpectedly.

Fail fast at startup for required variables:

// Node.js
const required = ['DATABASE_URL', 'JWT_SECRET', 'PORT'];
for (const key of required) {
  if (!process.env[key]) {
    console.error(`Missing required environment variable: ${key}`);
    process.exit(1);
  }
}

// Spring Boot — fail fast with @Value
@Value("${database.url:#{null}}")
private String databaseUrl;

@PostConstruct
public void validate() {
    if (databaseUrl == null) {
        throw new IllegalStateException("database.url must be configured");
    }
}

An application that crashes at startup with a clear error message (Missing required environment variable: DATABASE_URL) is infinitely easier to diagnose than one that starts, fails silently, and reports a 500 response three requests later.

Resource limits: unlimited locally, constrained in production

Local Docker runs don't have memory or CPU limits unless you explicitly set them. Production environments almost always do — Kubernetes resource limits, ECS task definition memory, Fargate task size.

The failure mode: your JVM application uses up to 4GB heap locally. In production it's limited to 1GB. The JVM's default heap sizing is based on the host's total memory, not the container's limit. In older JVM versions (pre-11), the JVM would ignore container memory limits entirely and allocate heap based on host RAM, leading to OOMKilled containers.

For JVM applications in containers, always set explicit GC and heap options:

ENV JAVA_OPTS="-XX:+UseContainerSupport -XX:MaxRAMPercentage=75.0 -XX:InitialRAMPercentage=50.0"
ENTRYPOINT ["sh", "-c", "java $JAVA_OPTS -jar app.jar"]

-XX:+UseContainerSupport (default since JDK 8u191) makes the JVM respect container memory limits. -XX:MaxRAMPercentage=75.0 sets heap to 75% of the container's memory limit, leaving headroom for the JVM's off-heap memory and the OS.

Test locally with the same limits as production:

docker run --memory=512m --cpus=0.5 your-image:tag

If the app fails under these constraints, you want to know before production does.

Filesystem: writable locally, read-only in production

Kubernetes securityContext.readOnlyRootFilesystem: true mounts the container root filesystem as read-only. If your application writes anywhere inside the container filesystem (temp files, log files, PID files, JVM crash dumps), it will fail.

Common offenders:

Log files written to a path like /app/logs/
JVM's -XX:+HeapDumpOnOutOfMemoryError writing to the working directory
Applications writing temp files to /tmp

Solutions:

Write logs to stdout/stderr, not files (let the orchestrator handle log collection)
Mount a writable volume for any path that needs writes: /tmp, /app/logs, etc.
Configure JVM heap dumps to a mounted volume path

In your Kubernetes deployment:

securityContext:
  readOnlyRootFilesystem: true
volumeMounts:
  - name: tmp
    mountPath: /tmp
  - name: logs
    mountPath: /app/logs
volumes:
  - name: tmp
    emptyDir: {}
  - name: logs
    emptyDir: {}

Test this locally:

docker run --read-only --tmpfs /tmp your-image:tag

If the application starts cleanly under --read-only, it will work with readOnlyRootFilesystem: true in Kubernetes.

Networking: localhost means something different

In Docker Compose, services reach each other by service name. postgresql://postgres:5432/mydb works because Compose creates a network and registers DNS for service names. Your application assumes the same pattern in production.

In production (Kubernetes, ECS), the networking model is different: services are reached via cluster DNS (myservice.namespace.svc.cluster.local) or environment-injected service endpoints, not Compose service names.

The fix is ensuring your application's service endpoints are fully configurable via environment variables and that local defaults don't leak into production configs. Never hardcode localhost or Compose service names in application code. Everything that varies between environments goes into environment variables.

Close the gap intentionally

Add a docker run test to your CI pipeline that mimics production constraints before the image is pushed:

docker run \
  --read-only \
  --tmpfs /tmp \
  --memory=512m \
  --cpus=0.5 \
  --user 1001:1001 \
  --env-file .env.test \
  --platform linux/amd64 \
  your-image:tag \
  /bin/sh -c "echo 'startup check passed'"

This catches the most common environment mismatches before the image reaches a real environment. Not everything — but enough to stop the "works locally, fails in production" class of incidents before they happen.

Our offices

Follow us

Why Your Docker Image Works Locally But Breaks in Production

The container that passes CI and fails in ECS

Architecture: the ARM/AMD64 gap

File permissions and user mismatch

Environment variables: present locally, absent in production

Resource limits: unlimited locally, constrained in production

Filesystem: writable locally, read-only in production

Networking: localhost means something different

Close the gap intentionally

Scale Your Backend - Need an Experienced Backend Developer?

Tell us about your project

Our offices

More articles

Why Your Containers Can't Talk to Each Other

Stop Writing Loops When SQL Aggregations Can Do the Work

How I Handle File Uploads in Rails with Active Storage

Distributed Caching With Redis in Spring Boot — Beyond the Basics