Spring Boot Logging in Production — Structured Logs, Correlation IDs, and What to Alert On

by Eric Hanson, Backend Developer at Clean Systems Consulting

Why unstructured logs fail at scale

A log line like 2026-04-17 14:30:45 ERROR OrderService - Failed to process order 123 for user alice@example.com contains useful information but is practically unqueryable at scale. Extracting it requires regex parsing that's fragile to any change in the log format. Alerting on error rate requires counting lines that match a pattern — slow and brittle.

Structured logging emits logs as JSON objects where every field is a named key-value pair:

{
  "timestamp": "2026-04-17T14:30:45.123Z",
  "level": "ERROR",
  "logger": "com.example.OrderService",
  "message": "Failed to process order",
  "orderId": "123",
  "userEmail": "alice@example.com",
  "errorType": "PaymentDeclinedException",
  "traceId": "abc123def456",
  "spanId": "789xyz",
  "environment": "production",
  "service": "order-service"
}

Every field is indexable. Querying "ERROR logs for order 123 in the last hour" is a log aggregator query, not a regex. Alerting on error rate is count(level=ERROR) / count(*) — precise and fast.

Logback with JSON output

Spring Boot uses Logback by default. Configure structured JSON output for production:

<!-- src/main/resources/logback-spring.xml -->
<configuration>

  <!-- Development profile: human-readable -->
  <springProfile name="!production">
    <appender name="CONSOLE" class="ch.qos.logback.core.ConsoleAppender">
      <encoder>
        <pattern>%d{HH:mm:ss.SSS} [%thread] %-5level %logger{36} - %msg%n</pattern>
      </encoder>
    </appender>
    <root level="DEBUG">
      <appender-ref ref="CONSOLE"/>
    </root>
  </springProfile>

  <!-- Production profile: structured JSON -->
  <springProfile name="production">
    <appender name="JSON" class="ch.qos.logback.core.ConsoleAppender">
      <encoder class="net.logstash.logback.encoder.LogstashEncoder">
        <includeCallerData>false</includeCallerData>
        <includeMdcKeyName>traceId</includeMdcKeyName>
        <includeMdcKeyName>spanId</includeMdcKeyName>
        <includeMdcKeyName>requestId</includeMdcKeyName>
        <includeMdcKeyName>userId</includeMdcKeyName>
        <includeMdcKeyName>tenantId</includeMdcKeyName>
        <customFields>{"service":"order-service"}</customFields>
      </encoder>
    </appender>
    <root level="INFO">
      <appender-ref ref="JSON"/>
    </root>
    <!-- Reduce noisy framework logs -->
    <logger name="org.hibernate.SQL" level="WARN"/>
    <logger name="com.zaxxer.hikari" level="WARN"/>
    <logger name="org.springframework.web" level="WARN"/>
  </springProfile>

</configuration>

The dependency:

<dependency>
    <groupId>net.logstash.logback</groupId>
    <artifactId>logstash-logback-encoder</artifactId>
    <version>7.4</version>
</dependency>

LogstashEncoder outputs each log event as a single-line JSON object — one line per event, newline-delimited. Log aggregators (Datadog, Splunk, ELK, CloudWatch Logs Insights) parse these directly without regex.

includeCallerData>false</includeCallerData> disables caller class and method name resolution — expensive for high-throughput services. Enable it only when actively debugging.

MDC — the context that travels with every log line

MDC (Mapped Diagnostic Context) is a per-thread map of key-value pairs automatically included in every log line. Set contextual values at the request boundary; they appear in all downstream log lines:

@Component
public class RequestLoggingFilter extends OncePerRequestFilter {

    @Override
    protected void doFilterInternal(HttpServletRequest request,
            HttpServletResponse response, FilterChain chain)
            throws ServletException, IOException {

        String requestId = Optional.ofNullable(request.getHeader("X-Request-ID"))
            .orElse(UUID.randomUUID().toString());

        try {
            MDC.put("requestId", requestId);
            MDC.put("method", request.getMethod());
            MDC.put("path", request.getRequestURI());

            // Set user context after security filter has run
            Authentication auth = SecurityContextHolder.getContext().getAuthentication();
            if (auth != null && auth.isAuthenticated() &&
                    !(auth instanceof AnonymousAuthenticationToken)) {
                MDC.put("userId", auth.getName());
            }

            response.addHeader("X-Request-ID", requestId);
            chain.doFilter(request, response);

        } finally {
            MDC.clear();  // mandatory — threads are reused
        }
    }
}

With MDC set, every log line from any class in the request thread includes requestId, userId, and path without any of those classes needing to pass them explicitly. A log line from PaymentGatewayClient deep in the call stack includes the same requestId as the controller log line — allowing reconstruction of the full request flow.

Correlation IDs across services

MDC is thread-local. When work crosses service boundaries (HTTP calls, message queue processing), the correlation ID must be propagated explicitly.

Outgoing HTTP calls — propagate the trace ID:

@Bean
public WebClient orderServiceClient(ObservationRegistry observationRegistry) {
    return WebClient.builder()
        .baseUrl(orderServiceUrl)
        .filter(propagateCorrelationId())
        .build();
}

private ExchangeFilterFunction propagateCorrelationId() {
    return ExchangeFilterFunction.ofRequestProcessor(request -> {
        String traceId = MDC.get("traceId");
        String requestId = MDC.get("requestId");

        ClientRequest.Builder builder = ClientRequest.from(request);
        if (traceId != null) builder.header("X-B3-TraceId", traceId);
        if (requestId != null) builder.header("X-Request-ID", requestId);

        return Mono.just(builder.build());
    });
}

Incoming HTTP calls — extract and restore the trace ID:

// In RequestLoggingFilter, check for incoming correlation headers
String incomingTraceId = request.getHeader("X-B3-TraceId");
if (incomingTraceId != null) {
    MDC.put("traceId", incomingTraceId);
} else {
    MDC.put("traceId", UUID.randomUUID().toString());
}

Micrometer Tracing (Spring Boot 3.x) automates this. With spring-boot-starter-actuator and a Micrometer Tracing bridge configured, trace IDs propagate automatically through WebClient, RestTemplate, and message listeners. The MDC is populated from the current span automatically — traceId and spanId appear in logs without manual propagation.

management:
  tracing:
    sampling:
      probability: 1.0  # sample 100% in development, 0.1 in production

With Micrometer Tracing, the manual propagation above is replaced by the framework. The tracing filter, MDC population, and header propagation all happen automatically.

Propagating MDC through async boundaries

MDC is thread-local — when work moves to a different thread, the MDC is not automatically copied. With @Async, Kafka consumers, and virtual threads, this causes MDC to be empty in log lines from the new thread:

@Async
public void processAsync(String orderId) {
    // MDC is empty here — different thread
    log.info("Processing order {}", orderId); // no traceId in this log line
}

Fix: copy MDC values at the point of thread handoff:

@Configuration
public class AsyncConfig {

    @Bean
    public TaskDecorator mdcTaskDecorator() {
        return runnable -> {
            Map<String, String> mdcCopy = MDC.getCopyOfContextMap();
            return () -> {
                try {
                    if (mdcCopy != null) MDC.setContextMap(mdcCopy);
                    runnable.run();
                } finally {
                    MDC.clear();
                }
            };
        };
    }

    @Bean
    public Executor asyncExecutor(TaskDecorator mdcTaskDecorator) {
        ThreadPoolTaskExecutor executor = new ThreadPoolTaskExecutor();
        executor.setTaskDecorator(mdcTaskDecorator);
        executor.setCorePoolSize(5);
        executor.setMaxPoolSize(20);
        executor.initialize();
        return executor;
    }
}

MDC.getCopyOfContextMap() captures the current MDC at submission time. The TaskDecorator restores it before the runnable executes on the pool thread.

For Kafka consumers, restore MDC from the message headers:

@KafkaListener(topics = "orders.placed")
public void handleOrderPlaced(ConsumerRecord<String, OrderPlacedEvent> record) {
    String traceId = extractHeader(record, "X-Trace-ID");
    try {
        MDC.put("traceId", traceId != null ? traceId : UUID.randomUUID().toString());
        MDC.put("topic", record.topic());
        MDC.put("partition", String.valueOf(record.partition()));
        MDC.put("offset", String.valueOf(record.offset()));

        processEvent(record.value());
    } finally {
        MDC.clear();
    }
}

Log levels — the discipline that reduces noise

ERROR: a condition that requires immediate human attention. A database is down. An uncaught exception reached the top of the call stack. A critical business operation failed permanently.

WARN: a condition that is unexpected but handled. A retry succeeded after initial failure. A configuration value is missing but a default was used. A deprecated code path was called.

INFO: key business events and state transitions. Order created. Payment processed. User logged in. Service started successfully. These should be auditable.

DEBUG: detailed technical information useful during development. SQL queries, HTTP request/response bodies, cache hit/miss decisions. Should be disabled in production.

TRACE: extremely verbose — method entry/exit, loop iterations. Almost never appropriate in production.

The common failure: everything at DEBUG in production because "we want to see what's happening." The result is gigabytes of log volume per hour, $thousands in log storage costs, and real errors buried in noise. Use DEBUG only when actively diagnosing a specific issue; switch it off immediately after.

// Expensive logging — construct string only if DEBUG is enabled
if (log.isDebugEnabled()) {
    log.debug("Cache miss for key {}, loading from database", generateKey(params));
}

// SLF4J parameterized logging — constructs string only if level is enabled
log.debug("Request completed: method={}, path={}, status={}, duration={}ms",
    method, path, status, duration);

// WRONG — always constructs the string regardless of log level
log.debug("Request: " + method + " " + path + " " + status);

What to alert on

Logs generate too much data to alert on every event. Structure alerts around signals, not log lines:

Immediately page:

  • ERROR log rate above baseline (e.g., > 1% of requests produce ERROR logs)
  • Specific error types that indicate security incidents: authentication failures exceeding threshold, authorization failures for admin endpoints
  • Any OutOfMemoryError or StackOverflowError in logs

Alert but don't page immediately:

  • WARN log rate increase (2x baseline sustained for 5 minutes)
  • Specific known degradation signals: circuit breaker WARN logs, retry WARN logs
  • Slow query WARN logs from Hibernate/HikariCP

Don't alert on:

  • Individual ERROR log lines — too noisy, alert on rates
  • DEBUG or INFO level — not alert-worthy by definition
  • Known expected errors (404 not found, 401 unauthorized) without rate anomaly

Configure your log aggregator to create metrics from logs, then alert on metrics:

# Datadog log-based metric example
count of events where level=ERROR and service=order-service
# Alert condition
sum(last 5m):count:log_lines{level:error,service:order-service} > 50

Alerting on a count rather than individual lines means a single ERROR during an otherwise quiet period doesn't page anyone — only sustained elevated error rates do. This is the distinction between signal and noise.

The logging checklist for a new service

Before going to production:

  • JSON structured output configured for the production profile
  • MDC populated at request entry with requestId, userId, and trace context
  • MDC cleared in finally at every thread boundary
  • Micrometer Tracing configured if using distributed tracing
  • TaskDecorator wrapping async executors to propagate MDC
  • Log level at INFO for application code, WARN for framework code
  • ERROR logs trigger an alert in monitoring
  • X-Request-ID returned in response headers for client-side debugging
  • No DEBUG logging left permanently enabled from a debugging session

Scale Your Backend - Need an Experienced Backend Developer?

We provide backend engineers who join your team as contractors to help build, improve, and scale your backend systems.

We focus on clean backend design, clear documentation, and systems that remain reliable as products grow. Our goal is to strengthen your team and deliver backend systems that are easy to operate and maintain.

We work from our own development environments and support teams across US, EU, and APAC timezones. Our workflow emphasizes documentation and asynchronous collaboration to keep development efficient and focused.

  • Production Backend Experience. Experience building and maintaining backend systems, APIs, and databases used in production.
  • Scalable Architecture. Design backend systems that stay reliable as your product and traffic grow.
  • Contractor Friendly. Flexible engagement for short projects, long-term support, or extra help during releases.
  • Focus on Backend Reliability. Improve API performance, database stability, and overall backend reliability.
  • Documentation-Driven Development. Development guided by clear documentation so teams stay aligned and work efficiently.
  • Domain-Driven Design. Design backend systems around real business processes and product needs.

Tell us about your project

Our offices

  • Copenhagen
    1 Carlsberg Gate
    1260, København, Denmark
  • Magelang
    12 Jalan Bligo
    56485, Magelang, Indonesia

More articles

How I Use Form Objects to Keep Rails Controllers Clean

Multi-model forms, complex validation logic, and params that don't map cleanly to database columns are where Rails' built-in form handling breaks down. Form objects fix all three without pulling in a framework.

Read more

When Automation Isn’t Enough: Why Humans Still Lead Code Quality

Automated tools can catch syntax errors and enforce style, but they can’t think. Humans remain essential to maintaining true code quality and long-term project health.

Read more

Hiring a Senior Backend Developer in Singapore Takes 9 Weeks and S$120K — There Is a Better Way

Nine weeks ago you posted the job. Today you have two maybes, one lowball counteroffer, and a product launch that can't wait any longer.

Read more

How Singapore Scaleups Reduce Backend Overhead Efficiently

Your engineering team doubled last year. Your backend output didn't. Somewhere between the new hires and the new meetings, the actual building slowed down.

Read more