Why observability is a foundation — not a feature — for Spring Boot in production.

Operations from localhost to production without panic

Modern applications live in complex, distributed environments. When a request fails, it rarely fails in isolation. It’s usually a cascade involving a few of microservices, a slow database and a saturated connection pool.

If you aren’t using distributed tracing and metrics from the first deployment, you’re flying blind. No timing. No trace ID. No context.

This is not an edge case. It’s what happens when observability is treated as “something to add later.”

⚡ TL;DR (Quick Recap)

  • Use Spring Boot Actuator with /health, /info, /metrics, /prometheus
  • Log in structured JSON with trace/correlation IDs
  • Observe, Don’t Just Measure - @Observed for automatic metrics and tracing
  • Metrics using Prometheus and visualize with Grafana
  • All examples assume Spring Boot 4.x

Why This Matters

Modern applications are distributed systems by default. A single request might traverse:

  • your API
  • a database
  • multiple internal services
  • third-party APIs

Without tracing and metrics, failures become invisible chains. With tools like OpenTelemetry and Micrometer, observability is now standardized — and expected.

Spring Boot supports it out of the box.

Actuator: Your Production Interface

Spring Boot Actuator is how your infrastructure interacts with your application.

Start with dependencies:

<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-actuator</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-aspectj</artifactId>
</dependency>
<dependency>
<groupId>io.micrometer</groupId>
<artifactId>micrometer-registry-prometheus</artifactId>
<scope>runtime</scope>
</dependency>

Note: spring-boot-starter-aop was in Spring Boot 3.x — now the spring-boot-starter-aspectj

and configuration:

management:
observations:
annotations:
enabled: true
key-values:
app: service-name
metrics:
tags:
app: service-name
server:
port: 8888
tracing:
sampling:
probability: 1.0 #Use 1.0 only for local/dev. In production, start with 0.1 or adaptive sampling.
info:
git:
mode: full
build:
enabled: true
endpoints:
web:
exposure:
include: health, info, metrics, prometheus
endpoint:
health:
probes:
enabled: true
show-details: when_authorized

Separating the management port from the application port (8080) is a security best practice to prevent external exposure of Actuator endpoints.

Also avoid exposing sensitive endpoints like:

  • /env
  • /heapdump
  • /shutdown

Think of Actuator as your production API — not a debugging tool.

Health Checks That Reflect Reality

Default health checks answer:

“Is the database reachable?”

But production needs:

“Can this service actually function?”

Example:

@Component
public class DownstreamHealthIndicator implements HealthIndicator {

@Override
public Health health() {
try {
boolean healthy = checkExternalService();
return healthy
? Health.up().build()
: Health.down()
.withDetail("reason", "Dependency returned unhealthy status")
.build();
} catch (Exception ex) {
return Health.down(ex)
.withDetail("reason", ex.getMessage())
.build();
}
}

private boolean checkExternalService() {
// Always enforce a hard timeout on health checks —
// a slow downstream must not block /health indefinitely
// and stall your liveness probe.
...
}
}

This ensures:

  • dependencies respond
  • resources are available
  • the app can fulfill real requests

Structured Logging: From Text to Data

Plain logs don’t scale.

Enable structured JSON logging:

logging:
structured:
format:
console: logstash

Then log with structure:

 log.atInfo()
.setMessage("Product created")
.addKeyValue("productId", id)
.addKeyValue("operation", "createProduct")
.log();

Now logs become:

  • filterable
  • searchable
  • correlated

Instead of noise, you get signal.

Trace IDs: Connecting the Dots

Logs without trace IDs are isolated.

With tracing:

  • every request gets a traceId
  • every operation gets a spanId

Configure tracing:

management:
tracing:
sampling:
probability: 1.0 #Use 1.0 only for local/dev. In production, start with 0.1 or adaptive sampling.

Now you can trace requests across systems using tools like Zipkin or Grafana.

Observed: Metrics and Tracing in One

Manual timing creates logs.@Observed creates insight.

// @Observed requires spring-boot-starter-aspectj on the classpath
// AND management.observations.annotations.enabled=true in application.yml
@Observed(name = "order.processing")
public void processOrder() {
// business logic
}

This automatically produces:

  • execution time metrics
  • trace spans
  • correlated observability data

No boilerplate required.

Metrics That Don’t Disappear: Prometheus

Metrics inside your app are useless if they’re not stored. This is where Prometheus comes in.

Expose endpoint

management:
observations:
annotations:
enabled: true
endpoints:
web:
exposure:
include: health, info, metrics, prometheus

Configure scraping

scrape_configs:
- job_name: 'spring-boot-app'
metrics_path: '/actuator/prometheus'
scrape_interval: 15s
static_configs:
- targets: ['app:8888']

What You Get Instantly

With @Observed + Prometheus:

  • Request latency metrics
  • Error rates
  • Throughput
  • JVM performance data

And most importantly — history.

Without it:

“Something was slow earlier…”

With it:

“Latency spiked at 10:03, peaked at p99 1.7s.”

The Info Endpoint: Know Your Deployment

Expose build metadata:

management:
info:
git:
mode: full
build:
enabled: true

and build plugin:

<plugin>
<groupId>io.github.git-commit-id</groupId>
<artifactId>git-commit-id-maven-plugin</artifactId>
</plugin>

Now /actuator/info shows:

  • version
  • commit hash
  • build time

No more:

“Which version is running?”

Final Takeaways

Observability is not a feature you add before launch. It is the operational floor your application runs on from the first request. Spring Boot gives you the Actuator, Micrometer Observation, structured logging and distributed tracing stack

  • Observability is the foundation, not an enhancement
  • Spring Boot Actuator is your production interface
  • Structured logs make debugging scalable
  • Micrometer Observation replaces manual instrumentation
  • Metrics without Prometheus are temporary — store them or lose them

You can find example of code on GitHub.

Originally posted on marconak-matej.medium.com.