Why Your API Returns 200 Even When Something Goes Wrong
by Eric Hanson, Backend Developer at Clean Systems Consulting
“It worked” — except it didn’t
You call an API. You get a 200 OK. Everything should be fine.
Except the payload says:
{
"success": false,
"error": "Invalid input"
}
This pattern shows up everywhere. It feels harmless—clients can just check the success flag, right?
In practice, it causes subtle, expensive problems:
- retries don’t trigger when they should
- monitoring misses real failures
- client libraries behave unpredictably
- debugging becomes guesswork
Returning 200 for failures is not just a stylistic issue. It breaks the contract HTTP is designed to provide.
Why teams end up here
This pattern usually comes from one of three places:
1. Treating HTTP as a dumb transport
If you see HTTP as just a pipe for JSON, you end up encoding everything in the body:
POST /orders
{
"status": "error",
"code": 4001
}
The status code becomes meaningless. All semantics move into the payload.
2. Fear of breaking clients
Teams worry that changing status codes will break consumers:
- “What if someone expects 200?”
- “What if retries behave differently?”
So they stick with 200 for everything and push complexity into the response body.
3. Misunderstanding error categories
Not all errors are the same:
- client errors (bad input)
- server errors (unexpected failure)
- transient issues (timeouts, rate limits)
If you don’t distinguish these, everything collapses into a generic “error” response.
HTTP status codes already solve this problem
You don’t need a custom error protocol. HTTP gives you one.
Use it.
Client errors (4xx)
The request is invalid. The client needs to fix something.
HTTP/1.1 400 Bad Request
{
"error": {
"code": "INVALID_STATUS",
"message": "Status must be one of: pending, shipped"
}
}
Other common cases:
401 Unauthorized→ missing/invalid auth403 Forbidden→ valid auth, insufficient permissions404 Not Found→ resource doesn’t exist409 Conflict→ state conflict (e.g., duplicate resource)
Server errors (5xx)
The request was valid. The server failed.
HTTP/1.1 500 Internal Server Error
{
"error": {
"code": "INTERNAL_ERROR",
"message": "Unexpected failure"
}
}
These should be rare and alert-worthy.
Success (2xx)
Only use 200 (or 201, 204, etc.) when the operation actually succeeded.
HTTP/1.1 201 Created
{
"id": "123",
"status": "created"
}
No ambiguity.
What breaks when you misuse 200
Retries stop working correctly
Most HTTP clients and proxies retry based on status codes.
If a request fails but returns 200, automatic retries won’t happen—even for transient issues.
You’ve just made your system less resilient.
Monitoring lies to you
Metrics often track error rates using status codes:
% of 5xx responses% of non-2xx responses
If everything is 200, your dashboards show “healthy” while users are failing.
You’ve lost one of the most important signals in your system.
Caching becomes dangerous
Caches (CDNs, proxies) assume 200 responses are valid.
If you return a cached 200 error response, clients may repeatedly receive failures even after the issue is fixed.
Client code becomes inconsistent
Instead of relying on HTTP semantics:
if (response.status === 200) {
// success
}
clients now have to do:
if (response.status === 200 && response.data.success === true) {
// success
}
Every client reimplements this logic slightly differently. Bugs follow.
You can still return structured errors
Using proper status codes doesn’t mean giving up detailed error information.
Combine both:
HTTP/1.1 422 Unprocessable Entity
{
"error": {
"code": "INVALID_EMAIL",
"message": "Email format is invalid",
"details": {
"field": "email"
}
}
}
This gives:
- machines → correct status for control flow
- humans → clear message for debugging
If you want a standard format, align with RFC 7807 (Problem Details for HTTP APIs).
The migration problem
If your API already returns 200 for errors, you can’t flip a switch overnight.
A safer approach:
Step 1: introduce correct status codes for new endpoints
Don’t propagate the mistake.
Step 2: add dual signaling temporarily
HTTP/1.1 400 Bad Request
{
"success": false,
"error": { ... }
}
Clients can transition gradually.
Step 3: deprecate the success flag
Once consumers rely on status codes, remove redundant fields.
This reduces ambiguity over time.
The tradeoff: strictness vs convenience
Using proper status codes introduces some friction:
- clients must handle more status cases
- tests need to assert on both status and body
- some legacy tooling may assume
200
But the alternative is worse:
- hidden failures
- unreliable retries
- misleading observability
You’re trading a bit of upfront discipline for long-term correctness.
What to do differently this week
Find one endpoint that returns 200 on failure.
Change it to return the appropriate 4xx or 5xx status code—without changing the response body yet.
Update one client to rely on the status code instead of a success flag.
That small shift will immediately surface hidden assumptions—and make your system more predictable.