The Difference Between Code That Works and Code That Lasts
by Eric Hanson, Backend Developer at Clean Systems Consulting
The Two-Year Test
Most code that "works" was written under a specific set of assumptions: this team, this load profile, this set of requirements, these upstream systems. When those assumptions hold, the code is fine. When they change — and they will change — you find out what the code was actually built for.
The difference between code that works and code that lasts is how many assumptions are baked in silently, how expensive those assumptions are to change, and whether the code communicates its dependencies on those assumptions to future readers.
This is not an argument for writing code that anticipates every possible future. That's overengineering and has its own costs. It's an argument for being deliberate about which assumptions you're making and making them visible.
The Categories of Silent Assumptions
When code fails to last, the failure usually traces back to one of a few assumption categories:
Data shape assumptions: Your service receives a JSON payload and accesses payload.user.address.city directly. Works perfectly until the upstream team restructures their payload — at which point your service throws NPEs and nobody immediately understands why. The assumption was that this path always exists. A defensive deserializer with clear error messages when required fields are absent would have made the assumption explicit and the failure legible.
Load assumptions: A background job that runs fine at 1,000 records takes 45 minutes at 500,000. The algorithm that was good enough at the initial data volume — an O(n²) nested loop that compares every record to every other — becomes a production incident two years later when the table grows. Load assumptions are particularly dangerous because they're invisible at development time.
Dependency behavior assumptions: Your service assumes the payment API responds in under 500ms, so your connection timeout is 1 second. When the payment provider has a slow day and responses take 2 seconds, your requests time out, your retry logic kicks in, and you've now doubled your traffic to an already-struggling upstream service. The assumption was normal behavior. The 99th percentile was never modeled.
Team knowledge assumptions: Code that requires tribal knowledge to operate safely. The deploy script that works only if you run it from the main branch. The migration that requires manually disabling a constraint first. The cron job that must not run concurrently. None of this was documented because the author knew it — and would always be there to explain it. Until they weren't.
What "Lasting" Looks Like Structurally
Code built to last is not necessarily more complex than code that merely works. Often it's simpler, because it's made explicit decisions about what to support rather than implicitly supporting everything.
// Works: assumes the config key exists and is a valid integer
int timeout = Integer.parseInt(config.get("timeout_ms"));
// Lasts: names the assumption, provides a recoverable default,
// and fails fast with a clear message when the config is wrong
int timeout = config.getInt("timeout_ms")
.filter(t -> t > 0 && t <= 30_000)
.orElseThrow(() -> new ConfigurationException(
"timeout_ms must be a positive integer ≤ 30000; got: " +
config.get("timeout_ms").orElse("(missing)")
));
The second version is more lines. It is also far easier to diagnose when something goes wrong, and it communicates its constraints to anyone who reads or deploys it.
The Role of Tests in Longevity
Tests are the most common mechanism engineers cite for making code last. The argument is correct but incomplete. Tests that cover the happy path and a few obvious edge cases document that the code works — they don't protect against the assumption failures above.
Tests that genuinely extend code longevity are:
- Contract tests that verify upstream and downstream integration behavior (Pact being the standard tool for microservice contracts)
- Property-based tests (QuickCheck for Haskell, Hypothesis for Python, jqwik for Java) that explore the space of inputs rather than just the ones you thought of
- Load tests that establish performance baselines and catch regressions before production does (k6 and Gatling are practical choices here)
Unit tests on individual functions are necessary but not sufficient. The assumptions that fail in production are usually at the boundaries.
The Documentation Debt
Code that lasts is code where the non-obvious decisions are documented close to the code. Not in a Confluence page that nobody reads. In a comment, a test name, or an error message.
// We use optimistic locking here rather than SELECT FOR UPDATE because
// the payment service holds locks for 50-200ms while validating with
// the card network. At our concurrency levels, pessimistic locking
// caused deadlocks. See incident INC-2847 for the original failure.
@Version
private Long version;
That comment is worth more than a design document. It travels with the code, it explains the constraint that wasn't obvious, and it points at the incident that taught you this lesson.
The Practical Takeaway
For your next significant feature, identify three assumptions your code makes that would cause it to fail if violated. Write them down explicitly — either as comments, as validation code that fails fast with clear messages, or as test cases that document the boundary. If you can't name three, you haven't looked carefully enough.