The Difference Between Code That Works and Code That Lasts

by Arif Ikhsanudin, Backend Developer

The Two-Year Test

Most code that "works" was written under a specific set of assumptions: this team, this load profile, this set of requirements, these upstream systems. When those assumptions hold, the code is fine. When they change — and they will change — you find out what the code was actually built for.

The difference between code that works and code that lasts is how many assumptions are baked in silently, how expensive those assumptions are to change, and whether the code communicates its dependencies on those assumptions to future readers.

This is not an argument for writing code that anticipates every possible future. That's overengineering and has its own costs. It's an argument for being deliberate about which assumptions you're making and making them visible.

The Categories of Silent Assumptions

When code fails to last, the failure usually traces back to one of a few assumption categories:

Data shape assumptions: Your service receives a JSON payload and accesses payload.user.address.city directly. Works perfectly until the upstream team restructures their payload — at which point your service throws NPEs and nobody immediately understands why. The assumption was that this path always exists. A defensive deserializer with clear error messages when required fields are absent would have made the assumption explicit and the failure legible.

Load assumptions: A background job that runs fine at 1,000 records takes 45 minutes at 500,000. The algorithm that was good enough at the initial data volume — an O(n²) nested loop that compares every record to every other — becomes a production incident two years later when the table grows. Load assumptions are particularly dangerous because they're invisible at development time.

Dependency behavior assumptions: Your service assumes the payment API responds in under 500ms, so your connection timeout is 1 second. When the payment provider has a slow day and responses take 2 seconds, your requests time out, your retry logic kicks in, and you've now doubled your traffic to an already-struggling upstream service. The assumption was normal behavior. The 99th percentile was never modeled.

Team knowledge assumptions: Code that requires tribal knowledge to operate safely. The deploy script that works only if you run it from the main branch. The migration that requires manually disabling a constraint first. The cron job that must not run concurrently. None of this was documented because the author knew it — and would always be there to explain it. Until they weren't.

What "Lasting" Looks Like Structurally

Code built to last is not necessarily more complex than code that merely works. Often it's simpler, because it's made explicit decisions about what to support rather than implicitly supporting everything.

// Works: assumes the config key exists and is a valid integer
int timeout = Integer.parseInt(config.get("timeout_ms"));

// Lasts: names the assumption, provides a recoverable default,
// and fails fast with a clear message when the config is wrong
int timeout = config.getInt("timeout_ms")
    .filter(t -> t > 0 && t <= 30_000)
    .orElseThrow(() -> new ConfigurationException(
        "timeout_ms must be a positive integer ≤ 30000; got: " +
        config.get("timeout_ms").orElse("(missing)")
    ));

The second version is more lines. It is also far easier to diagnose when something goes wrong, and it communicates its constraints to anyone who reads or deploys it.

The Role of Tests in Longevity

Tests are the most common mechanism engineers cite for making code last. The argument is correct but incomplete. Tests that cover the happy path and a few obvious edge cases document that the code works — they don't protect against the assumption failures above.

Tests that genuinely extend code longevity are:

  • Contract tests that verify upstream and downstream integration behavior (Pact being the standard tool for microservice contracts)
  • Property-based tests (QuickCheck for Haskell, Hypothesis for Python, jqwik for Java) that explore the space of inputs rather than just the ones you thought of
  • Load tests that establish performance baselines and catch regressions before production does (k6 and Gatling are practical choices here)

Unit tests on individual functions are necessary but not sufficient. The assumptions that fail in production are usually at the boundaries.

The Documentation Debt

Code that lasts is code where the non-obvious decisions are documented close to the code. Not in a Confluence page that nobody reads. In a comment, a test name, or an error message.

// We use optimistic locking here rather than SELECT FOR UPDATE because
// the payment service holds locks for 50-200ms while validating with
// the card network. At our concurrency levels, pessimistic locking
// caused deadlocks. See incident INC-2847 for the original failure.
@Version
private Long version;

That comment is worth more than a design document. It travels with the code, it explains the constraint that wasn't obvious, and it points at the incident that taught you this lesson.

The Practical Takeaway

For your next significant feature, identify three assumptions your code makes that would cause it to fail if violated. Write them down explicitly — either as comments, as validation code that fails fast with clear messages, or as test cases that document the boundary. If you can't name three, you haven't looked carefully enough.

Scale Your Backend - Need an Experienced Backend Developer?

We provide backend engineers who join your team as contractors to help build, improve, and scale your backend systems.

We focus on clean backend design, clear documentation, and systems that remain reliable as products grow. Our goal is to strengthen your team and deliver backend systems that are easy to operate and maintain.

We work from our own development environments and support teams across US, EU, and APAC timezones. Our workflow emphasizes documentation and asynchronous collaboration to keep development efficient and focused.

  • Production Backend Experience. Experience building and maintaining backend systems, APIs, and databases used in production.
  • Scalable Architecture. Design backend systems that stay reliable as your product and traffic grow.
  • Contractor Friendly. Flexible engagement for short projects, long-term support, or extra help during releases.
  • Focus on Backend Reliability. Improve API performance, database stability, and overall backend reliability.
  • Documentation-Driven Development. Development guided by clear documentation so teams stay aligned and work efficiently.
  • Domain-Driven Design. Design backend systems around real business processes and product needs.

Tell us about your project

Our offices

  • Copenhagen
    1 Carlsberg Gate
    1260, København, Denmark
  • Magelang
    12 Jalan Bligo
    56485, Magelang, Indonesia

More articles

What 5 Years of Backend Work Taught Me That No Tutorial Ever Did

Tutorials teach you how to build things. Five years of production work teaches you why most of what you built needed to be rebuilt. Here's what actually changes when you stop learning in isolation and start working on systems that matter.

Read more

Why Your Docker Image Works Locally But Breaks in Production

Local Docker and production environments differ in architecture, user permissions, resource limits, networking, and secret injection. Most "it works on my machine" container failures trace back to a small set of fixable mismatches.

Read more

When One Developer Chooses a Technology Nobody Else Understands

You trusted your developer to pick the right tools. Now the rest of the team can’t touch the code without a manual in another language.

Read more

Why Finnish Startups Hire Async Backend Contractors to Scale Beyond Helsinki's Small Talent Pool

Helsinki's engineering community is strong but small. The startups growing fastest have built a way to get backend work done that doesn't depend on the local pool being bigger than it is.

Read more