Your API Gateway Should Be Doing More Than Just Routing

January 17, 2026

by Arif Ikhsanudin, Backend Developer

What most teams use their gateway for

Route /api/orders/* to the Order Service. Route /api/users/* to the User Service. Forward the request. Return the response. This is load balancing, not an API gateway. If that's all your gateway does, you've added a network hop without capturing the value that a gateway layer can provide.

The API gateway is the single entry point for all external traffic. That position in your architecture makes it the right place to enforce policies that apply to every request, regardless of which downstream service handles it. Moving those concerns into the gateway means they're enforced consistently and service teams don't re-implement them.

Authentication and authorization at the gateway

JWT validation at the gateway — verifying signature, checking expiry, validating issuer and audience — is the single-validation pattern. Downstream services receive verified identity in request headers and don't need to re-validate tokens.

Kong (a widely used open-source gateway) makes this declarative:

# Kong plugin: JWT validation on all routes
plugins:
- name: jwt
  config:
    secret_is_base64: false
    claims_to_verify:
    - exp
    - nbf
    key_claim_name: kid
    # Public keys fetched from JWKS endpoint
    jwks_uri: https://auth.internal/.well-known/jwks.json

After validation, Kong can forward verified claims as headers. Services receive X-User-Id, X-User-Roles, X-Tenant-Id without having touched the JWT. When your token format changes, you update the gateway plugin configuration — not eight individual service auth implementations.

Coarse-grained authorization (is this route accessible to unauthenticated users? does this endpoint require admin role?) also belongs at the gateway. Fine-grained authorization (can this user modify this specific order?) belongs in the service.

Rate limiting

Without rate limiting at the gateway, a single misbehaving client (or a DDoS) can exhaust your downstream services. Rate limiting in each service is wasteful — each service re-implements the same concern, and limits are applied per-service rather than per-client across your API.

Gateway-level rate limiting is applied before requests reach your services:

# Kong rate limiting: 1000 requests per hour per authenticated user
plugins:
- name: rate-limiting
  config:
    hour: 1000
    policy: local          # or redis for distributed rate limiting
    limit_by: consumer     # per authenticated user
    error_code: 429
    error_message: "Rate limit exceeded"

For authenticated APIs, rate limit by user identity. For public endpoints, rate limit by IP with more generous limits. For internal service-to-service calls that bypass the gateway, rate limiting is handled at the service mesh layer or not at all (internal services are trusted not to abuse each other).

Request and response transformation

Gateways can adapt request and response shapes without service changes. Common patterns:

Header injection: add headers downstream services need (request ID for tracing, user context, timestamp).

API versioning routing: route /v1/orders and /v2/orders to different backend service versions or versions of the same service, without exposing version implementation to clients.

Response filtering: strip internal fields from responses before they reach external clients (internal database IDs, implementation details, debugging fields that should not be in the public API).

Protocol translation: accept REST externally, translate to gRPC for internal service calls. Kong's gRPC-gateway plugin, or a custom transformer, handles this translation at the gateway layer rather than in every client.

Observability: where gateway instrumentation pays off

The gateway has visibility into every external request your system receives. That position makes it uniquely valuable for observability:

Access logs with client identity, route, response code, latency, and response size — the baseline for API usage analysis and abuse detection
Correlation ID injection: generate a X-Correlation-Id header on every incoming request if not already present, and propagate it downstream. All downstream services log with this ID. Tracing a user complaint becomes a log search rather than archaeology.
Latency histograms by route: gateway-level latency metrics show the client-perceived response time for every API endpoint, which is the number your SLA actually cares about (not internal service latency, which excludes gateway processing and network time)

# Kong Prometheus plugin: gateway-level metrics
plugins:
- name: prometheus
  config:
    status_code_metrics: true
    latency_metrics: true
    bandwidth_metrics: true
    upstream_health_metrics: true

What the gateway should not do

Business logic: the gateway is infrastructure. If you find yourself writing business rules in gateway plugins (validate that this field has this value, apply this discount logic), that logic belongs in a service.

Service orchestration: the gateway should route to one upstream per request. If you're building a Backend for Frontend (BFF) pattern that aggregates multiple service calls, that belongs in a dedicated BFF service, not in gateway plugin code.

All authorization: coarse-grained role checks belong at the gateway. Whether user 123 can delete order 456 — that requires domain knowledge and belongs in the Order Service.

The gateway is most valuable when it's thin, consistent, and limited to genuinely cross-cutting concerns. When it accumulates business logic, it becomes a deployment bottleneck for concerns that belong to individual service teams.

Our offices

Follow us

Your API Gateway Should Be Doing More Than Just Routing

What most teams use their gateway for

Authentication and authorization at the gateway

Rate limiting

Request and response transformation

Observability: where gateway instrumentation pays off

What the gateway should not do

Scale Your Backend - Need an Experienced Backend Developer?

Tell us about your project

Our offices

More articles

Forced In-Person Work: When Contractors Are Treated Unfairly

Why Silent Meetings With Cameras On Are a Bad Idea

How I Balance Writing Code and Leading a Team at the Same Time

Stop Writing Loops When SQL Aggregations Can Do the Work