Your API Gateway Should Be Doing More Than Just Routing
by Arif Ikhsanudin, Backend Developer
What most teams use their gateway for
Route /api/orders/* to the Order Service. Route /api/users/* to the User Service. Forward the request. Return the response. This is load balancing, not an API gateway. If that's all your gateway does, you've added a network hop without capturing the value that a gateway layer can provide.
The API gateway is the single entry point for all external traffic. That position in your architecture makes it the right place to enforce policies that apply to every request, regardless of which downstream service handles it. Moving those concerns into the gateway means they're enforced consistently and service teams don't re-implement them.
Authentication and authorization at the gateway
JWT validation at the gateway — verifying signature, checking expiry, validating issuer and audience — is the single-validation pattern. Downstream services receive verified identity in request headers and don't need to re-validate tokens.
Kong (a widely used open-source gateway) makes this declarative:
# Kong plugin: JWT validation on all routes
plugins:
- name: jwt
config:
secret_is_base64: false
claims_to_verify:
- exp
- nbf
key_claim_name: kid
# Public keys fetched from JWKS endpoint
jwks_uri: https://auth.internal/.well-known/jwks.json
After validation, Kong can forward verified claims as headers. Services receive X-User-Id, X-User-Roles, X-Tenant-Id without having touched the JWT. When your token format changes, you update the gateway plugin configuration — not eight individual service auth implementations.
Coarse-grained authorization (is this route accessible to unauthenticated users? does this endpoint require admin role?) also belongs at the gateway. Fine-grained authorization (can this user modify this specific order?) belongs in the service.
Rate limiting
Without rate limiting at the gateway, a single misbehaving client (or a DDoS) can exhaust your downstream services. Rate limiting in each service is wasteful — each service re-implements the same concern, and limits are applied per-service rather than per-client across your API.
Gateway-level rate limiting is applied before requests reach your services:
# Kong rate limiting: 1000 requests per hour per authenticated user
plugins:
- name: rate-limiting
config:
hour: 1000
policy: local # or redis for distributed rate limiting
limit_by: consumer # per authenticated user
error_code: 429
error_message: "Rate limit exceeded"
For authenticated APIs, rate limit by user identity. For public endpoints, rate limit by IP with more generous limits. For internal service-to-service calls that bypass the gateway, rate limiting is handled at the service mesh layer or not at all (internal services are trusted not to abuse each other).
Request and response transformation
Gateways can adapt request and response shapes without service changes. Common patterns:
Header injection: add headers downstream services need (request ID for tracing, user context, timestamp).
API versioning routing: route /v1/orders and /v2/orders to different backend service versions or versions of the same service, without exposing version implementation to clients.
Response filtering: strip internal fields from responses before they reach external clients (internal database IDs, implementation details, debugging fields that should not be in the public API).
Protocol translation: accept REST externally, translate to gRPC for internal service calls. Kong's gRPC-gateway plugin, or a custom transformer, handles this translation at the gateway layer rather than in every client.
Observability: where gateway instrumentation pays off
The gateway has visibility into every external request your system receives. That position makes it uniquely valuable for observability:
- Access logs with client identity, route, response code, latency, and response size — the baseline for API usage analysis and abuse detection
- Correlation ID injection: generate a
X-Correlation-Idheader on every incoming request if not already present, and propagate it downstream. All downstream services log with this ID. Tracing a user complaint becomes a log search rather than archaeology. - Latency histograms by route: gateway-level latency metrics show the client-perceived response time for every API endpoint, which is the number your SLA actually cares about (not internal service latency, which excludes gateway processing and network time)
# Kong Prometheus plugin: gateway-level metrics
plugins:
- name: prometheus
config:
status_code_metrics: true
latency_metrics: true
bandwidth_metrics: true
upstream_health_metrics: true
What the gateway should not do
Business logic: the gateway is infrastructure. If you find yourself writing business rules in gateway plugins (validate that this field has this value, apply this discount logic), that logic belongs in a service.
Service orchestration: the gateway should route to one upstream per request. If you're building a Backend for Frontend (BFF) pattern that aggregates multiple service calls, that belongs in a dedicated BFF service, not in gateway plugin code.
All authorization: coarse-grained role checks belong at the gateway. Whether user 123 can delete order 456 — that requires domain knowledge and belongs in the Order Service.
The gateway is most valuable when it's thin, consistent, and limited to genuinely cross-cutting concerns. When it accumulates business logic, it becomes a deployment bottleneck for concerns that belong to individual service teams.