What is an API gateway?
An API gateway is the managed entry point to your microservices, enforcing cross‑cutting concerns: authentication/authorization, quotas, schema/contract enforcement, transformations, and observability.
Why it matters
- Centralizes policy: one place to enforce auth, TLS, rate limits, and schema validation.
- Developer experience: unified address space, docs, and versioning.
- Reliability: retries, circuit breaking, and backoff close to the client edge.
Layering and scope
- Edge gateway (public): authN, DDoS/WAF, global routing, TLS termination.
- Internal gateway (east‑west): service‑to‑service authZ (SPIFFE/SPIRE), traffic shaping, zero‑trust.
Core capabilities
- Authentication/Authorization (JWT/OAuth2/mTLS).
- Request/response transformation (REST↔gRPC, GraphQL composition).
- Rate limiting, quotas, usage plans, monetization.
- Observability: access logs, distributed tracing, metrics per route.
- Canary, blue/green, header‑based routing, version routing.
Design principles
- Keep business logic out of the gateway; use it for cross‑cutting concerns.
- Make configs declarative and versioned (GitOps). Changes should be reviewed and rolled out gradually.
- Prefer idempotent, safe retries with bounded budgets.
Versioning strategies
- URI versioning: /v1, /v2; simple but coarse.
- Header versioning: Accept/Content‑Type (application/vnd.company.v2+json).
- GraphQL: carefully evolve schema; avoid breaking field removals.
Security considerations
- Validate JWT signature/claims, audience, expiration and scopes.
- mTLS between gateway and backends; rotate certificates.
- WAF rules for common injection vectors and body size limits per route.
Reliability patterns
- Timeout per backend and per route; no infinite waits.
- Circuit breakers with half‑open probes; retry with jitter and budgets.
- Rate limits per consumer (API key, client ID) and per IP.
Observability
- Correlation IDs propagated to backends (traceparent). Add per‑route metrics (R/E/D) and SLIs.
- Structured access logs with request/response samples for debugging.
Multi‑tenant concerns
- Separate keys/quotas per tenant; hard limits to protect shared infrastructure.
- Per‑tenant routing/isolated backends if noisy neighbors are a risk.
Capabilities
- AuthN/AuthZ, mTLS, WAF, rate limiting, quotas.
- Transformation (REST↔gRPC/GraphQL), backend aggregation.
- Caching, circuit breaking, retries, timeouts.
Risks
- Central point of failure → multi‑AZ HA, canary/blue‑green, config as code.
Code: Kong route with rate limiting (declarative)
# language-yaml
services:
- name: users
url: http://users:8080
routes:
- name: users-route
paths: ["/users"]
plugins:
- name: rate-limiting
route: users-route
config:
minute: 120
policy: local
Code: Envoy JWT + routing by header
# language-yaml
http_filters:
- name: envoy.filters.http.jwt_authn
typed_config:
"@type": type.googleapis.com/envoy.extensions.filters.http.jwt_authn.v3.JwtAuthentication
providers:
auth0:
issuer: https://auth.example.com/
remote_jwks:
http_uri:
uri: https://auth.example.com/.well-known/jwks.json
cluster: auth_cluster
timeout: 5s
rules:
- match: { prefix: "/" }
requires: { provider_name: auth0 }
route_config:
virtual_hosts:
- name: api
domains: ["*"]
routes:
- match:
prefix: "/orders"
headers: [{ name: "x-api-version", exact_match: "v2" }]
route: { cluster: orders_v2 }
- match: { prefix: "/orders" }
route: { cluster: orders_v1 }
Runbook
- Rollout changes via canary: 5% → 25% → 50% → 100% users.
- Monitor errors, p95 latency, and saturation per route; auto‑rollback if budgets exhausted.
- Keep a documented manual bypass for emergency backend access.