API Gateway

What is an API gateway?

An API gateway is the managed entry point to your microservices, enforcing cross‑cutting concerns: authentication/authorization, quotas, schema/contract enforcement, transformations, and observability.

Why it matters

Centralizes policy: one place to enforce auth, TLS, rate limits, and schema validation.
Developer experience: unified address space, docs, and versioning.
Reliability: retries, circuit breaking, and backoff close to the client edge.

Layering and scope

Edge gateway (public): authN, DDoS/WAF, global routing, TLS termination.
Internal gateway (east‑west): service‑to‑service authZ (SPIFFE/SPIRE), traffic shaping, zero‑trust.

Core capabilities

Authentication/Authorization (JWT/OAuth2/mTLS).
Request/response transformation (REST↔gRPC, GraphQL composition).
Rate limiting, quotas, usage plans, monetization.
Observability: access logs, distributed tracing, metrics per route.
Canary, blue/green, header‑based routing, version routing.

Design principles

Keep business logic out of the gateway; use it for cross‑cutting concerns.
Make configs declarative and versioned (GitOps). Changes should be reviewed and rolled out gradually.
Prefer idempotent, safe retries with bounded budgets.

Versioning strategies

URI versioning: /v1, /v2; simple but coarse.
Header versioning: Accept/Content‑Type (application/vnd.company.v2+json).
GraphQL: carefully evolve schema; avoid breaking field removals.

Security considerations

Validate JWT signature/claims, audience, expiration and scopes.
mTLS between gateway and backends; rotate certificates.
WAF rules for common injection vectors and body size limits per route.

Reliability patterns

Timeout per backend and per route; no infinite waits.
Circuit breakers with half‑open probes; retry with jitter and budgets.
Rate limits per consumer (API key, client ID) and per IP.

Observability

Correlation IDs propagated to backends (traceparent). Add per‑route metrics (R/E/D) and SLIs.
Structured access logs with request/response samples for debugging.

Multi‑tenant concerns

Separate keys/quotas per tenant; hard limits to protect shared infrastructure.
Per‑tenant routing/isolated backends if noisy neighbors are a risk.

Capabilities

AuthN/AuthZ, mTLS, WAF, rate limiting, quotas.
Transformation (REST↔gRPC/GraphQL), backend aggregation.
Caching, circuit breaking, retries, timeouts.

Risks

Central point of failure → multi‑AZ HA, canary/blue‑green, config as code.

Code: Kong route with rate limiting (declarative)

# language-yaml
services:
  - name: users
    url: http://users:8080
    routes:
      - name: users-route
        paths: ["/users"]
plugins:
  - name: rate-limiting
    route: users-route
    config:
      minute: 120
      policy: local

Code: Envoy JWT + routing by header

# language-yaml
http_filters:
  - name: envoy.filters.http.jwt_authn
    typed_config:
      "@type": type.googleapis.com/envoy.extensions.filters.http.jwt_authn.v3.JwtAuthentication
      providers:
        auth0:
          issuer: https://auth.example.com/
          remote_jwks:
            http_uri:
              uri: https://auth.example.com/.well-known/jwks.json
              cluster: auth_cluster
              timeout: 5s
      rules:
        - match: { prefix: "/" }
          requires: { provider_name: auth0 }
route_config:
  virtual_hosts:
    - name: api
      domains: ["*"]
      routes:
        - match:
            prefix: "/orders"
            headers: [{ name: "x-api-version", exact_match: "v2" }]
          route: { cluster: orders_v2 }
        - match: { prefix: "/orders" }
          route: { cluster: orders_v1 }

Runbook

Rollout changes via canary: 5% → 25% → 50% → 100% users.
Monitor errors, p95 latency, and saturation per route; auto‑rollback if budgets exhausted.
Keep a documented manual bypass for emergency backend access.

*API Gateway