*API Gateway

September 15, 2025

What is an API gateway?

An API gateway is the managed entry point to your microservices, enforcing cross‑cutting concerns: authentication/authorization, quotas, schema/contract enforcement, transformations, and observability.

Why it matters

  • Centralizes policy: one place to enforce auth, TLS, rate limits, and schema validation.
  • Developer experience: unified address space, docs, and versioning.
  • Reliability: retries, circuit breaking, and backoff close to the client edge.

Layering and scope

  • Edge gateway (public): authN, DDoS/WAF, global routing, TLS termination.
  • Internal gateway (east‑west): service‑to‑service authZ (SPIFFE/SPIRE), traffic shaping, zero‑trust.

Core capabilities

  • Authentication/Authorization (JWT/OAuth2/mTLS).
  • Request/response transformation (REST↔gRPC, GraphQL composition).
  • Rate limiting, quotas, usage plans, monetization.
  • Observability: access logs, distributed tracing, metrics per route.
  • Canary, blue/green, header‑based routing, version routing.

Design principles

  • Keep business logic out of the gateway; use it for cross‑cutting concerns.
  • Make configs declarative and versioned (GitOps). Changes should be reviewed and rolled out gradually.
  • Prefer idempotent, safe retries with bounded budgets.

Versioning strategies

  • URI versioning: /v1, /v2; simple but coarse.
  • Header versioning: Accept/Content‑Type (application/vnd.company.v2+json).
  • GraphQL: carefully evolve schema; avoid breaking field removals.

Security considerations

  • Validate JWT signature/claims, audience, expiration and scopes.
  • mTLS between gateway and backends; rotate certificates.
  • WAF rules for common injection vectors and body size limits per route.

Reliability patterns

  • Timeout per backend and per route; no infinite waits.
  • Circuit breakers with half‑open probes; retry with jitter and budgets.
  • Rate limits per consumer (API key, client ID) and per IP.

Observability

  • Correlation IDs propagated to backends (traceparent). Add per‑route metrics (R/E/D) and SLIs.
  • Structured access logs with request/response samples for debugging.

Multi‑tenant concerns

  • Separate keys/quotas per tenant; hard limits to protect shared infrastructure.
  • Per‑tenant routing/isolated backends if noisy neighbors are a risk.

Capabilities

  • AuthN/AuthZ, mTLS, WAF, rate limiting, quotas.
  • Transformation (REST↔gRPC/GraphQL), backend aggregation.
  • Caching, circuit breaking, retries, timeouts.

Risks

  • Central point of failure → multi‑AZ HA, canary/blue‑green, config as code.

Code: Kong route with rate limiting (declarative)

# language-yaml
services:
  - name: users
    url: http://users:8080
    routes:
      - name: users-route
        paths: ["/users"]
plugins:
  - name: rate-limiting
    route: users-route
    config:
      minute: 120
      policy: local

Code: Envoy JWT + routing by header

# language-yaml
http_filters:
  - name: envoy.filters.http.jwt_authn
    typed_config:
      "@type": type.googleapis.com/envoy.extensions.filters.http.jwt_authn.v3.JwtAuthentication
      providers:
        auth0:
          issuer: https://auth.example.com/
          remote_jwks:
            http_uri:
              uri: https://auth.example.com/.well-known/jwks.json
              cluster: auth_cluster
              timeout: 5s
      rules:
        - match: { prefix: "/" }
          requires: { provider_name: auth0 }
route_config:
  virtual_hosts:
    - name: api
      domains: ["*"]
      routes:
        - match:
            prefix: "/orders"
            headers: [{ name: "x-api-version", exact_match: "v2" }]
          route: { cluster: orders_v2 }
        - match: { prefix: "/orders" }
          route: { cluster: orders_v1 }

Runbook

  • Rollout changes via canary: 5% → 25% → 50% → 100% users.
  • Monitor errors, p95 latency, and saturation per route; auto‑rollback if budgets exhausted.
  • Keep a documented manual bypass for emergency backend access.