Skip to content

Web Services & APIs — Operations

Practical guide to deploying, documenting, securing, versioning, testing, and monitoring web service APIs.


API Specification Formats

Specifications are machine-readable contracts for APIs — enabling codegen, mock servers, linting, and documentation.

OpenAPI 3.1 (REST)

The industry standard for describing RESTful HTTP APIs. Version 3.1 aligns with JSON Schema draft 2020-12.

openapi: 3.1.0
info:
  title: Orders API
  version: 2.4.0
  contact:
    email: api@example.com
  license:
    name: Apache 2.0
servers:
  - url: https://api.example.com/v2
    description: Production
  - url: https://sandbox.api.example.com/v2
    description: Sandbox

paths:
  /orders/{orderId}:
    get:
      operationId: getOrder
      summary: Retrieve a single order
      tags: [Orders]
      parameters:
        - name: orderId
          in: path
          required: true
          schema:
            type: string
            format: uuid
      responses:
        "200":
          description: Order found
          content:
            application/json:
              schema:
                $ref: "#/components/schemas/Order"
        "404":
          $ref: "#/components/responses/NotFound"
      security:
        - bearerAuth: []

components:
  schemas:
    Order:
      type: object
      required: [id, status, createdAt]
      properties:
        id:
          type: string
          format: uuid
        status:
          type: string
          enum: [pending, confirmed, shipped, delivered, cancelled]
        createdAt:
          type: string
          format: date-time
  responses:
    NotFound:
      description: Resource not found
      content:
        application/json:
          schema:
            $ref: "#/components/schemas/ProblemDetail"
  securitySchemes:
    bearerAuth:
      type: http
      scheme: bearer
      bearerFormat: JWT

Key OpenAPI 3.1 improvements over 3.0: - Full JSON Schema 2020-12 alignment (replaces OpenAPI's extended subset) - webhooks top-level field for inbound webhooks - discriminator improvements, const, $schema per-schema - exclusiveMinimum/exclusiveMaximum now numeric (not boolean)

AsyncAPI 3.0 (Event-Driven APIs)

OpenAPI equivalent for WebSocket, MQTT, Kafka, AMQP, SNS/SQS APIs.

asyncapi: 3.0.0
info:
  title: Order Events API
  version: 1.0.0
channels:
  orderCreated:
    address: orders.created
    messages:
      OrderCreated:
        payload:
          type: object
          properties:
            orderId:
              type: string
            customerId:
              type: string
operations:
  onOrderCreated:
    action: receive
    channel:
      $ref: "#/channels/orderCreated"

Protocol Buffers IDL (gRPC)

See architecture#protocol-buffers for the full .proto format. The .proto file IS the API spec for gRPC services.

Tooling comparison:

Format Ecosystem Codegen Mock Server Linting
OpenAPI 3.1 REST Any language Prism, WireMock Spectral, Vacuum
AsyncAPI 3.0 Event-driven Node.js, Java Microcks AsyncAPI Studio
Protobuf gRPC Any language grpc-go test server buf lint
WSDL SOAP Java, .NET, Python SoapUI SOAP UI

API Gateways

An API gateway is the single entry point for all client traffic — handling routing, auth enforcement, rate limiting, observability, and protocol translation.

flowchart LR
    C1[Mobile Client] --> GW[API Gateway]
    C2[Browser] --> GW
    C3[Partner API] --> GW
    GW -->|/orders| OS[Orders Service]
    GW -->|/users| US[User Service]
    GW -->|/products| PS[Product Service]
    GW --> Auth[Auth Service]
    GW --> RL[Rate Limiter\nRedis]
    GW --> Log[Observability\nDatadog / Grafana]

Kong Gateway

Open-source gateway built on NGINX + OpenResty (Lua). Enterprise tier adds RBAC, Dev Portal, and Vitals analytics.

# Kong declarative config (deck format)
services:
  - name: orders-service
    url: http://orders-service:8080
    plugins:
      - name: rate-limiting
        config:
          minute: 1000
          policy: redis
          redis_host: redis
      - name: jwt
        config:
          claims_to_verify: [exp]
    routes:
      - name: orders-route
        paths: [/v2/orders]
        strip_path: false
        methods: [GET, POST, PUT, PATCH, DELETE]
# Kong Admin API — add plugin to route
curl -X POST http://kong:8001/routes/orders-route/plugins \
  --data name=request-transformer \
  --data "config.add.headers[]=X-Request-ID:$(uuidgen)"

Envoy Proxy

High-performance C++ proxy developed at Lyft. Operates as data plane in Istio service mesh. Configured via xDS APIs (dynamic) or static YAML.

# Envoy static config — HTTP rate limit filter
http_filters:
  - name: envoy.filters.http.ratelimit
    typed_config:
      "@type": type.googleapis.com/envoy.extensions.filters.http.ratelimit.v3.RateLimit
      domain: orders_api
      rate_limit_service:
        grpc_service:
          envoy_grpc:
            cluster_name: rate_limit_service
        transport_api_version: V3

AWS API Gateway

Managed gateway for REST, HTTP, and WebSocket APIs. Integrates natively with Lambda, ALB, and VPC Link.

# Create HTTP API (simpler, lower cost than REST API)
aws apigatewayv2 create-api \
  --name orders-api \
  --protocol-type HTTP \
  --target arn:aws:lambda:us-east-1:123456789:function:orders-handler

# Add JWT authorizer
aws apigatewayv2 create-authorizer \
  --api-id abc123 \
  --authorizer-type JWT \
  --identity-source '$request.header.Authorization' \
  --jwt-configuration Audience=orders-api,Issuer=https://auth.example.com \
  --name JwtAuthorizer

Gateway comparison:

Gateway Deployment Config Model Best For
Kong Self-hosted / Cloud Declarative YAML / Admin API Large teams, plugin ecosystem
Envoy Self-hosted (sidecar) xDS (dynamic) / YAML Service mesh, Kubernetes
AWS API Gateway Managed Console / CDK / SAM AWS-native serverless
Nginx Self-hosted Imperative config Simple reverse proxy
Traefik Self-hosted Auto-discover (Kubernetes) Kubernetes ingress
Azure API Management Managed Portal / ARM / Bicep Azure-native

Authentication and Authorization

API Keys

Simplest scheme. Suitable for server-to-server or developer access where OAuth overhead is unneeded.

GET /v2/orders HTTP/1.1
X-API-Key: sk_live_a1b2c3d4e5f6

Best practices: - Prefix keys by environment: sk_live_, sk_test_ - Store only the hash (SHA-256) in database — never plaintext - Rotate on compromise; provide 30-day grace period during planned rotations - Associate keys with scopes: orders:read, orders:write

JWT (JSON Web Tokens)

Stateless bearer tokens. Three base64url-encoded parts: header, payload, signature.

# Decode JWT without verification (debugging)
echo "eyJhbGci..." | cut -d. -f2 | base64 -d | jq
// Payload claims
{
  "sub": "user_01HXYZ",
  "iss": "https://auth.example.com",
  "aud": "orders-api",
  "exp": 1745600000,
  "iat": 1745596400,
  "scope": "orders:read orders:write",
  "jti": "01HXYZ-unique-token-id"
}

JWT security checklist: - Use RS256 (asymmetric) for public key distribution, not HS256 (shared secret) - Short expiry: 15 minutes for access tokens; refresh tokens via httpOnly cookies - Validate iss, aud, exp, nbf on every request - Include jti (JWT ID) for revocation lookup in Redis blocklist - Never store sensitive data in payload — JWTs are encoded, not encrypted (use JWE for confidentiality)

OAuth 2.0 / OAuth 2.1

Authorization Code + PKCE (browser and mobile clients):

sequenceDiagram
    participant U as User
    participant C as Client App
    participant AS as Auth Server
    participant RS as Resource Server

    C->>C: Generate code_verifier, code_challenge = SHA256(verifier)
    C->>AS: GET /authorize?response_type=code&client_id=...&code_challenge=...
    AS->>U: Login + Consent screen
    U->>AS: Approve
    AS->>C: Redirect with ?code=AUTH_CODE
    C->>AS: POST /token {code, code_verifier, client_id}
    AS->>C: {access_token, refresh_token, expires_in}
    C->>RS: GET /orders Authorization: Bearer ACCESS_TOKEN
    RS->>C: 200 {orders: [...]}

Client Credentials (machine-to-machine):

curl -X POST https://auth.example.com/oauth/token \
  -d grant_type=client_credentials \
  -d client_id=service-account \
  -d client_secret=secret \
  -d scope="orders:read inventory:write"

OAuth 2.1 key changes (draft consolidation): - PKCE mandatory for all public clients - Implicit flow removed - Resource Owner Password Credentials (ROPC) flow removed - Refresh token rotation required for public clients

mTLS (Mutual TLS)

Both client and server present certificates — eliminates shared secrets for service-to-service auth.

# Generate client cert signed by your CA
openssl req -new -key client.key -out client.csr \
  -subj "/CN=orders-service/O=internal"
openssl x509 -req -in client.csr -CA ca.crt -CAkey ca.key \
  -CAcreateserial -out client.crt -days 365

# Call API with client cert
curl --cert client.crt --key client.key \
  --cacert ca.crt \
  https://internal-api.example.com/v2/orders

In Kubernetes: use SPIFFE/SPIRE for automatic workload identity, or let Istio inject mTLS transparently via sidecar.


API Versioning

Versioning Strategies

Strategy Example Pros Cons
URI path /v2/orders Most visible, easy routing Breaks resource identity
Query param /orders?version=2 Non-breaking URL Easily forgotten, cache unfriendly
Header API-Version: 2024-01-01 Clean URLs Less discoverable
Content negotiation Accept: application/vnd.api+json;version=2 RFC-compliant Complex client setup

URI versioning is the most common choice for public APIs (used by Stripe, Twilio, GitHub). Header versioning (calendar-based like Stripe-Version: 2023-10-16) is used by Stripe alongside URI versioning for fine-grained migrations.

Calendar-Based Versioning (Stripe Pattern)

Instead of major version bumps, every breaking change gets a calendar date:

GET /v1/charges HTTP/1.1
Stripe-Version: 2023-10-16

Each API key locks to a version at creation. Customers opt into new versions explicitly.

Deprecation Headers (RFC 8594)

HTTP/1.1 200 OK
Deprecation: "2026-01-01T00:00:00Z"
Sunset: "2027-01-01T00:00:00Z"
Link: <https://docs.example.com/migration/v3>; rel="successor-version"
  • Deprecation: when the endpoint was deprecated
  • Sunset: when it will stop working (RFC 8594)
  • Link: migration guide

Non-Breaking vs Breaking Changes

Non-breaking (safe to ship): - Adding optional request fields - Adding new response fields - Adding new endpoints - New enum values (unless clients use exhaustive matching)

Breaking (require new version): - Removing or renaming fields - Changing field types - Changing HTTP method for an operation - Altering authentication requirements - Removing enum values


Rate Limiting

Rate limiting protects services from abuse, ensures fair usage, and enables monetization tiers.

Algorithms

Token Bucket (allow bursting):

capacity = 100 tokens
refill_rate = 10 tokens/second

on request:
  if tokens >= cost:
    tokens -= cost
    return ALLOW
  else:
    return 429 Too Many Requests

AWS API Gateway and Kong use token bucket by default.

Sliding Window Log (most precise):

Stores timestamp of each request. Counts requests within [now - window, now]. High memory cost at scale.

Sliding Window Counter (approximation, low memory):

rate = (prev_count × (1 - elapsed/window)) + curr_count

Redis-based implementation: two counters (current window, previous window) per key.

Fixed Window (simplest, boundary spike risk):

Resets counter at fixed intervals. A burst at 11:59:59 and 12:00:01 yields 2× the allowed rate.

Response Headers

HTTP/1.1 200 OK
X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 847
X-RateLimit-Reset: 1745600000
Retry-After: 30

On 429:

HTTP/1.1 429 Too Many Requests
Retry-After: 30
X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1745600000
Content-Type: application/problem+json

{
  "type": "https://api.example.com/errors/rate-limit-exceeded",
  "title": "Too Many Requests",
  "status": 429,
  "detail": "You have exceeded 1000 requests per minute."
}

Rate Limit Keys

Choose the right granularity:

Key Use Case
IP address Unauthenticated public APIs, DDoS protection
API key Developer tier enforcement
User ID Per-account limits after auth
Endpoint Expensive operations (e.g., /search)
Tenant ID SaaS multi-tenant isolation

CORS (Cross-Origin Resource Sharing)

CORS restricts which browser origins can call your API. It does NOT protect server-to-server calls.

# Preflight request (browser auto-sends for non-simple requests)
OPTIONS /v2/orders HTTP/1.1
Origin: https://app.example.com
Access-Control-Request-Method: POST
Access-Control-Request-Headers: Authorization, Content-Type

# Server response
HTTP/1.1 204 No Content
Access-Control-Allow-Origin: https://app.example.com
Access-Control-Allow-Methods: GET, POST, PUT, PATCH, DELETE, OPTIONS
Access-Control-Allow-Headers: Authorization, Content-Type, X-Request-ID
Access-Control-Max-Age: 86400
Access-Control-Allow-Credentials: true

Critical rules: - Never set Access-Control-Allow-Origin: * with Access-Control-Allow-Credentials: true — browsers block it - Maintain an allowlist of trusted origins; validate dynamically against it - Cache preflight with Access-Control-Max-Age to reduce OPTIONS overhead


API Design Best Practices

Resource Naming

# Good — noun-based, plural, lowercase
GET    /v2/orders
POST   /v2/orders
GET    /v2/orders/{orderId}
PUT    /v2/orders/{orderId}
PATCH  /v2/orders/{orderId}
DELETE /v2/orders/{orderId}

# Nested resources — use sparingly; max 2 levels deep
GET /v2/orders/{orderId}/items
POST /v2/orders/{orderId}/items

# Actions (verbs) — use only for operations that don't map to CRUD
POST /v2/orders/{orderId}/cancel
POST /v2/orders/{orderId}/refund
POST /v2/payments/{paymentId}/capture

Idempotency Keys

Prevent duplicate processing when clients retry on network failure.

POST /v2/orders HTTP/1.1
Idempotency-Key: 01HXYZ-unique-request-id
Content-Type: application/json

{"productId": "prod_123", "quantity": 2}
Server logic:
1. Hash Idempotency-Key → look up in idempotency store (Redis/DB)
2. If found and result cached → return cached response immediately
3. If found and in-flight → return 409 Conflict or wait
4. If not found → process, store result keyed to hash, return result

TTL: 24–48 hours (per Stripe: 24h)

Pagination

Cursor-based (recommended for large/real-time datasets):

// Request: GET /v2/orders?limit=20&after=01HXYZ
{
  "data": [...],
  "pagination": {
    "limit": 20,
    "hasNextPage": true,
    "nextCursor": "01HABC",
    "hasPrevPage": true,
    "prevCursor": "01HWXY"
  }
}

Offset-based (simpler, avoid for real-time data — page drift on inserts):

// Request: GET /v2/orders?limit=20&offset=40
{
  "data": [...],
  "pagination": {
    "total": 1847,
    "limit": 20,
    "offset": 40,
    "pages": 93
  }
}

Standardized Error Responses (RFC 9457 / Problem Details)

{
  "type": "https://api.example.com/errors/validation-error",
  "title": "Validation Error",
  "status": 422,
  "detail": "Request body contains invalid fields.",
  "instance": "/v2/orders/01HXYZ",
  "errors": [
    {
      "field": "quantity",
      "message": "Must be a positive integer",
      "code": "INVALID_VALUE"
    },
    {
      "field": "productId",
      "message": "Product not found",
      "code": "RESOURCE_NOT_FOUND"
    }
  ],
  "traceId": "4bf92f3577b34da6a3ce929d0e0e4736"
}

Always include traceId or requestId for support/debugging correlation.

Long-Running Operations (202 Async Pattern)

# 1. Client submits job
POST /v2/reports HTTP/1.1
{"type": "monthly-revenue", "month": "2026-03"}

# 2. Server accepts immediately
HTTP/1.1 202 Accepted
Location: /v2/reports/jobs/job_01HXYZ
Retry-After: 30

# 3. Client polls
GET /v2/reports/jobs/job_01HXYZ

# 4a. Still processing
HTTP/1.1 200 OK
{"status": "processing", "progress": 42, "estimatedCompletion": "2026-04-25T10:15:00Z"}

# 4b. Complete
HTTP/1.1 200 OK
{"status": "complete", "resultUrl": "/v2/reports/rpt_01HABC", "expiresAt": "2026-04-26T10:00:00Z"}

# 5. Retrieve result
GET /v2/reports/rpt_01HABC

Alternative: use webhook callback instead of polling — POST /v2/reports body includes callbackUrl.

Filtering, Sorting, Searching

# Filtering — use query params
GET /v2/orders?status=pending&customerId=cust_123&createdAfter=2026-01-01

# Sorting — field and direction
GET /v2/orders?sort=-createdAt,+status   # minus = desc, plus = asc

# Sparse fieldsets — reduce payload size
GET /v2/orders?fields=id,status,total

# Full-text search
GET /v2/products?q=wireless+headphones&category=electronics

API First Design

Design the API contract before writing implementation code.

Workflow: 1. Write OpenAPI spec in YAML (use Spectral to lint against rules) 2. Generate mock server with Prism: prism mock openapi.yaml 3. Share mock URL with frontend team — both sides develop in parallel 4. Generate server stubs with oapi-codegen (Go), openapi-generator (Java/Python/etc.) 5. Write implementation against generated interfaces 6. Run contract tests against live server to verify spec compliance

# Prism mock server (read OpenAPI spec, serve mock responses)
npx @stoplight/prism-cli mock openapi.yaml --port 4010

# Call mock
curl http://localhost:4010/v2/orders/01HXYZ \
  -H "Authorization: Bearer test-token"

# Prism validation proxy (forward to real server, validate request/response against spec)
npx @stoplight/prism-cli proxy openapi.yaml http://localhost:8080

Testing

REST API Testing (curl)

# GET with auth header and pretty JSON
curl -s -X GET https://api.example.com/v2/orders/01HXYZ \
  -H "Authorization: Bearer $TOKEN" \
  -H "Accept: application/json" | jq

# POST with JSON body
curl -s -X POST https://api.example.com/v2/orders \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -H "Idempotency-Key: $(uuidgen)" \
  -d '{"productId": "prod_123", "quantity": 2}' | jq

# Test rate limiting — fire 10 requests rapidly
for i in {1..10}; do
  curl -s -o /dev/null -w "%{http_code}\n" \
    -H "Authorization: Bearer $TOKEN" \
    https://api.example.com/v2/orders
done

# Inspect headers only
curl -sI https://api.example.com/v2/orders

# Follow redirects, show timing
curl -v -w "@curl-format.txt" -L https://api.example.com/v2/orders

gRPC Testing (grpcurl)

# Install
brew install grpcurl

# List services (server reflection must be enabled)
grpcurl -plaintext localhost:50051 list

# Describe a service
grpcurl -plaintext localhost:50051 describe orders.OrderService

# Unary call
grpcurl -plaintext \
  -H "Authorization: Bearer $TOKEN" \
  -d '{"order_id": "01HXYZ"}' \
  localhost:50051 orders.OrderService/GetOrder

# Server streaming call
grpcurl -plaintext \
  -d '{"customer_id": "cust_123"}' \
  localhost:50051 orders.OrderService/WatchOrders

# Call with TLS
grpcurl \
  -cert client.crt -key client.key -cacert ca.crt \
  api.example.com:443 orders.OrderService/GetOrder \
  -d '{"order_id": "01HXYZ"}'

WebSocket Testing (wscat)

# Install
npm install -g wscat

# Connect to WebSocket server
wscat -c wss://api.example.com/ws \
  --header "Authorization: Bearer $TOKEN"

# Send a message (after connecting)
> {"type": "subscribe", "channel": "orders", "customerId": "cust_123"}
< {"type": "subscribed", "channel": "orders"}
< {"type": "order.updated", "orderId": "01HXYZ", "status": "shipped"}

# Connect with subprotocol
wscat -c wss://api.example.com/ws --subprotocol "v2.orders"

GraphQL Testing (curl + jq)

# Introspection query
curl -s -X POST https://api.example.com/graphql \
  -H "Content-Type: application/json" \
  -d '{"query": "{ __schema { types { name } } }"}' | jq '.data.__schema.types[].name'

# Query with variables
curl -s -X POST https://api.example.com/graphql \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "query GetOrder($id: ID!) { order(id: $id) { status total } }",
    "variables": {"id": "01HXYZ"}
  }' | jq

# Mutation
curl -s -X POST https://api.example.com/graphql \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "mutation CancelOrder($id: ID!) { cancelOrder(id: $id) { success } }",
    "variables": {"id": "01HXYZ"}
  }' | jq

Load Testing (k6)

// k6 load test script — orders API
import http from "k6/http";
import { check, sleep } from "k6";
import { Rate } from "k6/metrics";

const errorRate = new Rate("errors");

export const options = {
  stages: [
    { duration: "30s", target: 50 },   // ramp up to 50 VUs
    { duration: "2m", target: 50 },    // hold
    { duration: "30s", target: 200 },  // spike to 200 VUs
    { duration: "1m", target: 200 },   // hold spike
    { duration: "30s", target: 0 },    // ramp down
  ],
  thresholds: {
    http_req_duration: ["p(95)<500"],  // 95th percentile < 500ms
    errors: ["rate<0.01"],             // error rate < 1%
  },
};

export default function () {
  const res = http.get("https://api.example.com/v2/orders", {
    headers: { Authorization: `Bearer ${__ENV.API_TOKEN}` },
  });
  const ok = check(res, {
    "status is 200": (r) => r.status === 200,
    "response time < 500ms": (r) => r.timings.duration < 500,
  });
  errorRate.add(!ok);
  sleep(1);
}
k6 run --env API_TOKEN=$TOKEN load-test.js

Contract Testing (Pact)

Consumer-driven contract tests verify that API providers honour contracts expected by consumers.

# Consumer writes expectations → generates pact file
# Provider verifies pact file against running service

# Publish to Pact Broker
npx pact-broker publish ./pacts \
  --broker-base-url https://your-pact-broker.example.com \
  --consumer-app-version $(git rev-parse HEAD)

# Provider verifies
npx pact-provider-verifier \
  --provider-base-url http://localhost:8080 \
  --pact-broker-base-url https://your-pact-broker.example.com \
  --provider orders-service

Monitoring and Observability

Key Metrics (RED Method)

Metric Description Alert Threshold (example)
Rate Requests per second Traffic drop > 50% vs baseline
Errors 5xx error rate > 1% over 5 minutes
Duration p50, p95, p99 latency p99 > 1000ms

Additional API-specific metrics: - 4xx rate (client errors) — spike may indicate breaking change or client bug - Auth failure rate — spike indicates credential attack or misconfiguration - Rate limit hit rate (429 responses) — indicate capacity planning needs - Payload size distribution — detect runaway requests

Distributed Tracing (OpenTelemetry)

# Node.js — auto-instrumentation with OTLP export
npm install @opentelemetry/sdk-node @opentelemetry/auto-instrumentations-node

# Inject trace context headers
GET /v2/orders HTTP/1.1
traceparent: 00-4bf92f3577b34da6a3ce929d0e0e4736-00f067aa0ba902b7-01
tracestate: rend=congo

Propagate traceparent across all service boundaries. Every response should include X-Request-ID or X-Trace-ID tied to the trace.

Structured Logging

{
  "level": "info",
  "timestamp": "2026-04-25T10:00:00.123Z",
  "service": "orders-api",
  "traceId": "4bf92f3577b34da6a3ce929d0e0e4736",
  "spanId": "00f067aa0ba902b7",
  "method": "GET",
  "path": "/v2/orders/01HXYZ",
  "statusCode": 200,
  "durationMs": 47,
  "customerId": "cust_123",
  "region": "us-east-1"
}

Health Endpoints

# Liveness — is the process alive?
GET /health/live
HTTP/1.1 200 OK
{"status": "ok"}

# Readiness — is the service ready to receive traffic?
GET /health/ready
HTTP/1.1 200 OK
{
  "status": "ok",
  "checks": {
    "database": "ok",
    "cache": "ok",
    "dependencyServiceA": "ok"
  }
}

# Degraded state
HTTP/1.1 503 Service Unavailable
{
  "status": "degraded",
  "checks": {
    "database": "ok",
    "cache": "error",
    "dependencyServiceA": "ok"
  }
}

Circuit Breaker Pattern

Prevents cascading failures when a downstream dependency is degraded.

States:
  CLOSED → normal operation, requests pass through
  OPEN   → dependency is failing; requests fail fast with 503
  HALF_OPEN → test probe requests sent; if success → CLOSED, if fail → OPEN

Transition triggers:
  CLOSED → OPEN:     failure rate > 50% over last 10 requests (or time window)
  OPEN → HALF_OPEN:  after cooldown period (e.g. 30 seconds)
  HALF_OPEN → CLOSED: 3 consecutive successes
  HALF_OPEN → OPEN:   1 failure

Libraries: Resilience4j (Java), polly (.NET), opossum (Node.js), gobreaker (Go).


Webhooks as a Product

For APIs that offer webhooks, treat the delivery system as a first-class product.

Delivery Architecture

sequenceDiagram
    participant ES as Event Source
    participant Q as Message Queue
    participant WD as Webhook Dispatcher
    participant C as Customer Server

    ES->>Q: Publish event
    Q->>WD: Consume event
    WD->>C: POST /webhook (signed payload)
    alt Success (2xx)
        C->>WD: 200 OK (within 5s)
        WD->>Q: Ack message
    else Failure / Timeout
        WD->>Q: Nack / retry
        WD->>WD: Exponential backoff\n(5s, 25s, 125s, ...)
        WD->>WD: After 72h: mark dead, alert
    end

Payload Signing (HMAC-SHA256)

import hashlib, hmac, time

def sign_payload(secret: str, payload: bytes) -> str:
    timestamp = str(int(time.time()))
    message = f"{timestamp}.{payload.decode()}".encode()
    signature = hmac.new(secret.encode(), message, hashlib.sha256).hexdigest()
    return f"t={timestamp},v1={signature}"

def verify_signature(secret: str, payload: bytes, header: str, tolerance: int = 300) -> bool:
    parts = dict(part.split("=", 1) for part in header.split(","))
    timestamp = int(parts["t"])
    if abs(time.time() - timestamp) > tolerance:
        return False  # replay attack
    message = f"{timestamp}.{payload.decode()}".encode()
    expected = hmac.new(secret.encode(), message, hashlib.sha256).hexdigest()
    return hmac.compare_digest(expected, parts["v1"])

Reliability Patterns

Pattern Implementation
Idempotency keys Include webhookId in payload; consumer deduplicates
Immediate 200 Return 200 before processing; use queue for async work
Retry with backoff 5s → 25s → 125s → 625s; max 72h delivery window
Dead letter queue After max retries, route to DLQ; alert operator
Event ordering Include sequence counter; consumer handles out-of-order
CloudEvents format Standardize payload envelope (specversion, type, source, id)

Webhook Management Portal (product features)

  • Endpoint registration with per-event-type subscription
  • Delivery attempt log with request/response bodies (last 30 days)
  • Manual replay of failed deliveries
  • HMAC secret rotation (grace period supporting both old + new key)
  • 200 OK webhook test endpoint for validation

API Caching Strategies

Caching is the single most impactful API performance optimization. Multiple layers can cache independently.

Caching Layers

flowchart LR
    C[Client] -->|1| BC[Browser Cache\nCache-Control]
    BC -->|2| CDN[CDN Edge\nCloudflare / CloudFront]
    CDN -->|3| GW[API Gateway\nVarnish / nginx]
    GW -->|4| APP[Application\nRedis / Memcached]
    APP -->|5| DB[(Database\nQuery Cache)]

Cache-Control Patterns

# Immutable asset (hashed filename — never changes)
Cache-Control: public, max-age=31536000, immutable

# Frequently changing API resource
Cache-Control: private, max-age=0, must-revalidate
ETag: "a1b2c3"

# Shared resource (CDN-cacheable)
Cache-Control: public, max-age=60, s-maxage=300, stale-while-revalidate=600
Vary: Accept-Encoding, Authorization

# No caching (sensitive data)
Cache-Control: no-store

stale-while-revalidate — the CDN/proxy serves the stale cached response immediately while fetching a fresh copy in the background. The client gets a fast response; the cache updates asynchronously. Critical for APIs where slight staleness is acceptable (product catalogs, search results).

stale-if-error — serve stale content if the origin returns a 5xx error. Provides graceful degradation when the backend is down.

Cache Invalidation Patterns

Pattern How It Works Best For
TTL expiry Cache expires after max-age seconds Simple, predictable; acceptable staleness
Event-driven purge Backend publishes event → CDN/cache purge API called Real-time consistency; more infrastructure
Surrogate keys (tags) Tag cached responses; purge all responses with a tag Purge all /products/* when inventory changes
Conditional revalidation If-None-Match / If-Modified-Since → 304 or fresh Bandwidth savings; origin still hit
# Fastly — purge by surrogate key
curl -X POST https://api.fastly.com/service/SERVICE_ID/purge/product-42 \
  -H "Fastly-Key: $FASTLY_TOKEN"

# CloudFront — invalidation
aws cloudfront create-invalidation \
  --distribution-id E1234 \
  --paths "/v2/products/42" "/v2/products?category=electronics"

GraphQL Caching

GraphQL's POST /graphql endpoint breaks traditional HTTP caching — see architecture#caching for normalized client cache, APQ, and @cacheControl directive approaches.


Retry Patterns

Retries are essential for resilient API consumers, but naive retries cause retry storms that amplify failures.

Exponential Backoff with Jitter

Attempt 1: wait 0ms (immediate)
Attempt 2: wait random(0, 1000ms)          → e.g., 487ms
Attempt 3: wait random(0, 2000ms)          → e.g., 1,342ms
Attempt 4: wait random(0, 4000ms)          → e.g., 2,891ms
Attempt 5: wait random(0, 8000ms)          → e.g., 5,203ms
Give up after attempt 5

Full jitter (recommended by AWS) prevents thundering herd — all retrying clients spread randomly across the backoff window instead of hitting the server at the same instant.

import random, time

def retry_with_backoff(func, max_retries=5, base_delay=1.0, max_delay=30.0):
    for attempt in range(max_retries):
        try:
            return func()
        except RetryableError:
            if attempt == max_retries - 1:
                raise
            delay = min(base_delay * (2 ** attempt), max_delay)
            jittered = random.uniform(0, delay)  # full jitter
            time.sleep(jittered)

Retry Budgets

Instead of per-request retry limits, set a budget: "retry at most 10% of total requests." This prevents cascading retry storms where every client retries simultaneously during an outage.

If 1000 req/s normally, allow at most 100 retries/s total
When budget exhausted → fail fast instead of retrying

Istio and Envoy support retry budgets natively via retryBudget configuration.

Which Errors to Retry

Status Code Retry? Reason
408 Request Timeout Transient timeout
429 Too Many Requests ✅ (respect Retry-After) Rate limited; wait and retry
500 Internal Server Error ✅ (cautiously) May be transient; limit retries
502 Bad Gateway Upstream briefly unavailable
503 Service Unavailable ✅ (respect Retry-After) Server overloaded; back off
504 Gateway Timeout Upstream timeout
400 Bad Request Client error; retry won't help
401/403 Auth issue; retry with same creds won't help
404 Resource doesn't exist
409 Conflict ⚠️ Re-read state, then maybe retry with updated data
422 Validation error; fix input first

Idempotency Requirement

Only retry non-idempotent operations (POST) if the API supports idempotency keys. Otherwise, retrying a POST may create duplicate resources.


SDK and Client Code Generation

Generating typed client SDKs from API specifications eliminates hand-written HTTP calls and catches breaking changes at compile time.

REST — openapi-generator

# Install
npm install -g @openapitools/openapi-generator-cli

# Generate TypeScript client from OpenAPI spec
openapi-generator-cli generate \
  -i https://api.example.com/v2/openapi.yaml \
  -g typescript-fetch \
  -o ./generated/api-client \
  --additional-properties=supportsES6=true,npmName=@example/api-client

# Generate Go server stubs
openapi-generator-cli generate \
  -i openapi.yaml \
  -g go-server \
  -o ./internal/api

oapi-codegen (Go-specific, lighter weight):

# Generate Go types + Echo server from spec
oapi-codegen -package api -generate types,server openapi.yaml > api/api.gen.go

Generated client usage (TypeScript):

import { OrdersApi, Configuration } from '@example/api-client';

const api = new OrdersApi(new Configuration({
  basePath: 'https://api.example.com/v2',
  accessToken: token,
}));

// Fully typed — input and output types from OpenAPI spec
const order = await api.getOrder({ orderId: '01HXYZ' });
// order is typed as Order, not `any`

GraphQL — graphql-codegen

npm install -D @graphql-codegen/cli @graphql-codegen/typescript \
  @graphql-codegen/typescript-operations @graphql-codegen/typed-document-node
# codegen.ts
import type { CodegenConfig } from '@graphql-codegen/cli';

const config: CodegenConfig = {
  schema: 'https://api.example.com/graphql',
  documents: 'src/**/*.graphql',
  generates: {
    './src/generated/graphql.ts': {
      plugins: [
        'typescript',
        'typescript-operations',
        'typed-document-node',
      ],
    },
  },
};
export default config;
npx graphql-codegen

Result: every .graphql query/mutation file produces a fully typed TypedDocumentNode — input variables and response shape are both type-checked at compile time.

gRPC — buf generate

# Install buf CLI
brew install bufbuild/buf/buf

# buf.gen.yaml — code generation config
version: v2
plugins:
  - remote: buf.build/protocolbuffers/go
    out: gen/go
    opt: paths=source_relative
  - remote: buf.build/grpc/go
    out: gen/go
    opt: paths=source_relative
  - remote: buf.build/connectrpc/go
    out: gen/go
    opt: paths=source_relative

# Generate
buf generate proto/

buf advantages over raw protoc: - Dependency management (BSR — Buf Schema Registry) - buf lint — enforces proto style guide - buf breaking — detects breaking changes between proto versions in CI - buf generate — replaces complex protoc plugin chains


API Documentation Generation

Swagger UI

Interactive documentation from an OpenAPI spec — users can try API calls directly in the browser.

# Docker — serve Swagger UI for your spec
docker run -p 8080:8080 \
  -e SWAGGER_JSON=/spec/openapi.yaml \
  -v $(pwd):/spec \
  swaggerapi/swagger-ui

# Or embed in Express:
npm install swagger-ui-express
const swaggerUi = require('swagger-ui-express');
const spec = require('./openapi.json');
app.use('/docs', swaggerUi.serve, swaggerUi.setup(spec));

Redoc

Static, clean, three-panel documentation. Better for public-facing API docs than Swagger UI.

# CLI rendering
npx @redocly/cli build-docs openapi.yaml -o docs/index.html

# Or CDN-hosted single HTML
# <script src="https://cdn.redoc.ly/redoc/latest/bundles/redoc.standalone.js"></script>
# <redoc spec-url="openapi.yaml"></redoc>

Scalar

Modern, customizable API reference with a built-in API client. Growing alternative to Swagger UI.

npm install @scalar/express-api-reference

# Express integration
app.use('/reference', apiReference({
  spec: { url: '/openapi.yaml' },
  theme: 'kepler',
}));

GraphQL Documentation

  • GraphiQL — official in-browser IDE with docs explorer, query autocomplete, variable pane
  • Apollo Studio / Apollo Sandbox — schema explorer, operation history, field usage analytics
  • Postman — supports GraphQL schema import and introspection
# Serve GraphiQL standalone
npx graphiql-explorer --endpoint https://api.example.com/graphql

API Governance and Linting

Spectral (OpenAPI / AsyncAPI Linting)

Spectral enforces API design standards via configurable rules. Run in CI to prevent non-compliant changes.

# Install
npm install -g @stoplight/spectral-cli

# Lint against built-in OpenAPI rules
spectral lint openapi.yaml

# Lint against custom ruleset
spectral lint openapi.yaml --ruleset .spectral.yaml
# .spectral.yaml — custom API governance rules
extends:
  - spectral:oas

rules:
  # Require operationId on every endpoint
  operation-operationId:
    severity: error
    given: "$.paths[*][*]"
    then:
      field: operationId
      function: truthy

  # Enforce kebab-case paths
  paths-kebab-case:
    severity: error
    given: "$.paths"
    then:
      function: pattern
      functionOptions:
        match: "^(/[a-z0-9-{}]+)+$"

  # Require description on all parameters
  parameter-description:
    severity: warn
    given: "$.paths[*][*].parameters[*]"
    then:
      field: description
      function: truthy

  # Ban query string versioning
  no-query-version:
    severity: error
    given: "$.paths[*][*].parameters[?(@.name == 'version' && @.in == 'query')]"
    then:
      function: falsy

  # Require error response schemas
  require-error-responses:
    severity: warn
    given: "$.paths[*][*].responses"
    then:
      - field: "400"
        function: truthy
      - field: "500"
        function: truthy

Breaking Change Detection

# openapi-diff — detect breaking changes between spec versions
npx openapi-diff old-spec.yaml new-spec.yaml

# buf breaking — detect protobuf breaking changes
buf breaking proto/ --against .git#branch=main

# optic — track API changes in CI
npx @useoptic/optic diff openapi.yaml --base main --check

CI integration example (GitHub Actions):

- name: Lint API spec
  run: spectral lint openapi.yaml --fail-severity warn

- name: Check for breaking changes
  run: |
    git show main:openapi.yaml > /tmp/old-spec.yaml
    npx openapi-diff /tmp/old-spec.yaml openapi.yaml --fail-on-incompatible

Proto Linting with buf

# buf.yaml — proto lint configuration
version: v2
lint:
  use:
    - STANDARD         # Google's protobuf style guide
    - COMMENTS         # Require comments on all public types
  except:
    - PACKAGE_VERSION_SUFFIX

# Run lint
buf lint proto/

# Check for breaking changes against main branch
buf breaking proto/ --against '.git#branch=main'

buf breaking catches: field number reuse, type changes, field removal, service method signature changes — all before merge.


API Tooling Ecosystem

Category Tools
API spec editors Stoplight Studio, Swagger Editor, Redocly
Linting Spectral (OpenAPI/AsyncAPI), buf lint (Protobuf)
Mock servers Prism, WireMock, Microcks
Client testing Postman, Insomnia, Bruno, HTTPie
CLI testing curl, httpie, grpcurl, wscat, mqtt-cli
Load testing k6, Gatling, Locust, Apache JMeter
Contract testing Pact, Dredd, Schemathesis
Documentation Redoc, Swagger UI, Scalar, Mintlify
API gateways Kong, Envoy, AWS API Gateway, Traefik
Service mesh Istio, Linkerd, Consul Connect
Code generation openapi-generator, oapi-codegen, buf generate
Monitoring Datadog APM, Grafana + Prometheus, New Relic

Sources

OpenAPI & Specification

Authentication & Security

API Design

Rate Limiting & Gateways

Webhooks

Testing & Monitoring

Caching & Performance

Code Generation & Governance