Web Services & APIs — Operations¶
Practical guide to deploying, documenting, securing, versioning, testing, and monitoring web service APIs.
API Specification Formats¶
Specifications are machine-readable contracts for APIs — enabling codegen, mock servers, linting, and documentation.
OpenAPI 3.1 (REST)¶
The industry standard for describing RESTful HTTP APIs. Version 3.1 aligns with JSON Schema draft 2020-12.
openapi: 3.1.0
info:
title: Orders API
version: 2.4.0
contact:
email: api@example.com
license:
name: Apache 2.0
servers:
- url: https://api.example.com/v2
description: Production
- url: https://sandbox.api.example.com/v2
description: Sandbox
paths:
/orders/{orderId}:
get:
operationId: getOrder
summary: Retrieve a single order
tags: [Orders]
parameters:
- name: orderId
in: path
required: true
schema:
type: string
format: uuid
responses:
"200":
description: Order found
content:
application/json:
schema:
$ref: "#/components/schemas/Order"
"404":
$ref: "#/components/responses/NotFound"
security:
- bearerAuth: []
components:
schemas:
Order:
type: object
required: [id, status, createdAt]
properties:
id:
type: string
format: uuid
status:
type: string
enum: [pending, confirmed, shipped, delivered, cancelled]
createdAt:
type: string
format: date-time
responses:
NotFound:
description: Resource not found
content:
application/json:
schema:
$ref: "#/components/schemas/ProblemDetail"
securitySchemes:
bearerAuth:
type: http
scheme: bearer
bearerFormat: JWT
Key OpenAPI 3.1 improvements over 3.0:
- Full JSON Schema 2020-12 alignment (replaces OpenAPI's extended subset)
- webhooks top-level field for inbound webhooks
- discriminator improvements, const, $schema per-schema
- exclusiveMinimum/exclusiveMaximum now numeric (not boolean)
AsyncAPI 3.0 (Event-Driven APIs)¶
OpenAPI equivalent for WebSocket, MQTT, Kafka, AMQP, SNS/SQS APIs.
asyncapi: 3.0.0
info:
title: Order Events API
version: 1.0.0
channels:
orderCreated:
address: orders.created
messages:
OrderCreated:
payload:
type: object
properties:
orderId:
type: string
customerId:
type: string
operations:
onOrderCreated:
action: receive
channel:
$ref: "#/channels/orderCreated"
Protocol Buffers IDL (gRPC)¶
See architecture#protocol-buffers for the full .proto format. The .proto file IS the API spec for gRPC services.
Tooling comparison:
| Format | Ecosystem | Codegen | Mock Server | Linting |
|---|---|---|---|---|
| OpenAPI 3.1 | REST | Any language | Prism, WireMock | Spectral, Vacuum |
| AsyncAPI 3.0 | Event-driven | Node.js, Java | Microcks | AsyncAPI Studio |
| Protobuf | gRPC | Any language | grpc-go test server | buf lint |
| WSDL | SOAP | Java, .NET, Python | SoapUI | SOAP UI |
API Gateways¶
An API gateway is the single entry point for all client traffic — handling routing, auth enforcement, rate limiting, observability, and protocol translation.
flowchart LR
C1[Mobile Client] --> GW[API Gateway]
C2[Browser] --> GW
C3[Partner API] --> GW
GW -->|/orders| OS[Orders Service]
GW -->|/users| US[User Service]
GW -->|/products| PS[Product Service]
GW --> Auth[Auth Service]
GW --> RL[Rate Limiter\nRedis]
GW --> Log[Observability\nDatadog / Grafana]
Kong Gateway¶
Open-source gateway built on NGINX + OpenResty (Lua). Enterprise tier adds RBAC, Dev Portal, and Vitals analytics.
# Kong declarative config (deck format)
services:
- name: orders-service
url: http://orders-service:8080
plugins:
- name: rate-limiting
config:
minute: 1000
policy: redis
redis_host: redis
- name: jwt
config:
claims_to_verify: [exp]
routes:
- name: orders-route
paths: [/v2/orders]
strip_path: false
methods: [GET, POST, PUT, PATCH, DELETE]
# Kong Admin API — add plugin to route
curl -X POST http://kong:8001/routes/orders-route/plugins \
--data name=request-transformer \
--data "config.add.headers[]=X-Request-ID:$(uuidgen)"
Envoy Proxy¶
High-performance C++ proxy developed at Lyft. Operates as data plane in Istio service mesh. Configured via xDS APIs (dynamic) or static YAML.
# Envoy static config — HTTP rate limit filter
http_filters:
- name: envoy.filters.http.ratelimit
typed_config:
"@type": type.googleapis.com/envoy.extensions.filters.http.ratelimit.v3.RateLimit
domain: orders_api
rate_limit_service:
grpc_service:
envoy_grpc:
cluster_name: rate_limit_service
transport_api_version: V3
AWS API Gateway¶
Managed gateway for REST, HTTP, and WebSocket APIs. Integrates natively with Lambda, ALB, and VPC Link.
# Create HTTP API (simpler, lower cost than REST API)
aws apigatewayv2 create-api \
--name orders-api \
--protocol-type HTTP \
--target arn:aws:lambda:us-east-1:123456789:function:orders-handler
# Add JWT authorizer
aws apigatewayv2 create-authorizer \
--api-id abc123 \
--authorizer-type JWT \
--identity-source '$request.header.Authorization' \
--jwt-configuration Audience=orders-api,Issuer=https://auth.example.com \
--name JwtAuthorizer
Gateway comparison:
| Gateway | Deployment | Config Model | Best For |
|---|---|---|---|
| Kong | Self-hosted / Cloud | Declarative YAML / Admin API | Large teams, plugin ecosystem |
| Envoy | Self-hosted (sidecar) | xDS (dynamic) / YAML | Service mesh, Kubernetes |
| AWS API Gateway | Managed | Console / CDK / SAM | AWS-native serverless |
| Nginx | Self-hosted | Imperative config | Simple reverse proxy |
| Traefik | Self-hosted | Auto-discover (Kubernetes) | Kubernetes ingress |
| Azure API Management | Managed | Portal / ARM / Bicep | Azure-native |
Authentication and Authorization¶
API Keys¶
Simplest scheme. Suitable for server-to-server or developer access where OAuth overhead is unneeded.
Best practices:
- Prefix keys by environment: sk_live_, sk_test_
- Store only the hash (SHA-256) in database — never plaintext
- Rotate on compromise; provide 30-day grace period during planned rotations
- Associate keys with scopes: orders:read, orders:write
JWT (JSON Web Tokens)¶
Stateless bearer tokens. Three base64url-encoded parts: header, payload, signature.
// Payload claims
{
"sub": "user_01HXYZ",
"iss": "https://auth.example.com",
"aud": "orders-api",
"exp": 1745600000,
"iat": 1745596400,
"scope": "orders:read orders:write",
"jti": "01HXYZ-unique-token-id"
}
JWT security checklist:
- Use RS256 (asymmetric) for public key distribution, not HS256 (shared secret)
- Short expiry: 15 minutes for access tokens; refresh tokens via httpOnly cookies
- Validate iss, aud, exp, nbf on every request
- Include jti (JWT ID) for revocation lookup in Redis blocklist
- Never store sensitive data in payload — JWTs are encoded, not encrypted (use JWE for confidentiality)
OAuth 2.0 / OAuth 2.1¶
Authorization Code + PKCE (browser and mobile clients):
sequenceDiagram
participant U as User
participant C as Client App
participant AS as Auth Server
participant RS as Resource Server
C->>C: Generate code_verifier, code_challenge = SHA256(verifier)
C->>AS: GET /authorize?response_type=code&client_id=...&code_challenge=...
AS->>U: Login + Consent screen
U->>AS: Approve
AS->>C: Redirect with ?code=AUTH_CODE
C->>AS: POST /token {code, code_verifier, client_id}
AS->>C: {access_token, refresh_token, expires_in}
C->>RS: GET /orders Authorization: Bearer ACCESS_TOKEN
RS->>C: 200 {orders: [...]}
Client Credentials (machine-to-machine):
curl -X POST https://auth.example.com/oauth/token \
-d grant_type=client_credentials \
-d client_id=service-account \
-d client_secret=secret \
-d scope="orders:read inventory:write"
OAuth 2.1 key changes (draft consolidation): - PKCE mandatory for all public clients - Implicit flow removed - Resource Owner Password Credentials (ROPC) flow removed - Refresh token rotation required for public clients
mTLS (Mutual TLS)¶
Both client and server present certificates — eliminates shared secrets for service-to-service auth.
# Generate client cert signed by your CA
openssl req -new -key client.key -out client.csr \
-subj "/CN=orders-service/O=internal"
openssl x509 -req -in client.csr -CA ca.crt -CAkey ca.key \
-CAcreateserial -out client.crt -days 365
# Call API with client cert
curl --cert client.crt --key client.key \
--cacert ca.crt \
https://internal-api.example.com/v2/orders
In Kubernetes: use SPIFFE/SPIRE for automatic workload identity, or let Istio inject mTLS transparently via sidecar.
API Versioning¶
Versioning Strategies¶
| Strategy | Example | Pros | Cons |
|---|---|---|---|
| URI path | /v2/orders |
Most visible, easy routing | Breaks resource identity |
| Query param | /orders?version=2 |
Non-breaking URL | Easily forgotten, cache unfriendly |
| Header | API-Version: 2024-01-01 |
Clean URLs | Less discoverable |
| Content negotiation | Accept: application/vnd.api+json;version=2 |
RFC-compliant | Complex client setup |
URI versioning is the most common choice for public APIs (used by Stripe, Twilio, GitHub). Header versioning (calendar-based like Stripe-Version: 2023-10-16) is used by Stripe alongside URI versioning for fine-grained migrations.
Calendar-Based Versioning (Stripe Pattern)¶
Instead of major version bumps, every breaking change gets a calendar date:
Each API key locks to a version at creation. Customers opt into new versions explicitly.
Deprecation Headers (RFC 8594)¶
HTTP/1.1 200 OK
Deprecation: "2026-01-01T00:00:00Z"
Sunset: "2027-01-01T00:00:00Z"
Link: <https://docs.example.com/migration/v3>; rel="successor-version"
Deprecation: when the endpoint was deprecatedSunset: when it will stop working (RFC 8594)Link: migration guide
Non-Breaking vs Breaking Changes¶
Non-breaking (safe to ship): - Adding optional request fields - Adding new response fields - Adding new endpoints - New enum values (unless clients use exhaustive matching)
Breaking (require new version): - Removing or renaming fields - Changing field types - Changing HTTP method for an operation - Altering authentication requirements - Removing enum values
Rate Limiting¶
Rate limiting protects services from abuse, ensures fair usage, and enables monetization tiers.
Algorithms¶
Token Bucket (allow bursting):
capacity = 100 tokens
refill_rate = 10 tokens/second
on request:
if tokens >= cost:
tokens -= cost
return ALLOW
else:
return 429 Too Many Requests
AWS API Gateway and Kong use token bucket by default.
Sliding Window Log (most precise):
Stores timestamp of each request. Counts requests within [now - window, now]. High memory cost at scale.
Sliding Window Counter (approximation, low memory):
Redis-based implementation: two counters (current window, previous window) per key.
Fixed Window (simplest, boundary spike risk):
Resets counter at fixed intervals. A burst at 11:59:59 and 12:00:01 yields 2× the allowed rate.
Response Headers¶
HTTP/1.1 200 OK
X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 847
X-RateLimit-Reset: 1745600000
Retry-After: 30
On 429:
HTTP/1.1 429 Too Many Requests
Retry-After: 30
X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1745600000
Content-Type: application/problem+json
{
"type": "https://api.example.com/errors/rate-limit-exceeded",
"title": "Too Many Requests",
"status": 429,
"detail": "You have exceeded 1000 requests per minute."
}
Rate Limit Keys¶
Choose the right granularity:
| Key | Use Case |
|---|---|
| IP address | Unauthenticated public APIs, DDoS protection |
| API key | Developer tier enforcement |
| User ID | Per-account limits after auth |
| Endpoint | Expensive operations (e.g., /search) |
| Tenant ID | SaaS multi-tenant isolation |
CORS (Cross-Origin Resource Sharing)¶
CORS restricts which browser origins can call your API. It does NOT protect server-to-server calls.
# Preflight request (browser auto-sends for non-simple requests)
OPTIONS /v2/orders HTTP/1.1
Origin: https://app.example.com
Access-Control-Request-Method: POST
Access-Control-Request-Headers: Authorization, Content-Type
# Server response
HTTP/1.1 204 No Content
Access-Control-Allow-Origin: https://app.example.com
Access-Control-Allow-Methods: GET, POST, PUT, PATCH, DELETE, OPTIONS
Access-Control-Allow-Headers: Authorization, Content-Type, X-Request-ID
Access-Control-Max-Age: 86400
Access-Control-Allow-Credentials: true
Critical rules:
- Never set Access-Control-Allow-Origin: * with Access-Control-Allow-Credentials: true — browsers block it
- Maintain an allowlist of trusted origins; validate dynamically against it
- Cache preflight with Access-Control-Max-Age to reduce OPTIONS overhead
API Design Best Practices¶
Resource Naming¶
# Good — noun-based, plural, lowercase
GET /v2/orders
POST /v2/orders
GET /v2/orders/{orderId}
PUT /v2/orders/{orderId}
PATCH /v2/orders/{orderId}
DELETE /v2/orders/{orderId}
# Nested resources — use sparingly; max 2 levels deep
GET /v2/orders/{orderId}/items
POST /v2/orders/{orderId}/items
# Actions (verbs) — use only for operations that don't map to CRUD
POST /v2/orders/{orderId}/cancel
POST /v2/orders/{orderId}/refund
POST /v2/payments/{paymentId}/capture
Idempotency Keys¶
Prevent duplicate processing when clients retry on network failure.
POST /v2/orders HTTP/1.1
Idempotency-Key: 01HXYZ-unique-request-id
Content-Type: application/json
{"productId": "prod_123", "quantity": 2}
Server logic:
1. Hash Idempotency-Key → look up in idempotency store (Redis/DB)
2. If found and result cached → return cached response immediately
3. If found and in-flight → return 409 Conflict or wait
4. If not found → process, store result keyed to hash, return result
TTL: 24–48 hours (per Stripe: 24h)
Pagination¶
Cursor-based (recommended for large/real-time datasets):
// Request: GET /v2/orders?limit=20&after=01HXYZ
{
"data": [...],
"pagination": {
"limit": 20,
"hasNextPage": true,
"nextCursor": "01HABC",
"hasPrevPage": true,
"prevCursor": "01HWXY"
}
}
Offset-based (simpler, avoid for real-time data — page drift on inserts):
// Request: GET /v2/orders?limit=20&offset=40
{
"data": [...],
"pagination": {
"total": 1847,
"limit": 20,
"offset": 40,
"pages": 93
}
}
Standardized Error Responses (RFC 9457 / Problem Details)¶
{
"type": "https://api.example.com/errors/validation-error",
"title": "Validation Error",
"status": 422,
"detail": "Request body contains invalid fields.",
"instance": "/v2/orders/01HXYZ",
"errors": [
{
"field": "quantity",
"message": "Must be a positive integer",
"code": "INVALID_VALUE"
},
{
"field": "productId",
"message": "Product not found",
"code": "RESOURCE_NOT_FOUND"
}
],
"traceId": "4bf92f3577b34da6a3ce929d0e0e4736"
}
Always include traceId or requestId for support/debugging correlation.
Long-Running Operations (202 Async Pattern)¶
# 1. Client submits job
POST /v2/reports HTTP/1.1
{"type": "monthly-revenue", "month": "2026-03"}
# 2. Server accepts immediately
HTTP/1.1 202 Accepted
Location: /v2/reports/jobs/job_01HXYZ
Retry-After: 30
# 3. Client polls
GET /v2/reports/jobs/job_01HXYZ
# 4a. Still processing
HTTP/1.1 200 OK
{"status": "processing", "progress": 42, "estimatedCompletion": "2026-04-25T10:15:00Z"}
# 4b. Complete
HTTP/1.1 200 OK
{"status": "complete", "resultUrl": "/v2/reports/rpt_01HABC", "expiresAt": "2026-04-26T10:00:00Z"}
# 5. Retrieve result
GET /v2/reports/rpt_01HABC
Alternative: use webhook callback instead of polling — POST /v2/reports body includes callbackUrl.
Filtering, Sorting, Searching¶
# Filtering — use query params
GET /v2/orders?status=pending&customerId=cust_123&createdAfter=2026-01-01
# Sorting — field and direction
GET /v2/orders?sort=-createdAt,+status # minus = desc, plus = asc
# Sparse fieldsets — reduce payload size
GET /v2/orders?fields=id,status,total
# Full-text search
GET /v2/products?q=wireless+headphones&category=electronics
API First Design¶
Design the API contract before writing implementation code.
Workflow:
1. Write OpenAPI spec in YAML (use Spectral to lint against rules)
2. Generate mock server with Prism: prism mock openapi.yaml
3. Share mock URL with frontend team — both sides develop in parallel
4. Generate server stubs with oapi-codegen (Go), openapi-generator (Java/Python/etc.)
5. Write implementation against generated interfaces
6. Run contract tests against live server to verify spec compliance
# Prism mock server (read OpenAPI spec, serve mock responses)
npx @stoplight/prism-cli mock openapi.yaml --port 4010
# Call mock
curl http://localhost:4010/v2/orders/01HXYZ \
-H "Authorization: Bearer test-token"
# Prism validation proxy (forward to real server, validate request/response against spec)
npx @stoplight/prism-cli proxy openapi.yaml http://localhost:8080
Testing¶
REST API Testing (curl)¶
# GET with auth header and pretty JSON
curl -s -X GET https://api.example.com/v2/orders/01HXYZ \
-H "Authorization: Bearer $TOKEN" \
-H "Accept: application/json" | jq
# POST with JSON body
curl -s -X POST https://api.example.com/v2/orders \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-H "Idempotency-Key: $(uuidgen)" \
-d '{"productId": "prod_123", "quantity": 2}' | jq
# Test rate limiting — fire 10 requests rapidly
for i in {1..10}; do
curl -s -o /dev/null -w "%{http_code}\n" \
-H "Authorization: Bearer $TOKEN" \
https://api.example.com/v2/orders
done
# Inspect headers only
curl -sI https://api.example.com/v2/orders
# Follow redirects, show timing
curl -v -w "@curl-format.txt" -L https://api.example.com/v2/orders
gRPC Testing (grpcurl)¶
# Install
brew install grpcurl
# List services (server reflection must be enabled)
grpcurl -plaintext localhost:50051 list
# Describe a service
grpcurl -plaintext localhost:50051 describe orders.OrderService
# Unary call
grpcurl -plaintext \
-H "Authorization: Bearer $TOKEN" \
-d '{"order_id": "01HXYZ"}' \
localhost:50051 orders.OrderService/GetOrder
# Server streaming call
grpcurl -plaintext \
-d '{"customer_id": "cust_123"}' \
localhost:50051 orders.OrderService/WatchOrders
# Call with TLS
grpcurl \
-cert client.crt -key client.key -cacert ca.crt \
api.example.com:443 orders.OrderService/GetOrder \
-d '{"order_id": "01HXYZ"}'
WebSocket Testing (wscat)¶
# Install
npm install -g wscat
# Connect to WebSocket server
wscat -c wss://api.example.com/ws \
--header "Authorization: Bearer $TOKEN"
# Send a message (after connecting)
> {"type": "subscribe", "channel": "orders", "customerId": "cust_123"}
< {"type": "subscribed", "channel": "orders"}
< {"type": "order.updated", "orderId": "01HXYZ", "status": "shipped"}
# Connect with subprotocol
wscat -c wss://api.example.com/ws --subprotocol "v2.orders"
GraphQL Testing (curl + jq)¶
# Introspection query
curl -s -X POST https://api.example.com/graphql \
-H "Content-Type: application/json" \
-d '{"query": "{ __schema { types { name } } }"}' | jq '.data.__schema.types[].name'
# Query with variables
curl -s -X POST https://api.example.com/graphql \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{
"query": "query GetOrder($id: ID!) { order(id: $id) { status total } }",
"variables": {"id": "01HXYZ"}
}' | jq
# Mutation
curl -s -X POST https://api.example.com/graphql \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{
"query": "mutation CancelOrder($id: ID!) { cancelOrder(id: $id) { success } }",
"variables": {"id": "01HXYZ"}
}' | jq
Load Testing (k6)¶
// k6 load test script — orders API
import http from "k6/http";
import { check, sleep } from "k6";
import { Rate } from "k6/metrics";
const errorRate = new Rate("errors");
export const options = {
stages: [
{ duration: "30s", target: 50 }, // ramp up to 50 VUs
{ duration: "2m", target: 50 }, // hold
{ duration: "30s", target: 200 }, // spike to 200 VUs
{ duration: "1m", target: 200 }, // hold spike
{ duration: "30s", target: 0 }, // ramp down
],
thresholds: {
http_req_duration: ["p(95)<500"], // 95th percentile < 500ms
errors: ["rate<0.01"], // error rate < 1%
},
};
export default function () {
const res = http.get("https://api.example.com/v2/orders", {
headers: { Authorization: `Bearer ${__ENV.API_TOKEN}` },
});
const ok = check(res, {
"status is 200": (r) => r.status === 200,
"response time < 500ms": (r) => r.timings.duration < 500,
});
errorRate.add(!ok);
sleep(1);
}
Contract Testing (Pact)¶
Consumer-driven contract tests verify that API providers honour contracts expected by consumers.
# Consumer writes expectations → generates pact file
# Provider verifies pact file against running service
# Publish to Pact Broker
npx pact-broker publish ./pacts \
--broker-base-url https://your-pact-broker.example.com \
--consumer-app-version $(git rev-parse HEAD)
# Provider verifies
npx pact-provider-verifier \
--provider-base-url http://localhost:8080 \
--pact-broker-base-url https://your-pact-broker.example.com \
--provider orders-service
Monitoring and Observability¶
Key Metrics (RED Method)¶
| Metric | Description | Alert Threshold (example) |
|---|---|---|
| Rate | Requests per second | Traffic drop > 50% vs baseline |
| Errors | 5xx error rate | > 1% over 5 minutes |
| Duration | p50, p95, p99 latency | p99 > 1000ms |
Additional API-specific metrics:
- 4xx rate (client errors) — spike may indicate breaking change or client bug
- Auth failure rate — spike indicates credential attack or misconfiguration
- Rate limit hit rate (429 responses) — indicate capacity planning needs
- Payload size distribution — detect runaway requests
Distributed Tracing (OpenTelemetry)¶
# Node.js — auto-instrumentation with OTLP export
npm install @opentelemetry/sdk-node @opentelemetry/auto-instrumentations-node
# Inject trace context headers
GET /v2/orders HTTP/1.1
traceparent: 00-4bf92f3577b34da6a3ce929d0e0e4736-00f067aa0ba902b7-01
tracestate: rend=congo
Propagate traceparent across all service boundaries. Every response should include X-Request-ID or X-Trace-ID tied to the trace.
Structured Logging¶
{
"level": "info",
"timestamp": "2026-04-25T10:00:00.123Z",
"service": "orders-api",
"traceId": "4bf92f3577b34da6a3ce929d0e0e4736",
"spanId": "00f067aa0ba902b7",
"method": "GET",
"path": "/v2/orders/01HXYZ",
"statusCode": 200,
"durationMs": 47,
"customerId": "cust_123",
"region": "us-east-1"
}
Health Endpoints¶
# Liveness — is the process alive?
GET /health/live
HTTP/1.1 200 OK
{"status": "ok"}
# Readiness — is the service ready to receive traffic?
GET /health/ready
HTTP/1.1 200 OK
{
"status": "ok",
"checks": {
"database": "ok",
"cache": "ok",
"dependencyServiceA": "ok"
}
}
# Degraded state
HTTP/1.1 503 Service Unavailable
{
"status": "degraded",
"checks": {
"database": "ok",
"cache": "error",
"dependencyServiceA": "ok"
}
}
Circuit Breaker Pattern¶
Prevents cascading failures when a downstream dependency is degraded.
States:
CLOSED → normal operation, requests pass through
OPEN → dependency is failing; requests fail fast with 503
HALF_OPEN → test probe requests sent; if success → CLOSED, if fail → OPEN
Transition triggers:
CLOSED → OPEN: failure rate > 50% over last 10 requests (or time window)
OPEN → HALF_OPEN: after cooldown period (e.g. 30 seconds)
HALF_OPEN → CLOSED: 3 consecutive successes
HALF_OPEN → OPEN: 1 failure
Libraries: Resilience4j (Java), polly (.NET), opossum (Node.js), gobreaker (Go).
Webhooks as a Product¶
For APIs that offer webhooks, treat the delivery system as a first-class product.
Delivery Architecture¶
sequenceDiagram
participant ES as Event Source
participant Q as Message Queue
participant WD as Webhook Dispatcher
participant C as Customer Server
ES->>Q: Publish event
Q->>WD: Consume event
WD->>C: POST /webhook (signed payload)
alt Success (2xx)
C->>WD: 200 OK (within 5s)
WD->>Q: Ack message
else Failure / Timeout
WD->>Q: Nack / retry
WD->>WD: Exponential backoff\n(5s, 25s, 125s, ...)
WD->>WD: After 72h: mark dead, alert
end
Payload Signing (HMAC-SHA256)¶
import hashlib, hmac, time
def sign_payload(secret: str, payload: bytes) -> str:
timestamp = str(int(time.time()))
message = f"{timestamp}.{payload.decode()}".encode()
signature = hmac.new(secret.encode(), message, hashlib.sha256).hexdigest()
return f"t={timestamp},v1={signature}"
def verify_signature(secret: str, payload: bytes, header: str, tolerance: int = 300) -> bool:
parts = dict(part.split("=", 1) for part in header.split(","))
timestamp = int(parts["t"])
if abs(time.time() - timestamp) > tolerance:
return False # replay attack
message = f"{timestamp}.{payload.decode()}".encode()
expected = hmac.new(secret.encode(), message, hashlib.sha256).hexdigest()
return hmac.compare_digest(expected, parts["v1"])
Reliability Patterns¶
| Pattern | Implementation |
|---|---|
| Idempotency keys | Include webhookId in payload; consumer deduplicates |
| Immediate 200 | Return 200 before processing; use queue for async work |
| Retry with backoff | 5s → 25s → 125s → 625s; max 72h delivery window |
| Dead letter queue | After max retries, route to DLQ; alert operator |
| Event ordering | Include sequence counter; consumer handles out-of-order |
| CloudEvents format | Standardize payload envelope (specversion, type, source, id) |
Webhook Management Portal (product features)¶
- Endpoint registration with per-event-type subscription
- Delivery attempt log with request/response bodies (last 30 days)
- Manual replay of failed deliveries
- HMAC secret rotation (grace period supporting both old + new key)
200 OKwebhook test endpoint for validation
API Caching Strategies¶
Caching is the single most impactful API performance optimization. Multiple layers can cache independently.
Caching Layers¶
flowchart LR
C[Client] -->|1| BC[Browser Cache\nCache-Control]
BC -->|2| CDN[CDN Edge\nCloudflare / CloudFront]
CDN -->|3| GW[API Gateway\nVarnish / nginx]
GW -->|4| APP[Application\nRedis / Memcached]
APP -->|5| DB[(Database\nQuery Cache)]
Cache-Control Patterns¶
# Immutable asset (hashed filename — never changes)
Cache-Control: public, max-age=31536000, immutable
# Frequently changing API resource
Cache-Control: private, max-age=0, must-revalidate
ETag: "a1b2c3"
# Shared resource (CDN-cacheable)
Cache-Control: public, max-age=60, s-maxage=300, stale-while-revalidate=600
Vary: Accept-Encoding, Authorization
# No caching (sensitive data)
Cache-Control: no-store
stale-while-revalidate — the CDN/proxy serves the stale cached response immediately while fetching a fresh copy in the background. The client gets a fast response; the cache updates asynchronously. Critical for APIs where slight staleness is acceptable (product catalogs, search results).
stale-if-error — serve stale content if the origin returns a 5xx error. Provides graceful degradation when the backend is down.
Cache Invalidation Patterns¶
| Pattern | How It Works | Best For |
|---|---|---|
| TTL expiry | Cache expires after max-age seconds |
Simple, predictable; acceptable staleness |
| Event-driven purge | Backend publishes event → CDN/cache purge API called | Real-time consistency; more infrastructure |
| Surrogate keys (tags) | Tag cached responses; purge all responses with a tag | Purge all /products/* when inventory changes |
| Conditional revalidation | If-None-Match / If-Modified-Since → 304 or fresh |
Bandwidth savings; origin still hit |
# Fastly — purge by surrogate key
curl -X POST https://api.fastly.com/service/SERVICE_ID/purge/product-42 \
-H "Fastly-Key: $FASTLY_TOKEN"
# CloudFront — invalidation
aws cloudfront create-invalidation \
--distribution-id E1234 \
--paths "/v2/products/42" "/v2/products?category=electronics"
GraphQL Caching¶
GraphQL's POST /graphql endpoint breaks traditional HTTP caching — see architecture#caching for normalized client cache, APQ, and @cacheControl directive approaches.
Retry Patterns¶
Retries are essential for resilient API consumers, but naive retries cause retry storms that amplify failures.
Exponential Backoff with Jitter¶
Attempt 1: wait 0ms (immediate)
Attempt 2: wait random(0, 1000ms) → e.g., 487ms
Attempt 3: wait random(0, 2000ms) → e.g., 1,342ms
Attempt 4: wait random(0, 4000ms) → e.g., 2,891ms
Attempt 5: wait random(0, 8000ms) → e.g., 5,203ms
Give up after attempt 5
Full jitter (recommended by AWS) prevents thundering herd — all retrying clients spread randomly across the backoff window instead of hitting the server at the same instant.
import random, time
def retry_with_backoff(func, max_retries=5, base_delay=1.0, max_delay=30.0):
for attempt in range(max_retries):
try:
return func()
except RetryableError:
if attempt == max_retries - 1:
raise
delay = min(base_delay * (2 ** attempt), max_delay)
jittered = random.uniform(0, delay) # full jitter
time.sleep(jittered)
Retry Budgets¶
Instead of per-request retry limits, set a budget: "retry at most 10% of total requests." This prevents cascading retry storms where every client retries simultaneously during an outage.
If 1000 req/s normally, allow at most 100 retries/s total
When budget exhausted → fail fast instead of retrying
Istio and Envoy support retry budgets natively via retryBudget configuration.
Which Errors to Retry¶
| Status Code | Retry? | Reason |
|---|---|---|
408 Request Timeout |
✅ | Transient timeout |
429 Too Many Requests |
✅ (respect Retry-After) |
Rate limited; wait and retry |
500 Internal Server Error |
✅ (cautiously) | May be transient; limit retries |
502 Bad Gateway |
✅ | Upstream briefly unavailable |
503 Service Unavailable |
✅ (respect Retry-After) |
Server overloaded; back off |
504 Gateway Timeout |
✅ | Upstream timeout |
400 Bad Request |
❌ | Client error; retry won't help |
401/403 |
❌ | Auth issue; retry with same creds won't help |
404 |
❌ | Resource doesn't exist |
409 Conflict |
⚠️ | Re-read state, then maybe retry with updated data |
422 |
❌ | Validation error; fix input first |
Idempotency Requirement
Only retry non-idempotent operations (POST) if the API supports idempotency keys. Otherwise, retrying a POST may create duplicate resources.
SDK and Client Code Generation¶
Generating typed client SDKs from API specifications eliminates hand-written HTTP calls and catches breaking changes at compile time.
REST — openapi-generator¶
# Install
npm install -g @openapitools/openapi-generator-cli
# Generate TypeScript client from OpenAPI spec
openapi-generator-cli generate \
-i https://api.example.com/v2/openapi.yaml \
-g typescript-fetch \
-o ./generated/api-client \
--additional-properties=supportsES6=true,npmName=@example/api-client
# Generate Go server stubs
openapi-generator-cli generate \
-i openapi.yaml \
-g go-server \
-o ./internal/api
oapi-codegen (Go-specific, lighter weight):
# Generate Go types + Echo server from spec
oapi-codegen -package api -generate types,server openapi.yaml > api/api.gen.go
Generated client usage (TypeScript):
import { OrdersApi, Configuration } from '@example/api-client';
const api = new OrdersApi(new Configuration({
basePath: 'https://api.example.com/v2',
accessToken: token,
}));
// Fully typed — input and output types from OpenAPI spec
const order = await api.getOrder({ orderId: '01HXYZ' });
// order is typed as Order, not `any`
GraphQL — graphql-codegen¶
npm install -D @graphql-codegen/cli @graphql-codegen/typescript \
@graphql-codegen/typescript-operations @graphql-codegen/typed-document-node
# codegen.ts
import type { CodegenConfig } from '@graphql-codegen/cli';
const config: CodegenConfig = {
schema: 'https://api.example.com/graphql',
documents: 'src/**/*.graphql',
generates: {
'./src/generated/graphql.ts': {
plugins: [
'typescript',
'typescript-operations',
'typed-document-node',
],
},
},
};
export default config;
Result: every .graphql query/mutation file produces a fully typed TypedDocumentNode — input variables and response shape are both type-checked at compile time.
gRPC — buf generate¶
# Install buf CLI
brew install bufbuild/buf/buf
# buf.gen.yaml — code generation config
version: v2
plugins:
- remote: buf.build/protocolbuffers/go
out: gen/go
opt: paths=source_relative
- remote: buf.build/grpc/go
out: gen/go
opt: paths=source_relative
- remote: buf.build/connectrpc/go
out: gen/go
opt: paths=source_relative
# Generate
buf generate proto/
buf advantages over raw protoc:
- Dependency management (BSR — Buf Schema Registry)
- buf lint — enforces proto style guide
- buf breaking — detects breaking changes between proto versions in CI
- buf generate — replaces complex protoc plugin chains
API Documentation Generation¶
Swagger UI¶
Interactive documentation from an OpenAPI spec — users can try API calls directly in the browser.
# Docker — serve Swagger UI for your spec
docker run -p 8080:8080 \
-e SWAGGER_JSON=/spec/openapi.yaml \
-v $(pwd):/spec \
swaggerapi/swagger-ui
# Or embed in Express:
npm install swagger-ui-express
const swaggerUi = require('swagger-ui-express');
const spec = require('./openapi.json');
app.use('/docs', swaggerUi.serve, swaggerUi.setup(spec));
Redoc¶
Static, clean, three-panel documentation. Better for public-facing API docs than Swagger UI.
# CLI rendering
npx @redocly/cli build-docs openapi.yaml -o docs/index.html
# Or CDN-hosted single HTML
# <script src="https://cdn.redoc.ly/redoc/latest/bundles/redoc.standalone.js"></script>
# <redoc spec-url="openapi.yaml"></redoc>
Scalar¶
Modern, customizable API reference with a built-in API client. Growing alternative to Swagger UI.
npm install @scalar/express-api-reference
# Express integration
app.use('/reference', apiReference({
spec: { url: '/openapi.yaml' },
theme: 'kepler',
}));
GraphQL Documentation¶
- GraphiQL — official in-browser IDE with docs explorer, query autocomplete, variable pane
- Apollo Studio / Apollo Sandbox — schema explorer, operation history, field usage analytics
- Postman — supports GraphQL schema import and introspection
API Governance and Linting¶
Spectral (OpenAPI / AsyncAPI Linting)¶
Spectral enforces API design standards via configurable rules. Run in CI to prevent non-compliant changes.
# Install
npm install -g @stoplight/spectral-cli
# Lint against built-in OpenAPI rules
spectral lint openapi.yaml
# Lint against custom ruleset
spectral lint openapi.yaml --ruleset .spectral.yaml
# .spectral.yaml — custom API governance rules
extends:
- spectral:oas
rules:
# Require operationId on every endpoint
operation-operationId:
severity: error
given: "$.paths[*][*]"
then:
field: operationId
function: truthy
# Enforce kebab-case paths
paths-kebab-case:
severity: error
given: "$.paths"
then:
function: pattern
functionOptions:
match: "^(/[a-z0-9-{}]+)+$"
# Require description on all parameters
parameter-description:
severity: warn
given: "$.paths[*][*].parameters[*]"
then:
field: description
function: truthy
# Ban query string versioning
no-query-version:
severity: error
given: "$.paths[*][*].parameters[?(@.name == 'version' && @.in == 'query')]"
then:
function: falsy
# Require error response schemas
require-error-responses:
severity: warn
given: "$.paths[*][*].responses"
then:
- field: "400"
function: truthy
- field: "500"
function: truthy
Breaking Change Detection¶
# openapi-diff — detect breaking changes between spec versions
npx openapi-diff old-spec.yaml new-spec.yaml
# buf breaking — detect protobuf breaking changes
buf breaking proto/ --against .git#branch=main
# optic — track API changes in CI
npx @useoptic/optic diff openapi.yaml --base main --check
CI integration example (GitHub Actions):
- name: Lint API spec
run: spectral lint openapi.yaml --fail-severity warn
- name: Check for breaking changes
run: |
git show main:openapi.yaml > /tmp/old-spec.yaml
npx openapi-diff /tmp/old-spec.yaml openapi.yaml --fail-on-incompatible
Proto Linting with buf¶
# buf.yaml — proto lint configuration
version: v2
lint:
use:
- STANDARD # Google's protobuf style guide
- COMMENTS # Require comments on all public types
except:
- PACKAGE_VERSION_SUFFIX
# Run lint
buf lint proto/
# Check for breaking changes against main branch
buf breaking proto/ --against '.git#branch=main'
buf breaking catches: field number reuse, type changes, field removal, service method signature changes — all before merge.
API Tooling Ecosystem¶
| Category | Tools |
|---|---|
| API spec editors | Stoplight Studio, Swagger Editor, Redocly |
| Linting | Spectral (OpenAPI/AsyncAPI), buf lint (Protobuf) |
| Mock servers | Prism, WireMock, Microcks |
| Client testing | Postman, Insomnia, Bruno, HTTPie |
| CLI testing | curl, httpie, grpcurl, wscat, mqtt-cli |
| Load testing | k6, Gatling, Locust, Apache JMeter |
| Contract testing | Pact, Dredd, Schemathesis |
| Documentation | Redoc, Swagger UI, Scalar, Mintlify |
| API gateways | Kong, Envoy, AWS API Gateway, Traefik |
| Service mesh | Istio, Linkerd, Consul Connect |
| Code generation | openapi-generator, oapi-codegen, buf generate |
| Monitoring | Datadog APM, Grafana + Prometheus, New Relic |
Sources¶
OpenAPI & Specification¶
- OpenAPI 3.1.0 Specification
- AsyncAPI 3.0 Documentation
- Spectral OpenAPI Linting — Stoplight
- Microsoft REST API Guidelines
- Google API Design Guide
- Zalando RESTful API Guidelines
Authentication & Security¶
- OAuth 2.0 — RFC 6749
- OAuth 2.1 Draft
- PKCE — RFC 7636
- JSON Web Tokens — RFC 7519
- OWASP API Security Top 10 (2023)
- SPIFFE/SPIRE — Workload Identity
API Design¶
- Idempotency Keys — Stripe Docs
- HTTP Problem Details — RFC 9457
- Sunset Header — RFC 8594
- Cursor Pagination — Slack Engineering
- API Versioning — Stripe Blog
Rate Limiting & Gateways¶
- Kong Gateway Documentation
- Envoy Proxy — Rate Limiting
- AWS API Gateway Documentation
- Rate Limiting Algorithms — Stripe Engineering
Webhooks¶
Testing & Monitoring¶
- k6 Documentation
- grpcurl GitHub
- Pact Contract Testing
- OpenTelemetry Documentation
- Prism Mock Server — Stoplight