Architecture¶
Grafana Server Components¶
The Grafana server itself is a stateless web application with the following internal layers:
flowchart TB
subgraph Frontend["Frontend (TypeScript / React)"]
direction LR
DashUI["Dashboard UI"]
ExploreUI["Explore"]
AlertUI["Alerting UI"]
PluginUI["Panel & App Plugins"]
end
subgraph Backend["Backend (Go)"]
direction LR
API["HTTP API Server"]
Auth["Auth & RBAC"]
QEngine["Query Engine"]
AlertEng["Alert Rule Evaluator"]
Prov["Provisioning Engine"]
PluginMgr["Plugin Manager<br/>(gRPC subprocess host)"]
end
subgraph State["State Layer"]
DB["Database<br/>(PostgreSQL / MySQL / SQLite)"]
Cache["Session Cache<br/>(Redis / Memcached)"]
end
subgraph External["External Data Sources"]
Prom["Prometheus / Mimir"]
LokiDS["Loki"]
TempoDS["Tempo"]
SQL["MySQL / PostgreSQL"]
ES["Elasticsearch"]
CW["CloudWatch"]
end
Frontend --> API
API --> Auth
API --> QEngine
API --> AlertEng
API --> Prov
QEngine --> PluginMgr
PluginMgr -->|gRPC| External
Auth --> DB
AlertEng --> DB
Prov --> DB
Auth --> Cache
style Frontend fill:#ff6600,color:#fff
style Backend fill:#2a2d3e,color:#fff
style State fill:#1a1d2e,color:#fff
style External fill:#0d7377,color:#fff
Key Architectural Properties¶
| Property | Detail |
|---|---|
| Stateless frontend | All state is in the external DB and cache |
| Plugin isolation | Backend plugins run as gRPC subprocesses |
| Provisioning | Dashboards, data sources, alerts loaded from YAML/JSON at startup |
| Multi-org | Single Grafana instance, multiple isolated organizations |
| API-first | All UI operations have corresponding REST API endpoints |
LGTM Stack — Full Production Architecture¶
flowchart TB
subgraph Apps["Instrumented Applications"]
App1["Service A<br/>(OTel SDK)"]
App2["Service B<br/>(OTel SDK)"]
App3["Service C<br/>(Prometheus client)"]
end
subgraph Infra["Infrastructure"]
K8s["Kubernetes"]
Nodes["VM / Bare Metal"]
end
subgraph Collection["Grafana Alloy (DaemonSet / Sidecar)"]
Recv["Receivers<br/>OTLP, Prometheus, Syslog"]
Proc["Processors<br/>Batch, MemoryLimiter, ResourceDetection"]
Exp["Exporters"]
end
subgraph Mimir["Grafana Mimir"]
MD["Distributor"]
MI["Ingester"]
MQ["Querier"]
MSg["Store-Gateway"]
MC["Compactor"]
end
subgraph Loki["Grafana Loki"]
LD["Distributor"]
LI["Ingester"]
LQ["Querier"]
LQF["Query Frontend"]
LC["Compactor"]
end
subgraph Tempo["Grafana Tempo"]
TD["Distributor"]
TI["Ingester"]
TQ["Querier"]
TQF["Query Frontend"]
TMG["Metrics Generator"]
end
subgraph ObjStore["Object Storage (S3 / GCS / Azure)"]
Blocks["Metric Blocks"]
Chunks["Log Chunks + Index"]
Traces["Trace Blocks (Parquet)"]
end
subgraph Grafana["Grafana Server (HA)"]
GF1["Grafana Pod 1"]
GF2["Grafana Pod 2"]
GFn["Grafana Pod N"]
end
subgraph Supporting
PG["PostgreSQL<br/>(Grafana metadata DB)"]
Redis["Redis<br/>(Session cache)"]
LB["Load Balancer / Ingress"]
end
Apps --> Collection
Infra --> Collection
Collection -->|remote_write| MD
Collection -->|push| LD
Collection -->|OTLP gRPC| TD
MD --> MI
MI --> Blocks
MQ --> MI
MQ --> MSg
MSg --> Blocks
MC --> Blocks
LD --> LI
LI --> Chunks
LQF --> LQ
LQ --> LI
LQ --> Chunks
LC --> Chunks
TD --> TI
TI --> Traces
TQF --> TQ
TQ --> TI
TQ --> Traces
TMG --> MD
GF1 --> PG
GF2 --> PG
GFn --> PG
GF1 --> Redis
LB --> GF1
LB --> GF2
LB --> GFn
Grafana -.->|PromQL| MQ
Grafana -.->|LogQL| LQF
Grafana -.->|TraceQL| TQF
style Apps fill:#0d7377,color:#fff
style Infra fill:#0d7377,color:#fff
style Collection fill:#ff6600,color:#fff
style Mimir fill:#7b42bc,color:#fff
style Loki fill:#2a7de1,color:#fff
style Tempo fill:#e65100,color:#fff
style ObjStore fill:#0d1117,color:#fff
style Grafana fill:#ff6600,color:#fff
style Supporting fill:#1a1d2e,color:#fff
Mimir Architecture (Metrics)¶
flowchart LR
subgraph Write["Write Path"]
D["Distributor<br/>(validates, shards, replicates)"]
I["Ingester<br/>(in-memory TSDB + WAL)"]
end
subgraph Read["Read Path"]
QF["Query Frontend<br/>(splits, caches, queues)"]
Q["Querier<br/>(executes PromQL)"]
SG["Store-Gateway<br/>(indexes object storage)"]
end
subgraph Background["Background"]
C["Compactor<br/>(vertical + horizontal compaction)"]
end
subgraph Storage["Object Storage"]
OS["S3 / GCS / Azure<br/>(TSDB Blocks)"]
end
Prom["Prometheus / Alloy"] -->|remote_write| D
D -->|hash ring| I
I -->|flush every 2h| OS
QF --> Q
Q -->|recent data| I
Q -->|historical data| SG
SG --> OS
C --> OS
style Write fill:#7b42bc,color:#fff
style Read fill:#2a7de1,color:#fff
style Background fill:#1a1d2e,color:#fff
style Storage fill:#0d1117,color:#fff
Deployment Modes¶
| Mode | Description | Use Case |
|---|---|---|
| Monolithic | All components in a single process/pod | Dev, testing, small scale |
| Read-Write | Separate read and write paths | Medium scale |
| Microservices | Each component as independent pods | Production, hyperscale |
Loki Architecture (Logs)¶
flowchart LR
subgraph Write["Write Path"]
LD["Distributor<br/>(validates, routes by label hash)"]
LI["Ingester<br/>(compresses into chunks, indexes labels)"]
end
subgraph Read["Read Path"]
LQF["Query Frontend<br/>(splits time ranges, queues)"]
LQ["Querier<br/>(executes LogQL)"]
LIG["Index Gateway<br/>(metadata lookups)"]
end
subgraph Background["Background"]
LC["Compactor<br/>(merges index files, retention)"]
end
subgraph Storage["Object Storage"]
LOS["S3 / GCS / Azure<br/>(Chunks + Index)"]
end
Alloy["Alloy / Promtail"] -->|push| LD
LD --> LI
LI -->|flush| LOS
LQF --> LQ
LQ -->|recent| LI
LQ -->|historical| LIG
LIG --> LOS
LC --> LOS
style Write fill:#2a7de1,color:#fff
style Read fill:#0d7377,color:#fff
style Background fill:#1a1d2e,color:#fff
style Storage fill:#0d1117,color:#fff
Key Design Choice: Loki only indexes labels, not log content. This makes it 10–100x cheaper to operate than full-text-indexing alternatives (e.g., Elasticsearch) but requires effective label design.
Tempo Architecture (Traces)¶
flowchart LR
subgraph Write["Write Path"]
TD["Distributor<br/>(OTLP, Jaeger, Zipkin)"]
TI["Ingester<br/>(Parquet columns + bloom filters)"]
end
subgraph Read["Read Path"]
TQF["Query Frontend<br/>(splits, shards)"]
TQ["Querier<br/>(TraceQL engine)"]
end
subgraph SideEffects["Side Effects"]
TMG["Metrics Generator<br/>(RED metrics → Mimir)"]
end
subgraph Storage["Object Storage"]
TOS["S3 / GCS / Azure<br/>(Parquet Trace Blocks)"]
end
OTel["Apps (OTel SDK)"] -->|OTLP| TD
TD --> TI
TD --> TMG
TI -->|flush blocks| TOS
TQF --> TQ
TQ --> TI
TQ --> TOS
TMG -->|remote_write| Mimir["Mimir"]
style Write fill:#e65100,color:#fff
style Read fill:#ff6600,color:#fff
style SideEffects fill:#7b42bc,color:#fff
style Storage fill:#0d1117,color:#fff
Key Design Choice: No traditional index — Tempo uses Parquet columnar storage with bloom filters. TraceQL queries selectively load required columns, making large-scale trace search performant.
Kubernetes Deployment Topology¶
A typical production Grafana + LGTM deployment on Kubernetes uses these Helm charts:
| Component | Helm Chart | Min Replicas | Scaling |
|---|---|---|---|
| Grafana | grafana/grafana |
2+ (HA) | HPA on CPU/memory |
| Mimir | grafana/mimir-distributed |
3+ ingesters | Per-component HPA |
| Loki | grafana/loki |
3+ ingesters | Per-component HPA |
| Tempo | grafana/tempo-distributed |
3+ ingesters | Per-component HPA |
| Alloy | grafana/alloy (DaemonSet) |
1 per node | DaemonSet auto-scales |
| PostgreSQL | External managed (RDS/CloudSQL) | HA pair | Managed service |
| Redis | External managed (ElastiCache) | HA pair | Managed service |