LGTM Stack¶

Home | Knowledge Hub | Projects Hub

Summary¶

LGTM is the Grafana Labs open-source observability stack, named after its four core components: Loki (Logs), Grafana (visualization), Tempo (Traces), and Mimir (Metrics). A fifth pillar, Pyroscope (Profiles), is frequently included, sometimes expanding the acronym to LGTMP or referring to it as "big tent" observability.

The stack is purpose-built so each backend is independently scalable, uses object storage (S3/GCS/Azure Blob) as its primary persistence layer, and speaks OpenTelemetry natively. Grafana sits at the center as the single pane of glass, correlating across all signals.

Component	Signal	Query Language	GitHub Stars	Latest Version
Grafana Mimir	Metrics	PromQL	~5k ⭐	3.0.5
Grafana Loki	Logs	LogQL	~27.9k ⭐	3.7.1
Grafana Tempo	Traces	TraceQL	~4k ⭐	2.10.1 (3.0 in dev)
Grafana Pyroscope	Profiles	FlameQL	~10k ⭐	1.20.2
Grafana	Visualization	—	73.1k ⭐	12.4.2
Grafana Alloy	Collection	HCL (River)	—	1.15.0

Evaluation¶

Why it's better: The only fully open-source stack that covers all four observability pillars (metrics, logs, traces, profiles) with cross-signal correlation in a single UI. Each backend is optimized for its signal type and uses cheap object storage, making the stack 3–10x cheaper than Datadog at scale.
When it fits (Applicability):
Organizations with platform engineering capacity to operate multiple backends
Teams standardizing on OpenTelemetry who want no vendor lock-in
Cloud-native (Kubernetes) environments needing horizontal scalability
Mixed environments with heterogeneous data sources
Budget-conscious organizations needing enterprise-grade observability at open-source cost
Pros and Cons:

Pros	Cons
Each component best-of-breed for its signal type	Operational complexity — 4+ backends to manage
Object-storage-first = dramatically reduced cost	Requires solid Kubernetes & DevOps expertise
OpenTelemetry-native, no vendor lock-in	Signal correlation requires careful config
Massive community, battle-tested at scale	Query languages differ per signal (PromQL, LogQL, TraceQL, FlameQL)
Independent horizontal scaling per component	Multi-tenancy requires auth proxy setup
All-in-one Docker image for dev (`grafana/otel-lgtm`)	Production setup requires 6+ Helm charts
Cross-signal correlation (exemplars, derived fields)	Label cardinality is the #1 operational pitfall

Common Use Cases:
Full-stack Kubernetes observability — metrics, logs, traces, and profiles from all workloads in one view
Centralized enterprise observability platform — multi-tenant, shared infrastructure for multiple teams (Maersk, DHL, Salesforce pattern)
Cost-effective log aggregation — replacing Elasticsearch with Loki for 10–100x cost reduction
Distributed tracing at scale — Tempo handles 100M+ spans/day on object storage alone
AI/ML pipeline observability — tracking model inference latency, GPU utilization, and training metrics
IoT and industrial telemetry — high-volume metric ingestion via Mimir
Licensing & Commercial Use:
Grafana, Loki, Tempo: AGPL-3.0
Mimir: AGPL-3.0
Pyroscope: AGPL-3.0
Alloy: Apache 2.0
All components are free to self-host. If you modify the source and offer it as SaaS, you must release modifications under AGPL-3.0.
Grafana Cloud provides fully managed LGTM: Free ($0), Pro ($19/mo + usage), Enterprise ($25k+/yr)
Ecosystem & Data Connections:
Ingestion protocols: OTLP (gRPC/HTTP), Prometheus remote_write, Jaeger, Zipkin, Syslog, FluentBit
Collection: Grafana Alloy (primary), OpenTelemetry Collector, Prometheus, Promtail (legacy)
Storage: S3, GCS, Azure Blob Storage, MinIO (self-hosted)
IaC: Helm charts, Terraform provider, Jsonnet/Tanka, Ansible
Instrumentation: OpenTelemetry SDKs (Go, Java, Python, Node.js, .NET, Rust), auto-instrumentation agents, eBPF
Compatibility & Requirements:
Runs on Kubernetes (recommended), Docker, or bare metal Linux
Min dev setup: docker run grafana/otel-lgtm (single container with all components)
Production requires: Kubernetes cluster, object storage, PostgreSQL (for Grafana metadata), Redis (for sessions)
Object storage is mandatory for Mimir, Loki, and Tempo in production
Alternatives:
Datadog — All-in-one SaaS, highest cost, lowest ops burden
SigNoz — Open-source, OTel-native, ClickHouse-backed, unified single-binary
ELK Stack — Mature for logs, weaker for metrics/traces
New Relic — SaaS, generous free tier, proprietary
Splunk Observability — Enterprise, very expensive
OpenObserve — Open-source, Rust-based, single binary
Migration & Lock-in Risks:
Low lock-in on individual components — each backend uses open storage formats
Moderate lock-in on query languages — PromQL is universal, but LogQL, TraceQL, and FlameQL are Grafana-specific (well-documented, but not portable)
Gradual migration is supported — run old and new stacks in parallel, move one signal at a time
Migration from ELK: KQL/Lucene → LogQL requires query rewriting; Elasticsearch → Loki is a fundamental architecture shift (full-text index → label-only index)
Migration from Prometheus + Jaeger: Mimir accepts remote_write directly; Tempo accepts Jaeger protocol directly — both are near-drop-in replacements
Community Health & Support:
Combined GitHub stars across components: 120k+ (Grafana 73k, Loki 28k, Mimir 5k, Tempo 4k, Pyroscope 10k)
Battle-tested at: Maersk, DHL Express, Dutch Tax Office, Salesforce, and thousands of organizations
Enterprise SLAs via Grafana Labs
Active community forums, Slack, regular GrafanaCON conferences

Notes In This Folder¶

Grafana — the visualization layer and hub of the LGTM stack
Victoria Stack — competing full-stack (VictoriaMetrics + VictoriaLogs + VictoriaTraces), Apache 2.0, lower resource footprint
LGTM vs Victoria Stack — canonical comparison note
OpenTelemetry — the industry-standard telemetry collection framework used to feed the LGTM stack
Observability Stacks Comparison — 6-way comparison including Coroot, SigNoz, SkyWalking, OpenObserve

Assets¶

Store local images, diagrams, and PDFs in the _assets/ subfolder. Prefer Mermaid for inline diagrams.

Next Actions¶

Deep dive into Grafana Adaptive Metrics and Adaptive Logs (cost optimization features)
~~Research LGTM vs SigNoz comparison note~~ → covered in Observability Stacks Comparison
Benchmark object storage costs across S3, GCS, and Azure Blob for LGTM workloads