Victoria Stack¶
Home | Knowledge Hub | Projects Hub
Summary¶
The Victoria Stack is an interconnected suite of open-source observability databases built by VictoriaMetrics, Inc. It abandons the monolithic "all-in-one" design (like Datadog) in favor of specialized, highly optimized, loosely coupled binaries. Each component is designed for extreme resource efficiency — using 5–10x less RAM and ~50% less disk than competing solutions.
Core Databases¶
| Component | Signal | Query Language | API Compatibility | License |
|---|---|---|---|---|
| VictoriaMetrics | Metrics | MetricsQL (PromQL superset) | Prometheus, InfluxDB, Graphite, Datadog, NewRelic, OpenTelemetry | Apache 2.0 |
| VictoriaLogs | Logs | LogsQL | Elasticsearch Bulk, Loki Push, Syslog, OTLP, Fluentbit JSON | Apache 2.0 |
| VictoriaTraces | Traces | Jaeger Query API + experimental Tempo API (v0.8+) | OTLP (gRPC + HTTP), Jaeger, Zipkin, Tempo DS (experimental) | Apache 2.0 |
Edge Daemons¶
| Tool | Purpose | Key Feature |
|---|---|---|
vmagent |
Drop-in Prometheus scraper + metric router | Supports 50+ service discovery mechanisms |
vmalert |
Alerting & recording rule evaluator | Evaluates rules against any backend (metrics + logs) |
vmauth |
Smart HTTP proxy, auth, routing, load balancing | Routes traffic across all 3 databases by URL path |
vmbackup / vmrestore |
Incremental snapshot to S3/GCS/Azure | Point-in-time consistent backups without lock |
vmoperator |
Kubernetes operator (CRDs) | GitOps-native management of the full stack |
Repository & Community¶
| Attribute | Detail |
|---|---|
| Repository | github.com/VictoriaMetrics/VictoriaMetrics |
| Stars | 16.7k+ ⭐ |
| Latest Version | v1.139.0 (stable), v1.122.x (LTS) |
| Language | Go |
| License | Apache 2.0 (core); Enterprise features require paid license |
| Company | VictoriaMetrics, Inc. |
| Founded | ~2018 by Aliaksandr Valialkin |
Evaluation¶
-
Why it's better: Built on a shared design philosophy of extreme resource efficiency and zero-tuning operability. Uses ~10x less RAM and ~50% less disk space than Prometheus (via ZSTD optimizations). OTLP and Loki native ingestion APIs mean no translation layers are needed. Decoupled reads and writes allow linear cluster scaling.
-
When it fits (Applicability):
- High-ephemerality Kubernetes clusters with massive label churn
- IoT telemetry streams with millions of active series
- Environments processing terabytes of logs/traces daily
- When cloud observability costs (Datadog/NewRelic) have become prohibitive
- When local Prometheus/Loki instances are frequently OOM-killing due to high cardinality
-
Teams that want a single-binary, zero-dependency deployment
-
Pros and Cons:
| Pros | Cons |
|---|---|
| 5–10x less RAM than Prometheus, 50% less disk | Physically fragmented backends (3 separate storage engines) |
| Apache 2.0 license (more permissive than AGPL) | Lacks built-in correlation UI — relies on Grafana |
| Single-binary deployment, near-zero config | VictoriaTraces is newer and less battle-tested than Tempo |
| Drop-in API compatibility (Prometheus, Loki, OTLP) | Enterprise features (downsampling, SSO, retention filters) require paid license |
| MetricsQL fixes PromQL quirks (no extrapolation) | Smaller community than Grafana ecosystem |
| Cluster mode with shared-nothing architecture | No native multi-tenancy in single-node mode |
| vmoperator for GitOps Kubernetes management | Cross-signal correlation requires manual Grafana config |
- Common Use Cases:
- Roblox: Billions of active time series, 100% uptime across multiple quarters
- Spotify R&D: Replaced internal "Heroic" system for better scale
- CERN: Real-time monitoring of CMS detector system
- Grammarly: 10x cost reduction vs previous solution
- DreamHost: 80% memory reduction, 76M active time series
-
Other: Adidas, Wix, Brandwatch, DSV, Dig Security, Sensedia
-
Licensing & Commercial Use:
- Core databases: Apache 2.0 (no copyleft restrictions)
- Enterprise features: separate paid license (downsampling, retention filters, SSO in vmauth, advanced alerting)
- VictoriaMetrics Cloud: Managed SaaS — Single-node from $225/mo, Cluster from $1,300/mo
-
No per-host, per-user, or per-GB licensing
-
Ecosystem & Data Connections:
- Ingestion: Prometheus remote_write, InfluxDB line protocol, Graphite, Datadog, NewRelic, OpenTelemetry (OTLP), Elasticsearch Bulk, Loki Push, Syslog, Fluentbit JSON
- Querying: MetricsQL/PromQL, LogsQL, Jaeger Query API
- Visualization: Grafana (primary), built-in VMUI dashboard
- Collection: vmagent, OpenTelemetry Collector, Prometheus, Fluentbit, Logstash, Promtail
- IaC: vmoperator (K8s CRDs), Helm charts, Terraform provider
-
Backup: vmbackup/vmrestore to S3/GCS/Azure
-
Compatibility & Requirements:
- Runs on Linux, macOS, Docker, Kubernetes
- Zero external dependencies — no PostgreSQL, no Redis, no object storage required
- Single-node: 1 CPU, 256 MB RAM minimum (handles millions of series)
- Cluster: scales linearly with added vmstorage nodes
-
Storage: local SSDs recommended (not object storage)
-
Alternatives:
- Grafana Mimir — Horizontally scalable Prometheus, microservices-based, AGPL
- Thanos — Sidecar pattern for existing Prometheus, object storage
- Prometheus — The standard, single-node only
- InfluxDB — Dedicated TSDB, different query language (Flux)
- Grafana Loki — Log aggregation, label-only indexing
- SigNoz — OpenTelemetry-native, ClickHouse-backed
-
Elasticsearch/OpenSearch — Full-text log search, heavier resource footprint
-
Migration & Lock-in Risks:
- Very low lock-in on metrics — 100% PromQL compatible; MetricsQL extensions are optional
- Low lock-in on logs — accepts Loki API, Elasticsearch Bulk API; LogsQL is proprietary but data is portable
- Low lock-in on traces — accepts OTLP natively; data can be re-ingested elsewhere
- Migration from Prometheus: Add
remote_writeURL — zero downtime -
Migration from Loki: Switch Promtail/Fluentbit destination URL
-
Community Health & Support:
- 16.7k+ GitHub stars, active development, responsive maintainers
- Used by major companies (Roblox, Spotify, CERN, Grammarly, Adidas)
- Enterprise SLAs available
- Active Slack community, regular blog posts and conference talks
Notes In This Folder¶
Related Topics¶
- Grafana — visualization layer used with the Victoria Stack
- LGTM Stack — Grafana's competing full-stack observability solution
- LGTM vs Victoria Stack — canonical comparison note
- Observability Stacks Comparison — 6-way comparison including Coroot, SigNoz, SkyWalking, OpenObserve
- Prometheus — the metrics standard that VictoriaMetrics is wire-compatible with
- OpenTelemetry — the standard telemetry framework accepted natively by all Victoria components
Assets¶
Store local images, diagrams, and PDFs in the _assets/ subfolder. Prefer Mermaid for inline diagrams.
Next Actions¶
- ~~Create comparison note: LGTM vs Victoria~~ → completed: LGTM vs Victoria Stack
- Create comparison note: VictoriaLogs vs Loki (standalone deep dive)
- Benchmark VictoriaLogs vs Loki at 100 GB/day log volume
- Research VictoriaMetrics anomaly detection features (vmanomaly)