Skip to content

Victoria Stack

Home | Knowledge Hub | Projects Hub

Summary

The Victoria Stack is an interconnected suite of open-source observability databases built by VictoriaMetrics, Inc. It abandons the monolithic "all-in-one" design (like Datadog) in favor of specialized, highly optimized, loosely coupled binaries. Each component is designed for extreme resource efficiency — using 5–10x less RAM and ~50% less disk than competing solutions.

Core Databases

Component Signal Query Language API Compatibility License
VictoriaMetrics Metrics MetricsQL (PromQL superset) Prometheus, InfluxDB, Graphite, Datadog, NewRelic, OpenTelemetry Apache 2.0
VictoriaLogs Logs LogsQL Elasticsearch Bulk, Loki Push, Syslog, OTLP, Fluentbit JSON Apache 2.0
VictoriaTraces Traces Jaeger Query API + experimental Tempo API (v0.8+) OTLP (gRPC + HTTP), Jaeger, Zipkin, Tempo DS (experimental) Apache 2.0

Edge Daemons

Tool Purpose Key Feature
vmagent Drop-in Prometheus scraper + metric router Supports 50+ service discovery mechanisms
vmalert Alerting & recording rule evaluator Evaluates rules against any backend (metrics + logs)
vmauth Smart HTTP proxy, auth, routing, load balancing Routes traffic across all 3 databases by URL path
vmbackup / vmrestore Incremental snapshot to S3/GCS/Azure Point-in-time consistent backups without lock
vmoperator Kubernetes operator (CRDs) GitOps-native management of the full stack

Repository & Community

Attribute Detail
Repository github.com/VictoriaMetrics/VictoriaMetrics
Stars 16.7k+ ⭐
Latest Version v1.139.0 (stable), v1.122.x (LTS)
Language Go
License Apache 2.0 (core); Enterprise features require paid license
Company VictoriaMetrics, Inc.
Founded ~2018 by Aliaksandr Valialkin

Evaluation

  • Why it's better: Built on a shared design philosophy of extreme resource efficiency and zero-tuning operability. Uses ~10x less RAM and ~50% less disk space than Prometheus (via ZSTD optimizations). OTLP and Loki native ingestion APIs mean no translation layers are needed. Decoupled reads and writes allow linear cluster scaling.

  • When it fits (Applicability):

  • High-ephemerality Kubernetes clusters with massive label churn
  • IoT telemetry streams with millions of active series
  • Environments processing terabytes of logs/traces daily
  • When cloud observability costs (Datadog/NewRelic) have become prohibitive
  • When local Prometheus/Loki instances are frequently OOM-killing due to high cardinality
  • Teams that want a single-binary, zero-dependency deployment

  • Pros and Cons:

Pros Cons
5–10x less RAM than Prometheus, 50% less disk Physically fragmented backends (3 separate storage engines)
Apache 2.0 license (more permissive than AGPL) Lacks built-in correlation UI — relies on Grafana
Single-binary deployment, near-zero config VictoriaTraces is newer and less battle-tested than Tempo
Drop-in API compatibility (Prometheus, Loki, OTLP) Enterprise features (downsampling, SSO, retention filters) require paid license
MetricsQL fixes PromQL quirks (no extrapolation) Smaller community than Grafana ecosystem
Cluster mode with shared-nothing architecture No native multi-tenancy in single-node mode
vmoperator for GitOps Kubernetes management Cross-signal correlation requires manual Grafana config
  • Common Use Cases:
  • Roblox: Billions of active time series, 100% uptime across multiple quarters
  • Spotify R&D: Replaced internal "Heroic" system for better scale
  • CERN: Real-time monitoring of CMS detector system
  • Grammarly: 10x cost reduction vs previous solution
  • DreamHost: 80% memory reduction, 76M active time series
  • Other: Adidas, Wix, Brandwatch, DSV, Dig Security, Sensedia

  • Licensing & Commercial Use:

  • Core databases: Apache 2.0 (no copyleft restrictions)
  • Enterprise features: separate paid license (downsampling, retention filters, SSO in vmauth, advanced alerting)
  • VictoriaMetrics Cloud: Managed SaaS — Single-node from $225/mo, Cluster from $1,300/mo
  • No per-host, per-user, or per-GB licensing

  • Ecosystem & Data Connections:

  • Ingestion: Prometheus remote_write, InfluxDB line protocol, Graphite, Datadog, NewRelic, OpenTelemetry (OTLP), Elasticsearch Bulk, Loki Push, Syslog, Fluentbit JSON
  • Querying: MetricsQL/PromQL, LogsQL, Jaeger Query API
  • Visualization: Grafana (primary), built-in VMUI dashboard
  • Collection: vmagent, OpenTelemetry Collector, Prometheus, Fluentbit, Logstash, Promtail
  • IaC: vmoperator (K8s CRDs), Helm charts, Terraform provider
  • Backup: vmbackup/vmrestore to S3/GCS/Azure

  • Compatibility & Requirements:

  • Runs on Linux, macOS, Docker, Kubernetes
  • Zero external dependencies — no PostgreSQL, no Redis, no object storage required
  • Single-node: 1 CPU, 256 MB RAM minimum (handles millions of series)
  • Cluster: scales linearly with added vmstorage nodes
  • Storage: local SSDs recommended (not object storage)

  • Alternatives:

  • Grafana Mimir — Horizontally scalable Prometheus, microservices-based, AGPL
  • Thanos — Sidecar pattern for existing Prometheus, object storage
  • Prometheus — The standard, single-node only
  • InfluxDB — Dedicated TSDB, different query language (Flux)
  • Grafana Loki — Log aggregation, label-only indexing
  • SigNoz — OpenTelemetry-native, ClickHouse-backed
  • Elasticsearch/OpenSearch — Full-text log search, heavier resource footprint

  • Migration & Lock-in Risks:

  • Very low lock-in on metrics — 100% PromQL compatible; MetricsQL extensions are optional
  • Low lock-in on logs — accepts Loki API, Elasticsearch Bulk API; LogsQL is proprietary but data is portable
  • Low lock-in on traces — accepts OTLP natively; data can be re-ingested elsewhere
  • Migration from Prometheus: Add remote_write URL — zero downtime
  • Migration from Loki: Switch Promtail/Fluentbit destination URL

  • Community Health & Support:

  • 16.7k+ GitHub stars, active development, responsive maintainers
  • Used by major companies (Roblox, Spotify, CERN, Grammarly, Adidas)
  • Enterprise SLAs available
  • Active Slack community, regular blog posts and conference talks

Notes In This Folder

  • Grafana — visualization layer used with the Victoria Stack
  • LGTM Stack — Grafana's competing full-stack observability solution
  • LGTM vs Victoria Stack — canonical comparison note
  • Observability Stacks Comparison — 6-way comparison including Coroot, SigNoz, SkyWalking, OpenObserve
  • Prometheus — the metrics standard that VictoriaMetrics is wire-compatible with
  • OpenTelemetry — the standard telemetry framework accepted natively by all Victoria components

Assets

Store local images, diagrams, and PDFs in the _assets/ subfolder. Prefer Mermaid for inline diagrams.

Next Actions

  • ~~Create comparison note: LGTM vs Victoria~~ → completed: LGTM vs Victoria Stack
  • Create comparison note: VictoriaLogs vs Loki (standalone deep dive)
  • Benchmark VictoriaLogs vs Loki at 100 GB/day log volume
  • Research VictoriaMetrics anomaly detection features (vmanomaly)