Architecture¶
Deployment Modes¶
Single-Node vs Cluster¶
| Feature | Single-Node | Cluster |
|---|---|---|
| Scalability | Vertical only | Horizontal & Vertical |
| Operational Complexity | Very Low (1 binary) | Moderate (3 component types) |
| Multi-tenancy | No | Yes (via account IDs) |
| Replication | No (relies on durable disk) | Yes (-replicationFactor=N) |
| Target Workload | Up to ~1M samples/sec | Billions of series, 100M+ samples/sec |
| External Dependencies | None | None |
Recommendation: Start with single-node. Only move to cluster when you need multi-tenancy, horizontal scaling beyond a single machine, or application-level replication.
Component Roles¶
Each signal type follows the same tri-component pattern for cluster mode:
| Component Role | Metrics | Logs | Traces |
|---|---|---|---|
| Ingestion | vminsert |
vlinsert |
vtinsert |
| Querying | vmselect |
vlselect |
vtselect |
| Storage | vmstorage |
vlstorage |
vtstorage |
All three types are stateless (insert/select) or stateful (storage), and can be scaled independently.
Full Stack Architecture¶
flowchart TB
subgraph Sources["Data Sources"]
K8s["Kubernetes<br/>Pods & Services"]
Apps["Applications<br/>(OTel SDK)"]
Infra["Infrastructure<br/>(node_exporter, etc.)"]
Logs["Log Sources<br/>(Fluentbit, Logstash)"]
end
subgraph Collection["Collection Layer"]
Agent["vmagent<br/>(DaemonSet)<br/>Scrape + Push"]
OTel["OTel Collector<br/>(optional)"]
end
subgraph Proxy["Routing Layer"]
Auth["vmauth<br/>Auth · Route · LB"]
end
subgraph MetricsCluster["VictoriaMetrics (Metrics)"]
MI["vminsert ×2"]
MS["vmstorage ×3<br/>(StatefulSet, SSD)"]
MSel["vmselect ×2"]
MI --> MS
MSel --> MS
end
subgraph LogsCluster["VictoriaLogs (Logs)"]
LI["vlinsert ×2"]
LS["vlstorage ×3<br/>(StatefulSet, SSD)"]
LSel["vlselect ×2"]
LI --> LS
LSel --> LS
end
subgraph TracesCluster["VictoriaTraces (Traces)"]
TI["vtinsert ×2"]
TS["vtstorage ×3<br/>(StatefulSet, SSD)"]
TSel["vtselect ×2"]
TI --> TS
TSel --> TS
end
subgraph Alerting["Alerting"]
Alert["vmalert"]
AM["Alertmanager"]
end
subgraph Viz["Visualization"]
Grafana["Grafana"]
VMUI["VMUI<br/>(built-in)"]
end
Sources --> Collection
Collection --> Auth
Logs --> Auth
Auth -->|"Metrics: /api/v1/write"| MI
Auth -->|"Logs: /insert/jsonline"| LI
Auth -->|"Traces: /insert/opentelemetry"| TI
Auth -->|"PromQL query"| MSel
Auth -->|"LogsQL query"| LSel
Auth -->|"Jaeger query"| TSel
Grafana --> Auth
VMUI --> Auth
Alert --> Auth
Alert -->|"Fire alerts"| AM
style Sources fill:#0d7377,color:#fff
style Collection fill:#ff6600,color:#fff
style Proxy fill:#7b42bc,color:#fff
style MetricsCluster fill:#2a2d3e,color:#fff
style LogsCluster fill:#2a7de1,color:#fff
style TracesCluster fill:#e65100,color:#fff
style Alerting fill:#c62828,color:#fff
style Viz fill:#ff6600,color:#fff
vmalert Evaluation Flow¶
sequenceDiagram
participant A as vmalert
participant P as vmauth (Proxy)
participant VM as VictoriaMetrics / Logs
participant AM as Alertmanager
Note over A: Evaluate Rules (periodic)
A->>P: POST /api/v1/query (Query Request)
P->>VM: Inspect path & Forward to backend
VM-->>P: Return Query Results
P-->>A: Return Query Results
alt Alert Triggered
A->>AM: Send Alert Notification
else Recording Rule
A->>VM: Remote Write Results
end
Multi-Source Log Ingestion¶
VictoriaLogs accepts logs from virtually any source without translation:
flowchart LR
A["Promtail"] -->|"Loki Push API"| B{"vmauth"}
C["Fluent Bit"] -->|"JSON Lines"| B
D["Logstash"] -->|"ES Bulk API"| B
E["OTel Collector"] -->|"OTLP"| B
F["rsyslog"] -->|"Syslog"| B
B -->|"Route & Auth"| G["vlinsert"]
G --> H[("vlstorage")]
style B fill:#7b42bc,color:#fff
style H fill:#2a7de1,color:#fff
Storage Layout¶
VictoriaMetrics (Metrics)¶
/path/to/vmstorage/data/
├── big/ # Large, compacted data blocks
│ ├── YYYY_MM/ # Monthly partitions
│ │ ├── parts/ # Compressed TSDB blocks
│ │ └── tmp/ # Temporary merge workspace
├── small/ # Recently ingested, small blocks
│ └── YYYY_MM/
├── indexdb/ # Inverted index (label → series ID)
└── snapshots/ # Point-in-time snapshots (for vmbackup)
VictoriaLogs (Logs)¶
/path/to/vlstorage/data/
├── YYYYMMDD/ # Daily partitions
│ ├── bloom_filters/ # Bloom filters for word matching
│ ├── columns/ # Columnar storage (msg, timestamp, labels)
│ └── metadata/
VictoriaTraces (Traces)¶
Uses the same storage engine as VictoriaLogs (daily partitions, bloom filters, columnar format) but organizes data by trace ID and span attributes.
Kubernetes Deployment Matrix¶
| Component | Kind | Replicas (Min HA) | Key Resource | Helm Chart |
|---|---|---|---|---|
| vmagent | DaemonSet or Deployment | 1 per node (DS) or 2+ | CPU, Memory | victoria-metrics-agent |
| vmauth | Deployment | 2+ | CPU | victoria-metrics-auth |
| vminsert | Deployment | 2+ | CPU | victoria-metrics-cluster |
| vmselect | Deployment | 2+ | CPU, Memory | victoria-metrics-cluster |
| vmstorage | StatefulSet | 3+ | Disk IOPS, Memory | victoria-metrics-cluster |
| VictoriaLogs | StatefulSet (single-node) or cluster | 1–3 | Disk, Memory | victoria-logs-single |
| VictoriaTraces | StatefulSet (single-node) | 1 | Disk, Memory | — |
| vmalert | Deployment | 1–2 | CPU | victoria-metrics-alert |
| vmoperator | Deployment | 1 | CPU | victoria-metrics-operator |
Key Design Decisions¶
| Decision | Rationale |
|---|---|
| No external dependencies | No PostgreSQL, Redis, ZooKeeper, or object storage required — reduces operational surface |
| Local disk > Object storage | SSDs provide lower latency than S3; compression compensates for limited capacity |
| Shared-nothing cluster | vmstorage nodes don't communicate — each owns its shard, simplifying scaling |
| Consistent hashing | vminsert distributes data deterministically without consensus protocol overhead |
| Bloom filters (VictoriaLogs) | Dramatically less RAM than inverted indexes at the cost of slightly higher scan overhead |
| Apache 2.0 license | More permissive than AGPL — no copyleft obligations for SaaS usage |