VictoriaMetrics & VictoriaLogs: Observability for NIS-2 and DORA
TL;DR Modern compliance requirements like NIS-2, DORA, and GDPR demand robust, verifiable …
Diese Serie erklärt systematisch, wie moderne Software compliant entwickelt und betrieben wird – von EU-Regulierungen bis zur technischen Umsetzung.
Modern applications rarely consist of a monolith. More typical are dozens to hundreds of services distributed across containers, /kubernetes/ clusters, databases, and external APIs. Flawlessness in such environments is an illusion – what matters is how quickly and reliably you can detect, categorize, and resolve issues.
Observability is more than “a bit of monitoring.” It’s about reconstructing the internal state of your systems from externally observable behavior. This includes:
When properly implemented, observability becomes a stable component of your platform governance. It not only aids in incident handling but also in capacity planning, cost optimization, and demonstrating to auditors that your systems are controllable and traceable.
With VictoriaMetrics, VictoriaLogs, and Grafana, a stack is available that addresses these requirements without vendor lock-in and can be well integrated into European data protection and compliance models.
Metrics are numerical time series: requests per second, error rates, latencies, CPU, and memory usage. Their advantage is efficiency: they can be collected at high frequency and stored for a very long time.
Prometheus has established itself as the de facto standard for this – and VictoriaMetrics as a performant backend that accepts Prometheus-compatible data and is queryable via PromQL. For capacity planning and Golden Signals monitoring, metrics are the central tool.
Logs provide the story behind the numbers. They contain context: user IDs, request IDs, exception stacks, business events. Especially from a compliance perspective, logs are central: they enable forensics, traceability of accesses, and reconstruction of incidents.
VictoriaLogs is designed to store these log data in a structured, searchable, and tamper-proof manner – an important prerequisite for regulatory requirements, such as those towards NIS2 or DORA, which apply from January 17, 2025.
Traces link events across service boundaries. They show how a single request traverses multiple services, queues, and databases. In highly distributed architectures, this helps to make performance bottlenecks and unexpected dependencies visible.
Even if tracing is not mandatory for every system, traces round out the observability perspective in complex platforms – especially in conjunction with the Golden Signals.
The four Golden Signals – Latency, Traffic, Errors, Saturation – form a practical bridge between technology and operations. They help to understand observability not as a collection of arbitrary metrics but as a focused set of key figures with a clear purpose.
Latency describes the time a system takes to process a request. Important aspects:
With VictoriaMetrics, latency metrics can be captured in detail, and Grafana visualizes them in time series and heatmaps. Logs from VictoriaLogs complement the perspective: they show which specific requests became slower and which business operations are affected.
Traffic measures how much “work” your system performs:
Traffic metrics are essential for contextualizing latency and errors: rising latencies with constant traffic indicate internal problems, while rising latencies with massively increasing traffic suggest capacity limits.
VictoriaMetrics scales very efficiently here, even when storing millions of time series over long periods. This greatly facilitates trend analysis and capacity planning.
Error signals show how reliably your system operates:
Metrics provide aggregated error rates per service or endpoint, while logs provide details on causes and context. With VictoriaLogs and LogQL (compatible with Loki), you can quickly filter: for example, by error type, tenant, or feature flag.
From this data, service-level objectives (SLOs) can be derived, such as: “99.5% of requests to the checkout service are successful over a rollup period of 30 days.” Grafana helps you make these SLOs visible and verifiable.
Saturation describes how much your resources are utilized:
For operations teams, saturation is an early warning signal. As saturation rises, latency and errors often follow. With VictoriaMetrics, you can consistently capture these metrics per node, pod, and service; logs point to specific situations where resources were exhausted.
VictoriaMetrics is a high-performance time-series database that accepts Prometheus-compatible metrics. For those responsible in larger environments, several characteristics are particularly relevant:
This makes VictoriaMetrics a reliable foundation for Golden Signals monitoring in productive platforms.
VictoriaLogs addresses the second core area of observability: logs. For those responsible with a focus on security and compliance, several points are particularly interesting:
user_id, tenant, request_id, or feature_flag.The result is a log backend that provides both operations teams and data protection and compliance officers with actionable, structured information – without relying on proprietary SaaS solutions.
Grafana is the visible part of the observability stack. Technically responsible individuals need a tool that:
Key features:
To make the mentioned concepts tangible, let’s consider a typical Django application operated in /kubernetes/ and accessible externally via Ingress.
A Golden Signals dashboard in Grafana k
TL;DR Modern compliance requirements like NIS-2, DORA, and GDPR demand robust, verifiable …
TL;DR GitOps with ArgoCD anchors the desired state of your applications and infrastructure in Git, …
TL;DR Extending the classic 12-Factor-App with factors 13–15 (API First, Telemetry, Auth) is not a …