Alert on vulnerabilities ¶

This guide shows you how to set up alerts when critical, priority-based, or high-risk vulnerability findings are present in your workloads.

Nais exposes Prometheus metrics for vulnerability counts, priority, and risk per workload. You can use these to alert your team via Slack or Grafana.

Prerequisites ¶

Your workload has an SBOM generated via nais/docker-build-push
You are familiar with basic Prometheus alerting

Alert on critical vulnerabilities ¶

Create a PrometheusRule in your namespace that triggers an alert when the number of critical vulnerabilities exceeds zero.

yaml

apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  name: vulnerability-alerts
  namespace: <MY-TEAM>
  labels:
    team: <MY-TEAM>
spec:
  groups:
    - name: vulnerability-alerts
      rules:
        - alert: CriticalVulnerability
          expr: nais_workload_vulnerabilities{severity="CRITICAL", workload_namespace="<MY-TEAM>"} > 0
          for: 10m
          annotations:
            summary: "Critical vulnerability detected in {{ $labels.workload_name }}"
            consequence: "The workload is running with one or more critical vulnerabilities."
            action: "Go to Nais Console and handle the vulnerability for {{ $labels.workload_name }}."
          labels:
            namespace: <MY-TEAM>
            severity: critical

Alert on priority (ACT_NOW and HIGH) ¶

Nais also exposes a priority-based metric for the two highest-priority classes: nais_workload_vulnerabilities_priority.

Use this when you want exploitability-driven urgency, not only CVSS severity.

ACT_NOW: has_kev_entry=true, based on CISA KEV. This means known exploitation in the wild and should be treated as highest remediation priority.
HIGH: has_kev_entry=false and (known_ransomware_use=true or EPSS percentile >= 0.90). This means elevated likelihood of exploitation and should be handled quickly.

A CVE is counted in only one of these classes. If it has a KEV entry, it is counted as ACT_NOW, not HIGH.

yaml

apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  name: priority-vulnerability-alerts
  namespace: <MY-TEAM>
  labels:
    team: <MY-TEAM>
spec:
  groups:
    - name: priority-vulnerability-alerts
      rules:
        - alert: ActNowVulnerability
          expr: nais_workload_vulnerabilities_priority{priority="ACT_NOW", workload_namespace="<MY-TEAM>"} > 0
          for: 10m
          annotations:
            summary: "ACT_NOW vulnerability detected in {{ $labels.workload_name }}"
            consequence: "The workload has one or more ACT_NOW vulnerabilities."
            action: "Open Nais Console and handle the vulnerability for {{ $labels.workload_name }}."
          labels:
            namespace: <MY-TEAM>
            severity: critical
        - alert: HighPriorityVulnerability
          expr: nais_workload_vulnerabilities_priority{priority="HIGH", workload_namespace="<MY-TEAM>"} > 0
          for: 10m
          annotations:
            summary: "High-priority vulnerability detected in {{ $labels.workload_name }}"
            consequence: "The workload has one or more HIGH priority vulnerabilities."
            action: "Open Nais Console and handle the vulnerability for {{ $labels.workload_name }}."
          labels:
            namespace: <MY-TEAM>
            severity: warning

Alert on high risk score ¶

You can also alert based on nais_workload_risk_score if you prefer a single aggregated alert per workload instead of one per severity level.

The risk score is calculated as:

Plaintext

(CRITICAL × 10) + (HIGH × 5) + (MEDIUM × 3) + (LOW × 1) + (UNASSIGNED × 5)

A workload with 1 critical vulnerability scores 10, while 20 critical vulnerabilities score 200. Choose a threshold that matches your team's risk tolerance — 200 is a reasonable starting point and corresponds roughly to 20 critical or 40 high severity vulnerabilities.

yaml

apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  name: risk-score-alerts
  namespace: <MY-TEAM>
  labels:
    team: <MY-TEAM>
spec:
  groups:
    - name: risk-score-alerts
      rules:
        - alert: HighRiskScore
          expr: nais_workload_risk_score{workload_namespace="<MY-TEAM>"} > 200
          for: 10m
          annotations:
            summary: "High risk score for {{ $labels.workload_name }}"
            consequence: "The workload has accumulated a high vulnerability score."
            action: "Go to Nais Console and handle vulnerabilities for {{ $labels.workload_name }}."
          labels:
            namespace: <MY-TEAM>
            severity: warning

Activate alerts ¶

Automatically Manually

Add the rule file to your application repository and deploy it with Nais GitHub Action.

Apply the rule file you created, for example:

bash

kubectl apply -f .nais/alert-priority-vulnerabilities.yaml

Send alerts to a dedicated Slack channel ¶

By default, alerts are sent to your team's standard Slack channel. If you want to send vulnerability alerts to a dedicated channel, create an AlertmanagerConfig.

See Advanced Prometheus alerting for a complete example with a custom Slack channel and webhook.

Create alerts in Grafana ¶

Alternatively, you can create alerts directly in Grafana without deploying Kubernetes resources:

Open Grafana and go to Alerting → Alert rules
Click New alert rule
Select the Prometheus/Mimir data source that contains nais_workload_* metrics and use e.g.:
promql
```
nais_workload_vulnerabilities_priority{priority="ACT_NOW", workload_namespace="<MY-TEAM>"}
```
Set the threshold and connect to a Slack contact point

See Create alert in Grafana for a complete step-by-step guide.

After suppressing a vulnerability, the database is updated immediately. Workload metrics (nais_workload_vulnerabilities, nais_workload_vulnerabilities_priority, nais_workload_risk_score) are recomputed periodically (default: every 5 minutes), and alerts clear only after updated metrics are pushed and scraped by Prometheus. Expect roughly a 5-minute delay, plus normal metrics propagation time.

Alert on vulnerabilities ¶

Prerequisites ¶

Alert on critical vulnerabilities ¶

.nais/alert-vulnerabilities.yaml

Alert on priority (ACT_NOW and HIGH) ¶

.nais/alert-priority-vulnerabilities.yaml

Alert on high risk score ¶

.nais/alert-risk-score.yaml

Activate alerts ¶

Send alerts to a dedicated Slack channel ¶

Create alerts in Grafana ¶

Metric update delay after suppression

Learn more ¶