Monitoring PVE 8 via Prometheus on Kubernetes

From Jwiki
Revision as of 17:06, 29 August 2025 by Gyurci08 (talk | contribs)

Monitor Proxmox with Prometheus Exporter and Per-Host Tokens

This guide outlines a robust and secure method for deploying the `prometheus-pve-exporter` to a Kubernetes cluster. This architecture uses a single exporter instance that is dynamically configured at startup to use unique, per-host API tokens. This provides the operational simplicity of a single deployment with the enhanced security of per-host credentials.

1. Create a Unique Read-Only API Token on Each Proxmox Host

This setup must be performed on each Proxmox host you wish to monitor (e.g., `ahsoka`, `thrawn`). We will use a consistent user and token name across all hosts for simplicity.

Connect to each Proxmox host via SSH and run the following commands.

# On your FIRST Proxmox host (e.g., 'ahsoka'), create the user and first token:
pveum useradd pve-exporter@pve
pveum aclmod / -user pve-exporter@pve -role PVEAuditor
pveum user token add pve-exporter@pve exporter-token
pveum aclmod / -token 'pve-exporter@pve!exporter-token' -role PVEAuditor

# On ALL SUBSEQUENT hosts (e.g., 'thrawn'), the user is synced by the cluster.
# You only need to create a new token with the same name.
pveum user token add pve-exporter@pve exporter-token
pveum aclmod / -token 'pve-exporter@pve!exporter-token' -role PVEAuditor

Important: The `pveum user token add` command will generate a unique secret value on each host. You must copy the secret value for each host immediately, as you will not be able to see it again.

(Optional) Cleanup Script

If you need to re-run the setup on a host, first delete the old token.

pveum aclmod / -delete 1 -token 'pve-exporter@pve!exporter-token'
pveum user token remove pve-exporter@pve exporter-token
# Only run userdel after removing all tokens for that user from all hosts.
# pveum userdel pve-exporter@pve

2. Create the Kubernetes Manifests

On your local machine, create a single YAML file (e.g., `pve-exporter-full.yaml`). This file contains all the necessary Kubernetes resources.

Important: Before saving, populate the `Secret` with the unique token values you generated on each host. The keys in the secret (`ahsoka-token`, `thrawn-token`) must match the environment variable names used in the `initContainer`.

# pve-exporter-full.yaml
---
apiVersion: v1
kind: Secret
metadata:
  name: jgy-pve-exporter-secrets
  namespace: monitoring
type: Opaque
stringData:
  # Populate with the UNIQUE secret values generated on each Proxmox host
  ahsoka-token: "UNIQUE_SECRET_VALUE_FOR_AHSOKA"
  thrawn-token: "UNIQUE_SECRET_VALUE_FOR_THRAWN"
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: jgy-pve-exporter-config-template
  namespace: monitoring
data:
  pve.yml: |
    # The token_name is now consistent across all modules.
    # --- Module for ahsoka ---
    ahsoka:
      user: pve-exporter@pve
      token_name: exporter-token
      token_value: "${PVE_AHSOKA_TOKEN}"
      verify_ssl: false
    # --- Module for thrawn ---
    thrawn:
      user: pve-exporter@pve
      token_name: exporter-token
      token_value: "${PVE_THRAWN_TOKEN}"
      verify_ssl: false
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: jgy-pve-exporter
  namespace: monitoring
  labels:
    app: jgy-pve-exporter
spec:
  replicas: 1
  selector:
    matchLabels:
      app: jgy-pve-exporter
  template:
    metadata:
      labels:
        app: jgy-pve-exporter
    spec:
      volumes:
      - name: config-template-volume
        configMap:
          name: jgy-pve-exporter-config-template
      - name: processed-config-volume
        emptyDir: {}
      - name: tmp
        emptyDir: {}
      initContainers:
      - name: init-config-secrets
        image: busybox:1.36
        command: ['/bin/sh', '-c']
        args:
        - |
          sed -e "s|\${PVE_AHSOKA_TOKEN}|${PVE_AHSOKA_TOKEN}|g" \
              -e "s|\${PVE_THRAWN_TOKEN}|${PVE_THRAWN_TOKEN}|g" \
              /etc/config-template/pve.yml > /etc/processed-config/pve.yml
        env:
        - name: PVE_AHSOKA_TOKEN
          valueFrom:
            secretKeyRef:
              name: jgy-pve-exporter-secrets
              key: ahsoka-token
        - name: PVE_THRAWN_TOKEN
          valueFrom:
            secretKeyRef:
              name: jgy-pve-exporter-secrets
              key: thrawn-token
        volumeMounts:
        - name: config-template-volume
          mountPath: /etc/config-template
          readOnly: true
        - name: processed-config-volume
          mountPath: /etc/processed-config
      containers:
      - name: pve-exporter
        image: prompve/prometheus-pve-exporter:3.5.5
        args:
        - "--config.file=/etc/prometheus/pve.yml"
        - "--web.listen-address=:9106"
        ports:
        - name: http-metrics
          containerPort: 9106
          protocol: TCP
        livenessProbe:
          httpGet:
            path: /
            port: http-metrics
          initialDelaySeconds: 10
          periodSeconds: 15
        readinessProbe:
          httpGet:
            path: /
            port: http-metrics
          initialDelaySeconds: 5
          periodSeconds: 5
        securityContext:
          runAsNonRoot: true
          runAsUser: 1000
          readOnlyRootFilesystem: true
          allowPrivilegeEscalation: false
          capabilities:
            drop:
            - ALL
        volumeMounts:
        - name: processed-config-volume
          mountPath: /etc/prometheus
          readOnly: true
        - name: tmp
          mountPath: /tmp
        resources:
          requests:
            cpu: '0'
            memory: 128Mi
          limits:
            cpu: '0'
            memory: 256Mi
---
apiVersion: v1
kind: Service
metadata:
  name: jgy-pve-exporter
  namespace: monitoring
  labels:
    app: jgy-pve-exporter
spec:
  selector:
    app: jgy-pve-exporter
  ports:
  - name: http-metrics
    port: 9106
    targetPort: http-metrics

3. Create the Prometheus Scrape Configuration

Create a final YAML file for the `ScrapeConfig`. This tells Prometheus how to scrape the single exporter for all your Proxmox hosts, dynamically setting the `target` and `module` parameters for each one.

# pve-scrape-config.yaml
---
apiVersion: monitoring.coreos.com/v1alpha1
kind: ScrapeConfig
metadata:
  name: jgy-proxmoxes
  namespace: monitoring
  labels:
    prometheus: jgy-prometheus
spec:
  staticConfigs:
    - targets:
        - ahsoka.tatooine.jgy.local
        - thrawn.tatooine.jgy.local
        # Add other hosts here
  metricsPath: /pve
  relabelings:
    # Rule 1: Take the target address and use it as the 'target' URL parameter.
    - sourceLabels: [__address__]
      targetLabel: __param_target
      
    # Rule 2: Extract the hostname (e.g., "ahsoka") and use it as the 'module' URL parameter.
    - sourceLabels: [__address__]
      regex: '([^.]+)\..*' # Captures the part before the first dot
      targetLabel: __param_module
      
    # Rule 3: Set the 'instance' label to the Proxmox host's address.
    - sourceLabels: [__param_target]
      targetLabel: instance
      
    # Rule 4: Rewrite the scrape address to point to our single exporter service.
    - targetLabel: __address__
      replacement: jgy-pve-exporter.monitoring.svc:9106

4. Apply and Verify

Apply the two Kubernetes manifests to your cluster.

kubectl apply -f pve-exporter-full.yaml
kubectl apply -f pve-scrape-config.yaml

Check that the pod is running and that Prometheus is successfully scraping the targets.

# Check pod status
kubectl get pods -n monitoring -l app=jgy-pve-exporter

After a minute, navigate to your Prometheus UI, go to Status -> Targets, and verify that a target for each of your Proxmox hosts is present and has a state of UP.