Monitoring PVE 8 via Prometheus on Kubernetes: Difference between revisions

From Jwiki
No edit summary
No edit summary
Line 1: Line 1:
== Monitor Proxmox with Prometheus Exporter on Kubernetes ==
== Monitor Proxmox with Prometheus Exporter and Per-Host Tokens ==
This guide outlines a robust and secure method for deploying the `prometheus-pve-exporter` to a Kubernetes cluster. Since it is a security best practice to use a unique API token for each host, this guide details how to deploy a dedicated exporter instance for each host you wish to monitor.
This guide outlines a robust and secure method for deploying the `prometheus-pve-exporter` to a Kubernetes cluster. This architecture uses a single exporter instance that is dynamically configured at startup to use unique, per-host API tokens. This provides the operational simplicity of a single deployment with the enhanced security of per-host credentials.


=== 1. Create a Read-Only User and API Token on Each Proxmox Host ===
=== 1. Create a Unique Read-Only API Token on Each Proxmox Host ===
This one-time setup must be performed on '''each''' Proxmox host you want to monitor. This process ensures a clean permission set and avoids common access control list (ACL) conflicts. Connect to your host via SSH and run the following commands.
This setup must be performed on '''each''' Proxmox host you wish to monitor (e.g., `ahsoka`, `thrawn`). We will use a consistent user and token name across all hosts for simplicity.


The script below will:
Connect to each Proxmox host via SSH and run the following commands.
*  Create a user named `pve-exporter@pve` for monitoring.
*  Assign the built-in `PVEAuditor` role to the new user.
*  Create an API token named `exporter-token`.
*  Assign the `PVEAuditor` role '''directly to the API token''' to override any potential ACL inheritance issues.


<syntaxhighlight lang="bash">
<syntaxhighlight lang="bash">
# 1. Create the user
# On your FIRST Proxmox host (e.g., 'ahsoka'), create the user and first token:
pveum useradd pve-exporter@pve
pveum useradd pve-exporter@pve
# 2. Assign the standard PVEAuditor role to the USER
pveum aclmod / -user pve-exporter@pve -role PVEAuditor
pveum aclmod / -user pve-exporter@pve -role PVEAuditor
pveum user token add pve-exporter@pve exporter-token
pveum aclmod / -token 'pve-exporter@pve!exporter-token' -role PVEAuditor


# 3. Create the API token for the user
# On ALL SUBSEQUENT hosts (e.g., 'thrawn'), the user is synced by the cluster.
# You only need to create a new token with the same name.
pveum user token add pve-exporter@pve exporter-token
pveum user token add pve-exporter@pve exporter-token
# 4. THE CRITICAL STEP: Grant the PVEAuditor role DIRECTLY to the API TOKEN
pveum aclmod / -token 'pve-exporter@pve!exporter-token' -role PVEAuditor
pveum aclmod / -token 'pve-exporter@pve!exporter-token' -role PVEAuditor
</syntaxhighlight>
</syntaxhighlight>


'''Important:''' The `pveum user token add` command will output the '''Token ID''' (e.g., `pve-exporter@pve!exporter-token`) and the '''Secret Value'''. Copy the full secret value immediately, as '''you will not be able to see it again.'''
'''Important:''' The `pveum user token add` command will generate a '''unique secret value''' on each host. You must copy the secret value for '''each''' host immediately, as you will not be able to see it again.


==== (Optional) Cleanup Script ====
==== (Optional) Cleanup Script ====
If you need to re-run the setup on a host, first delete the old resources to ensure a clean state.
If you need to re-run the setup on a host, first delete the old token.
<syntaxhighlight lang="bash">
<syntaxhighlight lang="bash">
pveum aclmod / -delete 1 -token 'pve-exporter@pve!exporter-token'
pveum aclmod / -delete 1 -token 'pve-exporter@pve!exporter-token'
pveum user token remove pve-exporter@pve exporter-token
pveum user token remove pve-exporter@pve exporter-token
pveum userdel pve-exporter@pve
# Only run userdel after removing all tokens for that user from all hosts.
# pveum userdel pve-exporter@pve
</syntaxhighlight>
</syntaxhighlight>


=== 2. Create the Kubernetes Manifest for Each Host ===
=== 2. Create the Kubernetes Manifests ===
On your local machine, create a single YAML file (e.g., `pve-exporters.yaml`). This file will contain a separate set of Kubernetes resources for each Proxmox host. Below is the complete template for a host named `ahsoka`.
On your local machine, create a single YAML file (e.g., `pve-exporter-full.yaml`). This file contains all the necessary Kubernetes resources.


'''Important:''' Before saving, replace the placeholder values:
'''Important:''' Before saving, populate the `Secret` with the unique token values you generated on each host. The keys in the secret (`ahsoka-token`, `thrawn-token`) must match the environment variable names used in the `initContainer`.
`YOUR_TOKEN_NAME`: The name of your token (e.g., `exporter-token`).
`YOUR_API_TOKEN_SECRET`: The secret value you just generated.
*  `ahsoka.tatooine.jgy.local`: Update with your Proxmox host's fully qualified domain name or IP address.


<syntaxhighlight lang="yaml">
<syntaxhighlight lang="yaml">
# pve-exporter-full.yaml
---
apiVersion: v1
apiVersion: v1
kind: Secret
kind: Secret
Line 51: Line 46:
type: Opaque
type: Opaque
stringData:
stringData:
   ahsoka-token: "asd"
  # Populate with the UNIQUE secret values generated on each Proxmox host
   thrawn-token: "asd"
   ahsoka-token: "UNIQUE_SECRET_VALUE_FOR_AHSOKA"
   thrawn-token: "UNIQUE_SECRET_VALUE_FOR_THRAWN"
---
---
apiVersion: v1
apiVersion: v1
Line 61: Line 57:
data:
data:
   pve.yml: |
   pve.yml: |
    # The token_name is now consistent across all modules.
     # --- Module for ahsoka ---
     # --- Module for ahsoka ---
     ahsoka:
     ahsoka:
       user: pve-exporter@pve
       user: pve-exporter@pve
       token_name: ahsoka-token
       token_name: exporter-token
       token_value: "${PVE_AHSOKA_TOKEN}"
       token_value: "${PVE_AHSOKA_TOKEN}"
       verify_ssl: false
       verify_ssl: false
     # --- Module for thrawn ---
     # --- Module for thrawn ---
     thrawn:
     thrawn:
       user: pve-exporter@pve
       user: pve-exporter@pve
       token_name: thrawn-token
       token_name: exporter-token
       token_value: "${PVE_THRAWN_TOKEN}"
       token_value: "${PVE_THRAWN_TOKEN}"
       verify_ssl: false
       verify_ssl: false
Line 186: Line 182:
</syntaxhighlight>
</syntaxhighlight>


 
=== 3. Create the Prometheus Scrape Configuration ===
Scrape:
Create a final YAML file for the `ScrapeConfig`. This tells Prometheus how to scrape the single exporter for all your Proxmox hosts, dynamically setting the `target` and `module` parameters for each one.
<syntaxhighlight lang="yaml">
<syntaxhighlight lang="yaml">
# pve-scrape-config.yaml
---
apiVersion: monitoring.coreos.com/v1alpha1
apiVersion: monitoring.coreos.com/v1alpha1
kind: ScrapeConfig
kind: ScrapeConfig
Line 199: Line 197:
   staticConfigs:
   staticConfigs:
     - targets:
     - targets:
         - ahsoka.local
         - ahsoka.tatooine.jgy.local
        - thrawn.tatooine.jgy.local
        # Add other hosts here
   metricsPath: /pve
   metricsPath: /pve
   relabelings:
   relabelings:
    # Rule 1: Take the target address and use it as the 'target' URL parameter.
     - sourceLabels: [__address__]
     - sourceLabels: [__address__]
       targetLabel: __param_target
       targetLabel: __param_target
     
    # Rule 2: Extract the hostname (e.g., "ahsoka") and use it as the 'module' URL parameter.
     - sourceLabels: [__address__]
     - sourceLabels: [__address__]
       regex: '([^.]+)\..*'
       regex: '([^.]+)\..*' # Captures the part before the first dot
       targetLabel: __param_module
       targetLabel: __param_module
     
    # Rule 3: Set the 'instance' label to the Proxmox host's address.
     - sourceLabels: [__param_target]
     - sourceLabels: [__param_target]
       targetLabel: instance
       targetLabel: instance
     
    # Rule 4: Rewrite the scrape address to point to our single exporter service.
     - targetLabel: __address__
     - targetLabel: __address__
       replacement: jgy-pve-exporter.monitoring.svc:9106
       replacement: jgy-pve-exporter.monitoring.svc:9106
</syntaxhighlight>
</syntaxhighlight>


=== 3. Apply the Kubernetes Manifest ===
=== 4. Apply and Verify ===
Apply the single YAML file to your cluster to deploy all resources.
Apply the two Kubernetes manifests to your cluster.
<syntaxhighlight lang="bash">
<syntaxhighlight lang="bash">
kubectl apply -f pve-exporters.yaml
kubectl apply -f pve-exporter-full.yaml
kubectl apply -f pve-scrape-config.yaml
</syntaxhighlight>
</syntaxhighlight>


=== 4. Verify the Deployment ===
Check that the pod is running and that Prometheus is successfully scraping the targets.
Check that the pods are running and that Prometheus is successfully scraping the targets.
<syntaxhighlight lang="bash">
<syntaxhighlight lang="bash">
# Check pod status for all exporters, replacing with your host names
# Check pod status
kubectl get pods -n monitoring -l app --selector='app in (jgy-pve-exporter-ahsoka, jgy-pve-exporter-thrawn)'
kubectl get pods -n monitoring -l app=jgy-pve-exporter
</syntaxhighlight>
</syntaxhighlight>


After a minute, navigate to your Prometheus UI, go to '''Status -> Targets''', and verify that targets for `jgy-pve-exporter-ahsoka` (and any others you deployed) are present and have a state of '''UP'''.
After a minute, navigate to your Prometheus UI, go to '''Status -> Targets''', and verify that a target for each of your Proxmox hosts is present and has a state of '''UP'''.
 
'''Notes:'''
*  The `PVE_VERIFY_SSL: "false"` setting is used because Proxmox VE defaults to a self-signed SSL certificate. Set to `"true"` if you use a valid, trusted certificate.
*  The `ServiceMonitor` resource is intended for clusters running the Prometheus Operator. If you are not using it, you will need to add the scrape configuration directly to your `prometheus.yml` file.
*  All Kubernetes resources are deployed to the `monitoring` namespace. Adjust if you use a different one.


[[Category:Proxmox VE]]
[[Category:Proxmox VE]]
[[Category:Kubernetes]]
[[Category:Kubernetes]]
[[Category:Monitoring]]
[[Category:Monitoring]]

Revision as of 17:06, 29 August 2025

Monitor Proxmox with Prometheus Exporter and Per-Host Tokens

This guide outlines a robust and secure method for deploying the `prometheus-pve-exporter` to a Kubernetes cluster. This architecture uses a single exporter instance that is dynamically configured at startup to use unique, per-host API tokens. This provides the operational simplicity of a single deployment with the enhanced security of per-host credentials.

1. Create a Unique Read-Only API Token on Each Proxmox Host

This setup must be performed on each Proxmox host you wish to monitor (e.g., `ahsoka`, `thrawn`). We will use a consistent user and token name across all hosts for simplicity.

Connect to each Proxmox host via SSH and run the following commands.

# On your FIRST Proxmox host (e.g., 'ahsoka'), create the user and first token:
pveum useradd pve-exporter@pve
pveum aclmod / -user pve-exporter@pve -role PVEAuditor
pveum user token add pve-exporter@pve exporter-token
pveum aclmod / -token 'pve-exporter@pve!exporter-token' -role PVEAuditor

# On ALL SUBSEQUENT hosts (e.g., 'thrawn'), the user is synced by the cluster.
# You only need to create a new token with the same name.
pveum user token add pve-exporter@pve exporter-token
pveum aclmod / -token 'pve-exporter@pve!exporter-token' -role PVEAuditor

Important: The `pveum user token add` command will generate a unique secret value on each host. You must copy the secret value for each host immediately, as you will not be able to see it again.

(Optional) Cleanup Script

If you need to re-run the setup on a host, first delete the old token.

pveum aclmod / -delete 1 -token 'pve-exporter@pve!exporter-token'
pveum user token remove pve-exporter@pve exporter-token
# Only run userdel after removing all tokens for that user from all hosts.
# pveum userdel pve-exporter@pve

2. Create the Kubernetes Manifests

On your local machine, create a single YAML file (e.g., `pve-exporter-full.yaml`). This file contains all the necessary Kubernetes resources.

Important: Before saving, populate the `Secret` with the unique token values you generated on each host. The keys in the secret (`ahsoka-token`, `thrawn-token`) must match the environment variable names used in the `initContainer`.

# pve-exporter-full.yaml
---
apiVersion: v1
kind: Secret
metadata:
  name: jgy-pve-exporter-secrets
  namespace: monitoring
type: Opaque
stringData:
  # Populate with the UNIQUE secret values generated on each Proxmox host
  ahsoka-token: "UNIQUE_SECRET_VALUE_FOR_AHSOKA"
  thrawn-token: "UNIQUE_SECRET_VALUE_FOR_THRAWN"
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: jgy-pve-exporter-config-template
  namespace: monitoring
data:
  pve.yml: |
    # The token_name is now consistent across all modules.
    # --- Module for ahsoka ---
    ahsoka:
      user: pve-exporter@pve
      token_name: exporter-token
      token_value: "${PVE_AHSOKA_TOKEN}"
      verify_ssl: false
    # --- Module for thrawn ---
    thrawn:
      user: pve-exporter@pve
      token_name: exporter-token
      token_value: "${PVE_THRAWN_TOKEN}"
      verify_ssl: false
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: jgy-pve-exporter
  namespace: monitoring
  labels:
    app: jgy-pve-exporter
spec:
  replicas: 1
  selector:
    matchLabels:
      app: jgy-pve-exporter
  template:
    metadata:
      labels:
        app: jgy-pve-exporter
    spec:
      volumes:
      - name: config-template-volume
        configMap:
          name: jgy-pve-exporter-config-template
      - name: processed-config-volume
        emptyDir: {}
      - name: tmp
        emptyDir: {}
      initContainers:
      - name: init-config-secrets
        image: busybox:1.36
        command: ['/bin/sh', '-c']
        args:
        - |
          sed -e "s|\${PVE_AHSOKA_TOKEN}|${PVE_AHSOKA_TOKEN}|g" \
              -e "s|\${PVE_THRAWN_TOKEN}|${PVE_THRAWN_TOKEN}|g" \
              /etc/config-template/pve.yml > /etc/processed-config/pve.yml
        env:
        - name: PVE_AHSOKA_TOKEN
          valueFrom:
            secretKeyRef:
              name: jgy-pve-exporter-secrets
              key: ahsoka-token
        - name: PVE_THRAWN_TOKEN
          valueFrom:
            secretKeyRef:
              name: jgy-pve-exporter-secrets
              key: thrawn-token
        volumeMounts:
        - name: config-template-volume
          mountPath: /etc/config-template
          readOnly: true
        - name: processed-config-volume
          mountPath: /etc/processed-config
      containers:
      - name: pve-exporter
        image: prompve/prometheus-pve-exporter:3.5.5
        args:
        - "--config.file=/etc/prometheus/pve.yml"
        - "--web.listen-address=:9106"
        ports:
        - name: http-metrics
          containerPort: 9106
          protocol: TCP
        livenessProbe:
          httpGet:
            path: /
            port: http-metrics
          initialDelaySeconds: 10
          periodSeconds: 15
        readinessProbe:
          httpGet:
            path: /
            port: http-metrics
          initialDelaySeconds: 5
          periodSeconds: 5
        securityContext:
          runAsNonRoot: true
          runAsUser: 1000
          readOnlyRootFilesystem: true
          allowPrivilegeEscalation: false
          capabilities:
            drop:
            - ALL
        volumeMounts:
        - name: processed-config-volume
          mountPath: /etc/prometheus
          readOnly: true
        - name: tmp
          mountPath: /tmp
        resources:
          requests:
            cpu: '0'
            memory: 128Mi
          limits:
            cpu: '0'
            memory: 256Mi
---
apiVersion: v1
kind: Service
metadata:
  name: jgy-pve-exporter
  namespace: monitoring
  labels:
    app: jgy-pve-exporter
spec:
  selector:
    app: jgy-pve-exporter
  ports:
  - name: http-metrics
    port: 9106
    targetPort: http-metrics

3. Create the Prometheus Scrape Configuration

Create a final YAML file for the `ScrapeConfig`. This tells Prometheus how to scrape the single exporter for all your Proxmox hosts, dynamically setting the `target` and `module` parameters for each one.

# pve-scrape-config.yaml
---
apiVersion: monitoring.coreos.com/v1alpha1
kind: ScrapeConfig
metadata:
  name: jgy-proxmoxes
  namespace: monitoring
  labels:
    prometheus: jgy-prometheus
spec:
  staticConfigs:
    - targets:
        - ahsoka.tatooine.jgy.local
        - thrawn.tatooine.jgy.local
        # Add other hosts here
  metricsPath: /pve
  relabelings:
    # Rule 1: Take the target address and use it as the 'target' URL parameter.
    - sourceLabels: [__address__]
      targetLabel: __param_target
      
    # Rule 2: Extract the hostname (e.g., "ahsoka") and use it as the 'module' URL parameter.
    - sourceLabels: [__address__]
      regex: '([^.]+)\..*' # Captures the part before the first dot
      targetLabel: __param_module
      
    # Rule 3: Set the 'instance' label to the Proxmox host's address.
    - sourceLabels: [__param_target]
      targetLabel: instance
      
    # Rule 4: Rewrite the scrape address to point to our single exporter service.
    - targetLabel: __address__
      replacement: jgy-pve-exporter.monitoring.svc:9106

4. Apply and Verify

Apply the two Kubernetes manifests to your cluster.

kubectl apply -f pve-exporter-full.yaml
kubectl apply -f pve-scrape-config.yaml

Check that the pod is running and that Prometheus is successfully scraping the targets.

# Check pod status
kubectl get pods -n monitoring -l app=jgy-pve-exporter

After a minute, navigate to your Prometheus UI, go to Status -> Targets, and verify that a target for each of your Proxmox hosts is present and has a state of UP.