Monitoring PVE 8 via Prometheus on Kubernetes: Difference between revisions

From Jwiki
No edit summary
 
(16 intermediate revisions by the same user not shown)
Line 1: Line 1:
== Monitor Proxmox with Prometheus Exporter on Kubernetes ==
== Monitor Proxmox with Prometheus Exporter (PVE 8) ==
This guide outlines how to deploy the `prometheus-pve-exporter` to a Kubernetes cluster to monitor a remote Proxmox VE host using a secure API token. This is the recommended authentication method. The process involves a minimal, one-time setup on the Proxmox host and the deployment of a single multi-document YAML file to Kubernetes.
This guide outlines a robust and secure method for deploying the `prometheus-pve-exporter` to a Kubernetes cluster. This architecture uses a single exporter instance that is dynamically configured at startup to use unique, per-host API tokens. This provides the operational simplicity of a single deployment with the enhanced security of per-host credentials.


=== 1. Create a Read-Only User and API Token on Proxmox ===
=== 1. Create a Unique Read-Only API Token on Each Proxmox Host ===
The only action required on the Proxmox host is the creation of a dedicated user and an API token for that user. Connect to your Proxmox host via SSH and run the following commands.
This setup must be performed on '''each''' Proxmox host you wish to monitor (e.g., `pve-node-1`, `pve-node-2`). We will use a consistent user and token name across all hosts for simplicity.


The script below will:
Connect to each Proxmox host via SSH and run the following commands.
*  Create a role named <code>PVEExporter</code> with the necessary audit permissions.
*  Create a user named <code>pve-exporter@pve</code> specifically for this purpose (it does not require a password).
*  Assign the read-only role to the new user at the root level (<code>/</code>).
*  Create an API token named <code>exporter-token</code> for the user.


<syntaxhighlight lang="bash">
<syntaxhighlight lang="bash">
# Create the role with read-only privileges
# On your FIRST Proxmox host (e.g., 'pve-node-1'), create the user and first token:
pveum roleadd PVEExporter -privs "Datacenter.Audit Sys.Audit"
pveum useradd pve-exporter@pve
pveum aclmod / -user pve-exporter@pve -role PVEAuditor
pveum user token add pve-exporter@pve exporter-token
pveum aclmod / -token 'pve-exporter@pve!exporter-token' -role PVEAuditor


# Create the user (password login is not needed for token auth)
# On ALL SUBSEQUENT hosts (e.g., 'pve-node-2'), the user is synced by the cluster.
pveum useradd pve-exporter@pve
# You only need to create a new token with the same name.
pveum user token add pve-exporter@pve exporter-token
pveum aclmod / -token 'pve-exporter@pve!exporter-token' -role PVEAuditor
</syntaxhighlight>


# Assign the role to the user for the entire datacenter
'''Important:''' The `pveum user token add` command will generate a '''unique secret value''' on each host. You must copy the secret value for '''each''' host immediately, as you will not be able to see it again.
pveum aclmod / -user pve-exporter@pve -role PVEExporter


# Create the API token for the user
==== (Optional) Cleanup Script ====
pveum user token add pve-exporter@pve exporter-token
If you need to re-run the setup on a host, first delete the old token.
<syntaxhighlight lang="bash">
pveum aclmod / -delete 1 -token 'pve-exporter@pve!exporter-token'
pveum user token remove pve-exporter@pve exporter-token
# Only run userdel after removing all tokens for that user from all hosts.
# pveum userdel pve-exporter@pve
</syntaxhighlight>
</syntaxhighlight>
'''Important:''' The last command will output the token ID and the secret value. Copy the full secret value (the long string of characters) immediately. '''You will not be able to see it again.'''


=== 2. Create the Combined Kubernetes Manifest ===
=== 2. Create the Kubernetes Manifests ===
On your local machine, create a single YAML file named <code>pve-exporter-full.yaml</code>. This file contains all the necessary Kubernetes resources. We will store the token ID and the secret in the Kubernetes Secret.
On your local machine, create a single YAML file (e.g., `pve-exporter-full.yaml`). This file contains all the necessary Kubernetes resources.


'''Important:''' Before saving, replace the placeholder values for <code>YOUR_API_TOKEN_ID</code> (e.g., <code>pve-exporter@pve!exporter-token</code>) and <code>YOUR_API_TOKEN_SECRET</code> with the values you just generated. Also, update your Proxmox host's IP address.
'''Important:''' Before saving, populate the `Secret` with the unique token values you generated on each host. The keys in the secret (`pve-node-1-token`, `pve-node-2-token`) must match the environment variable names used in the `initContainer`.


<syntaxhighlight lang="yaml">
<syntaxhighlight lang="yaml">
Line 35: Line 40:
kind: Secret
kind: Secret
metadata:
metadata:
   name: pve-exporter-credentials
   name: pve-exporter-secrets
   namespace: monitoring # Or your preferred namespace
   namespace: monitoring
type: Opaque
stringData:
stringData:
   # The PVE_USER for token auth is the full Token ID
   # Populate with the UNIQUE secret values generated on each Proxmox host
   PVE_USER: "YOUR_API_TOKEN_ID" # e.g., pve-exporter@pve!exporter-token
  pve-node-1-token: "UNIQUE_SECRET_VALUE_FOR_PVE_NODE_1"
  # The PVE_PASSWORD for token auth is the Token Secret
   pve-node-2-token: "UNIQUE_SECRET_VALUE_FOR_PVE_NODE_2"
  PVE_PASSWORD: "YOUR_API_TOKEN_SECRET"
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: pve-exporter-config-template
  namespace: monitoring
data:
  pve.yml: |
    # --- Module for pve-node-1 ---
    pve-node-1:
      user: pve-exporter@pve
      token_name: exporter-token
      token_value: "${PVE_NODE_1_TOKEN}"
      verify_ssl: false
    # --- Module for pve-node-2 ---
    pve-node-2:
      user: pve-exporter@pve
      token_name: exporter-token
      token_value: "${PVE_NODE_2_TOKEN}"
      verify_ssl: false
---
---
apiVersion: apps/v1
apiVersion: apps/v1
Line 60: Line 85:
         app: pve-exporter
         app: pve-exporter
     spec:
     spec:
       containers:
       volumes:
       - name: pve-exporter
       - name: config-template-volume
         image: prompve/prometheus-pve-exporter:latest
        configMap:
         ports:
          name: pve-exporter-config-template
         - name: http-metrics
      - name: processed-config-volume
           containerPort: 9221
         emptyDir: {}
      - name: tmp
        emptyDir: {}
      initContainers:
      - name: init-config-secrets
         image: busybox:1.36
         command: ['/bin/sh', '-c']
        args:
        - |
           sed -e "s|\${PVE_NODE_1_TOKEN}|${PVE_NODE_1_TOKEN}|g" \
              -e "s|\${PVE_NODE_2_TOKEN}|${PVE_NODE_2_TOKEN}|g" \
              /etc/config-template/pve.yml > /etc/processed-config/pve.yml
         env:
         env:
         - name: PVE_USER
         - name: PVE_NODE_1_TOKEN
           valueFrom:
           valueFrom:
             secretKeyRef:
             secretKeyRef:
               name: pve-exporter-credentials
               name: pve-exporter-secrets
               key: PVE_USER
               key: pve-node-1-token
         - name: PVE_PASSWORD
         - name: PVE_NODE_2_TOKEN
           valueFrom:
           valueFrom:
             secretKeyRef:
             secretKeyRef:
               name: pve-exporter-credentials
               name: pve-exporter-secrets
               key: PVE_PASSWORD
               key: pve-node-2-token
         - name: PVE_VERIFY_SSL
        volumeMounts:
           value: "false"
         - name: config-template-volume
          mountPath: /etc/config-template
          readOnly: true
        - name: processed-config-volume
           mountPath: /etc/processed-config
      containers:
      - name: pve-exporter
        image: prompve/prometheus-pve-exporter:3.5.5
        args:
        - "--config.file=/etc/prometheus/pve.yml"
        - "--web.listen-address=:9106"
        ports:
        - name: http-metrics
          containerPort: 9106
          protocol: TCP
        livenessProbe:
          httpGet:
            path: /
            port: http-metrics
          initialDelaySeconds: 10
          periodSeconds: 15
        readinessProbe:
          httpGet:
            path: /
            port: http-metrics
          initialDelaySeconds: 5
          periodSeconds: 5
        securityContext:
          runAsNonRoot: true
          runAsUser: 1000
          readOnlyRootFilesystem: true
          allowPrivilegeEscalation: false
          capabilities:
            drop:
            - ALL
        volumeMounts:
        - name: processed-config-volume
          mountPath: /etc/prometheus
          readOnly: true
        - name: tmp
          mountPath: /tmp
        resources:
          requests:
            cpu: '0'
            memory: 128Mi
          limits:
            cpu: '0'
            memory: 256Mi
---
---
apiVersion: v1
apiVersion: v1
Line 89: Line 172:
spec:
spec:
   selector:
   selector:
     matchLabels:
     app: pve-exporter
      app: pve-exporter
   ports:
   ports:
   - name: http-metrics
   - name: http-metrics
     port: 9221
     port: 9106
     targetPort: http-metrics
     targetPort: http-metrics
---
</syntaxhighlight>
apiVersion: monitoring.coreos.com/v1
 
kind: ServiceMonitor
=== 3. Create the Prometheus Scrape Configuration ===
Create a final YAML file for the `ScrapeConfig`. This tells Prometheus how to scrape the single exporter for all your Proxmox hosts, dynamically setting the `target` and `module` parameters for each one.
<syntaxhighlight lang="yaml">
apiVersion: monitoring.coreos.com/v1alpha1
kind: ScrapeConfig
metadata:
metadata:
   name: pve-exporter
   name: pve-nodes
   namespace: monitoring
   namespace: monitoring
   labels:
   labels:
     release: prometheus # Label must match your Prometheus Operator's discovery selector
     # This label must match your Prometheus Operator's discovery selector
    prometheus: my-prometheus
spec:
spec:
   selector:
   staticConfigs:
     matchLabels:
     - targets:
      app: pve-exporter
        - pve-node-1.your-domain.com
   endpoints:
        - pve-node-2.your-domain.com
   - port: http-metrics
        # Add other hosts here
     path: /pve
   metricsPath: /pve
     params:
   relabelings:
       target:
     # Rule 1: Take the target address and use it as the 'target' URL parameter.
       - "192.168.1.100" # <-- Replace with your Proxmox host's IP address
     - sourceLabels: [__address__]
    relabelings:
       targetLabel: __param_target
        
    # Rule 2: Extract the hostname (e.g., "pve-node-1") and use it as the 'module' URL parameter.
    - sourceLabels: [__address__]
      regex: '([^.]+)\..*' # Captures the part before the first dot
      targetLabel: __param_module
     
    # Rule 3: Set the 'instance' label to the Proxmox host's address.
     - sourceLabels: [__param_target]
     - sourceLabels: [__param_target]
       targetLabel: instance
       targetLabel: instance
     - sourceLabels: [__param_target]
     
       targetLabel: target
    # Rule 4: Rewrite the scrape address to point to our single exporter service.
     - targetLabel: __address__
       replacement: pve-exporter.monitoring.svc:9106
</syntaxhighlight>
</syntaxhighlight>


=== 3. Apply the Kubernetes Manifest ===
=== 4. Apply and Verify ===
Apply the single YAML file to your cluster to deploy all resources at once.
Apply the two Kubernetes manifests to your cluster.
<syntaxhighlight lang="bash">
<syntaxhighlight lang="bash">
kubectl apply -f pve-exporter-full.yaml
kubectl apply -f pve-exporter-full.yaml
kubectl apply -f pve-scrape-config.yaml
</syntaxhighlight>
</syntaxhighlight>


=== 4. Verify the Deployment ===
Check that the pod is running and that Prometheus is successfully scraping the targets.
Check that the pod is running and that Prometheus is successfully scraping the target.
<syntaxhighlight lang="bash">
<syntaxhighlight lang="bash">
# Check pod status
# Check pod status
kubectl get pods -n monitoring -l app=pve-exporter
kubectl get pods -n monitoring -l app=pve-exporter
</syntaxhighlight>
</syntaxhighlight>
After a minute, navigate to your Prometheus UI, go to '''Status -> Targets''', and verify that a target named <code>pve-exporter</code> is present and has a state of '''UP'''.


'''Notes:'''
After a minute, navigate to your Prometheus UI, go to '''Status -> Targets''', and verify that a target for each of your Proxmox hosts is present and has a state of '''UP'''.
*  The <code>PVE_VERIFY_SSL: "false"</code> setting is used because Proxmox VE defaults to a self-signed SSL certificate. Set to <code>"true"</code> if you use a valid, trusted certificate.
*  The <code>ServiceMonitor</code> resource is intended for clusters running the Prometheus Operator. If you are not using it, you will need to add the scrape configuration directly to your <code>prometheus.yml</code> file.
*  All Kubernetes resources are deployed to the <code>monitoring</code> namespace. Adjust if you use a different one.


[[Category:Proxmox VE]]
[[Category:Proxmox VE]]
[[Category:Kubernetes]]
[[Category:Kubernetes]]
[[Category:Monitoring]]
[[Category:Prometheus]]

Latest revision as of 17:26, 29 August 2025

Monitor Proxmox with Prometheus Exporter (PVE 8)

This guide outlines a robust and secure method for deploying the `prometheus-pve-exporter` to a Kubernetes cluster. This architecture uses a single exporter instance that is dynamically configured at startup to use unique, per-host API tokens. This provides the operational simplicity of a single deployment with the enhanced security of per-host credentials.

1. Create a Unique Read-Only API Token on Each Proxmox Host

This setup must be performed on each Proxmox host you wish to monitor (e.g., `pve-node-1`, `pve-node-2`). We will use a consistent user and token name across all hosts for simplicity.

Connect to each Proxmox host via SSH and run the following commands.

# On your FIRST Proxmox host (e.g., 'pve-node-1'), create the user and first token:
pveum useradd pve-exporter@pve
pveum aclmod / -user pve-exporter@pve -role PVEAuditor
pveum user token add pve-exporter@pve exporter-token
pveum aclmod / -token 'pve-exporter@pve!exporter-token' -role PVEAuditor

# On ALL SUBSEQUENT hosts (e.g., 'pve-node-2'), the user is synced by the cluster.
# You only need to create a new token with the same name.
pveum user token add pve-exporter@pve exporter-token
pveum aclmod / -token 'pve-exporter@pve!exporter-token' -role PVEAuditor

Important: The `pveum user token add` command will generate a unique secret value on each host. You must copy the secret value for each host immediately, as you will not be able to see it again.

(Optional) Cleanup Script

If you need to re-run the setup on a host, first delete the old token.

pveum aclmod / -delete 1 -token 'pve-exporter@pve!exporter-token'
pveum user token remove pve-exporter@pve exporter-token
# Only run userdel after removing all tokens for that user from all hosts.
# pveum userdel pve-exporter@pve

2. Create the Kubernetes Manifests

On your local machine, create a single YAML file (e.g., `pve-exporter-full.yaml`). This file contains all the necessary Kubernetes resources.

Important: Before saving, populate the `Secret` with the unique token values you generated on each host. The keys in the secret (`pve-node-1-token`, `pve-node-2-token`) must match the environment variable names used in the `initContainer`.

apiVersion: v1
kind: Secret
metadata:
  name: pve-exporter-secrets
  namespace: monitoring
type: Opaque
stringData:
  # Populate with the UNIQUE secret values generated on each Proxmox host
  pve-node-1-token: "UNIQUE_SECRET_VALUE_FOR_PVE_NODE_1"
  pve-node-2-token: "UNIQUE_SECRET_VALUE_FOR_PVE_NODE_2"
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: pve-exporter-config-template
  namespace: monitoring
data:
  pve.yml: |
    # --- Module for pve-node-1 ---
    pve-node-1:
      user: pve-exporter@pve
      token_name: exporter-token
      token_value: "${PVE_NODE_1_TOKEN}"
      verify_ssl: false
    # --- Module for pve-node-2 ---
    pve-node-2:
      user: pve-exporter@pve
      token_name: exporter-token
      token_value: "${PVE_NODE_2_TOKEN}"
      verify_ssl: false
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: pve-exporter
  namespace: monitoring
  labels:
    app: pve-exporter
spec:
  replicas: 1
  selector:
    matchLabels:
      app: pve-exporter
  template:
    metadata:
      labels:
        app: pve-exporter
    spec:
      volumes:
      - name: config-template-volume
        configMap:
          name: pve-exporter-config-template
      - name: processed-config-volume
        emptyDir: {}
      - name: tmp
        emptyDir: {}
      initContainers:
      - name: init-config-secrets
        image: busybox:1.36
        command: ['/bin/sh', '-c']
        args:
        - |
          sed -e "s|\${PVE_NODE_1_TOKEN}|${PVE_NODE_1_TOKEN}|g" \
              -e "s|\${PVE_NODE_2_TOKEN}|${PVE_NODE_2_TOKEN}|g" \
              /etc/config-template/pve.yml > /etc/processed-config/pve.yml
        env:
        - name: PVE_NODE_1_TOKEN
          valueFrom:
            secretKeyRef:
              name: pve-exporter-secrets
              key: pve-node-1-token
        - name: PVE_NODE_2_TOKEN
          valueFrom:
            secretKeyRef:
              name: pve-exporter-secrets
              key: pve-node-2-token
        volumeMounts:
        - name: config-template-volume
          mountPath: /etc/config-template
          readOnly: true
        - name: processed-config-volume
          mountPath: /etc/processed-config
      containers:
      - name: pve-exporter
        image: prompve/prometheus-pve-exporter:3.5.5
        args:
        - "--config.file=/etc/prometheus/pve.yml"
        - "--web.listen-address=:9106"
        ports:
        - name: http-metrics
          containerPort: 9106
          protocol: TCP
        livenessProbe:
          httpGet:
            path: /
            port: http-metrics
          initialDelaySeconds: 10
          periodSeconds: 15
        readinessProbe:
          httpGet:
            path: /
            port: http-metrics
          initialDelaySeconds: 5
          periodSeconds: 5
        securityContext:
          runAsNonRoot: true
          runAsUser: 1000
          readOnlyRootFilesystem: true
          allowPrivilegeEscalation: false
          capabilities:
            drop:
            - ALL
        volumeMounts:
        - name: processed-config-volume
          mountPath: /etc/prometheus
          readOnly: true
        - name: tmp
          mountPath: /tmp
        resources:
          requests:
            cpu: '0'
            memory: 128Mi
          limits:
            cpu: '0'
            memory: 256Mi
---
apiVersion: v1
kind: Service
metadata:
  name: pve-exporter
  namespace: monitoring
  labels:
    app: pve-exporter
spec:
  selector:
    app: pve-exporter
  ports:
  - name: http-metrics
    port: 9106
    targetPort: http-metrics

3. Create the Prometheus Scrape Configuration

Create a final YAML file for the `ScrapeConfig`. This tells Prometheus how to scrape the single exporter for all your Proxmox hosts, dynamically setting the `target` and `module` parameters for each one.

apiVersion: monitoring.coreos.com/v1alpha1
kind: ScrapeConfig
metadata:
  name: pve-nodes
  namespace: monitoring
  labels:
    # This label must match your Prometheus Operator's discovery selector
    prometheus: my-prometheus 
spec:
  staticConfigs:
    - targets:
        - pve-node-1.your-domain.com
        - pve-node-2.your-domain.com
        # Add other hosts here
  metricsPath: /pve
  relabelings:
    # Rule 1: Take the target address and use it as the 'target' URL parameter.
    - sourceLabels: [__address__]
      targetLabel: __param_target
      
    # Rule 2: Extract the hostname (e.g., "pve-node-1") and use it as the 'module' URL parameter.
    - sourceLabels: [__address__]
      regex: '([^.]+)\..*' # Captures the part before the first dot
      targetLabel: __param_module
      
    # Rule 3: Set the 'instance' label to the Proxmox host's address.
    - sourceLabels: [__param_target]
      targetLabel: instance
      
    # Rule 4: Rewrite the scrape address to point to our single exporter service.
    - targetLabel: __address__
      replacement: pve-exporter.monitoring.svc:9106

4. Apply and Verify

Apply the two Kubernetes manifests to your cluster.

kubectl apply -f pve-exporter-full.yaml
kubectl apply -f pve-scrape-config.yaml

Check that the pod is running and that Prometheus is successfully scraping the targets.

# Check pod status
kubectl get pods -n monitoring -l app=pve-exporter

After a minute, navigate to your Prometheus UI, go to Status -> Targets, and verify that a target for each of your Proxmox hosts is present and has a state of UP.