Monitoring PVE 8 via Prometheus on Kubernetes: Difference between revisions

From Jwiki
No edit summary
No edit summary
Line 1: Line 1:
== Monitor Proxmox with Prometheus Exporter on Kubernetes ==
== Monitor Proxmox with Prometheus Exporter on Kubernetes ==
This guide outlines how to deploy the `prometheus-pve-exporter` to a Kubernetes cluster to monitor a remote Proxmox VE host using a secure API token. This is the recommended authentication method. The process involves a minimal, one-time setup on the Proxmox host and the deployment of a single multi-document YAML file to Kubernetes.
This guide outlines a robust and secure method for deploying the `prometheus-pve-exporter` to a Kubernetes cluster. Since it is a security best practice to use a unique API token for each host, this guide details how to deploy a dedicated exporter instance for each host you wish to monitor.
=== 1. Create a Read-Only User and API Token on Proxmox ===
 
The only action required on the Proxmox host is the creation of a dedicated user and an API token for that user. Connect to your Proxmox host via SSH and run the following commands.
=== 1. Create a Read-Only User and API Token on Each Proxmox Host ===
This one-time setup must be performed on '''each''' Proxmox host you want to monitor. This process ensures a clean permission set and avoids common access control list (ACL) conflicts. Connect to your host via SSH and run the following commands.
 
The script below will:
The script below will:
*  Create a role named <code>ExporterRole</code> with the necessary audit permissions.
*  Create a user named `pve-exporter@pve` for monitoring.
*  Create a user named <code>pve-exporter@pve</code> specifically for this purpose (it does not require a password).
*  Assign the built-in `PVEAuditor` role to the new user.
*  Assign the read-only role to the new user at the root level (<code>/</code>).
*  Create an API token named `exporter-token`.
*  Create an API token named <code>exporter-token</code> for the user.
*  Assign the `PVEAuditor` role '''directly to the API token''' to override any potential ACL inheritance issues.
 
<syntaxhighlight lang="bash">
<syntaxhighlight lang="bash">
# Create the role with read-only privileges and a valid name
# 1. Create the user
pveum roleadd ExporterRole -privs "Datastore.Audit Sys.Audit"
# Create the user (password login is not needed for token auth)
pveum useradd pve-exporter@pve
pveum useradd pve-exporter@pve
# Assign the role to the user for the entire datacenter
 
pveum aclmod / -user pve-exporter@pve -role ExporterRole
# 2. Assign the standard PVEAuditor role to the USER
# Create the API token for the user
pveum aclmod / -user pve-exporter@pve -role PVEAuditor
 
# 3. Create the API token for the user
pveum user token add pve-exporter@pve exporter-token
pveum user token add pve-exporter@pve exporter-token
# 4. THE CRITICAL STEP: Grant the PVEAuditor role DIRECTLY to the API TOKEN
pveum aclmod / -token 'pve-exporter@pve!exporter-token' -role PVEAuditor
</syntaxhighlight>
</syntaxhighlight>
'''Important:''' The last command will output the token ID and the secret value. Copy the full secret value (the long string of characters) immediately. '''You will not be able to see it again.'''
 
'''Important:''' The `pveum user token add` command will output the '''Token ID''' (e.g., `pve-exporter@pve!exporter-token`) and the '''Secret Value'''. Copy the full secret value immediately, as '''you will not be able to see it again.'''
 
==== (Optional) Cleanup Script ====
==== (Optional) Cleanup Script ====
If you need to re-run the setup, first delete the old resources to avoid errors.
If you need to re-run the setup on a host, first delete the old resources to ensure a clean state.
<syntaxhighlight lang="bash">
<syntaxhighlight lang="bash">
pveum aclmod / -delete 1 -token 'pve-exporter@pve!exporter-token'
pveum user token remove pve-exporter@pve exporter-token
pveum user token remove pve-exporter@pve exporter-token
pveum user delete pve-exporter@pve
pveum user delete pve-exporter@pve
pveum role delete ExporterRole
</syntaxhighlight>
</syntaxhighlight>
=== 2. Create the Combined Kubernetes Manifest ===
 
On your local machine, create a single YAML file named <code>pve-exporter-full.yaml</code>. This file contains all the necessary Kubernetes resources. We will store the token ID and the secret in the Kubernetes Secret.
=== 2. Create the Kubernetes Manifest for Each Host ===
'''Important:''' Before saving, replace the placeholder values for <code>YOUR_API_TOKEN_ID</code> (e.g., <code>pve-exporter@pve!exporter-token</code>) and <code>YOUR_API_TOKEN_SECRET</code> with the values you just generated. Also, update your Proxmox host's IP address.
On your local machine, create a single YAML file (e.g., `pve-exporters.yaml`). This file will contain a separate set of Kubernetes resources for each Proxmox host. Below is the complete template for a host named `ahsoka`.
 
'''Important:''' Before saving, replace the placeholder values:
*  `jgy-pve-exporter-ahsoka-auth`: Ensure secret names are unique per host.
*  `YOUR_API_TOKEN_ID`: Use the full Token ID from the previous step (e.g., `pve-exporter@pve!exporter-token`).
*  `YOUR_API_TOKEN_SECRET`: Use the secret value you just generated.
*  `ahsoka.tatooine.jgy.local`: Update with your Proxmox host's fully qualified domain name or IP address.
 
<syntaxhighlight lang="yaml">
<syntaxhighlight lang="yaml">
# ===================================================================
# ==        CONFIGURATION FOR PROXMOX HOST: ahsoka                ==
# ===================================================================
# ---
# 1. Secret for "ahsoka" - This holds the unique token for this host.
apiVersion: v1
apiVersion: v1
kind: Secret
kind: Secret
metadata:
metadata:
   name: pve-exporter-credentials
   name: jgy-pve-exporter-ahsoka-auth
   namespace: monitoring # Or your preferred namespace
   namespace: monitoring # Or your preferred namespace
stringData:
stringData:
Line 41: Line 61:
   PVE_PASSWORD: "YOUR_API_TOKEN_SECRET"
   PVE_PASSWORD: "YOUR_API_TOKEN_SECRET"
---
---
# 2. Deployment for the "jgy-pve-exporter-ahsoka" instance
apiVersion: apps/v1
apiVersion: apps/v1
kind: Deployment
kind: Deployment
metadata:
metadata:
   name: pve-exporter
   name: jgy-pve-exporter-ahsoka
   namespace: monitoring
   namespace: monitoring
   labels:
   labels:
     app: pve-exporter
     app: jgy-pve-exporter-ahsoka
spec:
spec:
   replicas: 1
   replicas: 1
   selector:
   selector:
     matchLabels:
     matchLabels:
       app: pve-exporter
       app: jgy-pve-exporter-ahsoka
   template:
   template:
     metadata:
     metadata:
       labels:
       labels:
         app: pve-exporter
         app: jgy-pve-exporter-ahsoka
     spec:
     spec:
       containers:
       containers:
       - name: pve-exporter
       - name: pve-exporter
         image: prompve/prometheus-pve-exporter:latest
         image: prompve/prometheus-pve-exporter:v3.5.5
        args:
        - "--web.listen-address=:9106"
         ports:
         ports:
         - name: http-metrics
         - name: http-metrics
           containerPort: 9221
           containerPort: 9106
         env:
         env:
         - name: PVE_USER
         - name: PVE_USER
           valueFrom:
           valueFrom:
             secretKeyRef:
             secretKeyRef:
               name: pve-exporter-credentials
               name: jgy-pve-exporter-ahsoka-auth
               key: PVE_USER
               key: PVE_USER
         - name: PVE_PASSWORD
         - name: PVE_PASSWORD
           valueFrom:
           valueFrom:
             secretKeyRef:
             secretKeyRef:
               name: pve-exporter-credentials
               name: jgy-pve-exporter-ahsoka-auth
               key: PVE_PASSWORD
               key: PVE_PASSWORD
         - name: PVE_VERIFY_SSL
         - name: PVE_VERIFY_SSL
           value: "false"
           value: "false"
        resources:
          requests:
            cpu: 50m
            memory: 64Mi
          limits:
            cpu: 100m
            memory: 128Mi
---
---
# 3. Service for the "jgy-pve-exporter-ahsoka" instance
apiVersion: v1
apiVersion: v1
kind: Service
kind: Service
metadata:
metadata:
   name: pve-exporter
   name: jgy-pve-exporter-ahsoka
   namespace: monitoring
   namespace: monitoring
   labels:
   labels:
     app: pve-exporter
     app: jgy-pve-exporter-ahsoka
spec:
spec:
   selector:
   selector:
     matchLabels:
     app: jgy-pve-exporter-ahsoka
      app: pve-exporter
   ports:
   ports:
   - name: http-metrics
   - name: http-metrics
     port: 9221
     port: 9106
     targetPort: http-metrics
     targetPort: "http-metrics"
---
---
# 4. ServiceMonitor for the "jgy-pve-exporter-ahsoka" instance
apiVersion: monitoring.coreos.com/v1
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
kind: ServiceMonitor
metadata:
metadata:
   name: pve-exporter
   name: jgy-pve-exporter-ahsoka
   namespace: monitoring
   namespace: monitoring
   labels:
   labels:
Line 104: Line 135:
   selector:
   selector:
     matchLabels:
     matchLabels:
       app: pve-exporter
       app: jgy-pve-exporter-ahsoka
   endpoints:
   endpoints:
   - port: http-metrics
   - port: "http-metrics"
     path: /pve
     path: /pve
     params:
     params:
       target:
       target:
       - "192.168.1.100" # <-- Replace with your Proxmox host's IP address
       - "ahsoka.tatooine.jgy.local" # <-- Replace with your Proxmox host's FQDN or IP
     relabelings:
     relabelings:
     - sourceLabels: [__param_target]
     - sourceLabels: [__param_target]
       targetLabel: instance
       targetLabel: instance
    - sourceLabels: [__param_target]
      targetLabel: target
</syntaxhighlight>
</syntaxhighlight>
To monitor additional hosts, copy and paste this entire four-document block into the same file, then perform a find-and-replace for `ahsoka` with your new host's name (e.g., `thrawn`) and update the new host's unique credentials and target address.
=== 3. Apply the Kubernetes Manifest ===
=== 3. Apply the Kubernetes Manifest ===
Apply the single YAML file to your cluster to deploy all resources at once.
Apply the single YAML file to your cluster to deploy all resources.
<syntaxhighlight lang="bash">
<syntaxhighlight lang="bash">
kubectl apply -f pve-exporter-full.yaml
kubectl apply -f pve-exporters.yaml
</syntaxhighlight>
</syntaxhighlight>
=== 4. Verify the Deployment ===
=== 4. Verify the Deployment ===
Check that the pod is running and that Prometheus is successfully scraping the target.
Check that the pods are running and that Prometheus is successfully scraping the targets.
<syntaxhighlight lang="bash">
<syntaxhighlight lang="bash">
# Check pod status
# Check pod status for all exporters, replacing with your host names
kubectl get pods -n monitoring -l app=pve-exporter
kubectl get pods -n monitoring -l app --selector='app in (jgy-pve-exporter-ahsoka, jgy-pve-exporter-thrawn)'
</syntaxhighlight>
</syntaxhighlight>
After a minute, navigate to your Prometheus UI, go to '''Status -> Targets''', and verify that a target named <code>pve-exporter</code> is present and has a state of '''UP'''.
 
After a minute, navigate to your Prometheus UI, go to '''Status -> Targets''', and verify that targets for `jgy-pve-exporter-ahsoka` (and any others you deployed) are present and have a state of '''UP'''.
 
'''Notes:'''
'''Notes:'''
*  The <code>PVE_VERIFY_SSL: "false"</code> setting is used because Proxmox VE defaults to a self-signed SSL certificate. Set to <code>"true"</code> if you use a valid, trusted certificate.
*  The `PVE_VERIFY_SSL: "false"` setting is used because Proxmox VE defaults to a self-signed SSL certificate. Set to `"true"` if you use a valid, trusted certificate.
*  The <code>ServiceMonitor</code> resource is intended for clusters running the Prometheus Operator. If you are not using it, you will need to add the scrape configuration directly to your <code>prometheus.yml</code> file.
*  The `ServiceMonitor` resource is intended for clusters running the Prometheus Operator. If you are not using it, you will need to add the scrape configuration directly to your `prometheus.yml` file.
*  All Kubernetes resources are deployed to the <code>monitoring</code> namespace. Adjust if you use a different one.
*  All Kubernetes resources are deployed to the `monitoring` namespace. Adjust if you use a different one.


[[Category:Proxmox VE]]
[[Category:Proxmox VE]]
[[Category:Kubernetes]]
[[Category:Kubernetes]]
[[Category:Monitoring]]
[[Category:Monitoring]]

Revision as of 16:05, 29 August 2025

Monitor Proxmox with Prometheus Exporter on Kubernetes

This guide outlines a robust and secure method for deploying the `prometheus-pve-exporter` to a Kubernetes cluster. Since it is a security best practice to use a unique API token for each host, this guide details how to deploy a dedicated exporter instance for each host you wish to monitor.

1. Create a Read-Only User and API Token on Each Proxmox Host

This one-time setup must be performed on each Proxmox host you want to monitor. This process ensures a clean permission set and avoids common access control list (ACL) conflicts. Connect to your host via SSH and run the following commands.

The script below will:

  • Create a user named `pve-exporter@pve` for monitoring.
  • Assign the built-in `PVEAuditor` role to the new user.
  • Create an API token named `exporter-token`.
  • Assign the `PVEAuditor` role directly to the API token to override any potential ACL inheritance issues.
# 1. Create the user
pveum useradd pve-exporter@pve

# 2. Assign the standard PVEAuditor role to the USER
pveum aclmod / -user pve-exporter@pve -role PVEAuditor

# 3. Create the API token for the user
pveum user token add pve-exporter@pve exporter-token

# 4. THE CRITICAL STEP: Grant the PVEAuditor role DIRECTLY to the API TOKEN
pveum aclmod / -token 'pve-exporter@pve!exporter-token' -role PVEAuditor

Important: The `pveum user token add` command will output the Token ID (e.g., `pve-exporter@pve!exporter-token`) and the Secret Value. Copy the full secret value immediately, as you will not be able to see it again.

(Optional) Cleanup Script

If you need to re-run the setup on a host, first delete the old resources to ensure a clean state.

pveum aclmod / -delete 1 -token 'pve-exporter@pve!exporter-token'
pveum user token remove pve-exporter@pve exporter-token
pveum user delete pve-exporter@pve

2. Create the Kubernetes Manifest for Each Host

On your local machine, create a single YAML file (e.g., `pve-exporters.yaml`). This file will contain a separate set of Kubernetes resources for each Proxmox host. Below is the complete template for a host named `ahsoka`.

Important: Before saving, replace the placeholder values:

  • `jgy-pve-exporter-ahsoka-auth`: Ensure secret names are unique per host.
  • `YOUR_API_TOKEN_ID`: Use the full Token ID from the previous step (e.g., `pve-exporter@pve!exporter-token`).
  • `YOUR_API_TOKEN_SECRET`: Use the secret value you just generated.
  • `ahsoka.tatooine.jgy.local`: Update with your Proxmox host's fully qualified domain name or IP address.
# ===================================================================
# ==         CONFIGURATION FOR PROXMOX HOST: ahsoka                ==
# ===================================================================
# ---
# 1. Secret for "ahsoka" - This holds the unique token for this host.
apiVersion: v1
kind: Secret
metadata:
  name: jgy-pve-exporter-ahsoka-auth
  namespace: monitoring # Or your preferred namespace
stringData:
  # The PVE_USER for token auth is the full Token ID
  PVE_USER: "YOUR_API_TOKEN_ID" # e.g., pve-exporter@pve!exporter-token
  # The PVE_PASSWORD for token auth is the Token Secret
  PVE_PASSWORD: "YOUR_API_TOKEN_SECRET"
---
# 2. Deployment for the "jgy-pve-exporter-ahsoka" instance
apiVersion: apps/v1
kind: Deployment
metadata:
  name: jgy-pve-exporter-ahsoka
  namespace: monitoring
  labels:
    app: jgy-pve-exporter-ahsoka
spec:
  replicas: 1
  selector:
    matchLabels:
      app: jgy-pve-exporter-ahsoka
  template:
    metadata:
      labels:
        app: jgy-pve-exporter-ahsoka
    spec:
      containers:
      - name: pve-exporter
        image: prompve/prometheus-pve-exporter:v3.5.5
        args:
        - "--web.listen-address=:9106"
        ports:
        - name: http-metrics
          containerPort: 9106
        env:
        - name: PVE_USER
          valueFrom:
            secretKeyRef:
              name: jgy-pve-exporter-ahsoka-auth
              key: PVE_USER
        - name: PVE_PASSWORD
          valueFrom:
            secretKeyRef:
              name: jgy-pve-exporter-ahsoka-auth
              key: PVE_PASSWORD
        - name: PVE_VERIFY_SSL
          value: "false"
        resources:
          requests:
            cpu: 50m
            memory: 64Mi
          limits:
            cpu: 100m
            memory: 128Mi
---
# 3. Service for the "jgy-pve-exporter-ahsoka" instance
apiVersion: v1
kind: Service
metadata:
  name: jgy-pve-exporter-ahsoka
  namespace: monitoring
  labels:
    app: jgy-pve-exporter-ahsoka
spec:
  selector:
    app: jgy-pve-exporter-ahsoka
  ports:
  - name: http-metrics
    port: 9106
    targetPort: "http-metrics"
---
# 4. ServiceMonitor for the "jgy-pve-exporter-ahsoka" instance
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: jgy-pve-exporter-ahsoka
  namespace: monitoring
  labels:
    release: prometheus # Label must match your Prometheus Operator's discovery selector
spec:
  selector:
    matchLabels:
      app: jgy-pve-exporter-ahsoka
  endpoints:
  - port: "http-metrics"
    path: /pve
    params:
      target:
      - "ahsoka.tatooine.jgy.local" # <-- Replace with your Proxmox host's FQDN or IP
    relabelings:
    - sourceLabels: [__param_target]
      targetLabel: instance

To monitor additional hosts, copy and paste this entire four-document block into the same file, then perform a find-and-replace for `ahsoka` with your new host's name (e.g., `thrawn`) and update the new host's unique credentials and target address.

3. Apply the Kubernetes Manifest

Apply the single YAML file to your cluster to deploy all resources.

kubectl apply -f pve-exporters.yaml

4. Verify the Deployment

Check that the pods are running and that Prometheus is successfully scraping the targets.

# Check pod status for all exporters, replacing with your host names
kubectl get pods -n monitoring -l app --selector='app in (jgy-pve-exporter-ahsoka, jgy-pve-exporter-thrawn)'

After a minute, navigate to your Prometheus UI, go to Status -> Targets, and verify that targets for `jgy-pve-exporter-ahsoka` (and any others you deployed) are present and have a state of UP.

Notes:

  • The `PVE_VERIFY_SSL: "false"` setting is used because Proxmox VE defaults to a self-signed SSL certificate. Set to `"true"` if you use a valid, trusted certificate.
  • The `ServiceMonitor` resource is intended for clusters running the Prometheus Operator. If you are not using it, you will need to add the scrape configuration directly to your `prometheus.yml` file.
  • All Kubernetes resources are deployed to the `monitoring` namespace. Adjust if you use a different one.