Monitoring a MacOS Host with Prometheus Node Exporter: Difference between revisions

From Jwiki
Created page with "Category:MacOS Category:Prometheus Category:Monitoring = Monitoring a macOS Host with Prometheus Node Exporter = This guide details the process of installing and configuring the Prometheus Node Exporter on a macOS machine, with a focus on filtering out irrelevant filesystems to ensure clean, actionable alerts. The primary challenge when monitoring macOS is handling the numerous OS-managed, virtual, and temporary filesystems that can trigger false positive a..."
 
No edit summary
Line 1: Line 1:
[[Category:MacOS]]
[[Category:MacOS]]
[[Category:Prometheus]]
[[Category:Prometheus]]
[[Category:Monitoring]]
[[Category:Observability]]


= Monitoring a macOS Host with Prometheus Node Exporter =
= Monitoring a macOS Host with Prometheus Node Exporter =

Revision as of 10:44, 29 August 2025


Monitoring a macOS Host with Prometheus Node Exporter

This guide details the process of installing and configuring the Prometheus Node Exporter on a macOS machine, with a focus on filtering out irrelevant filesystems to ensure clean, actionable alerts. The primary challenge when monitoring macOS is handling the numerous OS-managed, virtual, and temporary filesystems that can trigger false positive alerts for high disk usage.

h2. Initial Setup and Problem Diagnosis h3. Installation with Homebrew The standard method for installing Node Exporter on macOS is via the Homebrew package manager.

brew install node_exporter

h3. The Problem: False Positive Alerts After a default installation, Node Exporter will scrape metrics from all mounted filesystems. On macOS, this includes many volumes that are nearly full by design, leading to persistent, non-actionable alerts.

Common sources of these false positives include:

  • Xcode Simulator Runtimes: Mounted under /Library/Developer/CoreSimulator/.
  • Virtual Filesystems: Such as /dev (devfs).
  • Automounter Filesystems: Such as /System/Volumes/Data/home (autofs).
  • Read-only System Snapshots: The main / volume is a sealed, read-only snapshot of the OS.

The goal is to filter these out and only monitor user-managed volumes where disk space is a real concern, primarily /System/Volumes/Data.

h2. Configuration and Troubleshooting The solution involves a two-part strategy: configuring Node Exporter to exclude noisy filesystems at the source, and writing a precise PromQL alert that only targets the user-managed data volume.

h3. Step 1: Configure Node Exporter Exclusions Node Exporter can be configured to ignore specific mount points using a command-line flag. When installed with Homebrew, these flags should be placed in a dedicated arguments file.

  1. Create or Edit the Arguments File: This file is read by the `brew services` launch agent. It should contain one argument per line, without quotes.
nano /opt/homebrew/etc/node_exporter.args
  1. Add the Exclusion Rule: To comprehensively filter out all known OS-managed, virtual, and temporary filesystems, add the following line to the file.
--collector.filesystem.mount-points-exclude=^/(dev|private/var/folders|System/Volumes/(Preboot|Update|VM|Recovery|Hardware|xarts|iSCPreboot)|Library/Developer/CoreSimulator/.*)$

This regular expression tells Node Exporter to ignore any filesystem whose mount point matches these patterns.

  1. Restart the Service: Apply the new configuration by restarting the Node Exporter service.
brew services restart node_exporter

h3. Step 2: Validate the Configuration After restarting, verify that the noisy filesystems are no longer being exported. The following command queries the metrics endpoint and greps for the excluded patterns.

curl -s http://localhost:9100/metrics | grep -E 'mountpoint="/(dev|private/var/folders|System/Volumes/(Preboot|Update|VM|Recovery)|Library/Developer/CoreSimulator/.*)"'

A successful configuration will result in no output from this command.

h3. Step 3: Create a Precise PromQL Alert With the metrics now clean, the final step is to create an alerting rule in Prometheus that is both simple and impossible to trigger with a false positive. Instead of excluding filesystems in the query, you should explicitly include only the volume you care about.

The most critical user-managed filesystem on macOS is /System/Volumes/Data.

Recommended PromQL Alerting Rule: This query calculates the percentage of used space for the Data volume and will fire only if it exceeds 90%.

100 - (node_filesystem_avail_bytes{mountpoint="/System/Volumes/Data"} / node_filesystem_size_bytes{mountpoint="/System/Volumes/Data"}) * 100 > 90

By following this guide, you achieve a robust monitoring setup for macOS that provides clean data and generates alerts that are always actionable.