n0derunner

    Viewing Nutanix cluster metrics in prometheus/grafana

    Published: (Updated: ) in Telemetry, Nutanix, , , , , by .

    Using Nutanix API with prometheus push-gateway.

    Many customers would like to view their cluster metrics alongside existing performance data using Prometheus/Grafana

    Currently Nutanix does not provide a native exporter for Prometheus to use as a datasource. However we can use the prometheus push-gateway and a simple script which pulls from the native APIs to get data into prometheus. From there we can use Grafana or anything that can connect to Prometheus.

    The goal is to be able to view cluster metrics alongside other Grafana dashboards. For example show the current Read/Write IOPS that the cluster is delivering on a per container basis. I’m hard-coding IPs and username/passwords in the script which obviously is not production grade, so don’t do that.

    How to do it

    Since my Mac is connected to the corporate VPN I am going to setup the prometheus push-gateway, prometheus and Grafana on my Mac just as a demo.

    Step 1 Create a Prometheus gateway on the Mac

    The prometheus push-gateway comes pre-compiled for a variety of OS including MacOS. We can find the latest on github. As of July 2023 the latest version is 1.6.0. So head to github and pull the version for your OS. I’m using an older mac, so I went with pushgateway-1.6.0.darwin-amd64.tar.gz

    Download, gunzip and untar the file. Then simply run it

    cd /Users/gary/Downloads/pushgateway-1.6.0.darwin-amd64
    ./pushgateway

    The prometheus push-gateway listens on port 9091. Port 9091 is used to both send data to the gateway and listen for the prometheus scraper.

    Step 2 Point prometheus at the gateway

    The easiest thing to do for a demo is to hard-code the address of the push-gateway in the prometheus.yml file. I have prometheus downloaded on my mac in /Users/gary/Downloads/prometheus/prometheus-2.39.0.darwin-amd64. In my prometheus.yml I add the address of the push-gateway at the end of the file. So for me the gateway is running on the same machine as prometheus, but on port 9091

      static_configs:
          - targets: ["localhost:9090","localhost:9091"]

    Then start prometheus

    ./prometheus

    Step 4 Write a script to gather data from the cluster and send it to the push-gateway

    Now we have a push-gateway connected to a prometheus server, we need to put some cluster metrics into it. For simplicity I am going to use a simple bash script with the values hard coded.

    There are two main things happening here.

    That’s it. Here is the script. You will need to supply the cluster VIP and a username as well as a password to connect the Prism Element and grab the counters we want.

    #!/usr/bin/env bash
    
    VIP="<your-cluster-virtual-IP-here>"
    username="<prism-login-name>"
    password="<prism-login-password>"
    container_list="ctr1"
    
    function main {
       while true  ; do
           for container in $container_list ; do
               CTR_UUID=$(get_uuid_for_container $container)
               READ_IOPS=$(get_metric $CTR_UUID "controller_num_read_iops")
               WRITE_IOPS=$(get_metric $CTR_UUID "controller_num_write_iops")
               echo "cluster_read_iops $READ_IOPS" | curl --data-binary @- http://localhost:9091/metrics/job/$container
                cho "cluster_write_iops $WRITE_IOPS" | curl --data-binary @- http://localhost:9091/metrics/job/$container
               done
            sleep 10
       done
    }
    
    function get_metric {
        CTR_UUID=$1
        metric_name=$2
        URL="https://$VIP:9440/PrismGateway/services/rest/v2.0/storage_containers/$CTR_UUID/stats/?metrics=$metric_name" 
        JSON=$(curl -S -u "$username:$password" -k -X GET --header 'Accept: application/json' $URL 2>/dev/null)
        RES=$(echo $JSON | jq '.["stats_specific_responses"][0]["values"][0]')
        echo "$RES"
    }
    
    function get_uuid_for_container {
        CTR_NAME=$1
        URL="https://$VIP:9440/PrismGateway/services/rest/v2.0/storage_containers/?search_string=$CTR_NAME"
        JSON=$(curl -S -u "$username:$password" -k -X GET --header 'Accept: application/json' $URL 2>/dev/null)
        RES=$(echo $JSON | jq '.["entities"][0]["storage_container_uuid"]'| sed s/\"//g)
        echo $RES
    }
    # Call main function.
    main

    Step 5 View data in prometheus

    We told prometheus about the push-gateway in step-2 by adding the address of the push-gateway in the prometheus.yml file. Now we can start the prometheus server on the mac and take a look at the data we are gathering.

    If it is not running already, start the prometheus server
    cd /Users/gary/Downloads/prometheus/prometheus-2.39.0.darwin-amd64
    ./prometheus

    Point your browser at the prometheus server e.g. http://localhost:9090/graph In prometheus we can add the stats we are pushing. We get to name them whatever we like in the script. For simplicity we are just sending two stats cluster_read_iops and cluster_write_iops. From the prometheus web page we can see the values. The stats we get from the API are already a rate so there`s no need to transform further.

    And these metrics reflect the workload being generated on the cluster from a single fio instance

    Once that’s up and running you can add the metrics that prometheus now knows about in Grafana.

    Now you have metrics from your Nutanix cluster that you can display alongside everything else.

    Comments

    Leave a Comment