Monitoring YugabyteDB with Prometheus and Grafana in Kubernetes

March 11, 2020

Prometheus has matured into a robust time-series metrics monitoring solution since it was first open-sourced in 2012. CNCF incubated it as its second project after Kubernetes in 2016 followed by graduation in 2018. Today it is arguably the most popular option for monitoring Kubernetes cluster metrics as well as container-based applications. Combined with Grafana for visualization, it becomes a potent combination for dashboarding performance of applications. Nodes in a YugabyteDB cluster have exposed a Prometheus endpoint for easy metrics collection right from the inception of the open source project. Yugabyte Platform and Yugabyte Cloud, Yugabyte’s two commercial DBaaS offerings, have extensive monitoring features built on Prometheus.

In this blog post, we will review how you can include Prometheus and Grafana capabilities in your deployed environment. All of the components we will discuss are 100% open source and can be used either in an on-prem or public cloud-based environment.

What’s YugabyteDB? It is an open source, high-performance distributed SQL database built on a scalable and fault-tolerant design inspired by Google Spanner. The Yugabyte SQL API, YSQL, is wire-protocol compatible with PostgreSQL so that developers can easily build new cloud native applications without learning a new database language.

Prometheus can be stood up in a Kubernetes cluster using either the Kubernetes prometheus-operator or a straightforward yaml manifest. For example consider this yaml excerpt:

spec:
    containers:
    - name: prometheus
      image: prom/prometheus:v2.15.2
      imagePullPolicy: Always
      ports:
        - containerPort: 9090
          protocol: TCP
      volumeMounts:
        - mountPath: "/etc/prometheus"
          name: config-prometheus

The above would launch a single Prometheus container which leverages a simple Prometheus configuration. The full Prometheus yaml is provided for reference here. The configuration for Prometheus needs to have scrape targets identified to pull metrics from and their endpoints. For example:

apiVersion: v1
kind: ConfigMap
metadata:
 name: prometheus-config
 namespace: yb-demo
data:
 prometheus.yml: |
  global:
  scrape_configs:
- job_name: 'yugabytedb'
metrics_path: /prometheus-metrics
    static_configs:
      - targets: ['yb-master-0.yb-masters.yb-demo.svc.cluster.local:7000', 'yb-master-1.yb-masters.yb-demo.svc.cluster.local:7000', 'yb-master-2.yb-masters.yb-demo.svc.cluster.local:7000']
        labels:
          group: 'yb-master'

The above Prometheus config specifically identifies the source nodes, ports, and path for obtaining metrics from the master YugabyteDB nodes. The entire yaml example is here.

The Grafana deployment is similarly simple and straightforward. Grafana consists of another single container specified in a yaml manifest:

spec:
      containers:
      - name: grafana
        image: grafana/grafana
        ports:
        - containerPort: 3000
          protocol: TCP
        volumeMounts:
        - mountPath: /var/lib/grafana
          name: grafana-storage
        env:
        - name: GF_SERVER_HTTP_PORT
          value: "3000"
        - name: GF_SECURITY_ADMIN_PASSWORD
          value: "admin"
        - name: GF_INSTALL_PLUGINS
          value: "grafana-kubernetes-app"
      volumes:
      - name: grafana-storage
        emptyDir: {}

The source file is here. You could supplement this with a persistent volume like this:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  creationTimestamp: null
  labels:
    component: grafana
  name: grafana-claim
  namespace: yb-demo
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi

With Grafana as the visualization component, you can expose this via a service load-balancer:

apiVersion: v1
kind: Service
metadata:
  name: grafana-service
  namespace: yb-demo
spec:
  type: LoadBalancer
  selector:
    app: grafana
  ports:
  - protocol: TCP
    port: 3000
    targetPort: 3000

Getting Started with Prometheus, Grafana, and YugabyteDB

An application that you can monitor across the entire stack is the YugabyteDB-powered microservices demo Hipster Shop. Proceed with standing up the YugabyteDB and HipsterShop demo as documented here.

Now it’s one step to deploy both Prometheus and Grafana from the included monitoring directory.

kubectl create -f monitoring/
kubectl get svc
grafana-service         LoadBalancer   10.0.7.82     35.239.65.174   3000:32386/TCP                                 15m

You should be able to view the Grafana login page via a web browser on port 3000 of the indicated IP.

The default credentials are admin/admin. You will be asked if you want to change the admin password which you can do after initial login. Now you can identify the Prometheus data source from its headless service name in Kubernetes.

Identify the Prometheus data source from its headless service name in Kubernetes

kubectl get svc |grep prom
prometheus              ClusterIP      10.0.5.150              9090/TCP                                       19m

data source prometheus kubernetes monitoring

Now we can add in a dashboard to view our metrics:

new dashboard grafana

Setting up the static scrape targets gives you access to the YugabyteDB metrics from Master and Tserver. For example:

access to the YugabyteDB metrics from Master and Tserver Prometheus monitoring yugabytedb how to

Now if we want to generate traffic to our YugabyteDB instance, we can leverage the loadgenerator service that is part of the HipsterShop deployment. You can edit the users in the kubernetes-manifests/loadgenerator.yaml:

edit user in kubernetes manifests load generator yaml prometheus monitoring how to yugabytedb

Now when we deploy the HipsterShop, we’ll see the activity reflected in our dashboard. For example, if we are looking for the select statements, we can use handler_latency_yb_ysqlserver_SQLProcessor_SelectStmt_sum.

Selecting this metric we will be able to see the number of select statements across the 3 Tservers in our deployment.

see the number of select statements across the 3 Tservers in our deployment yugabytedb prometheus monitoring how to

You can also look at Write IOPS with:

sum(irate(handler_latency_yb_ysqlserver_SQLProcessor_InsertStmt_count[1m]))

Or read latency:

avg(irate(handler_latency_yb_ysqlserver_SQLProcessor_SelectStmt_sum[1m])) /
avg(irate(handler_latency_yb_ysqlserver_SQLProcessor_SelectStmt_count[1m]))

Or write latency:

avg(irate(handler_latency_yb_ysqlserver_SQLProcessor_InsertStmt_sum[1m])) /
avg(irate(handler_latency_yb_ysqlserver_SQLProcessor_InsertStmt_count[1m]))

Another way to generate a synthetic workload is with our sample apps repo from here. For example:

kubectl run --image=yugabytedb/yb-sample-apps --restart=Never java-client -- --workload SqlInserts --nodes $YSQL_IP --num_threads_write 1 --num_threads_read 4 -- value_size 4096

For other examples, please check out our observability page here.

See It in Action

Check out the demo video on getting started with monitoring YugabyteDB with Prometheus and Grafana to see it in action:

Future Work

Building on the above effort, we would like to monitor the kube-state-metrics of the Kubernetes cluster as well as the metrics of other Prometheus compatible services. This would leverage published dashboards for Grafana available here. We would also like to centralize our monitoring of multiple Prometheus instances via CNCF’s Cortex project.

March 11, 2020

Cloud Native

Monitoring YugabyteDB with Prometheus and Grafana in Kubernetes

Related Posts

Explore Distributed SQL and YugabyteDB in Depth