The Distributed SQL Blog

Thoughts on distributed databases, open source and cloud native

Announcing the Rook Operator for YugabyteDB

We are excited to congratulate the Rook community on the release on 1.1! We are also pleased to announce that the Rook operator for YugabyteDB is now available from rook.io and also on Github. This release extends the Rook storage operator as a custom resource, as well as provides an additional way to easily create, natively view and manage YugabyteDB within a Kubernetes cluster. In this blog we’ll summarize how to get started with the operator.

What’s YugabyteDB? It is an open source, high-performance distributed SQL database built on a scalable and fault-tolerant design inspired by Google Spanner. YugabyteDB’s SQL API (YSQL) and drivers are PostgreSQL wire-protocol compatible.

YugabyteDB + Rook = Awesome!

YugabyteDB is a high-performance, distributed SQL database that runs natively in Kubernetes. It supports most PostgreSQL features, including very advanced features such as window functions, stored procedures, triggers and extensions. YugabyteDB can horizontally write scale reads and writes, meaning you simply add more pods when need to handle more queries. YugabyteDB can also be deployed in geo-distributed configurations (multi-zone, multi-region, multi-cloud) including across multiple k8s clusters.

The Rook framework simplifies the deployment and management of a YugabyteDB cluster. Rook makes it very easy to build a dedicated Kubernetes CRD and an operator for YugabyteDB. In turn, this YugabyteDB operator simplifies both the deployment and management of the underlying YugabyteDB clusters running on any diverse Kubernetes environment – simply using kubectl.

In addition to the initial deployment of storage resources, operators can automate complex and tedious management tasks that would otherwise have to be performed by a person responsible for running the cluster. Examples of such tasks include periodic backups of data, responding to health check failures or scaling out the cluster when the number of queries spike. Operators codify and automate such operational expertise, enabling reliable management of the YugabyteDB cluster while cutting down the number of on-calls. Better service with less work – what’s not to like!

Creating a cluster

As a customer resource definition (CRD), ‘ybclusters’ are now first class objects within the Kubernetes API. This is enabled by adding the operator.yaml to your cluster of choice. The YugabyteDB deployment here will in-turn create a ‘hello-ybdb-cluster’ with RF=3. To create this cluster, simply run the following:

And that’s it! Once you have created your cluster and ensured that the pods are running, you can natively query the ‘ybclusters object’, as shown below:

You can also see more details of the cluster as shown below:

In addition, you can view port settings and log in to the database after deployment:

Connecting to the cluster

To connect to the database cluster using the command line shell, you can run the following:

Next create a table

INSERT a row into the table:

SELECT the row from the table:

You can try out more queries on this cluster as well.

Customizing the Cluster

Note that a number of parameters can be configured by overriding the defaults. The sections below point out some of these customizations.

Aggregate Cluster Resources

YugabyteDB pools together the resources across all the pods in a cluster. You can customize the number of pods from 3 to the desired value in the YugabyteDB cluster you just created by editing the following attribute in the cluster.yaml file:

The attribute tserver refers to the number of YugabyteDB Tablet Server processes in the cluster, which are responsible for hosting/serving user data (e.g, tables). In a nutshell, they deal with all user queries.

Configure Dynamic Volumes

You can also customize the dynamic volume size and storageclass for both master and tserver services:

The YB-Master (aka the YugabyteDB Master Server) processes are responsible for keeping system metadata, coordinating system-wide operations such as create/alter/drop tables, and initiating maintenance operations such as load-balancing

Depending on your cluster implementation, SSD’s are the best option for YugabyteDB operation so ensure your storageclass is mapped appropriately. For example with GKE, the ‘standard’ or default storageclass only reflects HDD. To add an SSD storageclass you could introduce the following to your GKE cluster in a ssd-storageclass.yaml:

After creating the storageclass, change the storageClassName in the cluster.yaml to “faster” or the appropriate name you added to your ssd-storageclass.yaml.

What’s Next?

  1. Read more about running YugabyteDB using Rook in our documentation.
    Explore the core features of YugabyteDB.

For any issues, contributions, or questions, please let us know via GitHub here. As YugabyteDB is built for multi-cloud deployments, future work with projects such as Crossplane.io is a natural fit. We look forward to continuing to extend and integrate YugabyteDB across the Kubernetes ecosystem!

Related Posts