The Distributed SQL Blog

Thoughts on distributed databases, open source and cloud native

Announcing the Rook Operator for YugabyteDB

We are excited to congratulate the Rook community on the release on 1.1! We are also pleased to announce that the Rook operator for YugabyteDB is now available from rook.io and also on Github. This release extends the Rook storage operator as a custom resource, as well as provides an additional way to easily create, natively view and manage YugabyteDB within a Kubernetes cluster. In this blog we’ll summarize how to get started with the operator.

What’s YugabyteDB? It is an open source, high-performance distributed SQL database built on a scalable and fault-tolerant design inspired by Google Spanner. YugabyteDB’s SQL API (YSQL) and drivers are PostgreSQL wire-protocol compatible.

YugabyteDB + Rook = Awesome!

YugabyteDB is a high-performance, distributed SQL database that runs natively in Kubernetes. It supports most PostgreSQL features, including very advanced features such as window functions, stored procedures, triggers and extensions. YugabyteDB can horizontally write scale reads and writes, meaning you simply add more pods when need to handle more queries. YugabyteDB can also be deployed in geo-distributed configurations (multi-zone, multi-region, multi-cloud) including across multiple k8s clusters.

The Rook framework simplifies the deployment and management of a YugabyteDB cluster. Rook makes it very easy to build a dedicated Kubernetes CRD and an operator for YugabyteDB. In turn, this YugabyteDB operator simplifies both the deployment and management of the underlying YugabyteDB clusters running on any diverse Kubernetes environment – simply using kubectl.

In addition to the initial deployment of storage resources, operators can automate complex and tedious management tasks that would otherwise have to be performed by a person responsible for running the cluster. Examples of such tasks include periodic backups of data, responding to health check failures or scaling out the cluster when the number of queries spike. Operators codify and automate such operational expertise, enabling reliable management of the YugabyteDB cluster while cutting down the number of on-calls. Better service with less work – what’s not to like!

Creating a cluster

As a customer resource definition (CRD), ‘ybclusters’ are now first class objects within the Kubernetes API. This is enabled by adding the operator.yaml to your cluster of choice. The YugabyteDB deployment here will in-turn create a ‘hello-ybdb-cluster’ with RF=3. To create this cluster, simply run the following:

kubectl create -f operator.yaml
kubectl create -f cluster.yaml

And that’s it! Once you have created your cluster and ensured that the pods are running, you can natively query the ‘ybclusters object’, as shown below:

$ kubectl -n rook-yugabytedb get ybclusters
NAME                 AGE
hello-ybdb-cluster   3h

You can also see more details of the cluster as shown below:

$ kubectl -n rook-yugabytedb describe ybclusters
Name:         hello-ybdb-cluster
Namespace:    rook-yugabytedb
Labels:       none
Annotations:  none
API Version:  yugabytedb.rook.io/v1alpha1
Kind:         YBCluster
...

In addition, you can view port settings and log in to the database after deployment:

kubectl get services -n rook-yugabytedb
NAME                               TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)                                        AGE
yb-master-ui-hello-ybdb-cluster    ClusterIP   10.84.5.159    none        7000/TCP                                       3h54m
yb-masters-hello-ybdb-cluster      ClusterIP   None           none        7000/TCP,7100/TCP                              3h54m
yb-tserver-ui-hello-ybdb-cluster   ClusterIP   10.84.12.151   none        9000/TCP                                       3h54m
yb-tservers-hello-ybdb-cluster     ClusterIP   None           none        9000/TCP,9100/TCP,9042/TCP,6379/TCP,5433/TCP   3h54m

Connecting to the cluster

To connect to the database cluster using the command line shell, you can run the following:

kubectl exec -n rook-yugabytedb -it yb-tserver-hello-ybdb-cluster-0 /home/yugabyte/bin/ysqlsh -- -h yb-tserver-hello-ybdb-cluster-0  --echo-queries

ysqlsh (11.2)
Type "help" for help.

postgres=#

Next create a table

postgres=# CREATE TABLE test (coltest varchar(20));
CREATE TABLE test (coltest varchar(20));
CREATE TABLE

INSERT a row into the table:

postgres=# insert into test (coltest) values ('It works!');
insert into test (coltest) values ('It works!');
INSERT 0 1

SELECT the row from the table:

postgres=# SELECT * from test;
SELECT * from test;
coltest
-----------
It works!
(1 row)

You can try out more queries on this cluster as well.

Customizing the Cluster

Note that a number of parameters can be configured by overriding the defaults. The sections below point out some of these customizations.

Aggregate Cluster Resources

YugabyteDB pools together the resources across all the pods in a cluster. You can customize the number of pods from 3 to the desired value in the YugabyteDB cluster you just created by editing the following attribute in the cluster.yaml file:

  tserver:
replicas: 3

The attribute tserver refers to the number of YugabyteDB Tablet Server processes in the cluster, which are responsible for hosting/serving user data (e.g, tables). In a nutshell, they deal with all user queries.

Configure Dynamic Volumes

You can also customize the dynamic volume size and storageclass for both master and tserver services:

spec:
        accessModes: [ "ReadWriteOnce" ]
        resources:
          requests:
            storage: 1Gi
        storageClassName: standard

The YB-Master (aka the YugabyteDB Master Server) processes are responsible for keeping system metadata, coordinating system-wide operations such as create/alter/drop tables, and initiating maintenance operations such as load-balancing

Depending on your cluster implementation, SSD’s are the best option for YugabyteDB operation so ensure your storageclass is mapped appropriately. For example with GKE, the ‘standard’ or default storageclass only reflects HDD. To add an SSD storageclass you could introduce the following to your GKE cluster in a ssd-storageclass.yaml:

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: faster
provisioner: kubernetes.io/gce-pd
parameters:
  type: pd-ssd

After creating the storageclass, change the storageClassName in the cluster.yaml to “faster” or the appropriate name you added to your ssd-storageclass.yaml.

What’s Next?

  1. Read more about running YugabyteDB using Rook in our documentation.
    Explore the core features of YugabyteDB.

For any issues, contributions, or questions, please let us know via GitHub here. As YugabyteDB is built for multi-cloud deployments, future work with projects such as Crossplane.io is a natural fit. We look forward to continuing to extend and integrate YugabyteDB across the Kubernetes ecosystem!

Related Posts