The Distributed SQL Blog

Thoughts on distributed databases, open source and cloud native

Introducing yugabyted, the Simplest Way to Get Started with YugabyteDB

Users familiar with YugabyteDB’s architecture know that it is based on two servers where cluster metadata storage and administrative operations are managed by a YB-Master metadata server that is independent from the YB-TServer database server. The end result is that the YB-Master server can be configured and tuned (in a manner that is completely isolated from YB-TServer) to achieve higher performance and faster resilience than a single-server architecture. However, the need for these two servers to come together in a cluster increases the mental burden on a new user who wants to experience the database’s benefits in the shortest time possible. Utilities such as yb-ctl have gone a long way to address this challenge in the context of a local cluster that runs only on a single host machine. However, the two-server challenge comes to the forefront when multi-node clusters need to be deployed across multiple host machines especially involving manual deployments that do not use a cluster orchestrator like Kubernetes Helm or Terraform.

We are excited to announce the availability of yugabyted, a native server that acts as a parent server across the YB-TServer and YB-Master servers. yugabyted’s immediate goal is to remove the day-1 learning curve that can be daunting for new users. Over the next few releases, we intend to make yugabyted the standard way of interacting with YugabyteDB, even for complex day-2 scenarios. Similar to mysqld, the `d` in yugabyted stands for daemon that acts as a single point of entry into the database cluster and can be easily managed as an always-on service.

yugabyted parent process or server to two yugabyte servers

Create a single-node cluster

Download and untar yugabytedb tar.gz

wget https://downloads.yugabyte.com/yugabyte-2.2.0.0-darwin.tar.gz
tar xvfz yugabyte-2.2.0.0-darwin.tar.gz && cd yugabyte-2.2.0.0/

You can download yugabytedb for other OS platforms from https://download.yugabyte.com/

Additionally, add a few loopback IP addresses that will allow us to simulate a multi-node YugabyteDB cluster on our laptop.

sudo ifconfig lo0 alias 127.0.0.2
sudo ifconfig lo0 alias 127.0.0.3
sudo ifconfig lo0 alias 127.0.0.4

Start node 1

Starting a single-node cluster is simple. Given that there is only one node, the Replication Factor of the cluster is set to be 1. In other words, every tablet (aka shard) has only one copy stored in the cluster.

./bin/yugabyted start

The output for a successful run on the above command should look like the following. The cluster has been started with a single node listening at 127.0.0.1. The data directory for this node is located at <current-dir>/var/data and the logs directory is located at <current-dir>/var/logs.

output from starting yugabyted with the start command

Review cluster status

You can review the status of the 1-node RF1 cluster using the YB-Master admin UI at http://127.0.0.1:7000/ and YB-TServer admin UI at http://127.0.0.1:9000/. Sample screenshots are available in YugabyteDB Docs. Following is the list of YB-TServers available as shown on the YB-Master admin UI.

list of YB-TServers available as shown on the YB-Master admin UI yugabyted tutorial

Connect with ysqlsh and load sample data set

Now we can connect to the cluster using ysqlsh and load a sample data set as highlighted in the  Explore YSQL docs.

./bin/ysqlsh
CREATE DATABASE yb_demo;
\c yb_demo;
\i share/schema.sql
\i share/products.sql
\i share/users.sql
\i share/orders.sql
\i share/reviews.sql

Run queries

We can run a basic JOIN query to ensure our data was loaded as expected.

SELECT users.id, users.name, users.email, orders.id, orders.total
          FROM orders INNER JOIN users ON orders.user_id=users.id
          LIMIT 10;

At this point, this single-node cluster is a good development environment that you can even run on a Vagrant VM on macOS with the various ports available for access outside the VM.

From single-node to multi-node with the –join option (BETA)

Now that we have created a single-node cluster, we can use yugabyted’s –join option (currently in BETA) to add 2 more nodes into the same cluster and in the process change the Replication Factor of the cluster from 1 to 3. Prior to yugabyted, a single-node cluster RF1 cluster could not be expanded to a 3-node RF3 cluster. So this is an exciting addition to the YugabyteDB feature list.

Start node 2

We start a second node with a listen address of 127.0.0.2 and a directive to join the existing cluster at 127.0.0.1.

bin/yugabyted start --base_dir=/home/yugabyte-2.2.0.0/data2 --listen=127.0.0.2 --join=127.0.0.1

Now we can see that our original 1-node cluster has expanded to a 2-node cluster while remaining at RF1.

yugabyted example - our original 1-node cluster has expanded to a 2-node cluster while remaining at RF1

Start node 3

We start a third node with a listen address of 127.0.0.3 and a directive to join the existing cluster at 127.0.0.1. Even specifying 127.0.0.2 as the join value will work because that is the listen address of the second node in the cluster.

bin/yugabyted start --base_dir=/home/yugabyte-2.2.0.0/data3 --listen=127.0.0.3 --join=127.0.0.1

Our 2-node cluster has now become a 3-node cluster and the Replication Factor has increased automatically from 1 to 3.

Our 2-node cluster has now become a 3-node cluster and the Replication Factor has increased automatically from 1 to 3 yugabyted example

Connect to node 3 and run queries

We can confirm that this third node is a fully functional member of the cluster by executing some queries for existing data.

./bin/ysqlsh -h 127.0.0.3
\c yb_demo;
Run the same query as previous section

Review cluster status

review cluster status yugabyted example

Native failover/repair in action

In this section, we will see how YugabyteDB handles failovers and repairs natively without using any external replication library.

Stop node 1

bin/yugabyted stop

Review cluster status

review status again after stopping node yugabyted example daemon

review cluster status after stopping node yugabyted daemon tutorial

Connect to node 3 and run queries

./bin/ysqlsh -h 127.0.0.3
\c yb_demo;
Run the same query as previous section

The data is preserved even though it was inserted via a node that is no longer available. This is because of YugabyteDB’s self-healing architecture that elects new shard leaders on available nodes whenever shard leaders are lost because of node failures or network partitions. Even though the cluster is under-replicated for a few shards, writes and reads continue on the cluster for those shards.

Bring back node 1

bin/yugabyted start --join=127.0.0.2

bring back node 2 yugabyted daemon example step

We can see that node 1 has joined back the cluster and the cluster is back to a fully replicated state.

Horizontal write scalability in action

Now we are ready to see horizontal write scalability in action where we add a fourth node and insert new data through that new node.

Start node 4

bin/yugabyted start --base_dir=/home/yugabyte-2.2.0.0/data4 --listen=127.0.0.4 --join=127.0.0.1

Review cluster status

As we can see below, the Num Nodes (TServers) changed to 4 but the YB-Masters count remained at 3. This is because this fourth node does not require a new YB-Master server since the existing 3 YB-Master servers already provide resilience against failures in this RF3 cluster.

the Num Nodes (TServers) changed to 4 but the YB-Masters count remained at 3 yugabyted example

Connect to node 4 and execute distributed transactions

./bin/ysqlsh -h 127.0.0.4
\c yb_demo;

Now that you are in the yb_demo database context, run the distributed transaction that simulates a real-world order with ACID properties across the decrement inventory operation and new order creation operation.

BEGIN TRANSACTION;

/* First insert a new order into the orders table. */
INSERT INTO orders
  (id, created_at, user_id, product_id, discount, quantity, subtotal, tax, total)
VALUES (
  (SELECT max(id)+1 FROM orders)                 /* id */,
  now()                                          /* created_at */,
  1                                              /* user_id */,
  2                                              /* product_id */, 
  0                                              /* discount */,
  10                                             /* quantity */,
  (10 * (SELECT price FROM products WHERE id=2)) /* subtotal */,
  0                                              /* tax */,
  (10 * (SELECT price FROM products WHERE id=2)) /* total */
) RETURNING id;

/* Next decrement the total quantity from the products table. */
UPDATE products SET quantity = quantity - 10 WHERE id = 2;

COMMIT;

We can verify that the order got inserted by running the following.

SELECT * from orders WHERE id = (SELECT max(id) FROM orders);

We can also verify that the total quantity of product id 2 in the inventory is 4990 (which is 5000 - 10) by running the following query.

SELECT id, category, price, quantity FROM products WHERE id=2;

Destroy node 4

To make our lives a bit challenging, let us completely destroy the node 4 that we used in the previous step. This can simulate a cluster shrink scenario at an online retailer after the rush of Black Friday and Cyber Monday.

bin/yugabyted destroy --base_dir=/home/yugabyte-2.2.0.0/data4

Connect to node 3 and run queries

We can now verify that the transaction we performed on node 4 prior to its loss is still available on the cluster.

./bin/ysqlsh -h 127.0.0.3
\c yb_demo;
SELECT * from orders WHERE id = (SELECT max(id) FROM orders);
SELECT id, category, price, quantity FROM products WHERE id=2;

Summary

We believe yugabyted is a very important first step towards simplifying the new user experience for YugabyteDB. Over the next few releases, we will continue to advance our feature set, including hardening and expanding the capabilities of yugabyted multi-node. We welcome all users to give yugabyted a try today and provide us feedback via GitHub and Slack.

Related Posts