The Distributed SQL Blog

Thoughts on distributed databases, open source and cloud native

Getting Started with Hasura GraphQL Remote JOINs on Multi-Cloud Distributed SQL

VP Developer Relations

Remote Joins in Hasura GraphQL extend the concept of joining data across tables, to being able to join data across tables and remote data sources. In this blog post we are going to demonstrate this capability by configuring the following set up.

  • A 3 node YugabyteDB cluster running on GKE with a Hasura GraphQL Engine attached
  • A 3 node YugabyteDB cluster running on AKS with a Hasura GraphQL Engine attached
  • A Remote Schema and Remote Relationship configured
  • The ability to issue GraphQL queries that join data from two different databases, hosted in two different clouds, into a single result set

how to get started with hasura graphql remote joins on multicloud distributed sql with yugabytedb

First things first, for those of you who might be new to either distributed SQL or YugabyteDB…

What Is Distributed SQL?

Distributed SQL databases are becoming popular with organizations interested in moving data infrastructure to the cloud or to cloud native environments. This is often motivated by the desire to reduce TCO or move away from the scaling limitations of monolithic RDBMS like Oracle, MySQL, and SQL Server. The basic characteristics of Distributed SQL are:

  • A SQL API for querying and modeling data, with support for traditional RDBMS features like primary keys, foreign keys, indexes, stored procedures, and triggers.
  • Automatic distributed query execution so that no single node becomes a bottleneck.
  • A distributed SQL database should support automatically distributed data storage. This includes indexes which should be automatically distributed (aka sharded) across multiple nodes of the cluster so that no single node becomes a bottleneck for ensuring high performance and high availability.
  • Distributed SQL systems should also provide for strongly consistent replication and distributed ACID transactions.

For a deeper discussion about what Distributed SQL is, check out, “What is Distributed SQL?”

What Is YugabyteDB?

YugabyteDB is an open source, high-performance distributed SQL database built on a scalable and fault-tolerant design inspired by Google Spanner. YugabyteDB is PostgreSQL wire compatible and supports GraphQL along with advanced RDBMS features like stored procedures, triggers, and UDFs.

Got questions? Make sure to ask them in our YugabyteDB Slack channel. Ok, let’s dive in…

Step 1: Install YugabyteDB on GKE

Here are the complete steps to install a 3 node YugabyteDB cluster on Google Kubernetes Engine. Once you have a YugabyteDB cluster running on GKE, check the status of the cluster by executing the command below:

$ kubectl get services --namespace yb-demo

NAME					EXTERNAL-IP		PORT(S)
yb-tserver-service   	35.224.XX.XX    	5433:3067/TCP

Note the external IP for yb-tserver-service which we are going to use to establish a connection between YugabyteDB and Hasura GraphQL Engine. From the output above we can see that the IP is 35.224.XX.XX and the YSQL port is 5433.

Step 2: Create the Northwind Sample Database

In our documentation, you can find a variety of sample databases that are compatible with YugabyteDB. For the purposes of this tutorial we are going to use the Northwind sample database. The Northwind database contains the sales data for a fictitious company called “Northwind Traders,” which imports and exports specialty foods from around the world. The Northwind database is an excellent tutorial schema for a small-business ERP, with customers, orders, inventory, purchasing, suppliers, shipping, employees, and single-entry accounting.

Connect to the yb-tserver-pod by running the following command:

$ kubectl exec -n yb-demo -it yb-tserver-0 /bin/bash

To download the schema and data files, run the following commands:

$ wget https://raw.githubusercontent.com/yugabyte/yugabyte-db/master/sample/northwind_ddl.sql
$ wget https://raw.githubusercontent.com/yugabyte/yugabyte-db/master/sample/northwind_data.sql

Note: If the Google Cloud Shell tells you that the wget command does not exist, you can execute:

$ yum install wget -y

To connect to the YSQL service, exit out of the pod shell and run the following command:

$ exit 

$ kubectl exec -n yb-demo -it yb-tserver-0 -- ysqlsh -h yb-tserver-0.yb-tservers.yb-demo

Create a database and connect to it using the following commands:

yugabyte=# CREATE DATABASE northwind;
northwind=# \c northwind;

We can now create the database objects and load them with data using the files we downloaded to yb-tserver-pod using the following commands:

northwind=# \i 'northwind_ddl.sql';
northwind=# \i 'northwind_data.sql';

By default, a YugabyteDB installation doesn’t have a password setup for the default yugabyte user. Specifying one is done the same way you’d do it in PostgreSQL.

northwind=# ALTER ROLE yugabyte WITH PASSWORD 'password';

Step 3: Set Up Hasura on GKE to Use YugabyteDB

We are now ready to install the Hasura GraphQL Engine on GKE. Exit the YSQL shell and get the Hasura Kubernetes deployment and service files by executing the commands below.

$ wget https://raw.githubusercontent.com/hasura/graphql-engine/master/install-manifests/kubernetes/deployment.yaml
$ wget https://raw.githubusercontent.com/hasura/graphql-engine/master/install-manifests/kubernetes/svc.yaml

Modify the database URL in deployment.yaml file to include the IP of YugabyteDB. For the purposes of this tutorial, the modification should look like this:

value: postgres://yugabyte:[email protected]:5433/northwind

Note: If you’d like everything to run in the yb-demo namespace, make sure to modify the namespace value in both the deployment.yaml and svc.yaml files. This is the setup I have chosen for this demo.

After saving the file, use kubectl to create a Hasura deployment using the commands below:

$ kubectl create -f deployment.yaml

deployment.apps/hasura created
$ kubectl create -f svc.yaml

service/hasura created

To find the external IP and open the Hasura console execute the command below:

$ kubectl get services --namespace yb-demo 

NAME       EXTERNAL-IP     PORT(S)                                    hasura     35.192.XX.XX    80:30546/TCP

Now use http://<EXTERNAL-IP>/console to access the Hasura console. In this case it is 35.192.XX.XX. You should now see the Hasura console.

access the hasura console distributed sql multicloud tutorial blog post yugabytedb

Step 4: Track the Northwind Objects in Hasura

Click on Data in top nav, click on Track All for tables, views, and foreign key relations. Refresh your browser and you should now see the northwind tables in the Explorer tree on the left-hand side when you click on GraphQL in the top nav.

blog post tutorial yugabytedb distributed sql hasura console with northwind sample database

Step 5: Install YugabyteDB on AKS

Here are the complete steps to install a 3 node YugabyteDB cluster on Azure Kubernetes Service. Once you have a YugabyteDB cluster running on AKS, check the status of the cluster by executing the command below:

$ kubectl get services --namespace yb-demo

NAME					EXTERNAL-IP		PORT(S)
yb-tserver-service   	20.190.XX.XX   	5433:3067/TCP

Note the external IP for yb-tserver-service which we are going to use to establish a connection between YugabyteDB and Hasura GraphQL Engine. From the output above we can see that the IP is 20.190.XX.XX and the YSQL port is 5433.

Step 6: Install the Promotions Database

In this step we are going to create a database called promotions, as well as a table called promotions. We’ll then load it with data about what types of promotions there are, what company is sponsoring them, and the duration of the promotion.

Log into the YSQL shell:

$ kubectl exec -n yb-demo -it yb-tserver-0 -- ysqlsh -h yb-tserver-0.yb-tservers.yb-demo

Create a database and connect to it using the following commands:

yugabyte=# CREATE DATABASE promotions;
yugabyte=# \c promotions;

By default, a YugabyteDB installation doesn’t have a password setup for the default yugabyte user. Specifying one is done the same way you’d do it in PostgreSQL.

promotions=# ALTER ROLE yugabyte WITH PASSWORD 'password';

Create the following table in the promotions database.

CREATE TABLE promotions (
	promotion_id serial PRIMARY KEY,
	supplier_id int2 NOT NULL,
	promotion_name VARCHAR ( 50 ) NOT NULL,
	company_name VARCHAR ( 40 ) NOT NULL,
	start_date DATE NOT NULL,
	stop_date DATE NOT NULL,
	created_on DATE NOT NULL DEFAULT CURRENT_DATE
);

Populate the promotions table with the following data.

INSERT INTO promotions (supplier_id, promotion_name, company_name, start_date, stop_date, created_on)
values 
('28','Buy One Get One Free', 'Gai pâturage', '1996-08-15', '1996-08-30', CURRENT_DATE),
('19', 'Free Shipping', 'New England Seafood Cannery', '1996-10-15', '1996-10-30', CURRENT_DATE),
('16', 'Free T-Shirt With Every Purchase', 'Bigfoot Breweries', '1996-12-15', '1996-12-30', CURRENT_DATE),
('20','10% Off', 'Gula Malacca', '1996-11-15', '1996-11-30', CURRENT_DATE),
('8', '20% Off', 'Specialty Biscuits, Ltd.', '1996-09-15', '1996-09-30', CURRENT_DATE);

Step 7: Set Up Hasura on AKS to Use YugabyteDB

We are now ready to install the Hasura GraphQL Engine on Azure Kubernetes Service. Fortunately, it is very easy to get Hasura up and running on AKS and pointed to our YugabyteDB cluster with a simple wizard. Use this “one-click deployment” to set up the particulars. After selecting the option to connect to an existing database, you should see the following.

install hasura graphql on aks distributed sql yugabytedb tutorial

Note that we are specifying the external IP and port of our YugabyteDB cluster. In this case 20.189.XX.XX and 5433. We are also specifying the yugabyte user and the promotions database in the settings.

After Hasura is successfully deployed on AKS, locate the external IP for the console. In my case I looked in the Azure Portal under All Services > Resource groups > /Resource Group/ > hasura-promotions-container-group and saw that 52.137.XX.XX was the public IP assigned to my Hasura GraphQL Engine. With this information you should now be able to open up the Hasura console and track the promotions table. Once that’s accomplished, you should now see the promotions table in your Explorer as shown below.

open up the Hasura console and track the promotions table in the Explorer AKS GKE yugabytedb distributed sql tutorial

Step 8: Set Up a Remote Schema and Remote Relationship

On the Hasura console running on GKE, navigate to the Remote Schemas tab, click on Add and provide the necessary information for the Hasura server running on AKS.

add hasura on the Remote Schemas tab

Next, let’s add a Remote Relationship on the supplier_id table between the suppliers and promotions table that we can leverage in our remote JOIN. You can accomplish this by navigating to Data > suppliers > Relationships > Remote Schema Relationships and specify the relationship.

Remote Schema Relationships and specify the relationship yugabytedb hasura gke aks tutorial example

Step 9: Issue a Remote Join Across Two Databases

We are now finished with setting up a Remote Schema to the promotions database and a Remote Relationship on supplier_id. We can now issue a query that processes data located in two different YugabyteDB databases, hosted on two different clouds. For the purposes of this demo, we want to issue a Remote Join GraphQL query that will allow us to determine if a supplier in the “primary” Northwind database has a promotion scheduled in the “remote” Promotions database.

For example, in the query below we want to know if supplier “16” is running any promotions, and if so, what is the name of the promotion and for what dates is the promotion active.

distributed sql with yugabytedb and graphql with hasura on GKE and AKS example of query - is supplier “16” running any promotions

The GraphQL query should return the following results.

GraphQL query results example yugabytedb distributed sql hasura graphql gke and aks

If we break down the GraphQL query, here’s what is happening.

From the suppliers table, which is hosted on a YugabyteDB cluster running on GKE, we are returning:

  • supplier_id = "16"
  • company_name = "Bigfoot Breweries"
  • contact _name = "Cheryl Saylor"
  • phone = "(503) 555-9931"

We then make use of the Remote Schema promotions_remote that has a relationship on supplier_id to allow us to return the following data from the promotions table, which is hosted on a YugabyteDB cluster running on AKS.

  • promotion_name = "Free T-Shirt With Every Purchase"
  • start_date = "1996-12-15"
  • stop_date = "1996-12-30"

Conclusion

That’s it, if you followed all the instructions in this blog post, you now have the following configured.

  • 3 node YugabyteDB cluster running on GKE with a Hasura GraphQL Engine attached
  • 3 node YugabyteDB cluster running on AKS with a Hasura GraphQL Engine attached
  • A Remote Schema and Remote Relationship configured
  • The ability to issue GraphQL queries that join data from two different databases into a single result set

Related Posts

VP Developer Relations