The Distributed SQL Blog

Thoughts on distributed databases, open source and cloud native

PostgreSQL How-to: SportsDB Running on a Distributed SQL Database

VP Developer Relations

SportsDB is a sample dataset compiled from multiple sources, encompassing a variety of sports including football, baseball, ice hockey and more. It also cross-references many different types of content media. It is capable of supporting queries for the most intense of sports data applications, yet is simple enough for use by those with minimal database experience. The database itself is comprised of over 100 tables and just as many sequences, unique constraints, foreign keys and indexes. The dataset also includes almost 80k rows of data. It has been ported to MySQL, SQL Server and PostgreSQL. You can check out a detailed ER diagram here.

In this post we are going to walk you through how to download and install the PostgreSQL compatible version of SportsDB onto the YugaByte DB distributed SQL database with a replication factor of 3.

What’s YugaByte DB? It’s a high performance distributed SQL database for global, internet-scale apps. YugaByte DB is a PostgreSQL-compatible database. Similar to Google Spanner, YugaByte DB gives you all the scalability characteristics of NoSQL, without sacrificing the ACID transactions or strong consistency you are accustomed to with PostgreSQL.

Download and Install YugaByte DB

For complete instructions on how to get up and running on a variety of platforms including prerequisites, check out our Quickstart Guide. In the following section we’ll cover the basic steps for getting up and running in just a few minutes with a local 3 node cluster on your Mac.

Download and Extract YugaByte DB

$ wget https://downloads.yugabyte.com/yugabyte-2.0.1.0-darwin.tar.gz
$ tar xvfz yugabyte-2.0.1.0-darwin.tar.gz && cd yugabyte-2.0.1.0/

Note: The above instructions are for version 1.3.0. To find the latest version of YugaByte DB, visit the quickstart page.

Configure Loopback Addresses

Add a few loopback IP addresses for the various YugaByte DB nodes to use.

sudo ifconfig lo0 alias 127.0.0.2
sudo ifconfig lo0 alias 127.0.0.3
sudo ifconfig lo0 alias 127.0.0.4
sudo ifconfig lo0 alias 127.0.0.5
sudo ifconfig lo0 alias 127.0.0.6
sudo ifconfig lo0 alias 127.0.0.7

Create a 3 Node Cluster

With the command below you’ll create a 3 node cluster with a replication factor of 3.

$ ./bin/yb-ctl --rf 3 create --tserver_flags "memory_limit_hard_bytes=6442450944”

Note: Due to the hardware limitations of my laptop and size of the scripts, I temporarily bumped up the available memory accessible to YugaByte by adding the above tserver_flags argument. Depending on your setup, you might need to bump it up as well if you see errors like the one below while executing the database scripts.

Service unavailable (yb/tserver/tablet_service.cc:239): Soft memory limit exceeded (at 96.13% of capacity)]

Check the Status of the YugaByte DB Cluster

Now let’s take a look at the status of the cluster and all the nodes that comprise it.

$ ./bin/yb-ctl status

As you can see from the output, we have three nodes running locally with a replication factor of 3. This means that every piece of data is being replicated on all three nodes.

Enter the YSQL shell

Next run the YSQL shell command to enter the PostgreSQL shell.

$ ./bin/ysqlsh --echo-queries
ysqlsh (11.2)
Type "help" for help.
postgres=#

What’s YSQL? It’s YugaByte DB’s PostgreSQL-compatible, distributed SQL API.

We are now ready to build the SportsDB database.

Download and Install the SportsDB Database

Download the SportsDB Scripts

You can download the SportsDB database that is compatible with YugaByte DB from our GitHub repo. The five files you’ll need are:

We’ve purposely broken up what would otherwise be a very large script. By breaking it up into building blocks, it’ll be easier to see what YugaByteDB is doing and spot any problems (if you encounter them) a lot easier.

Create the SportsDB Database

CREATE DATABASE sportsdb;

Let’s confirm we have the sportsdb database by listing out the databases on our cluster.

postgres=# \l

Switch to the sportsdb database.

postgres=# \c sportsdb
You are now connected to database "sportsdb" as user "postgres".
sportsdb=#

Build the SportsDB Tables and Sequences

sportsdb=# \i /Users/yugabyte/sportsdb_tables.sql

We can verify that all 203 tables and sequences have been created by executing:

sportsdb=# \d

Load Sample Data into SportsDB

Next, let’s load our database with sample data, ~80k rows.

sportsdb=# \i /Users/yugabyte/sportsdb_inserts.sql

Let’s do a simple SELECT to pull data from the basketball_defensive_stats table to verify we now have some data to play with.

sportsdb=# SELECT * FROM basketball_defensive_stats WHERE steals_total = '5';

Create Unique Constraints and Foreign Keys

Next, let’s create our unique constraints and foreign keys by executing:

sportsdb=# \i /Users/yugabyte/sportsdb_constraints.sql

and

sportsdb=# \i /Users/yugabyte/sportsdb_fks.sql

Create Indexes

Finally, let’s create our indexes by executing:

sportsdb=# \i /Users/yugabyte/sportsdb_indexes.sql

Note: If you have worked with the SportDB sample database in the past, you know that the index creation section specifies the use of a BTREE index. YugaByte DB makes use of LSM trees, so we’ve modified the script as such. You can read more about LSM vs BTREE in our post, “A Busy Developer’s Guide to Database Storage Engines  –  The Basics.” Even if we had not specified LSM, you would have seen an informational message that advised you that YugaByte DB had made the switch behind the scenes.

Explore SportsDB

That’s it! You are ready to start exploring SportsDB running on YugaByte DB using your favorite PostgreSQL admin or development tool. You can learn more about the SportsDB project, libraries, samples, web services and more by visiting the project page here.

What’s Next

  • Compare YugaByte DB in depth to databases like CockroachDB, Google Cloud Spanner and MongoDB.
  • Get started with YugaByte DB on macOS, Linux, Docker, and Kubernetes.
  • Contact us to learn more about licensing, pricing or to schedule a technical overview.

Related Posts

VP Developer Relations