The Distributed SQL Blog

Thoughts on distributed databases, open source and cloud native

How to Migrate Data from Cassandra or MySQL to YugaByte DB?

VP Marketing and Community

If you work with databases, at some point you are going to need to get data in and out them using a format that can be consumed by a completely different system. YugaByte DB makes use of CSV files to make this as easy as possible. The CSV format is arguably the most universally portable way to get data migrations accomplished.

TL;DR – YugaByte DB makes use of Cassandra’s COPY FROM command and a forked version of Cassandra’s Bulk Loader to get data into the system. It uses a tool called Cassandra Unloader to get data out of the system and into a CSV format.”

For the purposes of this blog post we are going to be working with the Cassandra-compatible YCQL API.

Importing Data to YugaByte DB

In this section we’ll walk you through the following operations:

  • Creating a destination table in YugaByte DB
  • Exporting data from Cassandra
  • Exporting data from MySQL
  • Importing data into YugaByte DB

Create a destination table

In this example we are going to create a table called SensorData in YugaByte DB and assume that a compatible dataset already resides in either a Cassandra or MySQL database. If you’d like to generate a sample dataset for use with this example, check out the “Prepare Source Data” section in the import documentation. The DDL to create the table is as follows:

Exporting from Apache Cassandra

Getting data out of Cassandra and into a CSV file is a straightforward operation:

Exporting from MySQL

Similarly, exporting data out of MySQL and into a CSV file is a simple process:

Best practices for importing data into YugaByte DB depends on dataset size

How best to import data into YugaByte DB largely depends on the size of database. For small datasets in the MB rage we recommend using Cassandra’s copy from command. For example:

For datasets in the GB and multi-TB range we recommend using our fork of the Cassandra bulk loader. You can get the bulk loader here. Running the import using the bulk loader would look as follows:

Exporting Data from YugaByte DB into a CSV

In this section we’ll show you how to export data from YugaByte DB into CSV files using the cassandra-unloader tool.

Creating a source table

Let’s use the following DDL to create a source table called USERS for use in our export example:

If you’d like to generate some sample data to populate the table above, check out the “Generate Sample Data” and “Load Sample Data” sections here.

Exporting data to CSV

Using the cassandra-unloader tool, execute the following command:

You can analyze the csv data directly or import it into a different YugaByte DB cluster. That’s all there is to it! For more information concerning data migration features, check out the “Data Migration” section in the docs.

What’s Next?

Related Posts

VP Marketing and Community