YugaByte DB

The YugaByte Database Blog

Thoughts on open source, cloud native and distributed databases

YugaByte Database Community & Engineering Update — July 20, 2018

Welcome to the inaugural edition of the YugaByte DB Community and Engineering update series! Let’s dive in and take a look at what has happened over the last few weeks.

Community News

There has been a lot of activity in terms of meetups and events. In June, YugaByte was at DockerCon. We also did a hands-on lab about building geo-distributed cloud apps at a Datariders meetup and a talk at Samsung about building modern apps at cloud scale. We will be at Google Cloud NEXT 2018 from July 24–26, 2018. Stop by and say hello!

You can also see our other upcoming events.

We are Hiring!

YugaByte is looking for a passionate Developer Advocate! Are you excited about becoming the voice of our users? Do you love experimenting with new technologies and presenting it at conferences, meetups and workshops? We would love to talk to you. Check out our list of open positions.

Release Updates

We recently released the YugaByte DB 1.0.4. This release packs a number of features on top of the 1.0 version by adding:

Cassandra-compatible YCQL Features

  • Secondary indexes and unique constraints to ensure that a column does not have duplicate values.
  • The JSONB datatype now supports fine grained select/update of attributes and built-in operators.
  • Built-in functions to compute averages and to convert blobs to types.

Redis-compatible YEDIS Features

  • Ability to read from the local datacenter.
  • Support for the ZSCORE command.
  • Support bounded staleness for follower reads.

Ecosystem and Deployments

  • YugaByte DB now works with Presto.
  • C++, C# and Go client drivers are now supported.

Roadmap

For the upcoming 1.1 release , we are working towards the general availability of a number of critical features:

Other major items on the roadmap include:

  • Security features like authentication of users in YCQL and YEDIS APIs
  • Support for managed Kubernetes environments such as GKE and PKS

To view a list of all items being worked, browse to our GitHub projects page.

Documentation, Blogs, Tutorials and Videos

Docs Updates

Technical Blogs

Videos and Technical Presentations

Enhancement Requests

There have been a few important enhancement requests from the community.

Unique Secondary Index Enhancement

A recent request was to implement a unique secondary index in order to ensure that there were no duplicate values in a column. For example, consider an employee table which has employee_id as the primary key column, and an email column where the entries need to be unique.

There are a number of such scenarios in OLTP applications where uniqueness of the values in a column need to be ensured, and we believe this is a great addition to a transactional NoSQL database. Hence, we decided to prioritize this feature. Under the hood, the unique constraint performs a distributed transaction using conditional insert statements.

Fine-Grained Errors in Batch Inserts

In the current Apache Cassandra/CQL wire protocol, when any error occurs in a batch of insert operations, only a single error code can be returned. This is not ideal in cases when only a few insert operations fail, because in such cases the app cannot find out the failed inserts in the batch.

The feature request was to implement a way to return fine-grained errors in a batch. This can now be achieved by adding RETURNS STATUS AS ROW clause to an insert statement. As an example:

INSERT INTO t (k, c) VALUES (2, 2) RETURNS STATUS AS ROW;

If the above insert failed because of a unique index violation, it would return an error as follows:

 [applied] | [message]                                  | k | c
-----------+--------------------------------------------+---+---
     false | Duplicate value disallowed by unique index | 1 | 2 
           | k.t_unique_c                               |   |

LIMIT and OFFSET Support

LIMIT and OFFSET enable paginating through results. If a limit count is given, no more than that many rows will be returned. OFFSET says to skip that many rows before beginning to return rows. If both OFFSET and LIMIT appear, then OFFSET rows are skipped before starting to count the LIMIT rows that are returned.

We felt that the LIMIT and OFFSET support feature request would enable a lot of use-cases that need to paginate through the results. Imagine an e-commerce site wants to display the orders placed by a user in a paginated fashion, displaying 10 orders per page. The first page of orders can be retrieved as follows:

SELECT * FROM orders WHERE user_id = 1000 LIMIT 10;

And the second page of orders can subsequently be retrieved using the following query:

SELECT * FROM orders WHERE user_id = 1000 LIMIT 10 OFFSET 10;

Highlights from GitHub, StackOverflow and Forums

Following are a few of the recent questions, comments and discussions that are worth pointing out.

Best Code Reading

One of our community users asked about a case in the code when the schema object gets unconditionally flushed to disk. Definitely required some serious code-reading sessions, way to go BiterrorChen!

Best Use Case Discussion

Here is a great discussion about a use-case to track, store and serve user actions on a platform. It starts out with a discussion on micro-second precision, dives into support for various isolation levels in distributed transactions and finally discusses hash-partitioning in YCQL. Great discussion yjiangnan!

What’s Next?

Karthik Ranganathan

Founder & CTO