The Distributed SQL Blog

Thoughts on distributed databases, open source and cloud native

Implementing PostgreSQL User-Defined Table Functions in YugabyteDB

Welcome to part two of a three-part series of posts on PostgreSQL’s table functions. These functions can be easily leveraged in a distributed SQL database like YugabyteDB, which is PostgreSQL compatible.

In part one I gave a brief introduction to PostgreSQL’s table functions. Part three will cover some realistic use cases. I’ll introduce this second post by quoting that paragraph:

A regular language plpgsql user-defined function is implemented using the plain return statement.

Read More

An Introduction to PostgreSQL Table Functions in YugabyteDB

Welcome to the first of a three-part series of posts on PostgreSQL’s table functions. These functions can be easily leveraged in a distributed SQL database like YugabyteDB, which is PostgreSQL compatible.

This series follows on from my “Using Stored Procedures in Distributed SQL Databases” post. In this series of posts we’ll cover:

  • What table functions are and why they’re useful
  • Demonstrate the use of some built-in SQL table functions
  • Introduce how you can implement a user-defined table function,

Read More

YugabyteDB Community Update, Tricks and Tips – Dec 13, 2019

Welcome to this week’s community update where we recap a few interesting questions that have popped up in the last week or so on the YugabyteDB Slack channel, Forum, GitHub or Stackoverflow. We’ll also review upcoming events, new blogs and documentation. Ok, let’s dive right in:

How best to configure clusters across deployment types

Ava over on StackOverflow asked how to best setup configurations for different deployment models like single AZ,

Read More

Using Stored Procedures in Distributed SQL Databases

These days, most monolithic SQL databases support stored procedures. This support first emerged in commercially available offerings in the late nineteen eighties. However, stored procedure support is not yet standard in distributed SQL databases. In fact, YugabyteDB is just one of two in this category—supporting stored procedures written in PostgreSQL’s PL/pgSQL. (Aurora also supports stored procedures.) This post recaps the case for stored procedures that motivated their introduction all those years ago.

Read More

How Plume Handled Billions of Operations Per Day Despite an AWS Zone Outage

Enterprises deploy YugabyteDB clusters across multiple availability zones (AZs) in order to ensure continuous availability of their business-critical services even when faced with cloud infrastructure failures like zone outages. On November 12, 2019, there was one such outage of an entire availability zone in the eu-central-1 region of AWS. This was reported on the AWS status page on that day,

Read More

What is Distributed SQL?

SQL has been the de-facto language for relational databases (aka RDBMS) for almost four decades. Relational databases are therefore also known as SQL databases. However, the original SQL databases like Oracle, PostgreSQL, and MySQL are monolithic from an architectural standpoint. They are unable to distribute data and queries across multiple instances automatically. NewSQL databases emerged to make SQL scalable. However,

Read More

The Benefit of Partial Indexes in Distributed SQL Databases

If a partial index is used, instead of a regular one, on a nullable column—where only a small fraction of the rows have not null values for this column—then the response time for inserts, updates, and deletes can be shortened significantly. As a bonus, the response times for single row selects shorten a little bit too. This post explains what a partial index is,

Read More

How YugabyteDB Scales to More than 1 Million Inserts Per Sec

There are a number of well-known experiments where eventually-consistent NoSQL databases were scaled out to perform millions of inserts and queries. Here, we do the same using YSQL, YugabyteDB’s PostgreSQL-compatible, strongly-consistent, distributed SQL API. We created a 100-node YugabyteDB cluster, ran single-row INSERT and SELECT workloads with high concurrency – each for an hour and measured the sustained performance (throughput and latency).

Read More