The Distributed SQL Blog

Thoughts on distributed databases, open source, and cloud native

Distributed SQL Sharding: How Many Tablets, and at What Size?

The first answer to this question is the usual “it depends“. The second answer, thanks to YugabyteDB’s auto-splitting feature and distributed SQL sharding principles, is “don’t worry, this is managed automatically.” 

However, it’s still important to understand how sharding works, how to handle corner cases correctly, and how to split tablets to save resources.

Read More

YugabyteDB and Red Hat OpenShift: Resilient Kubernetes Workloads at Scale

Kubernetes has become widely adopted in the Fortune 500. Many companies are now using the platform to run stateless and stateful applications on-premises or as hybrid cloud deployments in production. Of course, with any new technology, there are growing pains when running resilient Kubernetes workloads. But most executives and developers agree that the benefits far outweigh the challenges.

Data on the Kubernetes ecosystem is evolving rapidly with the rise of stateful applications.

Read More

DSS Asia 2022: Event Highlights and Key Takeaways

We held our second annual Distributed SQL Summit (DSS) Asia annual conference on the 30th and 31st of March 2022, following great feedback from our initial event and user demand.

This two-day online conference was packed with fascinating presentations, discussions, and demos from customers, partners, and our own experts.

With 35 engaging sessions to choose from,

Read More

Five Benefits to Running a Distributed SQL Database in Kubernetes

A distributed SQL database is a single logical relational database deployed on a cluster of servers. The database automatically replicates and distributes data across multiple servers. These databases are strongly consistent and support consistency across availability and geographic zones in the cloud.

At a minimum, a distributed SQL database has the following characteristics:

  • A SQL API for accessing and manipulating data and objects
  • Automatic distribution of data across nodes in a cluster
  • Automatic replication of data in a strongly consistent manner
  • Support for distributed query execution so clients do not need to know about the underlying distribution of data
  • Support for distributed ACID transactions

But should you run a distributed SQL database in Kubernetes?

Read More

What Every Application Developer Needs to Know About Geo-Distributed Databases

I’ve been working with distributed systems, platforms, and databases for the last seven years. Back in 2015, many architects began using distributed databases to scale beyond the boundaries of a single machine or server. They selected such a database for its horizontal scalability, even if its performance remained comparable to a conventional single-server database.

Now, with the rise of cloud native applications and serverless architecture,

Read More

Securing YugabyteDB: Part 2 – Client-to-Server Encryption in Transit

In the first post in this series, we covered how to secure YugabyteDB’s internal RPC protocol using the TLS encryption protocol, also referred to as server-to-server encryption in transit. In this post, we secure the communication between SQL clients and the PostgresQL query interface of YugabyteDB, also called client-to-server encryption in transit. 

YugabyteDB—a 100% open source,

Read More

Automation Workflows: Using GitOps for YugabyteDB with Argo CD and Helm

GitOps is an operational framework for declarative-driven systems such as Kubernetes. More specifically, it provides a set of best practices that converge the runtime state of the services with the declarative state defined in Git. On the other hand, Argo CD is a declarative, continuous delivery tool for Kubernetes. Argo CD follows the GitOps pattern of using Git repositories as the source of truth for defining the desired application state.

Read More

Implementing Change Data Capture (CDC) in YugabyteDB

Databases are systems of records in any enterprise. However, unlocking the true business potential of data requires availability across the entire enterprise ecosystem. This means different applications, services, and systems powering the business processes and use cases of the organization. As a result, change data capture (CDC) is an ideal mechanism for integrating downstream applications and services to a database efficiently,

Read More