Basic Introduction to YugaByte DB Components
In this short blog post we are going to give you a quick overview of the components that make up a YugaByte DB universe.
YugaByte DB is composed of nodes. We collectively refer to this collection of all nodes as a universe. These nodes can be physical machines, virtual machines or containers (e.g. Kubernetes).
A YugaByte DB universe is made up of one or more clusters. At a minimum it will contain a primary cluster, and if enabled, a read replica cluster. A cluster is a logical grouping of nodes that are running YugaByte processes. These processes can be performing either synchronous replication (within the nodes of a primary cluster) or asynchronous replication (to nodes of the read replica cluster.)
Depending on your business and technical requirements you might choose to deploy YugaByte DB in a single availability zone (AZ), multiple AZs within a region, or across multiple regions and cloud providers, making use of either synchronous or asynchronous replication.
Keyspaces & Databases
If you are coming from the NoSQL world, specifically when using our Cassandra compatible YCQL API, universes are going to be made up of one or more logical keyspaces. Similarly, if you are making use of our PostgreSQL compatible YSQL API, then think of keyspaces as MySQL or PostgreSQL databases. As such, these keyspaces (or databases) will be made up of tables. YugaByte DB automatically shards, replicates, and rebalances tables across the universe. These shards of table data are referred to as tablets. It is worth noting that keyspaces can span clusters in the scenario where a universe contains both a primary and read replica cluster.
To make all this work, YugaByte DB makes use of two processes.
The YB-TServer process manages the I/O operations of one or more tablets. This process coordinates with other YB-TServer processes to provide the following functionality to the cluster:
- Replicates data to other servers using Raft and stores data persistently using DocDB (a highly optimized implementation of RocksDB).
- Efficient memory utilization via a global block cache.
- Auto-sizing of the block cache and memstore based on available memory
- Distributing tablet load on the server uniformly across data disks to even out the load on the system.
- Throttled compactions across tablets to prevent “compaction storms” which can contribute to high foreground latencies.
- Prioritization of a queue of small and large compactions to minimize maintenance load on the system.
The YB-Master process makes use of Raft to maintain system metadata and records like where tablets are located, and what security policies are enabled on them. This process also performs some background operations like initiating rebalancing and other administrative operations. We should note that the YB-Master process does not participate in I/O operations related to serving data requests.
In a nutshell
- A YugaByte DB universe is made up of nodes.
- Logical groupings of nodes are called clusters.
- Clusters can be primary or read-replicas depending on the replication scheme.
- All nodes run the YB-TServer process and some also run the YB-Master process.
- The YB-TServer process handles I/O operations.
- The YB-Master process handles administrative, maintenance, DDL and resiliency operations.
- Data is stored as tables, which are logically grouped into keyspaces (or databases)
- Keyspaces belong to universes.
- Tables are sharded into tablets.
- Check out the docs for more details on the various components that make up a YugaByte DB universe.
- Compare YugaByte DB to databases like Amazon DynamoDB, CockroachDB and Cassandra.
- Get started with YugaByte DB on macOS, Linux, Docker and Kubernetes.
- Contact us to learn more about licensing, pricing or to schedule a technical overview.