The YugaByte Database Blog

Thoughts on open source, cloud native and distributed databases

AWS re:Invent 2018 Recap – The Freedom to Build

Team YugaByte was at AWS re:Invent in Las Vegas last week. While AWS was announcing a flurry of new product releases and existing product updates, we had some excellent deep dive conversations at our booth on the future of transactional databases and how YugaByte DB is playing its part in shaping that future. This post summarizes our key learnings from the conference,

Read More

Data Modeling Basics – PostgreSQL vs. Cassandra vs. MongoDB

Application developers usually spend considerable time evaluating multiple operational databases to find that one database that’s best fit for their workload needs. These needs include simplified data modeling, transactional guarantees, read/write performance, horizontal scaling and fault tolerance. Traditionally, this selection starts out with the SQL vs. NoSQL database categories because each category presents a clear set of trade-offs. High performance in terms of low latency and high throughput is usually treated as a non-compromisable requirement and hence is expected in any database chosen.

Read More

Are MongoDB’s ACID Transactions Ready for High Performance Applications?

Web app developers initially adopted MongoDB for its ability to model data as “schemaless” JSON documents. This was a welcome relief to many who were previously bitten by the rigid structure and schema constraints of relational databases. However, two critical concerns that have been a thorn on MongoDB’s side over the years are that of data durability and ACID transactions.

Read More

Apache Cassandra: The Truth Behind Tunable Consistency, Lightweight Transactions & Secondary Indexes

ACID transactions were a big deal when first introduced formally in the 1980s in monolithic SQL databases such as Oracle and IBM DB2. Popular distributed NoSQL databases of the past decade including Apache Cassandra initially focused on “big data” use cases that did not require such guarantees and hence avoided implementing them altogether. Our post, “A Primer on ACID Transactions: The Basics Every Cloud App Developer Must Know”

Read More

Google Spanner vs. Calvin: Is There a Clear Winner in the Battle for Global Consistency at Scale?

Prof. Daniel Abadi, lead inventor of the Calvin transaction management protocol and the PACELC theorem, wrote a thought-provoking post last month titled “NewSQL database systems are failing to guarantee consistency, and I blame Spanner”. The post takes a negative view of software-only Google Spanner derivative databases such as YugaByte DB and CockroachDB that use Spanner-like partitioned consensus for single shard transactions and a two phase commit (2PC) protocol for multi-shard (aka distributed) transactions.

Read More

Understanding How YugaByte DB Runs on Kubernetes

As we reviewed in “Docker, Kubernetes and the Rise of Cloud Native Databases”, Kubernetes has benefited from rapid adoption to become the de-facto choice for container orchestration. This has happened in a short span of only 4 years since Google open sourced the project in 2014. YugaByte DB’s automated sharding and strongly consistent replication architecture lends itself extremely well to containerized deployments powered by Kubernetes orchestration.

Read More

How Does the Raft Consensus-Based Replication Protocol Work in YugaByte DB?

As we saw in ”How Does Consensus-Based Replication Work in Distributed Databases?”, Raft has become the consensus replication algorithm of choice when it comes to building resilient, strongly consistent systems. The YugaByte DB database uses Raft for both leader election and data replication. Instead of having a single Raft group for the entire dataset in the cluster,

Read More

How Does Consensus-Based Replication Work in Distributed Databases?

Whether it be a WordPress website’s MySQL backend or Dropbox’s multi-exabyte storage system, data replication is at the heart of making data durable and available in the presence of hardware failures such as machine crashes, disk failures, network partitions and clock skews. The basic idea behind replication is very simple: keep multiple copies of data on physically isolated hardware so that the failure in one does not impact the others and as a result,

Read More

New to Google Cloud Databases? 5 Areas of Confusion That You Better Be Aware of

After billions of dollars in capital expenditure and reference customers in every major vertical, Google Cloud Platform has finally emerged as a credible competitor to Amazon Web Services and Microsoft Azure when it comes to enterprise-ready cloud infrastructure. While Google Cloud’s compute and storage offerings are easier to understand, making sense of its various managed database offerings is not for the faint-hearted.

Read More