YugaByte DB

The YugaByte Database Blog

Thoughts on open source, cloud native and distributed databases

A for apple, B for ball, C for “CAP theorem”

Image credit: https://loveforprogramming.quora.com

In the world of databases today, consistency is one of the most misunderstood concepts. It is also one of the big reasons NoSQL databases are difficult to reason about.

The CAP theorem states that “in the presence of a network partition, one has to choose between consistency and availability”. In order to provide higher write availability, some NoSQL databases implement a weaker form of consistency called eventual consistency. This post dives into the types of consistency as referenced by the CAP theorem, how they impact the application and some common (but highly incorrect) assertions about eventual consistency.

Types of consistency

Primarily, there are three forms of consistency as it relates to the CAP theorem — strong consistency, timeline consistency and eventual consistency. Let us try to understand these in the context of a simple, real-world application.

Let us take the example of an email messaging application. Imagine there is a user, Smith, who exchanges three messages with his friend John — m1, m2 and m3. Let us say that the following sequence of operations occur:

  1. Smith first sends a message m1 (“Hi John, how are you?”) to John — making m1 the first message in Smith’s email account.
  2. John responds with a message m2 (“I am fine, Smith. How are you?”), which is therefore the second message in Smith’s email account.
  3. Smith replies to message m2 with a message m3 (“I am fine too. Thanks for your response, John!”), making it the third message in his email account.

How Smith’s inbox should really look.

We are going to examine the effect of various levels of consistency on Smith’s inbox. We assume that the application server that handles Smith’s account does the following:

  • writes the messages in the order m1, m2, m3 into a replicated database
  • each write receives a success response from the database
  • a read of the Smith’s entire inbox is performed after all the three messages are written

We assume that the database can tolerate failures at any point (since data is replicated), and that these failures are within the fault tolerance configuration of the database (for example, if replication is 3, there is at most one failure).

Now let us look at how the types of consistency impact this email application in the real world.

Strong Consistency

If a database offers strong consistency, then an application must be able to read all the writes that were successfully acknowledged.

In our email example, for a read request that tries to read all the messages in Smith’s inbox, the only valid result for a strongly consistent database is:

  • (m1, m2, m3)

How Smith’s inbox looks with strong consistency

The application querying the database always sees a complete, consistent result at any point.

Timeline Consistency

In a timeline consistent database, the messages are replicated asynchronously to a replica in the order in which they were received. Typically, timeline consistency is the result of async, master-slave replication.

In our example, any of the following responses are valid:

  • (m1, m2, m3)
  • (m1, m2)
  • (m1)

How Smith’s inbox might look with async replication

Thus, in case of failures, the application might not see all data that was acknowledged. Furthermore, with timeline consistency, some recent data may be lost in the case of a failure.

Eventual consistency

In an eventually consistent database, there is no ordering of the messages being replicated. Hence, the system can return data that is inconsistent temporarily.

In our example, upon the failure of the master, the valid responses are any combination of m1, m2, m3, for example:

  • m3
  • m2, m3
  • m1, m3

How Smith’s inbox might look with eventual consistency

In other words, upon querying the database for all the messages in Smith’s email account, a response with just the message m3 (“I am fine too. Thanks for your response, John!”) and without m1, m2 is perfectly valid. This is because eventually consistency allows the database to become temporarily inconsistent. The possibility of reading inconsistent data in turn makes it hard for this application to work with such a database.

Myths of eventual consistency

We live in a world of heavy digital marketing with a lot myths around eventual consistency. This is an attempt to reason through some of them.

Myth #1: Using quorum reads and writes in an eventually consistent database makes it strongly consistent

Reality: Implementing quorum reads and writes on an eventually consistent database may give inconsistent results in failure scenarios.

Most applications are not built to handle scenarios such as the one outlined in the message inbox example above. The recommendation from eventually consistent databases to handle situations such as these is to use “quorum reads and writes” in order to implement “strong consistency on top of eventual consistency”. This is an incorrect assertion.

As a simple counter example, imagine you have 3 nodes A, B, C. You first perform (INSERT key=k1) which gets written to all the three nodes. Then, you try to perform (DELETE key) which gets written only to nodes A and B. Now, after a sufficient time period, nodes A and B would have deleted the key as well as dropped the (DELETE key) delete marker. Now, if you read the key, node C would have the key and it would re-surface (due to read-repair).

Myth #2: Eventually consistent DBs are more performant, because strong consistency is achieved by sacrificing performance.

Reality: When using quorum reads and writes, an eventually consistent database performs worse than a strongly consistent one.

A majority of applications built on top of eventually consistent databases use quorum reads and writes to achieve strong consistency. To understand this, let us consider a replication-factor=3 database setup. With eventual consistency, the following always happens:

  • Every write happens on 3 nodes (and we wait for 2 to succeed)
  • Every read happens on 3 nodes (and we wait for 2 to succeed)

With strong consistency, the write pattern is the same, but the reads happen only on one node in the steady state — which is the majority of the time (only in the case of failures — which are occasional, they happen on all the 3 nodes) — making it a much more performant design.

Myth #3: Eventually consistent NoSQL databases are as easy to scale

Reality: An eventually consistent database is much more inefficient to scale out than a strongly consistent one.

None of the replicas of an eventually consistent database have all the data (if they did, it would not be eventually consistent). Therefore, in order to scale out and load balance the data onto newly added nodes, the database has to read data from all its replicas and perform a logical resolve. This is extremely inefficient, and causes unpredictable latencies in the foreground operations because scaling out uses the same mechanism as serving foreground requests. A strongly consistent database — if architected correctly, can just copy compressed data files from a peer, making the rebalancing operation as efficient as possible.

But you may still say:

CP databases do not have high availability, so they really don’t cut it for me.

Not totally true. It is completely possible to architect a CP database with strong consistency and active replicas which can serve data in just a couple of seconds after a failure, making them highly available. Contrast that with an AP database, that takes all your writes but may not necessarily be able to serve it back to you with any guarantees.

So in a nutshell, it is possible to build a database with the following characteristics:

  • strong consistency
  • efficiency in scaling out even for large datasets
  • high performance, low latency
  • high availability
  • BONUS: high data density given all the above

Where can I find such a database?

Stay tuned to find out 😉.

Karthik Ranganathan

Founder & CTO