The Distributed SQL Blog

Thoughts on distributed databases, open source and cloud native

Why I Moved from Oracle to YugaByte

Developer Advocate

I’m thrilled at the prospect of what lies ahead of me in my new job at YugaByte. I’ve just started in the role of Developer Advocate for YugaByte DB. This is an open source, cloud native, distributed SQL database—in other words, a database for the modern world. There’s an excellent brief description here.

I made this move a couple of weeks ago after almost thirty years at Oracle. I started with Oracle in the UK, in various roles, ending up in Consulting. Then, in 1996, I moved to Oracle Database R&D as a product manager. This gave me the bonus of moving across the pond from UK to the San Francisco Bay Area. You can see the full chronology on LinkedIn.

My previous job was similar in broad outline to my new job. Each requires a deep understanding of the mental model and API that developers who build applications against a SQL database of record need. And each depends on communicating this understanding proactively in various forms of collateral, and reactively in response to questions that arrive through an ever-increasing proliferation of channels. From my previous life, I know SQL and what makes it uniquely attractive for implementing database-backed applications, I know how to program applications that issue SQL statements and deal with the outcomes, I can write technical prose, I can present to a live audience, and I can drive a round of questions and answers to a usefully formulated problem statement. I’m used to the fact that such problem statements may well lead to specification and implementation work by the developers of the database system to bring it new functionality—or, dare I say, to fix bugs. And I’m used to participating in the softer aspects of this product improvement work. I’m assuming that all these skills are transferable to the new job.

Both Oracle Database and YugaByte DB are SQL databases. So… is the new job just more of the same? No, definitely not! There are big differences, and it’s these that make me so excited. The clue to the differences is these three notions: distributed, cloud native, and open source. Only the SQL API remains the same. The most terse characterization of YugaByte DB combines the notions of the SQL API and the distributed architecture in the label distributed SQL database. The subtext is that SQL is the only sensible way to deal with what the database of record persists because the abstraction that it embodies famously allows developers to program only in the domain of the semantics that they need, delegating all aspects of the physical representation of the data, and its persistence and retrieval, to the database management system. By convention, the term SQL database management system also connotes full-on ACID.

I’m now going to take the opportunity of my first YugaByte blog to go a little deeper. However, I’m still just skimming the surface. You can go as deep as you’d like by visiting our website.

Distributed SQL

Some years ago, the term “SQL Database” implied executing SQL statements using processes running on a single machine, writing and reading data to and from locally attached storage. In this model, the only way to handle increasing demand for data volumes and processing performance is to add capacity to your single machine—so called scaling up. And scaling up a single machine inevitably meets a limit. Now that this model is no longer the only one, the term monolithic SQL database system is used to denote it. The monolithic model brings other challenges. For example, the only way to protect against the total collapse of the single host machine is to run a second, standby, machine and to keep its data synchronized with the primary machine through replication. However, unless performance is to be unacceptably compromised, the replication can only be asynchronous, and asynchronous replication brings further problems in its train.

The only way to defeat the data processing challenges that the monolithic SQL database system brings—the inherent limit of scaling up, and the challenge of surviving catastrophic failure—is to add more and more machines, each with its own storage, and to have them all work in concert as a so-called cluster to meet the overall requirements for data processing. The terms distributed architecture and scaling out characterise this model. Notice that I said data processing rather than SQL processing, and distributed architecture rather than distributed SQL system. This is to remind you of the various schemes, and the various databases, that have implemented these, that the journey towards usable distributed architectures has brought.

Another goal of a distributed architecture is to gain fault tolerance: not by having a whole standby machine waiting in the wings to take over from the primary in the event of its total collapse; but, rather, at the much lower level of granularity of table shards that are automatically replicated. But not all distributed architectures manage this. The journey to develop optimally usable distributed architectures has been tortuous. It’s sufficient to say NoSQL and eventually consistent to characterize the teething troubles. The terms connote tricky programming and wrong results. Of course, nobody chooses such inconsistencies voluntarily. Rather, the inconsistencies have been chosen as part of a bigger trade-off, in the context of currently available possibilities, driven by the non-negotiable requirement to handle vast data volumes with acceptable performance.

The era of trade-offs is now over. YugaByte DB brings an intrinsically fault-tolerant distributed architecture and the time-honored abstraction of SQL and ACID transactions, using the PostgreSQL dialect and the PostgreSQL wire protocol. In short, a distributed SQL database.

Cloud Native

The term cloud native means different things to different people.

  • To a cloud vendor it means various properties that make its cloud offerings putatively better than those of other vendors.
  • To application architects, it means the opportunity to bring something new to market, for use by the general public, without the lead time of specifying and purchasing infrastructure, and with the freedom to change infrastructure choices with the minimum of discomfort—in particular, as is always the hope, as requirements for data volume and performance that start at a modest level grow to planet scale.
  • To the development team it means agility: the ability to spin up and destroy sandboxes for original programming, functional testing, stress testing, user acceptance testing, and so on at the click of a browser button or an API call.
  • To the operations engineers and database administrators, it means the relief from owning many of their more tedious tasks and the ability to respond quickly, and ideally mechanically, to the failure of individual machines in the cluster that jointly implements the distributed database. Notice that rented, virtual machines and containers, running on a cloud vendor’s commodity hardware, do suffer a high chance of failure. Orchestration technologies such as Kubernetes make the management of containers easy but do not solve the problem of failures.
  • And to business executives, it means freedom from cloud vendor lock-in. In other words, the ability to replace the machines that implement a cluster, and that run on one vendor’s cloud, with those provided by another vendor. Ideally this should be done incrementally and without downtime by taking advantage of the mechanism that provides resilience to machine failure within a single cloud, relying on the fact that the data that one machine hosts is also hosted, and multiply so, on other machines.

Before my new colleagues implemented YugaByte DB, they set for themselves the overarching requirement that it should deliver unlimited scalability and seamless fault tolerance within a cluster deployed not just in a regionally or globally distributed fashion but also when the machines that implement the cluster are distributed across different vendors’ clouds. In other words, it was required—before it existed—that it should be truly cloud native. And now that it is implemented, you can download it and try it for yourselves using our Quick Start tutorial.

Open Source

There’s no getting away from the fact that the world of application development has changed dramatically during the decades since robust monolithic SQL database system implementations became commercially available. Back in the day, it was accepted—because there simply was no alternative—that the software used to underpin applications had to be licensed, and usually from the moment that the development of the application began, from vendors who controlled the prices. This implied that the choices of such enabling software were made up front and in no small part based on post-negotiation licensing costs—and that they were substantially influenced by the financial officers. Then came the open source movement. Now the developers of a new application can try various kinds of roughly equivalent enabling software, with no financial commitment, and can even cooperate with its authors, and with other users, to tailor it to their specific needs.

The world has changed, and these days fewer and fewer new application development efforts are based on proprietary enabling software.

Summary

  • Oracle Database is a monolithic RDBMS; YugaByte DB is a distributed RDBMS.
  • The deployment of Oracle Database in the cloud is a straight relocation of its historical on-premises deployment model; YugaByte DB was designed from the get-go for the cloud.
  • Oracle Database is a proprietary system; YugaByte DB is an open source implementation.

Different use cases are best served by different databases. And the use case “Exciting job for Bryn” is definitely best served by YugaByte DB! In other words, I was seduced by the modernity of YugaByte DB, by its promise to restore SQL’s proper popularity in the world of distributed architecture after an epoch where trade-offs forced the goodness that it brings to be sacrificed, and by the prospect of working at a high-growth startup, with a best-in-class team, where I will play many roles during a single day. I was also attracted by the possibility of working with people whom I know, respect, and trust. Kannan Muthukkaruppan, Co-Founder & CEO, and one of the founding engineers were close colleagues of mine in the PL/SQL Team at Oracle HQ. And many others among my new colleagues have each earlier had one, or more, stints with Oracle.

All this will be a huge change for me. But, as they say, a change is as good as a rest.

We Are Hiring!

If you’re passionate about solving problems at the cutting edge of transactional databases, distributed systems and cloud native infrastructure, please check out our open positions.

What’s Next?

  • Compare YugaByte DB in depth to databases like CockroachDB, Google Cloud Spanner and MongoDB.
  • Get started with YugaByte DB on macOS, Linux, Docker and Kubernetes.
  • Contact us to learn more about licensing, pricing or to schedule a technical overview.

Related Posts

Developer Advocate