The Distributed SQL Blog

Thoughts on distributed databases, open source, and cloud native

Why I Joined Yugabyte

Databases are omnipresent

Back in 2016 I started at Nutanix, fresh after finishing my graduate studies. This was barely a couple of weeks after two of Nutanix’s employees, Karthik and Kannan, left to start Yugabyte with Mikhail Bautin. They were still the subject of a few watercooler chats at Nutanix, mainly because of their bold decision to enter the database market, one that is definitely not known for being easy.

It became apparent to me why databases are hard. In my five year stint at Nutanix, I worked for and used nearly half a dozen databases. It seemed like we needed a different database for everything. At first, we needed a consistent, distributed database for our file system metadata. Later, we needed one that could give us transaction support to manage all of our config entities. After that, we started to use yet another database for time series stats. In addition to all that, we definitely needed a relational database for identity management. For a company that didn’t exactly have a database in its portfolio at that time, I was just one of dozens of engineers who built databases at Nutanix. It didn’t take a genius to understand how important databases were and how ridiculously difficult it was to build one.

How I knew about Yugabyte

Around this time Kannan, Karthik and Mikhail arrived at Nutanix to give a talk about Yugabyte. It seemed to address a wide set of requirements – reliability, consistency, rich query layer, and transactions all at high scalability without performance costs. Needless to say, the room was packed. The Yugabyte co-founders didn’t have any trouble hitting the numerous deep questions totally out of the park. I realized no one was surprised at this. Many of the Nutanix employees had personally worked with Kannan and Karthik, and their excellent engineering proficiency was terribly old news. That the database implementation would be excellent was a given. The icing on the cake – this database is completely open source.

Starting off with open source

I was generally happy with my career at Nutanix, but one of the things on my bucket list was to be able to contribute to open source, a desire I have harbored since undergrad. This came up when I was talking to Karthik one day, and he obviously suggested that I consider contributing to YugabyteDB. A quick email to Nutanix Legal confirmed what I already knew – Nutanix was supportive of its employees contributing to open source. I started off with a small change to expose the number of dead/unresponsive YB-TServers as a Prometheus metric from the YB-Master. I knew Yugabyte was a small startup at the time, but still it felt really cool to have someone like Kannan review my changes!

Row level geo-partitioning

As I was looking at other areas to explore in YugabyteDB, Karthik proposed a very interesting YugabyteDB project to contribute to – Row Level Geo Partitioning, i.e. the ability to pin data at row level of a user created table to geographic locations. Being relatively new to the world of relational databases, the scope of this project was mind numbing! However when the Yugabyte team broke this out into individual deliverables, I was really pleased at how neatly such a huge project was broken into small pieces, with each piece constituting a customer visible deliverable.

Working with Yugabyte code

Yugabyte’s YSQL layer is built using a fork of PostgreSQL-11.2. The first phase of my project was to enable a well known PostgreSQL feature called Table Partitioning, and ensure that it works in YugabyteDB. Every issue I faced with dev-environment setup was promptly addressed in the YugabyteDB community Slack channel.

Since the company was still a 3 year old startup at the time, I expected to find hard-to-understand, hard-to-test hacky code spread everywhere. I was completely blown away by how neat the entire codebase was. I didn’t have any difficulty finding anything I needed. For someone who was new to Yugabyte, Postgres, and SQL itself, it was relatively easy to get the Table Partitioning project up and running.

As for tests, I was stunned to find that even command-line tools meant for internal use had a decently exhaustive set of unit tests. “Detective” (or Mikhail’s “weekend project”) is the coolest piece of test infrastructure. All unit tests are run through a complex matrix of platforms (CentOS/Ubuntu/Mac), compilers (GCC/Clang) and build types (debug/release/ASAN/TSAN) against every patch submitted for code review, thus letting the committer know if any unit tests are failing due to their patch. There are also tools explaining which set of commits broke any unit test. These tools make the arduous process of preventing regressions and unit test failures far more manageable than I would have expected.

Working with the Yugabyte team

Once my code was up and running, I presented Table Partitioning to the Yugabyte team, and we got into discussions on how we would implement the next steps for Row Level Geo-Partitioning. I worked with Neha and Mihnea from the YSQL team; and Bogdan and Rahul from the DocDB team at Yugabyte. Technology and code aside, now I had a foray into the company culture itself.

That they were talented and extremely smart engineers comes as no surprise. However,

I also liked how well structured and organized each and every meeting was. Every engineer displayed a very clear thought process, patience in answering questions, and never failed to highly appreciate every small contribution I made. The super cool Community Hero swag that I received for my contribution didn’t hurt either.

Joining Yugabyte

As much as I liked my then career at Nutanix, I realized that in my heart of hearts, my decision was already made. I was ready to move heaven and earth to join Yugabyte, and was preparing for the arduous interview process I expected them to throw at me. Yugabyte however has always been a picture of efficiency. When I expressed interest in joining Yugabyte, I received an offer straight away given that my open source contributions already demonstrated my ability to add value to the company.

On my first official day at work I received a pleasant surprise – the Yugabyte blog had just showcased my work on Row Level Geo-Partitioning! Promotions, awards, company-all-hands announcements are great, but your workplace pride really skyrockets when you see your work on a public facing blog (and in the hands of customers!).

Over the past 3 months I have been working on completing the story on Row Level Geo-Partitioning. It is with tremendous pride that I start my work every day. Nearly every Yugabeing is different from another in terms of their thought process and background, but we are all united in how motivated and invested we are in the company’s success and the success of our customers.

This is not to say that every day at Yugabyte has zero challenges! A growing customer base means a growing list of expansive feature requests. We are also expanding now, and working on streamlining our onboarding processes for new hires, and our build and test processes for new code. You could be the next person who helps us with all of the above! I can’t tell you it will be easy, but it will be more than worth it 🙂

Related Posts