My Yugabyte Journey: From Intern to Full-Time Software Engineer

Tim Elgersma

Hello. My name is Tim Elgersma, and I’m a software engineering intern on the YSQL team at Yugabyte. I have one semester left in my bachelor of Computer Science program at the University of Waterloo. In this blog post, I’d like to talk about my experience interning here over the past several months, and why I’m excited to join the company full time upon graduation.

Adding tablespaces to tablegroups

My onboarding at Yugabyte went pretty smoothly. I learned a bit about the codebase, how YugabyteDB is set up, and how YugabyteDB compares to PostgreSQL. My mentor, Deepthi Srinivasan, assigned me a small bug to become familiar with YugabyteDB’s testing and code review processes.

After that, Deepthi and I went over a list of a dozen possible projects. She gave me a high-level explanation of each project. There were some pretty interesting options, and it was nice to have the choice given to me. I ended up choosing to implement tablespaces on tablegroups.

Tablespaces are well explained here. However, simply put, they allow users to define a specific cloud, region, and/or zone for a table. This means that data can be stored near where it’s used, fault tolerance can be configured, and GDPR regulations can be followed.

Tablegroups are a beta feature in YugabyteDB that allow co-locating multiple tables on a single tablet. Since each tablet has some network, storage, and memory overhead, there is a limit to the number of tablets that can fit on a node. Sharing tablets between tables allows us to increase the number of tables per node.

Coming up with a design document

I came up with a design document after a few days of poking around the code base. I spent time searching for relevant details, asking questions, and reading (surprisingly good!) documentation. Deepthi scheduled a design review for me, and after addressing some comments, I was ready to start on the project.

Overall, I had several implementation steps. I added a few more tasks to my to-do list when I found some simple things missing, as highlighted below.

  • Created a migration for a YugabyteDB system table to rename the pg_tablegroup table to pg_yb_tablegroup and add an additional tablespace_id column.
  • Enhanced the grammar to allow setting a tablespace on a tablegroup and to enable DROP CASCADE for tablegroups.
  • Improved the cluster load balancer to store the tables in a tablegroup in the correct location for the tablegroup.
  • Updated our internal dependency logic to handle a few new cases introduced by tablegroups.
  • Updated pg_dump and pg_dumpall to work with tablegroups.
  • Added support for tab-completion for tablegroups.

I touched a wide range of areas on the database to implement this feature, which was fun. I ran into some interesting problems and got to pick the brains of some SQL geniuses.

Selectivity support

During one of our weekly YSQL team meetings, a presentation on query planning caught my eye. Query planning looked like a really interesting and challenging topic, trying to figure out what the user wants and how to do it faster. I mentioned to Deepthi and Sushant that I thought this was cool. They then started asking around for any query planning projects that I could do in my last month at Yugabyte.

The topic of selectivity came up, so I started reading through some very helpful Postgres documentation on how it works. Then I met with Tanuj, who walked me through Yugabyte’s cost functions and how it’s lacking. Table statistics were recently enabled (but unused). Our query plans are costed using a very basic heuristic-based approach. For example, we guessed if a query would return 1, 10, 100, or 1000 rows.

The topic was different from my previous project, so there were new things to learn, which was exciting. I wrote a document that explained the state of the world, and came up with a simple POC to use Postgres’s selectivity logic when evaluating the selectivity of indexes with restriction clauses, and when estimating the number of rows returned by a relation with restriction clauses. These two fixes can have big impacts on selecting more efficient plans, which increases query execution speed and query efficiency.

Wisdom within

Yugabyte started doing a biweekly session called “Wisdom Within” during my internship. Every second Wednesday, the interns would have an hour-long meeting with a senior leader at Yugabyte to just ask questions about anything. We could take the conversations in any direction interesting to us, so I asked Suda Srinivasan, VP of Product Strategy and Solutions, if he thought Yugabyte was an ethical company. The conversation resulted in our director of HR telling me that Yugabyte would apply for World’s Most Ethical Companies next year. It’s really exciting to work for a company that is not just wanting to be ethical, but willing to have a third party evaluate and critique them in a range of different areas.

Exchange and extensions

Back when I interviewed with Yugabyte, I was planning on completing my degree with a semester abroad starting in February 2021. Yugabyte had told me that they could be flexible in extending my internship for the month of January. Those plans got canceled in September, and then uncanceled in mid-November, so we went ahead with the extension. To give me some time to enjoy the holidays and pack, we extended my contract to 3 days per week for the month of January. Yugabyte’s flexibility here was awesome.

Reflections

I worked on some pretty impactful features, and got to choose what I worked on. There were a ton of cool options, and it was fun too to see Sushant wanting to learn about query planning alongside me. Deepthi created Yugabyte’s tablespaces, and I got to learn a lot from her as I built on top of her work.

I was able to learn a ton during this internship. Before this, I didn’t have a ton of database experience beyond using them in CRUD applications, and I had never taken a database course. I worked with a lot of really smart and helpful people, I was able to ask questions about anything I was curious about, and I got to see a pretty substantial subset of Yugabyte’s code base. My understanding of databases changed from magicly-fast blackboxes to some well-written and understandable code.

I really enjoyed my time at Yugabyte. I’m happy I’ll be back full-time soon!

Want to join a dynamic company in growth mode? We are currently hiring for a number of open positions across Yugabyte. Discover your next opportunity today!

Tim Elgersma

Related Posts

Explore Distributed SQL and YugabyteDB in Depth

Discover the future of data management.
Learn at Yugabyte University
Get Started
Browse Yugabyte Docs
Explore docs
PostgreSQL For Cloud Native World
Read for Free