YugabyteDB Open Source Community Spotlight – October 2021
The Yugabyte community is always active and its members are constantly having interesting conversations and making valuable contributions. We spotlight members of the community to recognize their contributions to making the Yugabyte community a great place.
Warren Wilfred L. Cruz, Data Solutions Architect @ EXIST Software Labs, Inc.
If you’re on the Yugabyte community Slack, you should be familiar with Warren Cruz. He’s consistently active with notable, demonstrated expertise in migrating to YugabyteDB. He’s also written about YugabyteDB multiple times in the past – The Future of the Database: YugabyteDB and A Fully Dockerized MySQL to YugabyteDB Migration Strategy Using pgloader. Warren is a veteran software developer, dev team lead, and database administrator. Currently, Warren is a Data Solutions Architect at EXIST Software Labs where he focuses on implementing enterprise database and data warehouse solutions for clients across industries.
Tell us about yourself
How long have you been coding? What got you interested? What languages/DBs have been your focus?
I began my career in the IT industry way back in 1998 when I was hired by one of the universal banks in the country [Philippines]. I was given formal training in Visual Basic 5, but the open solutions department actually used the just-released Visual Basic 6 as its main desktop database apps programming language, with MS Access as the database “backend”. I developed the “More Than Double Your Money” system, which was rolled out across the country (10 diskettes!), along with other desktop database apps.
My next projects would be with other banks, with one particular project, the Allied Bank risk management project, where I used Informatica PowerCenter as the ETL tool to extract departmental data from various sources, collating them into a SQL Server 2000 data warehouse.
The longest Unisys project that I was involved in was with the National Statistics Authority, where I was deployed for 11 years. My main role and function was as Team Leader and Senior Database Administrator. I was tasked with ensuring database performance, availability, and reporting along with data conversion tasks. Our database backends ranged from SQL Server 2000 to 2012 and 2016 with some Postgres.
Finally, I became part of the EXIST family, and this is where I am now. I’ve been involved in projects in the energy, banking, manufacturing, and telecoms industries, implementing enterprise database solutions through EDB Postgres and a particular implementation of Community Postgres which I dubbed “PostgrEX” (Postgres EXIST Enterprise Xpertise). I got certified in EDB Postgres a little over a year after joining the company. I’ve also implemented Greenplum data warehouse solutions in a few projects and have implemented YugabyteDB in production for two projects now.
Describe your database experience
Which databases have you worked with before? Which do you currently work with? What pain points have you encountered?
My first dealings with the “database” was with MS Access in desktop database application development. After this, I moved on to a full-blooded RDBMS with SQL Server 2000, which I used a lot for many years. My MCSD certification majored in database development with SQL Server 2000. I also got to use versions 2012 and 2016. One of the major pain points I encountered with SQL Server is in the area of index optimization. Before deciding on the drop-recreate method, which proved way faster, the reindex of very large tables could go on for days!
I mainly work with Postgres and its derivatives now. In the enterprise database space, I presently implement Community Postgres, PostgrEX, EDB Postgres, and YugabyteDB, ensuring high availability, automatic failover, backup management, and monitoring. Some lament the lack of multi-threading in Postgres, but I think this dedication to process-safety is one of the reasons Postgres has earned a reputation for being the robust and dependable RDBMS that it is, garnering DB of the year awards for many years.
In the modern data platform and data analytics space, I mainly implement Greenplum and have done proofs of concept (POCs) on Postgres-XL.
How did you first hear about YugabyteDB?
What caught your attention? Why was it interesting? Were you trying to solve a problem?
I first got involved with YugabyteDB through one of our close partners in our Greenplum practice who moved to YugabyteDB. I was given the task of learning the platform, and I found the documentation to be very helpful. The videos on Vimeo and YouTube were also instrumental in my gaining knowledge and experience in the solution.
The problem with user scalability has been around in many of our enterprise database implementations. I researched 2nd Quadrant’s Postgres-BDR in order to find a good multi-master, sharding replication solution, but, as it turns out, it falters in the resiliency department (lose 1 node and everything is over). YugabyteDB addresses the issue of user scalability so well that, aside from multi-master functionality, its replication mechanisms ensure that node failures can be tolerated quite comfortably.
What has your experience been like with YugabyteDB?
What were your expectations? What have you been focusing on? What are your thoughts about it?
I have had only pleasant experiences with YugabyteDB and the team. As mentioned above, the online resources and documentation are phenomenal. The Slack group also is very much indispensable in assisting new adopters to more quickly include YugabyteDB in their arsenal of enterprise database solutions.
I have deployed YugabyteDB in two production environments in two different clients and I have discovered that any roadblock that I may encounter can easily be overcome through the official documentation and other online sources that detail the experiences of other users.
Describe what surprised you the most
Technically, operationally, etc.
I really reveled in the fact that any node can be a master! While I haven’t POC’d geographically separated nodes, the architecture itself is conducive to a kind of user experience that has long been desired but is only now being realized.
In terms of being surprised, I was kinda caught off-guard by some potential performance problems that accepting the default sharding setting can introduce. If you have many, many tables and each is sharded in 8 pieces, the RPC calls between shards in order to maintain consistency can become a bottleneck. Good thing that this can easily be tweaked as a G-flag.
Have you built or migrated any applications to YugabyteDB?
Why? What were your expectations? Has YugabyteDB delivered on them?
In terms of migration, I have successfully migrated MySQL and SQL Server to YugabyteDB using pgloader. If I’m not mistaken, it was my communication with some of the YB team members in Slack about pgloader that prompted the team’s development of its own fork of pgloader. I also reported a bug about the use of pgloader in SQL Server to Postgres migration which also tipped the YB team to the bug. They have confirmed this bug in pgloader and are on the way to correcting it in the YB fork.
As mentioned, two clients are already using YugabyteDB in their newly-developed applications, and both are keen on migrating some of their legacy apps to YB as well.
What future plans do you have for YugabyteDB?
Contributions to the code base, building other projects with YugabyteDB as a backend, etc.
Our company, EXIST Software Labs, Inc., currently has plans of migrating current Postgres enterprise database applications to YugabyteDB. We are also actively involved in positioning YB in new applications development initiatives.
In the future, I see more Kubernetes-based deployments of YugabyteDB and more cloud utilization. My current experience has been mostly with locally-hosted VM environments.
What product feature/enhancement are you most looking forward to?
You can always find our up-to-date roadmap on Github.
In terms of Day 2 Ops support, I hope more allowance is made to DDL (data definition language) changes that are even closer to native Postgres, notwithstanding the distributed nature of the underlying storage. This would really make migration to YB a lot easier in terms of developer adoption and change management.
Have you looked at other distributed SQL or distributed databases in addition to YugabyteDB?
In your opinion, what sets YugabyteDB apart from the rest?
As mentioned above, I did look into Postgres BDR as one of the options for a distributed database implementation with an emphasis on user scalability in terms of multi-master accessibility. The key weakness of BDR is its lack of resilience in that a single node failure brings down the whole cluster.
This is where YugabyteDB is a cut above the rest in that its replication mechanisms allow for node failures and cluster availability, depending on the replication factor chosen. YugabyteDB will definitely meet your five nines SLA requirements!
How would you describe YugabyteDB to somebody just starting the process of finding a new database?
What advice would you give them?
I would tell anyone looking to modernize their enterprise database implementations that being cloud-ready is the name of the game now. Being able to deploy your applications anywhere with your database backend having the same freedom, along with 24/7 accessibility, is key.
With YugabyteDB, your database backend can be deployed in any form factor, be it on-premise or on the cloud, in bare metal servers or virtual machines, in traditional software packages or unmanaged/managed containers through Docker/Kubernetes, and be accessible by your geographically-dispersed users through a global database cluster having nodes in strategically-placed geographical regions.
Anything else you would like people to know?
Oracle-to-Postgres migration is one of the key strategies that many businesses are now employing in order to streamline costs and divert funds to the core business. With this in mind, YugabyteDB is perfectly poised to fill this need.
You can start by migrating your in-house apps to Community Postgres first in order to get a feel for how natural the process can be. If this works out, you can easily migrate your Community Postgres to YugabyteDB with the use of a simple built-in backup utility – ysql_dump – that takes a YB-compatible dump of your Community Postgres database. You can then restore this dump to a waiting YB cluster. From then on, all your new app development can be done in YugabyteDB.