Earlier this year I performed my last act as Drizzle liaison to SPI by requesting that Drizzle be removed from the list of active SPI member projects and that about 6000 USD of donated funds be moved to the SPI general fund.
Drizzle project started in 2008, when Brian Aker and a few other MySQL employees were moved to Sun Microsystem's CTO labs. The background to why there was demand for such a project was in my opinion twofold:
1) Even if MySQL was a leading open source product at the time, with a vibrant and opinionated open source community, it was also the first large single vendor open source ecosystem. This meant that essentially all MySQL code was developed by engineers employed by MySQL itself. While this is not uncommon today, it was at the time, and often came as a surprise to outside community members who tried and then failed to contribute code to MySQL.
2) MySQL's engineering department had managed to tie themselves into a knot and progress was largely stalled during the 5.0 to 5.1 cycle. When I joined the company, MySQL was in the middle of a 13 month code freeze, trying to stabilize toward a 5.1 release. Crucial improvements, like support for more than 2 CPU cores, were developed outside the company. Notably by a team at Google led by Mark Callaghan.
With all the frustrations accumulated both inside and outside MySQL-the-company, Drizzle was received with much enthusiasm. Even Monty Widenius, creator of MySQL, endorsed the project and its more community oriented approach. According to my memory over 50 people contributed to Drizzle in its first year - another testament to how much goodwill and potential the project had in the MySQL community.
Drizzle's approach was to work from a clean slate. Breaking with full MySQL compatibility, a lot of the MySQL 5.0 features (stored procedures, triggers...) were deleted to arrive at a simpler code base. A few months later, Monty would create MariaDB and after that Percona would launch Percona Server, both of which remained compatible with the parent MySQL, and crucially produced regular, stable, production quality releases. In the end it has been these two MySQL forks that were able to fill the void that existed in the MySQL community at the time.
Let's rewrite everything
Drizzle's approach to "rewrite everything from scratch" is a very common lure that engineers around the World often dream of. I have heard this so many times at so many companies. Apart from the obvious reason - the "legacy" code base is crap and we will be better off starting from a clean slate - I've heard and observed the following reasons1 :
- The current code base is hard to understand to you, and writing your own code from scratch will result in a code base you understand better. Hopefully. (Yet other engineers may be perfectly capable of working with the legacy code base. The question is, why is it hard for you?)
- We should use Go for the new project.
- We should migrate all of our stack from AWS to GCP, said an engineer who had read a blog that GCP had better performance and may therefore solve a specific performance issue he had failed to fix.
- Writing your own tool is apparently easier than reading the documentation of an existing one that you could have just used.
- Writing your own tool from scratch is more fun. (Yes, one honest engineer really said this out loud once. I appreciated the honesty.)
- Etc...
One benefit - to the engineer, not the rest of the World - of such projects is that it gives the engineer several months, maybe years, of time that they can just happily hack on the clean slate code base without any pressure from users (or managers) that would create pressure to fix bugs, or you know, not create them in the first place. If you do decide to rewrite everything in Go, or migrate everything to GCP, then those are huge projects where success can only be evaluated months and years later. By that time the engineer (and their manager?) may have already moved to another company, where they can again start writing something from scratch...
In Drizzle's case it took about 3 years until the first "stable" release. While this is an eternity by today's standards, it's worth pointing out that also for MySQL at the time it was normal - or in any case a fact - to have 3 years between major releases. But even then, this was 3 years that the Drizzle developers could just enjoy life, hacking on code that did not have a single production user. Also later in life I've seen that engineers (and their managers) can commonly get away working 3 years on projects that never reach production capability.
How do I know it didn't have a single production user? When I became the Drizzle SPI Liaison, I also wanted to contribute, and I contributed documentation, because it was largely missing. While writing installation instructions, I found a bug: If you installed Drizzle as an RPM or DEB, and started it as root via init.d, then it would crash (when dropping privileges after opening the port). In other words, Drizzle developers would compile drizzled, run tests and tests would pass, but only when starting it as regular user. The ones who said they were running drizzle "in production" to power their Wordpress blogs, must have been running it in a screen terminal. This bug was a great example of the importance of testing what your users are actually doing!
I usually have a strong preference for fixing the production code and rejecting the lure of rewriting everything from scratch. Forcing developers to work on code that is actually running in somebody's production, helps keep everyone honest. If you can't work on the production code, why do you think you can create production quality code from scratch? This preference dates back to my experience with Drizzle.
DB Engines
A Drizzle eulogy would not be complete without a friendly pointing and laughing at DB Engines ranking. This website creates a monthly score estimating the "popularity" of hundreds of database products, by scraping the web for social media mentions, open job positions, and news articles. It is taken very seriously at many marketing departments in NoSQL startups, who tend to have a KPI to improve their db-engines ranking.
Even if Drizzle development had already stalled, and even if Drizzle never had even a single production user, for many years Drizzle was far ahead both MariaDB and Percona Server on the DB Engines list. To those of us who noticed this, and understood what it meant, it served as a good reminder of how seriously to take the monthly rankings.2 And to all the marketing departments who do take DB Engines seriously: If a handful of Drizzle hackers can create more positive publicity than you, then what is it that you are missing?
Acknowledgements
I would like to publicly thank Josh Berkus, who at the time served as SPI board member and also PostgreSQL lead developer, for sponsoring Drizzle in our SPI membership application. Your friendship and collegiality toward other open source database projects was always much appreciated.
- Log in to post comments
- 2629 views
nits
Great post.
For InnoDB:
* the InnoDB problem was scaling beyond 1 socket. It wasn't horrible with 4 cores on 1 socket. It was horrible going to 2 sockets.
* while we fixed many things in InnoDB fixing the InnoDB rw-lock was first done by Yasufumi Kinoshita and his work motivated my team at Google. Our diff was easier for upstream and included a formal model using http://spinroot.com/spin/whatispin.html which might explain why our version made it upstream. AFAIK he now works on upstream InnoDB which is awesome for the community. I think I first learned of him from an amazing presentation at Percona Live.
Writing the new thing from scratch is also perceived as a way to have more impact and get promoted faster. I think this is usually incorrect as impact = %improvement * %usage and a small improvement on a widely used thing == much impact. But that is my bias. The obvious way to get promoted faster is to join a small startup that turns into a successful company. Alas, that is hard to know up front. Another route to success is to find a project with fixable & low-hanging fruit. While many projects have low-hanging fruit, many of those problems aren't fixable for technical and social reasons.
While I love db-engines I suspect it over values the past as so many dead/dying DBMS on that list don't change much in their ratings.
Refactor heavily rather than rewrite
I did like that we pretty much stuck to “refactor not rewrite”, and design towards being able to add features back in with more modular code at a later date.
I’ve done a lot of thinking over the years on where and how we went wrong. It’s something that probably warrants its own blog post, but whenever I sit down to write it I don’t want to, it makes me sad.
We probably should have been aggressively courting production users earlier than we did, and just living with the fact we’d have to deal with upgrades and migration. Also, without a doubt, when the overwhelming majority of core developers go to work for the one company, you add a lot of risk.
Correct. Drizzle didn't start
Correct. Drizzle didn't start from scratch in the "rewrite everything in Go" sense. I just thought this was a good opportunity to reflect a bit wider on this pattern, so I included thoughts beyond just what was relevant to Drizzle. But with the benefit of hindsight, I do think it was a mistake that Drizzle broke so much with the legacy MySQL that it lost connection with real production users. In that sense it shared the same risk, and the risk did materialize, as full rewrites have.
Another side of the coin: maintenance debt
Henrik, you are mostly spot on. Especially your first point reminds me of Henry Spencer's USENET signature: "Those who do not understand Unix are condemned to reinvent it, poorly."
A related observation is that it often is easier to add a new layer of code, or perhaps duplicate and modify existing code than to try to thoroughly understand all underlying code, so that it could be refactored to fulfill the new requirements. In MySQL 5.7, one example of that is the SPATIAL INDEX implementation for InnoDB. It is a mix of both duplicated code (the files in the directory storage/innobase/gis mostly duplicate some files in storage/innobase/row) and modification of lower-level functions (introducing conditions to distinguish B-trees and R-trees).
I might even claim that new layers of code are sometimes introducing maintenance debt, by making code harder to read and maintain, and slower to execute. An example of slowdown is the introduction of the function trx_get_id_for_print() in MySQL 5.7. It was mechanically added to every code path, even to code paths where it is obviously redundant (because a unique transaction identifier is known to exist). I think that in cases like this, it would be better to spend some more thought on design, to try to achieve zero-overhead abstractions. For example, a Boolean template parameter to an inline function could be used, so that in debug builds we can assert that a transaction identifier has been assigned, while in release builds we would assume that it has been assigned.
I believe that some refactoring is necessary from time to time to gain advantage of emerging technologies. Refactoring could enable an interface or a compiler optimization that improves performance, or it could enable some compile-time instrumentation that allows more errors to be caught during testing (say, interfacing custom memory management with Valgrind, AddressSanitizer, or MemorySanitizer, or custom synchronization primitives with ThreadSanitizer).
I agree that disregarding compatibility is very dangerous, because it not only could it alienate potential users, but it could also invalidate large amounts of pre-existing regression tests.
Since I joined MariaDB Corporation in 2016, we have been refactoring InnoDB, refactoring and partially rewriting some things. We have tried to maintain compatibility with earlier MariaDB and MySQL versions whenever possible.
Finally, I would like to reply to Mark Callaghan. Yes, I remember the SPIN model of your refactoring of the InnoDB RW-latch, and was happy to see some use of formal methods in the wild. (Before joining Innobase Oy in 2003, for my doctoral thesis I worked on a model checker that had some similar features to SPIN.) Sadly, I was not assigned to work on the task, so I am not aware of what exactly your team improved back then. Yasufumi Kinoshita later joined Oracle, and for MySQL 5.7 he improved the InnoDB RW-latches with a new SX-latch mode that allows some fields of an index root page to be modified while other parts of the root page is S-latched. I phrased the "High Level Architecture" in https://dev.mysql.com/worklog/task/?id=6326 before the "Examples" section. The design is Yasufumi's.
Thanks Marko! These are great
Thanks Marko! These are great additional viewpoints from InnoDB point of view.
I have also worked with code that was copied from a sibling directory, with the justification that not touching the existing code decreases risk.
rw-lock
I wrote about the InnoDB rw-lock at http://smalldatum.blogspot.com/2019/12/fixing-innodb-rw-lock.html
I hope to use Spin again.
Replacing the locks
In MariaDB Server 10.6, we replaced InnoDB’s implementation of mutexes and events with conventional mutexes and condition variables. This work also involved replacing most use of InnoDB rw-locks with a simple non-recursive rw-lock (using SRWLOCK on Microsoft Windows, and std::atomic and futex on Linux or OpenBSD, generic rw-lock on othe systems). For the index tree latch and buffer pool page latches, I concluded that we must continue to support recursion and the special lock mode that Yasufumi Kinoshita called SX mode. Hence, we have futex-based sux_lock (Shared, Update, eXclusive) and a wrapper that supports recursive U and X lock acquisition as well as upgrading U locks to X.
The fact that SRWLOCK or the futex-based rw-lock fits in a machine word allowed another improvement: pushing down hash table locks to the hash array, using one latch per cache line. Thus, acquiring a latch on the hash table cell will as a byproduct also load the protected data to the cache. This was partly covered in my recent talk: https://fosdem.org/2021/schedule/event/mariadb_buffer_pool_improvements/