Notes from MySQL Conference 2012 - Part 2, the hard part

This is the second and final part of my notes from the MySQL conference. In this part I'll focus on the technical substance of talks I saw, and didn't see.

More than ever before I was a contributor rather than attendee at this conference. Looking back, this resulted in seeing less talks than I would have wanted to, since I was speaking or preparing to speak myself. Sometimes it was worse than speaking, for instance I spent half a day picking up pewter goblets from an egnravings shop... (congratulations to all the winners again :-) Luckily, I can make up for some of that by going back and browse their slides. This is especially important whenever 2 good talks are scheduled in the same slot, or in the same slot when I was to speak. So I have categorized topics here along various axes, but also along the "things I did see" versus "things I missed" axis.

My own talks

Tutorial: Evaluating MySQL High Availability alternatives
Using and Benchmarking Galera in Different Architectures

I'd say both the HA tutorial (with Ben Mildren) and the Galera benchmarking talk (with Alex Yurchenko) went well and were well attended (about 40). There were about 7 people staying for questions after the Galera talk, which lasted another 20 minutes. All in all it felt like people were really interested in Galera and we were able to answer their detailed questions.

For my Drizzle Day talk I did a live demo of the JSON Server. There were no slides, but I recently blogged about it here and here.

The hallway track (OpenShift, MySQL refactoring)

They say the hallway track is the best track. I suppose it's especially true once you reach a certain level of experience and skill, because you will already know the content of most talks. There were many good talks to learn from in this conference, but even so, some key teachings came from the hallway track:

Ironically, the most significant piece of information for me was not at all about MySQL. Mark Atwood is a fellow Drizzler but his dayjob is as a Community Manager and Evangelist for Red Hat's OpenShift. Due to whatever misguided webnews sources I've been reading, I had thought that OpenShift is Red Hat's equivalent of OpenStack and CloudStack. And of course, in such a comparison for all I know, OpenShift is not in the top 2 IaaS offerings. A one minute conversation with Mark was all that was needed to correct this piece of misinformation. Turns out that OpenShift is a PaaS offering, that runs on top of OpenStack. (I don't know, maybe other IaaS vendors too?) It's a way to deploy and auto-scale your apps, which can be written in any language including C/C++, into the cloud. (Similar to something like Heroku.) It was just about to be open sourced, and you can try it for free at openshift.redhat.com.

The other piece of info from the hallway track came from MariaDB developers who say that merging MySQL into MariaDB is becoming increasingly difficult because they see Oracle refactoring the MySQL codebase within 5.6 series. I couldn't get more opinions on this, it seems beyond Monty's team not many people follow the 5.6 codebase yet. (No, I don't read the source code either.) But if true, this is truly interesting news. The refactoring work is the one thing that is really, really great about Drizzle. I've developed a few Drizzle plugins out of pure curiosity, ended up fixing a few things in the core too, including the build system (scaaaryy!!) and I have to say it is a pleasure to work with such clean and modular code. If Oracle can do anything closely resembling that, while of course maintaining backwards compatibility, then we could yet witness amazing things happen in the MySQL world. For the classic MySQL forks, Oracle is really the only one who can do this, since in fundamental things like refactoring, both Percona and MariaDB are still necessarily followers to the "upstream" MySQL. (Even if when looking at single features, they are ahead.) I will certainly monitor this topic closely, and ask more people what they know about it.

Talks I saw

One talk was clearly the best of all that I saw:
Scripting MySQL with Lua and libdrizzle

It was not very well attended. What can you do, the speakers are 2 unknown engineers from China (TaoBao is the eBay of China, and apparently a Top 20 site on the Alexa ranking!) competing in the same timeslot against a Facebook talk on backups, Giuseppe on replication, Peter on the optimizer and Lachlan about Xtrabackup Manager (I suppose also a Facebook talk on backups...). But those 12 that came to this talk clearly appreciated the performance results of this trendy architecture. It was very interesting, presented by very competent engineers, and very relevant to my own work as a performance architect. The slides are up, I recommend you browse through them.

I came late to Stewart's talk about replaying DB load with Percona Playback. It seems the tool is still a bit new, but the use case they target is obviously useful. Will have to play with it one day.

Most other talks I attended because of some poor personal reasons:

I attended Vadim's talk on Galera because our Galera talk was right after it. His talks covered basics and ours was more advanced. Ok I had to do it but did of course not learn much.

I attended Johan's talk on MySQL Cluster 7.2 and learned exactly one thing I didn't know from before - and I don't even use NDB much nowadays. The one thing is that pushdown joins can only work if you stick with the KEY partition type (which is the default). This is a bit of a bummer, since doing custom partitioning is one of the first optimizations you want to do with NDB. (You can still do custom partitioning, but only with KEY partitioning.) The explanation is that the data nodes are ignorant of the partitioning used, so unless it is KEY, they are lost.

I attended the Trading Screen talk about migrating from Sybase to MySQL, and it was an interesting talk, but it seems the Santa Clara audience isn't big on migrations as this was another poorly attended talk. Btw, when I saw the proposal for the Trading Screen I somehow thought it was a US company and a Percona customer. Turns out it was a UK company and a SkySQL customer. Makes sense.

Anyway, I also went to Kaj's talk about cost savings with choosing MySQL. This was also poorly attended and honestly was too high level for a Percona conference (good, but not even superficially technical). I was excited about having both these talks at the conference, and gave them a good rating on the panel, since I was big on migrating Sybase and Oracle to MySQL back in my sales engineer days. But perhaps these kind of talks fit better in a Europe conference. If I'm on the panel next year, I won't consider migration talks anymore.

Even for myself, I perhaps could have spent my time better than these two talks. I'm out of the migration game now, I attended them purely out of historical passion I suppose. Anyway, it's good to know migrations still happen in Europe, and at least at SkySQL customers, if nothing else.

And those were really all the talks I attended when subtracting my own and time spent fetching pewter. Looking at the above list, I suppose here too it is time for me to let go of my past and embrace a new era. (Whatever that might be, but seems that Galera and GIS might be in it...)

Things I learned already prior to the conference

Being on the panel has the advantage that you see all these great talks as proposals several months ahead of everyone else. So I consider myself lucky to have learned about these technologies, and taking an in-depth look at them - already before the conference even began.

Scalr.com

InnoDB row cache A talk on this was submitted after the deadline, which was maybe the reason it wasn't accepted. I find this work really interesting and hoping that one of the forks could include it to see more mainstream usage. (Note that I just completed a key-value json interface for Drizzle, unless the link was obvious...)

A Global In-memory Data System for MySQL (A very advanced MySQL Cluster talk from an end user.)

There were also some talks that were good but I already know the technology from hands on experience:

What's new in MySQL 5.5 and 5.6 replication?

Gizzard: Scale with Twitter's MySQL sharding framework (I never looked at Gizzard, but we have a similar setup using Voldemort at Nokia. It has been presented in 2 previous conferences.)

the list of sessions I see now that I would end up listing a majority of the talks! So yeah, it was just a very good conference, with many good talks in parallel. Anyway, these are the topics I intend to follow up on:

- Vitesse vs Gizzard
- Amazon RDS vs OpenStack RedDwarf
- SSD
- 2 Security talks (Pythian, McAfee)
- Marco's talk on Oracle to MySQL migrations (just because I'm a sucker)
- Sphinx vs InnoDB Full Text
- MariaDB 5.5, I'm especially interested in the GIS feature
- Yoshinori on the Memcache API
- Not so interested in the optimizer stuff itself (just knowing the word "subqueries" is enough for me in practice) but I'll probably look at how Sergey and Timour vs Peter handled this topic in their respective talks. Do they agree or have different views?

So, pretty intensive Summer reading ahead!

Thanks

Thanks to everyone who came, presented, attended, made friends, bought my book, and made all of it a great conference.

igor (not verified)

Mon, 2012-04-23 02:56

Henrik,

It seems to me that in Part 2 you contradict your own statements of Part 1. In Part 1 you say that the number of core MySQLers who left Oracle for different reasons is striking. In Part 2 when talking about possible re-factoring of the MySQL server code you say: "For the classic MySQL forks, Oracle is really the ONLY ONE who can do this, since in fundamental things like refactoring, both Percona and MariaDB are still necessarily followers to the "upstream" MySQL".
How has it come? Or MySQL people who left Oracle (Sun) were mainly not core engineers working on the MySQL server?

Only the following server engineers who worked for MySQL AB at the end of 2002 when I joined the company still stay with Oracle:
Alexander Barkov (Izhevsk, Russia)
Ramil Kalimullin (Izhevsk, Russia)
Sergey Glukhov (Izhevsk, Russia)
Guilhem Bichot (France)
To this list I rightly can add Dmitri Lenev (Moscow, Russia) and Mats Kindahl (Sweden), who joined MySQL AB in 2003.
That's it.

There are some other engineers who worked on the server code, but they joined MySQL AB later and they did not make any significant contributions into MySQL 5.0/5.1 server development.

I don't want to offend anybody, but for each of the above mentioned engineer we, at Monty Program AB, have a counterpart with more solid track of contributions and, at least, as solid tenure (and I pretend here that Monty doesn't count). From this point of view it's not clear for me why we would like to play the role of necessarily followers of the Oracle re-engineering experiments. As a matter of fact we do not.

As to the server re-factoring proper, starting from some projects in 2008-2009 we saw several attempts to do it, but none of them successfully ended up with a merge into the main tree. Mind that I do not consider putting some code into separate name spaces (like relocation of all optimizer functions into the newly created Optimizer class) as a real re-engineering.

At the same time the code of MariaDB 5.3 for the first time in the history of the MySQL development moved all optimizer transformations into the optimization phase and this architectural change created a solid foundation for future optimizer improvements and a proper implementation of prepare statements.

I see that in the latest DMR release mysql 5.6.5 the code of sql_select.c are cut into 5 separate pieces (each moved into a separate file), but still the function from there are called in the same order and this order is not one that should be. The one of the consequences of this is that any merge of mariadb and mysql becomes extremely hard, mainly because mysql 5.6.5 code partly loses its development history.

Finally I really recommend you to peek into the mysql/mariadb development trees from time to time. Both are still open.

Hi Igor

When I say that Oracle is the only one that can do a re-factoring of the classic MySQL code base, I of course do not mean to say that they are the only ones competent or capable of doing so. In fact I believe most experienced developers can refactor a code base if they can fall back on a good test suite. In particular, the MariaDB team of course could refactor the code as much as you want.

However, as long as you desire to continue merging from MySQL you can't, because any refactoring you want to do will make it harder for you to continue merging. For Percona this is even more true, their strategy explicitly is to not make any deep changes to the code base. So therefore, any wholesale refactoring simply must be done by Oracle.

Btw, I appreciated your earlier blog post about MariaDB vs MySQL optimizer work. It was a very good overview into both and highlighted well the approaches of each team.

About the bookAbout this siteAcademicAccordAmazonAppleBeginnersBooksBuildBotBusiness modelsbzrCassandraCloudcloud computingclsCommunitycommunityleadershipsummitConsistencycoodiaryCopyrightCreative CommonscssDatabasesdataminingDatastaxDevOpsDistributed ConsensusDrizzleDrupalEconomyelectronEthicsEurovisionFacebookFrosconFunnyGaleraGISgithubGnomeGovernanceHandlerSocketHigh AvailabilityimpressionistimpressjsInkscapeInternetJavaScriptjsonKDEKubuntuLicensingLinuxMaidanMaker cultureMariaDBmarkdownMEAN stackMepSQLMicrosoftMobileMongoDBMontyProgramMusicMySQLMySQL ClusterNerdsNodeNoSQLNyrkiöodbaOpen ContentOpen SourceOpenSQLCampOracleOSConPAMPParkinsonPatentsPerconaperformancePersonalPhilosophyPHPPiratesPlanetDrupalPoliticsPostgreSQLPresalespresentationsPress releasesProgrammingRed HatReplicationSeveralninesSillySkySQLSolonStartupsSunSybaseSymbiansysbenchtalksTechnicalTechnologyThe making ofTransactionsTungstenTwitterUbuntuvolcanoWeb2.0WikipediaWork from HomexmlYouTube