Reflections on MySQL conference - Part I: technology
This was the first time I attended the MySQL conference as an end user of MySQL, rather than selling something to someone. It gave a different perspective as I was mostly interested in learning how to solve my own problems, rather than someone else's problems. The conference catered to that need very well!
Below, links are generally to the O'Reilly page that contains a PDF or PPT file of the slides. Keynote and Ignite talks contain video. The 2 first ones are the most important and will already take you a long way in becoming a MySQL ninja!
MySQL administration, tuning and architecture
Confuse of Dev or Ops? Simple rule: if you are praise for Web site success, you are Dev; if you are blame when Web site down, you are Ops.
The Monday tutorial by Yoshinori is one of the most useful conference talks ever! It is essentially a guide to what kind of HW you should use with MySQL nowadays, and how to tune your Linux system for it. I can give it to our DBA team and just tell them to take Yoshinori's advice.
Linux and H/W optimizations for MySQL (Yoshinori Matsunobu)
Yoshinori Matsunobu is the new Mark Callaghan: http://en.oreilly.com/mysql2011/public/schedule/detail/17111 Very useful stuff in PDF!
Baron Scwhartz did a 5 minute Ignite talk about causes for MySQL downtime. It turns out SAN failures come up frequently in his studies. This created lots of discussion and tweeting. Boy, I wish I had had these results back when I was selling MySQL Cluster (shared nothing) against Oracle RAC (shared disk). Below I'm also linking to the Percona white paper where Baron published his study and I'd like to highlight the high quality of Percona white papers in general. Often I learn a lot from white papers out there. Sometimes they are just badly disquised marketing. But Percona white papers are stuff you could get a PhD from, they are results of scientific research, with very useful results.
Update: Reading the white paper it becomes clear that SAN is actually not the leading cause for downtime, as the tweets and soundbites suggest, rather user errors such as bad SQL, bad schema design, allowing disk to become full and just DROP TABLE are far ahead. However, SANs are apparently the most common type of component failure in the sample - this is still significant.
This was not a conference session, but at work we recently ran into the problem with how Linux allocates memory on modern NUMA architecture, and were greatly helped by Jeremy Cole's blog post:
I was under orders from my boss to offer Jeremy something at the hotel bar, and this task was executed diligently.
It was interesting to learn Twitter uses MySQL in a similar way as was presented by Nokia's Yekesa Kosuru. At Nokia we use MySQL as a backend for the open source Voldemort NoSQL storage. Twitter developed their own NoSQL layer called Gizzard that also runs on top of MySQL. Some of the main benefits of both of these are transparent sharding and online addition of nodes and shard re-balancing. Twitter then also has a social graph thing called FlockDB on top of Gizzard. It's all open source, so I'll have to take a look one day.
Big and Small Data at @Twitter (Jeremy Cole)
NoSQL with MySQL (Yekesa Kosuru)
OH: "Rails is like your parents having sex; I know it happens, but I don't want to think about it either way."
Other sessions I was at include:
I figured out a way to do unit testing on bash. @xaprb on Aspersa toolkit. #mysqlconf
MySQL and Linux Tuning - Better Together (Steve Francis)
MySQL and SSD: Usage Patterns (Vadim Tkachenko)
Vadim's findings generally support what I already learned in Yoshinori's session. In addition he found some InnoDB scaling issue when using more than 100GB for the buffer pool (which I've been doing lately, need to see if I'm affected).
High availability and replication
The past year I've been paying a lot of attention to replication and HA. Baron's study pretty much rules out SAN now, and re-affirms what I've intuitively felt for a time: Higher level replication is better than lower level, eg. MySQL-level replication is better than DRBD replication which is better than SAN.
My preference for MySQL-level replication is that this gives you the most redundancy and flexibility. With SAN, you have just one black box of disks. If it fails, you have no data. DRBD is kind of better, as you have 2 disks in 2 separate boxes. But only one can be mounted at any single point in time, and failover still includes InnoDB recovery time, which is minutes. With MySQL-level replication you have an active standby for immediate failovers, and it is easy to do things like rolling upgrades. It can even be used as a tool in splitting shards, etc.
Yet, it could be better. Especially current MySQL replication in 5.1 and 5.5 is not good enough. It is single threaded and the juggling around binary log positions is annoying at best, error prone at worst. In the future I'd want to see MySQL replication evolve to something that matches MySQL Cluster, Voldemort and other modern solutions, where replication is an automated tool for provisioning new nodes and re-sharding painlessly. This is why I refer generically to "MySQL-level replication" rather than specifically to the built-in "MySQL replication". So I spent a lot of time looking into solutions for replication.
So classic mysql replication is still the main HA solution for the web. (Yahoo, DeNA...) #mysqlconf
But unfortunately I mostly learned that people just use MySQL 5.0/5.1 replication. As a former MySQL Cluster sales guy, I will just refuse to accept that asynchronous replication is a solution for HA. Of course, in Japan their main threat scenario is an earthquake taking down the data center. I'll accept that asynchronous MySQL replication is appropriate for geo-redunancy. So actually I need both synchronous and asynchronous. But even then I hate the idea of having to deal with juggling MySQL binary log positions. But if I ever have to, then I'm eager to take a look at Yoshinori's scripts that take care of that job.
MySQL High Availability at Yahoo (Jay Janssen)
Automated, Non-Stop MySQL Operations and Failover (Yoshinori Matsunobu)
So, given that what I want doesn't seem to exist, I spent a lot of time looking into replication solutions. One of them, Tungsten, even was awarded as product of the year.
Learn how to cure MySQL replication deprivation with Tungsten! (Robert Hodges, Edward Archibald)
Galera Replication (Seppo Jaakola, Alexey Yurchenko)
A State of Multi-Master Replication (Seppo Jaakola, Alexey Yurchenko)
I actually skipped Oracle's session. Here's a tip to MySQL marketing: If you'd blog more about what you are up to, we'd know what you are doing and might be interested in your work. If your only blog splash is on Monday morning when the conference already has started - well that's the one day in the year I'm not reading it.
The following table summarizes what I came up with:
|MySQL 5.1||MySQL 5.5||MySQL 5.6||Tungsten||Galera||NDB|
|Global transaction id||*||x||x||x|
*) Future release. (Tungsten does parallel slave application for different schema's, however you'd need to write a few lines of Java to parallelize based on other criteria. MySQL 5.6 is also developing parallel slave application based on different schema's. Since I don't know if that will or can be extended, I've left them empty for now, but it could well be a start if I had more information.)
I put in MySQL Cluster NDB since I think it's properties are a good benchmark for a HA database. But it is not InnoDB-based, so disqualifies because of that. I learned at the conference that Galera isn't fully synchronous after all, rather slaves apply the transactions asynchronously. However, as an advantage over MySQL 5.5, Galera internally guarantees that the transactions will be able to apply, there's just a small lag before that actually happens. Alexey already had an idea how to provide fully synchronous mode too (block new reads until necessary statements have been applied). This is however not an important topic for HA, current Galera and MySQL 5.5 satisfy what I want in this respect.
So from the above table, it looks like Galera can potentially do what I want. Tungsten comes out strong too, but due to their architectural approach I don't see how they can ever do synchronous replication - which I will still insist is the only truly correct approach to HA. (Update: Robert Hodges comments below that he does have a plan for (semi)synchronous replication, no other details shared.) MySQL 5.6 looks to implement most of this list, but it is still an early development release, and they need to diversify options for parallel replication.
Forks, alternatives and trends
This is not information that can be learned at a session, but I spent a lot of time trying to find out which of the forks everyone is using. Percona released download numbers of Percona Server, claiming 250K downloads so far. Monty claimed "hundreds of thousands" for MariaDB in his keynote, talking to others it seems this can be interpreted as "more than 100K" which is good for a first year. Drizzle was just released as GA, so cannot compare, but their monthly totals on Launchpad are pretty good, comparable to MariaDB a year ago. (Note that Drizzle not being 100% compatible probably means slower adoption curve, so that is a good number.) Vanilla MySQL from Oracle of course beats these numbers in a week.
Talking to people it also seems the spread is pretty even spread, all forks are gaining attention. SkySQL gets the price for first support vendor to install Drizzle in production at a customer, that's pretty cool of them!
The conclusion of this research was simply that "the tide lifts all boats". Oracle's MySQL sales are growing (to the best of our rumor-based knowledge), the other vendors are growing, all forks are growing. PostgreSQL is growing - according to Josh Berkus guesstimate overall Postgres business grew 400% over last 2 years. NoSQL is growing, but this doesn't take away from the traditional MySQL-PostgreSQL market.
The situation is much like the Linux distribution market in the late 90's. There are many alternatives to choose from. If you find a feature that is a must have for you - pick that one. Otherwise, pick whatever looks good. If it works for you, fine, if not, pick another one.