Here are the slides of the talk I just gave at Froscon. (Video should be available soon.)
I've never been to Burning Man myself, but I'm aware of the event due to Drizzle development stalling to a halt during that festival. In other words, I have many friends that go there.
It was interesting to read a statement from the organizers of Burning Man about the fact that this year there is way more demand for Burning Man tickets than they can sell. Apparently even the desert has its limits (and more so the road leading to it).
Organically growing volunteer projects are exciting because they just grow and grow and there seems to be nothing there to stop them. But once in a while they hit bottlenecks that need to be solved.
An important part of benchmarking is to draw graphs. A graph can reveal results you wouldn't have spotted just by looking at raw numbers. By the way, the process of massaging the raw numbers into graphs will often reveal things too.
Sysbench output tends to be quite wordy, especially when you have a script that runs 1, 2, 4, 8... threads with the same test. To manually copy paste the numbers into a spreadsheet is tiresome. So I came up with this monster shell one-liner to condense the output into a csv file. I'm posting it here so I will find it the next time I need it:
Nowadays you can buy small network attached boxes to function as small home-office disk servers so cheaply, that most of the time I take backups by having a couple of those lying around the house. But for many years I used to backup my personal Linux desktop by burning CD's and then DVD's.
The challenge with backing up to a CD/DVD is that you easily have more data to back up than fits on one disc. The TAR utility was created for tapes and completely lacks any capability of splitting itself into parts. What's more, with TAR you create the archive first, and then compress it, so it is not possible to know in the TAR phase when you've actually reached the size of your CD or DVD, and in the compression phase it is generally too late.
Mythbusters: How to configure InnoDB buffer poll on large MySQL servers
Yesterday I wrote about the dangers in using top on systems with 100+ GB of RAM, not to mention future systems with 1+ TB. A related topic is, how should I configure MySQL on such a large system?
There is a classic rule of thumb that on a dedicated MySQL server one should allocate 80% of memory to the InnoDB buffer pool. On a 128GB system that is 102.4 GB. This means that I would leave 25.6 GB of RAM "unused". So surely on these large systems, this old piece of advice cannot hold anymore. If the database was previously running on a server that in total had less than that altogether, it seems wrong to leave so much memory just unused. Let's label the old rule of thumb tentatively a "myth" and ask mythbusters to figure out a new MySQL configuration for us...
We all know that a megabyte in binary system is not the same as one million bytes (in decimal system). But have you actually cared much about it? I have to admit I haven't. I know there is a small rounding error, but by and large I always treated 2^10 = 1 kB = 1024 bytes and 10^3 = 1 kB = 1000 as the same thing. (Update: Opening sentence was edited to remove units MB and MiB since it seems even I managed to use them backwards! The math in this article is correct. The rest of the article uses MB, GB and TB mostly to refer to binary magnitudes, which is apparently incorrect. See comments for wikipedia links and discussion.)
More importantly, when you move into larger numbers, rounding errors usually become even less important. Unfortunately, in this case they become bigger:
This is the first part of many posts in a series of blog posts where I want to document how the MepSQL packages were built. By doing that, I will also end up covering the MariaDB build system (which this is based on), some of BuildBot, Amazon EC2 cloud and packaging DEBs and RPMs just in general, so it could be interesting from many perspectives. In this first part I'll simply scribble some notes about reviewing the OpenSuse Build System, Launchpad PPA service vs using your own servers and automating the builds with BuildBot.
Originally I just wanted to work on some new ideas on the automated build and QA system used by MariaDB. But since leaving Monty Program I didn't have access to any of those servers anymore, so as a first step I had to look into what alternatives there are for building binary packages for many operating systems and hardware platforms. In fact, this was another thing I had wanted to learn more about for a while. For instance Michal Hrušecký uses OpenSuse Build Service to build both MySQL and MariaDB packages for all RPM based distributions in the blink of an eye - I was interested to find out what's behind that magic.