open life blog

Microverse guest lecture: How to build a career working in Open Source (and also remotely)

In 8th grade, our home economics class got an assignment to create a scrapbook describing our adult life. Students would cut and paste - I mean literally, with scissors - pictures from magazines to create a portfolio with their 3 bedroom houses, one or two cars, a dog, and so on.

I of course did it a bit differently. I spent hours drawing a log cabin that my future family would live in. The cabin was in the wilderness in Lapland, where I operated some tourism related company as a family business. This was a few years before I would hear about the internet, or own a cell phone. But in my dream future I used a fax and satellite phone to operate my business. Btw, it was also years before I heard the word "startup" :-D

Automated System Performance Testing at MongoDB - the end of a trilogy

We are pleased to announce that our paper Automated System Performance Testing at MongoDB will be presented at DBTest 2020. (A workshop in conjunction with SIGMOD/PODS.) It is available today on Arxiv.org.

This paper presents the framework we developed to do automated performance testing on realistic MongoDB clusters. We have used and evolved this system to run hundreds of benchmarks every day as part of our Continuous Integration system. In conjunction with publishing this paper, we have also finally open sourced the Python code to this framework. The framework is called Distributed Systems Infrastructure 2.0, or DSI for short.

My conference talks on Youtube

My kids watch a lot of youtube. They follow the famous Finnish youtubers every week. At some point my son had realized there are many videos on youtube with his father doing conference talks. Some of them have a thousand viewers. I've never gotten so much adoration and respect from my son as that day!

I've created a playlist of all my conference talks that have been published on youtube.

Paper review: Strong and Efficient Consistency with Consistency-Aware Durability

Mark Callaghan pointed me to a paper for my comments: Strong and Efficient Consistency with Consistency-Aware Durability by Ganesan, Alagappan and Arpaci-Dusseau ^2. It won Best Paper award at the Usenix Fast '20 conference. The paper presents a new consistency level for distributed databases where reads are causally consistent with other reads but not (necessarily) with writes.

My comments are mostly on section 2 of the paper, which describes current state of the art and a motivation for their work.

Writing a data loader for database benchmarks

A task that I've done many times in my career in databases is to load data into a database as a first step in some benchmark. To do it efficiently you want to use multiple threads. Dividing the work onto many threads requires good comprehension of third grade math, yet can be surprisingly hard to get right.

The typical setup is often like this:

  1. The benchmark framework launches N independent threads. For example in Sysbench these are completely isolated Lua environments with no shared data structures or communication possible between the threads.
  2. Each thread gets as input its thread id i and the total number of threads launched N.

Automatic retries in MongoDB

At work we were discussing whether MongoDB will retry operations in some circumstances or whether the client needs to be prepared to do so. After a while we realized different participants in the discussion were discussing different retries.

So I sat down to get to the bottom of all the retries that can happen in MongoDB, and write a blog post about them. But after googling a bit it turns out someone has already written that blog post, so this will be a short post for me linking to other posts.

Retries by the driver

If you set retryWrites=true in your MongoDB connection string, then the driver will automatically retry some write operations for some types of failures. Ok, can I be more specific? Yes I can...

Let's rewrite everything from scratch (Drizzle eulogy)

Earlier this year I performed my last act as Drizzle liaison to SPI by requesting that Drizzle be removed from the list of active SPI member projects and that about 6000 USD of donated funds be moved to the SPI general fund.

Drizzle project started in 2008, when Brian Aker and a few other MySQL employees were moved to Sun Microsystem's CTO labs. The background to why there was demand for such a project was in my opinion twofold:

About the bookAbout this siteAcademicAccordAmazonAppleBeginnersBooksBuildBotBusiness modelsbzrCassandraCloudcloud computingclsCommunitycommunityleadershipsummitConsistencycoodiaryCopyrightCreative CommonscssDatabasesdataminingDatastaxDevOpsDistributed ConsensusDrizzleDrupalEconomyelectronEthicsEurovisionFacebookFrosconFunnyGaleraGISgithubGnomeGovernanceHandlerSocketHigh AvailabilityimpressionistimpressjsInkscapeInternetJavaScriptjsonKDEKubuntuLicensingLinuxMaidanMaker cultureMariaDBmarkdownMEAN stackMepSQLMicrosoftMobileMongoDBMontyProgramMusicMySQLMySQL ClusterNerdsNodeNoSQLNyrkiöodbaOpen ContentOpen SourceOpenSQLCampOracleOSConPAMPParkinsonPatentsPerconaperformancePersonalPhilosophyPHPPiratesPlanetDrupalPoliticsPostgreSQLPresalespresentationsPress releasesProgrammingRed HatReplicationSeveralninesSillySkySQLSolonStartupsSunSybaseSymbiansysbenchtalksTechnicalTechnologyThe making ofTransactionsTungstenTwitterUbuntuvolcanoWeb2.0WikipediaWork from HomexmlYouTube