Presentation: Databases and the Cloud (and why it is more difficult for databases)

A week ago I again had the pleasure to give a guest lecture at Tampere University of Technology. I've visited them the first time when I worked as MySQL pre-sales in Sun.

To be trendy, I of course had to talk about the cloud. It turns out every section has the subtitle "...and why it is more difficult for databases". I also rightfully claim to have invented the NoSQL key-value development model in 2005.

Actually, I'm aware of OQGraph and I mention it while talking this presentation, but I'm not convinced it qualifies for my criteria for a graph database.

As I see it, a true graph database would:

1) store data on disk in an order that is optimized for the graph queries. In other words the storage format would minimize the average seek time on disk for all paths in the graph. As far as I know OQGraph is actually just a layer on top of InnoDB and will use InnoDB auto_increment keys in the table.

2) uses some query language or api appropriate for graph queries since SQL is really not suitable for the paradigm with nodes and edges. XPath is an example of what I mean. OQGraph is essentially just an SQL function you can use. This is of course because MySQL/MariaDB doesn't make introducing your own query language as simple as plugging in a storage engine is.

The actual point I make in the presentation is that if you did both of the above: dont' use InnoDB and don't use the SQL parser, then there isn't much left of MySQL code that is used. So it seems to me the problem space that a graph database solves really doesn't have a lot of overlap with an RDBMS. The benefit of having a MySQL storage engine like OQGraph is in the end mainly "familiarity" of using MySQL, not necessarily appropriateness.

For comparison, consider that a key-value store on top of innodb does make sense, and several implementations exist that have gathered more awareness than OQGraph.

About the bookAbout this siteAcademicAccordAmazonAppleBeginnersBooksBuildBotBusiness modelsbzrCassandraCloudcloud computingclsCommunitycommunityleadershipsummitConsistencycoodiaryCopyrightCreative CommonscssDatabasesdataminingDatastaxDevOpsDistributed ConsensusDrizzleDrupalEconomyelectronEthicsEurovisionFacebookFrosconFunnyGaleraGISgithubGnomeGovernanceHandlerSocketHigh AvailabilityimpressionistimpressjsInkscapeInternetJavaScriptjsonKDEKubuntuLicensingLinuxMaidanMaker cultureMariaDBmarkdownMEAN stackMepSQLMicrosoftMobileMongoDBMontyProgramMusicMySQLMySQL ClusterNerdsNodeNoSQLNyrkiöodbaOpen ContentOpen SourceOpenSQLCampOracleOSConPAMPParkinsonPatentsPerconaperformancePersonalPhilosophyPHPPiratesPlanetDrupalPoliticsPostgreSQLPresalespresentationsPress releasesProgrammingRed HatReplicationSeveralninesSillySkySQLSolonStartupsSunSybaseSymbiansysbenchtalksTechnicalTechnologyThe making ofTransactionsTungstenTwitterUbuntuvolcanoWeb2.0WikipediaWork from HomexmlYouTube

Search

Recent blog posts

Recent comments