19 April 2011

Traversing larger graphs with OQGRAPH 3. Time for Bacon?

Using the IMDB data, I have loaded the actor and actress database into a few simple InnoDB tables and have performed the following test:

MariaDB [imdb]> show status like "Innodb_rows_read";
+------------------+---------+
| Variable_name    | Value   |
+------------------+---------+
| Innodb_rows_read | 1237459 |
+------------------+---------+
1 row in set (0.00 sec)

MariaDB [imdb]> select x.* from entity as orig, entity as dest, imdb_graph, entity as x where dest.type='ACTOR' and dest.name='Kevin Bacon' and imdb_graph.latch=2 and imdb_graph.origid=orig.id and imdb_graph.destid=dest.id and x.id=imdb_graph.linkid and orig.type='ACTOR' and orig.name='James Earl Jones';
+---------+---------+------------------------------------------------------------------+
| id      | type    | name                                                             |
+---------+---------+------------------------------------------------------------------+
|  780890 | ACTOR   | James Earl Jones                                                 |
| 1922997 | ARCHIVE | "20 to 1" (2005) {Our All Time Favourite Films (#9.11)}  (voice) |
|  131965 | ACTOR   | Antonio Banderas                                                 |
| 1945521 | ARCHIVE | Oscar, que empiece el espect?culo (2008)                         |
| 1003551 | ACTOR   | Kevin Bacon                                                      |
+---------+---------+------------------------------------------------------------------+
5 rows in set (2.80 sec)

MariaDB [imdb]> show status like "Innodb_rows_read";
+------------------+---------+
| Variable_name    | Value   |
+------------------+---------+
| Innodb_rows_read | 2102482 |
+------------------+---------+
1 row in set (0.00 sec)

MariaDB [imdb]> select (2102482 - 1237459)/2.8;
+-------------------------+
| (2102482 - 1237459)/2.8 |
+-------------------------+
|             308936.7857 |
+-------------------------+
1 row in set (0.09 sec)


Over 300,000 edges examined per second... Nifty.

The entity table has approximately 3 million entries and the link table has approximately 30 million edges.

Note - this performance is with a warmed up buffer pool. With a cold buffer pool, the query takes nearly 30 seconds.

14 April 2011

OQGRAPH at MySQL UC 2011

I had a good and lively audience at my presentation today. I was happy to give them a sneek preview of the Mk.III implementation of OQGRAPH.

Oh yeah! I guess I'm announcing the availability of the source code of the MkIII implementation.

Links are in my slides... http://goo.gl/UrybZ

Btw, building the storage engine requires Boost libraries and libJudy installed. It's currently in a MariaDB repo ... versions for MySQL and Drizzle are to follow.

02 August 2010

Tip that will save hours of sleep...

This tip is several months late but here goes anyway.

In my opinion, when you have an environment where you're using multiple MySQL storage engines (using the right tool for the job and all that jazz), there is nothing more annoying than to have tables created or altered into a storage engine that you did not request. The default behaviour of MySQL is to use the default engine if the requested storage engine is not available.

Why is this annoying? Because when a specific storage engine is requested, it needs to be using that specific engine and anything else is an error.

Fortunately, there is an SQL-Mode in MySQL which will enable the most desirable behaviour ... that is, if an engine, specifically requested, is not available, there will be an error instead of substituting the default storage engine: NO_ENGINE_SUBSTITUTION

Everyone should put this into their site's my.cnf files. I am glad to see more and more people using it.

(My only apology is that I am sorry I failed to argue strongly enough to make this behaviour default when I implemented it, years ago)