S h o r t S t o r i e s

// Tales from software development

Archive for April 2016

MySQL covering indexes for performance

leave a comment »

Vitality’s application has a function to create clinical letters. While most customers print a physical copy of the letter and post it, increasingly we are asked to implement an interface to a document management system that sends the letter electronically to the recipient. Typically, this is the patient’s GP.

The interface executes a SELECT query on the correspondence table in the Vitality database similar to this:

SELECT
  c.FileName,
  cs.Description
  FROM 
    correspondence AS c
    INNER JOIN codesets AS cs ON cs.CodeId = c.Type
  WHERE
    c.Approved = true
    AND c.Processed = NULL
    AND cs.CodeSetNo = 20;

Note that the columns referenced on the correspondence table are FileName (of the Microsoft Word document), Type (the type of letter), Approved (indicating the letter is complete and may be sent), and Processed (indicating whether the interface has processed the letter). Codeset 20 contains a list of the letter templates and their descriptions.

This SELECT query has executed in less than a second for the past five years that the interface has been installed at one of our largest customers where approximately 60,000 letters are created each year.

However, a few weeks ago it was noticed that the interface was intermittently failing to send letters. Investigation showed that the SELECT query was sometimes taking in excess of 90 seconds to execute. The command was timing out and throwing an exception that caused the interface to suspend processing. The puzzle was, why was a query that typically executed in less than a second now taking over 90 seconds ?

I’m not sure that we’re ever going to fully understand the change in behaviour but, in part, it appears to be that the MySQL query planner changed the execution plan as the size of the correspondence table grew to over 300,000 rows. After rebuilding the indexes on the table and repairing it the execution time was consistently around 10 seconds.

We realised that whilst we had always thought that the execution time was less than a second and the interface’s processing logs usually showed this, this was likely to be because much of the time MySQL was using a cached query result.

Because the correspondence table contains a summary of the letter content, each row is quite large and a table scan is expensive. With over 300,000 rows in the table, a table scan is very expensive.

So we started to look at how the query could be optimised. Initially we thought that a suitable index would help the query engine to efficiently access the table data but after a number of failed experiments we realised that whilst this was better than a table scan the performance was still quite poor.

The last resort was to try creating a covering index. This is an index in which all the referenced columns are contained in the index so that the query engine doesn’t need to access the table data at all. Our first attempt didn’t show any improvement until we realised that whilst we had included the FileName, Approved, and Processed columns referenced in the SELECT column list and WHERE clause, we had overlooked the reference to the Type column in the INNER JOIN condition. As soon as this column was added to the index there was a massive improvement and the query now consistently executes in 0.3-06 seconds with caching disabled.

Using EXPLAIN to display the query plan shows that the correspondence table is now not being access at all. Instead the covering index is providing all the correspondence column data referenced in the query.

Advertisements

Written by Sea Monkey

April 13, 2016 at 8:00 pm

Posted in MySQL

Tagged with

Flipboard – what’s going on ?

leave a comment »

I’ve been using Flipboard for a couple of years to keep up with several software development blogs and noticed recently that these were showing up in Flipboard as ‘Content unavailable’. As the affected blogs are all hosted by Microsoft MSDN, I guessed that the URLs may have changed and I need to edit the Flipboard entry to point to the updated URL.

This morning I tried to do that on my iPad and discovered that there is no way to edit any property of a Flipboard entry. The Edit functionality consists of one action: X, i.e. Delete. This is poor.

As an alternative to correcting the URL, I thought I’d add a new entry for the new URLs and then delete the old entries. Erm… How do I add a URL for a blog ? It must be possible because I’ve previously done it, obviously, but it isn’t at all obvious now. After trying a number of different ways of searching and adding the blog URL I realised that Flipboard just won’t do this anymore. Or, if it can do it, the functionality is well hidden.

As I wasn’t getting very far with the iPad app I thought I’d try using the web interface on my desktop PC. Bizarrely, the web UI feels even more limited. I don’t think it is as I think it offers exactly the same functionality as the iPad app but there’s an expectation that the browser UI on a desktop PC would offer more. At least on the iPad app I get a message indicating ‘Content unavailable’ but in the desktop browser UI I just get a blank page. This is getting a bit rubbish…

I’m at a loss to understand what Flipboard is trying to achieve. They appear to have removed the ability to follow a blog web site and there is no edit functionality worth mentioning. It’s possible to delete a source but that’s it.

When Apple released its News app on iOS there were suggestions that this would compete directly against apps like Flipboard but rather than build on the existing functionality, Flipboard appears to be removing functionality and dumbing down to ensure that its app fails against News. Weird. Right now, I can’t see any reason not to delete the Flipboard app from my iPad as it’s not providing me with anything that the Apple News app doesn’t do better.

Written by Sea Monkey

April 13, 2016 at 8:30 am

Posted in Comment

Tagged with