## CodePoint Open and MySQL spatial extensions

The geographical location data in the Ordnance Survey’s CodePoint Open dataset is expressed as 6 digit eastings and northings values in the British National Grid Reference System. It’s not immediately obvious how this data can be used.

In fact, because the 6 digit values represent an easting or northing at 1 metre resolution, SQL spatial extensions can be used to work on the data directly to produce useful results. For example, the Distance() function can be used against two Point values to return the distance between two points in the same scale as the geometry system in use, i.e. in this case, in metres. Or, it would be if MySQL implemented the Distance function but there’s an easy workaround for this.

First, how do we create a Point value ? The OpenGIS standard defines two formats for expressing geometry data: Well Known Text (WKT) and Well Known Binary (WKB). For example, expressing a Point value in the British National Grid Reference System in WKT might look like this:

POINT(530622, 183959)

(This is actually easting and northing for the London postcode N1 0AA.)

To create a Point value in MySQL you need to pass this WKT expression, as text, to the PointFromText() function:

SELECT PointFromText('POINT(530622 183959)');

If you display the value created you’ll see that it’s binary data. It can be rendered as WKT or WKB using the appropriate function:

SELECT AsWKT(PointFromText('POINT(530622 183959)'));

Ideally, to calculate the distance between two postcodes you’d create Point values for each location and then use the Distance() function to return the distance between them. As MySQL doesn’t yet implement Distance() we have to use this workaround: create a LineString value using the two location values and then call the GLength() function to return the lenth of the line.

The WKT for a LineString is:

LINESTRING(x1 y1, x2 y2)

The following SQL statement creates the LineText value by constructing the WKT value from two postcode locations and returning the GLength value scaled to from metres to kilometres:

SELECT p1.PC , p2.PC , GLength(LineStringFromText(CONCAT('LINESTRING(',p1.`EA`, ' ', p1.`NO`,', ',p2.`EA`, ' ', p2.`NO`,')'))) / 1000 AS Distance_Km FROM postcode p1, postcode p2 WHERE p1.PC = 'N1 0AA' AND p2.PC = 'EH1 1AD';

The second postcode is in the Edinburgh area and the result is:

PC PC Distance_Km N1 0AA EH1 1AD 530.76

One thing to note about the CodePoint Open dataset is that the fomatting of the postcode in the PC field is inconsistent. A postcode consists of an outward code and an inward code separated by a space but some of the postcodes in the dataset do not have a separating space, some have a single space, and others have two spaces. The CodePoint Open User Guide suggests normalizing the formatting of this data to make it usable in your application.

## Leave a Reply