Rob's tech blog...

Don't put your DB in the Cloud Mrs Worthington

(With apologies to Noel Coward) A couple of occurrences inspired this post, one was a recent meeting with a local hosting provider to discuss their Virtual Private Cloud (which turned out to be nothing of the sort) and the other was a post from the Percona team comparing performance & cost of databases in the cloud vs databases on a traditional SSD array. To quickly summarise what is a well written and informed post, the basic premise is simple - EBS storage, even when clustered and running in RAID configuration has latency varience that makes it unsuitable for high thoughput applications. What's interesting (to me at least) is the comparison of cost of an SSD array which is written off at $4,000 per year vs the equivalent cloud storage for ~$3,000 per year and the immense difference in throughput on both systems. And this isn't S3 storage - this is EBS and therefore applies equally to any disk intensive cloud server whether it be MySQL or Couch or similar. Yes I am perfectly aware that part of the appeal of Cloud hosting is the freedom to scale but it's interesting to see that, at some point that scaling has limits and introduces issues that do not exist in traditional big iron computing. (Src: www.mysqlperformanceblog.com)

GeoIp - Binary DB vs MySQL Lookups

Recently I had to implement GeoIP lookups for a client project to determine which country user's were coming from in order to supply them locale specific content. That, in itself, is not very interesting. What really prompted sufficient interest to write this post was the difference in look up performance between the two available formats for this database.

Read the rest of this post »

MySQL Indices - what, how and when

One simple area of MySQL optimisation that is often overlooked is indexes* on tables. Basically an index is a sub set of the data that allows the MySQL engine to match against values much faster than normal. In short, it speeds up the sorting of data and therefore reduces the time it takes to return results and the load on the server as a result.

Read the rest of this post »

Don't use Query of Queries for simple datasets!

I've just finished a whole batch of optimisation routines for an Excel import/export that had to run on our CF8 based CRM system (thanks to Ben Nadel for his amazing POI CFC). The export had a tricky requirement whereby each row (of some 56,000 rows) had n items which needed to be set according to a lookup in a link table in the DB. Rather that running each comparison as a query against the live DB (56,000 rows x 20+ items = 1,120,000 queries!) I figured I could pull the whole link table into memory using a simple query and then use Query of Queries to save a bit of time.

Read the rest of this post »

11
To Posterous, Love Metalab