Rob's tech blog...

Don't put your DB in the Cloud Mrs Worthington

(With apologies to Noel Coward) A couple of occurrences inspired this post, one was a recent meeting with a local hosting provider to discuss their Virtual Private Cloud (which turned out to be nothing of the sort) and the other was a post from the Percona team comparing performance & cost of databases in the cloud vs databases on a traditional SSD array. To quickly summarise what is a well written and informed post, the basic premise is simple - EBS storage, even when clustered and running in RAID configuration has latency varience that makes it unsuitable for high thoughput applications. What's interesting (to me at least) is the comparison of cost of an SSD array which is written off at $4,000 per year vs the equivalent cloud storage for ~$3,000 per year and the immense difference in throughput on both systems. And this isn't S3 storage - this is EBS and therefore applies equally to any disk intensive cloud server whether it be MySQL or Couch or similar. Yes I am perfectly aware that part of the appeal of Cloud hosting is the freedom to scale but it's interesting to see that, at some point that scaling has limits and introduces issues that do not exist in traditional big iron computing. (Src: www.mysqlperformanceblog.com)

Scotch on the Rocks 2011 – Day 2 – Expand your mind with noSQL and then relax on the CouchDB - Mark Drew

Final day, final session and the gods of tech had intervened by killing Mark’s laptop first thing Friday morning which meant I was able to go and see his presentation* in place of the EHCache session I was due to see. NoSQL and I have a love hate relationship – I’d love to be able to use it and hate that I can’t. That said, Mark’s preso covered the basic issues surrounding relational databases in the modern world and how poorly a lot of data fits into the table / join concept as well as pointing out the differences in some of the common NoSQL alternatives. After a global overview (which given the amount of data and the number of platforms was pretty impressive in itself) Mark moved onto the specifics of CouchDB and the ease with which a quick Key Value data store can be setup. I was already aware of the RESTful nature of the platform but didn’t realise that this protocol was used by everything (client tools, replication, the works). It was quite impressive to see the JSON being returned live with the listener requests and the act of inserting data, attachments etc. was demonstrated very quickly and cleanly. One thing I thought about later was, considering the issues surround ColdFusion and the inconsistencies in serializing / deserializing data to and from JSON had any one actually tried this against a CF server? The replication demo was fantastic (fully replicated servers in ~ 10 seconds) and certainly raised some eyebrows amongst those (myself included) who’d setup MySQL replication. The big issues surrounding alternate data stores  will be present for some time yet but Mark gave us a 60 minute, whirlwind around what the future should look like for a lot of apps and did it all (well most of it) with a pint in his hand! Scores (out of 10) Direct Professional Value: 9 Ongoing General Value: 9 Contention / Debate: 8 Style: 10 Overall: 9 *Rumours that I was seen lurking around Mark Drew’s laptop with a magnet on a fishing rod are total fabrications and it’s just a happy coincidence (for me at least) that I managed to see his fantastic preso.

GeoIp - Binary DB vs MySQL Lookups

Recently I had to implement GeoIP lookups for a client project to determine which country user's were coming from in order to supply them locale specific content. That, in itself, is not very interesting. What really prompted sufficient interest to write this post was the difference in look up performance between the two available formats for this database.

Read the rest of this post »

MySQL - What if the worst happens?

After the recent Tweetalpyse that followed Oracle's update of the MySQL version comparison table I got to thinking, what if MySQL were to be rendered no longer viable as my web database of choice. What are the options for open source relational data management?

Read the rest of this post »

ColdFusion CachedWithin & Max Rows - bug or feature?

Recently a colleague and I have been working through the beating heart of our flagship application to try and improve performance and throughput. There was nothing wrong with the performance we were getting but we'd got some time to spare and so figured tweaking the code to get the absolute maximum out of it would be a fun excercise for a Thursday afternoon. One of the areas we looked at was caching some of the common MySQL queries that are called on every request using the CachedWithin attribute. Having merrily rolled this out we fired up our test site expecting pleasing improvements in page load times and instead found that the site was pretty much completely borked.

Read the rest of this post »

MySQL Indices - what, how and when

One simple area of MySQL optimisation that is often overlooked is indexes* on tables. Basically an index is a sub set of the data that allows the MySQL engine to match against values much faster than normal. In short, it speeds up the sorting of data and therefore reduces the time it takes to return results and the load on the server as a result.

Read the rest of this post »

Don't use Query of Queries for simple datasets!

I've just finished a whole batch of optimisation routines for an Excel import/export that had to run on our CF8 based CRM system (thanks to Ben Nadel for his amazing POI CFC). The export had a tricky requirement whereby each row (of some 56,000 rows) had n items which needed to be set according to a lookup in a link table in the DB. Rather that running each comparison as a query against the live DB (56,000 rows x 20+ items = 1,120,000 queries!) I figured I could pull the whole link table into memory using a simple query and then use Query of Queries to save a bit of time.

Read the rest of this post »

11
To Posterous, Love Metalab