Over a million developers have joined DZone.
Platinum Partner

Using SQL Instead of Ruby to Speed Migrations

· Big Data Zone

The Big Data Zone is brought to you in partnership with Hortonworks.  Learn more about Connected Data Platforms that power the creation of modern data applications and how they deliver actionable intelligence. 

This post originally authored by Reed Law.

When building a web app, there’s often a mountain of data that I need to move from one format to another. Most web application frameworks provide a mechanism for this called migrations. A migration is a snippet of code that creates a change in the database. Migrations are convenient because they can run inside each environment (e.g. development, staging, and production) and create the same results without having to dump and restore databases.

Usually when I think about doing a one-off data migration, I first think of how easy it is to do in Ruby. Programming in Ruby all day makes me think in Ruby, so I almost automatically come up with something like this:

Zone.all.map{|z|z.city + "," + z.state}.uniq.map{|c|City.create(name: c.gsub(/,.+$/), state: c.gsub(/^.+/))}

I needed to create cities for each unique city/state in the zones table. So I mapped each unique city plus state, to an array that then created a new city (with the help of a little regex to split the city and state). But with over forty thousand rows, this expression took several minutes before it overheated my laptop. Turns out this same migration can happen in under a second with a carefully crafted SQL statement. Here it is:

INSERT INTO cities ("name", "state", "created_at", "updated_at") SELECT DISTINCT city, state, current_timestamp, current_timestamp FROM zones;

Normally, to figure out the proper syntax for a query like the one above, I have to review the PostgreSQL manual. In this case, the time saved was clearly worth the extra effort. The next time you need a migration hopefully you’ll head for the right tool. After a while, you may begin thinking in SQL.

The Big Data Zone is brought to you in partnership with Hortonworks.  Learn, Collaborate, and Thrive with Hortonworks Community Connection


Published at DZone with permission of Nick Jordan .

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}