For many of us, databases are no fun… surely not as fun as ActiveRecord or Rails in general. Every railers I know love the “dot notation” offered by ActiveRecord associations immensely… while refering to the vast majority of raw SQL queries as “Some ugly non-sense gibberish that we would all prefer not know the existence”.
Our hate towards the SQL language and databases is not a valid excuse to not assume our responsabilities as application developers. Recently I was working on a Rails plugin when the sentiment of being incompetent struck my body exactly like the lightning strikes an adventurous golfer who happily plays golf during a thunderstorm.
It all happened when I had a query that was generating a lot of results… and was slow as hell to process… even though I was using the :include option for Eager Loading.
@stuff = ParentStuff.find(:all, :include => :child_stuff, :order => “created_at DESC”)
Then, in my view, I had something like :
@stuff.each do |parent_stuff|
bla bla bla <%=parent_stuff.name%>
parent_stuff.child_stuff.each do |stuff|
bla bla bla <%=stuff.title%>
end
end
It was awfully slow! I started looking at the logs and saw this typical monstruous sql query. Look at it… you have to look at it. I know you don’t want to… but it won’t go away :
SELECT parent_stuff.`id` AS t0_r0, parent_stuff.`field1` AS t0_r1, parent_stuff.`field2` AS t0_r2, parent_stuff.`field3` AS t0_r3, parent_stuff.`field4` AS t0_r4, child_stuff.`id` AS t1_r0, child_stuff.`field5` AS t1_r1, child_stuff.`field6` AS t1_r2, child_stuff.`field7` AS t1_r3, child_stuff.`field8` AS t1_r4, child_stuff.`field9` AS t1_r5, child_stuff.`field10` AS t1_r6, child_stuff.`field11` AS t1_r7 FROM parent_stuff LEFT OUTER JOIN child_stuff ON child_stuff.the_foreign_key = parent_stuff.id WHERE (parent_stuff.created_at >= ‘2008-04-24 00:00:00’ AND parent_stuff.created_at <= '2008-04-24 09:44:56') ORDER BY parent_stuff.created_at
What's the problem with this? It's just one query... cannot take that much time. I thought it was a "Rails problem" until I ran the query directly in a MySql web interface. It took the same amount of time, meaning that the problem was within the query itself. Oh sh**... it was a database problem! Like many Rails Developers, my database skills are, while not inexistant, inadequate.
I tried to figure out what could be the problem with this query. I replaced every LEFT OUTER JOIN with INNER JOIN and Boom! the query executed in less than half a second.
My urgent desire of leaving the MySql web interface made me go back to my Rails code. I then tried some stupid random stuff like :
@stuff = ParentStuff.find(:all, :include => :child_stuff, :joins => “INNER JOIN child_stuff ON child_stuff.parent_stuff_id=id” :order => “created_at DESC”) but the slow LEFT OUTER JOINs remain in the query instead of being replace by the faster INNER JOIN. Then, I learned that in Rails 2.0, you could do Eager Loading with INNER JOIN by passing an association name to the :joins option, like that :
@stuff = ParentStuff.find(:all, :joins => :child_stuff, :order => “created_at DESC”)
Problem is I was working on a plugin… and I’d like it to be compatible with older versions of Rails. So I did what I was scared of doing since the beginning : Going to some MySql forums to read the advice of some DBA’s …
Then I found the answer to my problem… and this answer proved me that I needed to stop playing blindly with ActiveRecord like if the backend database was none of my business…. because it is. The answer is so obvious… but given the fact that I never wanted to care about databases since Rails exists, I didn’t think about it :
Answer : Add an index to the foreign key in the child table!
This is just one case that shows the dangers of not being aware of what’s going on in the database. It’s sad, but you cannot blindly let ActiveRecord manage everything that is Database related for you. It’s only a question of being in control of the entire application… and the relational database is a part of it.
UPDATE
This post has kind of turned against me. The only thing I wanted to say with this article is that ActiveRecord needs our supervision… not that I am a moron who can’t understand databases. I know what joins are, I know what indexes are… but in most situations I don’t have to “care” about them with ActiveRecord. Sorry but I felt the need to defend myself about this issue 🙂