How to Optimize Active Record Efficiently

Uncategorized

How to Optimize Active Record Efficiently

Hi there! In this article, we will look at the various tricks and techniques that can be adopted to optimize data retrieval time and make our Rails application run faster. Ride with me as we go through a number of ways to optimize queries in Active Record.

What is Active Record?

Active Record is an Object Relational Mapping (ORM) framework that seeks to connect the rich objects of an application to tables in a relational database management system. Read more on ORM here. In the Ruby on Rails framework, Active Record is the M in MVC (Model, View, Controller). Active Record facilitates the creation and use of business objects whose data requires persistent storage to a database. An ORM ensures that we do not have to write sql queries that add relationships between tables in a database, or those that retrieve data from a database using sql raw queries. Active Record is a type of ORM.

TECHNIQUES TO OPTIMIZE ACTIVE RECORD QUERIES

1. Eager Loading

By default in Ruby on Rails, Active Record has lazy loading enabled, which means it delays the loading of data until the point where we actually need it. This is not ideal when retrieving an object and its associated objects when you know you will need the associated objects.

With lazy loading, each associated object that is used by the code will result in an additional DB query, which can cause a significant performance hit when there are many objects. This is commonly known as the “N+1 Problem”.

For an example, let us write code to retrieve a list of participants and their respective assessments.

participant.rb

class Participant
  has_many :assessments
end

participants controller

def index
  participants = Participant.all
  participants.each do {|p| p.assessments }
end

The code above will result in the sql query below by generating n+1 queries, where n is the number of participants, namely:


# SELECT * FROM participants;

# SELECT * FROM assessments WHERE assessments.participant_id = 1;

# SELECT * FROM assessments WHERE assessments.participant_id = 2;

# SELECT * FROM assessments WHERE assessments.participant_id = 3;

# SELECT * FROM assessments WHERE assessments.participant_id = n;

Notice how many sql queries were executed? Imagine a scenario where each participant has 1000 assessments, that means we will retrieve a thousand assessments one at a time, that will not only be time consuming but can also make our application very slow and inefficient.

To fix or avoid such queries, we apply eager loading when retrieving the participant objects. So the refactored code becomes:

def index
  participants = Participant.includes(:assessments)
  assessments = participants.each { |p| p.assessments }
end

The Output


Participant Load (0.8ms) SELECT "participants".* FROM "participants"
Assessment Load (1.8ms) SELECT "assessments".* FROM "assessments" WHERE "assessments"."deleted_at" IS NULL AND 
"assessments"."participant_id" IN ($1, $2, $3, $4, $5) [["participant_id", 375233176], ["participant_id", 724201174], ["participant_id", 26889763], ["participant_id", 96668790], ["participant_id", 457187834]]

The above output is the sql query that is being executed. The stark difference between both sql queries is very obvious. The refactored code ensures the assessment is pre-loaded which helps to avoid n+1 query problems.

Click here to learn more about eager loading in Active Record.

2. Bulk Delete With “delete_all”

When trying to delete a set of assessments for instance, it is easier, more efficient, and less time consuming to use the delete_all method provided by Active Record. Let’s look at an example below.

archived_assessment =  Assessment.where(archived: true)

archived_assessment.each do |assessment|
  assessment.delete
end
# Delete From assessments where id=26
# Delete From assessments where id=33
# Delete From assessments where id=34
# Delete From assessments where id=43
# Delete From assessments where id=53

As we can see in the above output, the deletion is done one object at a time. What if we had a 1000 assessments? This is where the delete_all method shines as bright as the sun… lol.

archived_assessment = Assessment.where(archived: true)

archived_result.delete_all

# DELETE FROM assessments WHERE assessment.archived = true;

The Output


Assessment Load (78.7ms) SELECT "assessments".* FROM "assessments" WHERE "assessments"."deleted_at" IS NULL AND "assessments"."participant_id" = $1 [["participant_id", 375233176]]
#<Assessment:0x000055e841b45f08>
#<Assessment:0x000055e841b5bad8>
#<Assessment:0x000055e841b5b858>
  Assessment Load (0.4ms) SELECT "assessments".* FROM "assessments" WHERE "assessments"."deleted_at" IS NULL AND "assessments"."participant_id" = $1 [["participant_id", 724201174]]
#<Assessment:0x00007f0d91c3edf8>
#<Assessment:0x00007f0d91c3ebc8>
  Assessment Load (0.4ms) SELECT "assessments".* FROM "assessments" WHERE "assessments"."deleted_at" IS NULL AND "assessments"."participant_id" = $1 [["participant_id", 26889763]]
#<Assessment:0x00007f0d91c46030>
  Assessmment Load (0.4ms) SELECT "assessments".* FROM "assessments" WHERE "assessments"."deleted_at" IS NULL AND "assessments"."participant_id" = $1 [["participant_id",96668790]]
#<Assessment:0x00007f0d91c4d8d0>
  Assessment Load (0.3ms) SELECT "assessments".* FROM "assessments" WHERE "assessments"."deleted_at" IS NULL AND "assessments"."participant_id" = $1 [["participant_id", 457187834]]
#<Assessment:0x00007f0d91c550a8>

As we can see, a single query is executed and all archived_assessments are deleted irrespective of the number. We can do the same for bulk_create and bulk_update. These methods eliminate the need for an iteration when a group of data is to be created, updated or deleted.

3. Selecting And Plucking Only Needed Attributes

Sometimes we don’t need all the model attributes, but only a few. Using the pluck method, we query the needed attributes and return them in an array. However, select returns an active record object with only the selected attribute.

archived_assessment = Assessment.where(archived: true).pluck(:title)

The above code will return an array of the titles of all assessments where the archived field is true. The pluck method is called when specific attributes are needed.

archived_assessment = Assessment.where(archived: true).select(:title)

Using select will return an active_record relation with the title attribute. This means, we have access to the assessment object but with only the specified attribute (the title in this case).

4. present? vs exists?

Sometimes we do not want to return an object data, we only want to verify its existence. The present? or exists? method can be used in this case. However, one is more efficient than the other.

If we want to know if a record exists but do not want to use the object, then using present? is an overkill as it selects all the columns and loads the object into memory. However, exists? only does a check without loading the object into memory.

archived_assessment = Assessment.where(archived: true).exists?

archived_assessment = Assessment.where(archived: true).present?

There are tons of other active record methods that can be used to enhance data manipulation in a database, but these highlighted ones are some of the most basic of them. Regards!

Johnbosco Okoror

Leave a Reply

Your email address will not be published. Required fields are marked *