django select_related

Table of Contents

Introduction

The Django select_related method allows you to speed up your querysets and therefore your application.
In this post, I’ll show you how you can minimize database accesses and thus maximize the speed of your Django app.

To learn how to use the select_related method we will see a very clear example. We will start from the creation of the models and we will analyze two ways; the first (the slow one) without using select_related and then the fast one using it.

You will see that the difference in database accesses will be abysmal.

Speed up Django queryset using select_related method

The Django documentation describes this method in this way:

Returns a QuerySet that will “follow” foreign-key relationships, selecting additional related-object data when it executes its query

Django documentation

Let’s see now how to use this method.

Models and records creation

Suppose to have these two simple models and for simplicity each album has only one artist:

class Artist(models.Model):
    name = models.CharField(max_length=10)
class Album(models.Model):
    name = models.CharField(max_length=30)
    artist = models.ForeignKey(Artist, on_delete=models.CASCADE)

With this simple piece of code we can create 100 albums and 100 artists where each artist is associated to an album.

for idx in range(100):
    artist_name = "artist_{}".format(idx)
    artist_obj = Artist.objects.create(name=artist_name)
    album_name = "album_{}".format(idx)
    Album.objects.create(name=album_name, artist=artist_obj)

Without Django select_related method – Slow approach

In this piece of code below we simply want to print the associated artist for each album.
To do this, we first collect all the albums and then use a for loop to print the information we need for each artist

from django.db import connection
print("Initial number of queries: {}".format(len(connection.queries)))
album_qs = Album.objects.all()
for album in album_qs:
    artist = album.artist
    print("Album name: {} - Artist name: {}".format(album.name, artist.name))
print("Final number of queries: {}".format(len(connection.queries)))

As we said, the initial queryset is on Album model.
After this, we use a loop to print each album and the linked artist.

Using connection.queries, it is possible to check how many queries django is doing.
The output of the code snippet above is the following:

Initial number of queries: 0
Album name: album_0 - Artist name: artist_0
Album name: album_1 - Artist name: artist_1
Album name: album_2 - Artist name: artist_2
Album name: album_3 - Artist name: artist_3
Album name: album_4 - Artist name: artist_4
Album name: album_5 - Artist name: artist_5
...
Album name: album_95 - Artist name: artist_95
Album name: album_96 - Artist name: artist_96
Album name: album_97 - Artist name: artist_97
Album name: album_98 - Artist name: artist_98
Album name: album_99 - Artist name: artist_99
Final number of queries: 101

The first query is done when all Albums are retrieved, but what about the other 100?
Basically, everytime access to artist variable is requested, there is an additional database hit.

artist = album.artist

Using Django select_related method – Faster approach

Starting from the code presented in the previous section, how can we speed up our code?
In this case, the bottleneck is the number of database accesses, but no problem!
To solve this problem we can use the select_related method.

print("Initial number of queries: {}".format(len(connection.queries)))
album_qs = Album.objects.all().select_related("artist")
for album in album_qs:
    artist = album.artist
    print("Album name: {} - Artist name: {}".format(album.name, artist.name))
print("Final number of queries: {}".format(len(connection.queries)))

The only difference is on the second line, indeed using select_related, Django ‘pre-loads’ all artists for each album.

Let’s try running the code again now and see the difference:

Initial number of queries: 0
Album name: album_0 - Artist name: artist_0
Album name: album_1 - Artist name: artist_1
Album name: album_2 - Artist name: artist_2
Album name: album_3 - Artist name: artist_3
Album name: album_4 - Artist name: artist_4
Album name: album_5 - Artist name: artist_5
...
Album name: album_95 - Artist name: artist_95
Album name: album_96 - Artist name: artist_96
Album name: album_97 - Artist name: artist_97
Album name: album_98 - Artist name: artist_98
Album name: album_99 - Artist name: artist_99
Final number of queries: 1

Amazing, the modification done has decreased the number of database hits from 101 to 1!

Conclusion

With this brief use case we have seen how it is possible to decrease the number of database accesses using the select_related Django querysets method.
Django offers other methods for speeding up queries that are just as effective. In the next posts I will analyze others.

As always, if you have any doubts or if you are in trouble, I invite you to write me a comment. If not take a look at the latest posts!

Leave a Comment

Your email address will not be published. Required fields are marked *