Table of Contents
Introduction
The Django select_related method allows you to speed up your querysets and therefore your application.
In this post, I’ll show you how you can minimize database accesses and thus maximize the speed of your Django app.
To learn how to use the select_related
method we will see a very clear example. We will start from the creation of the models and we will analyze two ways; the first (the slow one) without using select_related
and then the fast one using it.
You will see that the difference in database accesses will be abysmal.
Speed up Django queryset using select_related method
The Django documentation describes this method in this way:
Returns a QuerySet that will “follow” foreign-key relationships, selecting additional related-object data when it executes its query
Django documentation
Let’s see now how to use this method.
Models and records creation
Suppose to have these two simple models and for simplicity each album has only one artist:
class Artist(models.Model):
name = models.CharField(max_length=10)
class Album(models.Model):
name = models.CharField(max_length=30)
artist = models.ForeignKey(Artist, on_delete=models.CASCADE)
With this simple piece of code we can create 100 albums and 100 artists where each artist is associated to an album.
for idx in range(100):
artist_name = "artist_{}".format(idx)
artist_obj = Artist.objects.create(name=artist_name)
album_name = "album_{}".format(idx)
Album.objects.create(name=album_name, artist=artist_obj)
Without Django select_related method – Slow approach
In this piece of code below we simply want to print the associated artist for each album.
To do this, we first collect all the albums and then use a for
loop to print the information we need for each artist
from django.db import connection
print("Initial number of queries: {}".format(len(connection.queries)))
album_qs = Album.objects.all()
for album in album_qs:
artist = album.artist
print("Album name: {} - Artist name: {}".format(album.name, artist.name))
print("Final number of queries: {}".format(len(connection.queries)))
As we said, the initial queryset is on Album model.
After this, we use a loop to print each album and the linked artist.
Using connection.queries
, it is possible to check how many queries django is doing.
The output of the code snippet above is the following:
Initial number of queries: 0
Album name: album_0 - Artist name: artist_0
Album name: album_1 - Artist name: artist_1
Album name: album_2 - Artist name: artist_2
Album name: album_3 - Artist name: artist_3
Album name: album_4 - Artist name: artist_4
Album name: album_5 - Artist name: artist_5
...
Album name: album_95 - Artist name: artist_95
Album name: album_96 - Artist name: artist_96
Album name: album_97 - Artist name: artist_97
Album name: album_98 - Artist name: artist_98
Album name: album_99 - Artist name: artist_99
Final number of queries: 101
The first query is done when all Albums are retrieved, but what about the other 100?
Basically, everytime access to artist variable is requested, there is an additional database hit.
artist = album.artist
Using Django select_related method – Faster approach
Starting from the code presented in the previous section, how can we speed up our code?
In this case, the bottleneck is the number of database accesses, but no problem!
To solve this problem we can use the select_related method.
print("Initial number of queries: {}".format(len(connection.queries)))
album_qs = Album.objects.all().select_related("artist")
for album in album_qs:
artist = album.artist
print("Album name: {} - Artist name: {}".format(album.name, artist.name))
print("Final number of queries: {}".format(len(connection.queries)))
The only difference is on the second line, indeed using select_related, Django ‘pre-loads’ all artists for each album.
Let’s try running the code again now and see the difference:
Initial number of queries: 0
Album name: album_0 - Artist name: artist_0
Album name: album_1 - Artist name: artist_1
Album name: album_2 - Artist name: artist_2
Album name: album_3 - Artist name: artist_3
Album name: album_4 - Artist name: artist_4
Album name: album_5 - Artist name: artist_5
...
Album name: album_95 - Artist name: artist_95
Album name: album_96 - Artist name: artist_96
Album name: album_97 - Artist name: artist_97
Album name: album_98 - Artist name: artist_98
Album name: album_99 - Artist name: artist_99
Final number of queries: 1
Amazing, the modification done has decreased the number of database hits from 101 to 1!
Conclusion
With this brief use case we have seen how it is possible to decrease the number of database accesses using the select_related Django querysets method.
Django offers other methods for speeding up queries that are just as effective. In the next posts I will analyze others.
As always, if you have any doubts or if you are in trouble, I invite you to write me a comment. If not take a look at the latest posts!