From the blog

Using Database Views in Django ORM

disc-hanging
Sometimes using database views rather than database tables can be very helpful for querying aggregated data. Typically in the Django ORM, a database table is tied to a Django ORM model, but we found it is also possible to tie a database view with a Django ORM model. By doing so, you can hide the data aggregation logic on the database level (in view creating SQL). And most of the ORM features, like double underscore foreign key lookup, still work for the model tied to a database view. In this blogpost, I’ll walk you through a simple example to illustrate how.

The regular models

Our example uses the following regular models. Each model has a database table tied to it.

from django.db import models
class Job(models.Model):
    name = models.CharField(max_length=200)
class User(models.Model):
    name = models.CharField(max_length=200)
class Charge(models.Model):
    HARDWARE = 'hw'
    SOFTWARE = 'sw'
    TYPE_CHOICES = (
        (HARDWARE, 'Hardware'),
        (SOFTWARE, 'Software'),
    )
    type = models.CharField(choices=TYPE_CHOICES, max_length=2)
    amount = models.DecimalField(max_digits=8, decimal_places=2)
    month = models.DateField()
    user = models.ForeignKey(User)
    job = models.ForeignKey(Job, null=True, blank=True)

The Charge model is used to store the charges. It has a foreignkey to the user model and an optional foreignkey to the job model. Not every Charge record can be associated to a job.

The database view

The database view is created by following the SQL and it can be injected into a customized data migration with the raw SQL execution command.

CREATE OR REPLACE VIEW app_chargesummary AS
    SELECT row_number() OVER () as id,
        ci.user_id,
        ci.job_id,
        ci.month,
        SUM(CASE WHEN ci.type = 'hw' THEN ci.amount ELSE 0 END) AS hardware,
        SUM(CASE WHEN ci.type = 'sw' THEN ci.amount ELSE 0 END) AS software
    FROM app_charge ci
    GROUP BY ci.user_id, ci.job_id, ci.month;

This view sums up all the hardware charges and software charges for the Charge records with the same job, user, and month. It contains the following columns:

   [id, user_id, job_id, month, hardware, software]

The next step is to create a Django model which maps to this view so we can use Django ORM to retrieve the data from the view.

The model for the view

We created the model below to map the database view.

class ChargeSummary(models.Model):
 id = models.BigIntegerField(primary_key=True)
 user = models.ForeignKey(User, on_delete=models.DO_NOTHING)
 job = models.ForeignKey(Job, on_delete=models.DO_NOTHING)
 month = models.DateField()
 hardware = models.DecimalField(max_digits=19, decimal_places=2)
 software = models.DecimalField(max_digits=19, decimal_places=2)
class Meta:
        managed = False
        db_table = 'app_chargesummary'

Although the model above looks very similar to a regular Django model, there are three differences worth noticing.

  1. For the foreignkey fields, we need to specify an extra kwarg: on_delete=models.DO_NOTHING. Although normally we won’t try to call delete() on the ChargeSummary model because of the view at behind is not writable, without specifying this kwarg would lead to some test cases failure. The might be related to the auto cleaning mechanism for the test database after test cases running.
  2. managed = False needs to be specified in class Meta so that the south or Django database auto migration will ignore that model and won’t try to create a database schema migration for it.
  3. db_table = ‘name_of_the_view’needs to be specified in class Meta to let the Django ORM know which database entity to retrieve the data from.

Query with the new model

With the new model being set up, you can use the regular ORM query methods with no difference compared with the regular Django model.
Get all ChargeSummary objects:

ChargeSummary.object.all()

Get all ChargeSummary objects for May 2015:

ChargeSummary.object.filter(month=datetime(year=2015, month=5, day=1, tzinfo=utc))

Get all ChargeSummary objects with a certain job name:

ChargeSummary.object.filter(job__name='Test Job')

Get all ChargeSummary objects with a specific user email address (suppose you have a 1-to-1 userprofile model associate with the user model):

ChargeSummary.object.filter(user__profile__email='irwen@rescale.com')

Of course certain methods like save() or delete() will presumably fail since DML SQLs are not applicable to database views by nature.

Last but not least

If you are upgrading your app to Django 1.7 or later from an earlier version, you need to run

 python manage.py makemigrations your_app_label

to generate 0001_initial.py and then run the command below to upgrade your database.

python manage.py migrate --fake-initial

Unfortunately, this would fail if you have the database view and the unmanaged ChargeSummary model. The reason is when Django runs the migrations with the fake initial flag, it verifies that the tables do exist in the database for all the models in CreateModel operations in the 0001_inital.py migration. CreateModel operations are created for every model, even those with managed=False. So for ChargeSummary model, it will fail since it can’t find the actual table. Fortunately, we can work around this issue by adding the view create sql into 0001_initial.py, and then run the migrations with the fake initial flag.

...
migrations.RunSQL(
    """
    DROP VIEW IF EXISTS app_chargesummary;
    CREATE OR REPLACE VIEW app_chargesummary ...;
    """
),
...

Happy modeling and coding with Django ORM.

Related articles

Rescale: 2014 in Review

Team 2014 was a year of immense growth and development for Rescale. Our team expanded with the addition of several new members, including, Engineer: Mairi Macdonald, Hiraku Nakamura, Alex Kudlick, Mark Whitney, and Kenneth Chung, Operations Manager: Sarah Dietz, VP […]

read more »

3 Hidden Benefits of Cloud HPC

Companies are moving to high-performance computing (HPC) in the cloud for access to flexible, varied, pay-per-use HPC resources.  At the same time, they don’t have to manage these assets or spend a massive amount of capital and time to deploy […]

read more »