Kogan.com Dev Blog

Nam Ngo

A hidden gem in Django 1.7: ManifestStaticFilesStorage

The biggest change in Django 1.7 was the built-in schema migration support which everyone is aware of, however 1.7 also shipped with lots of other great additions, ManifestStaticFilesStorage - the new static files storage backend was one of them.

Static file caching is everywhere

Before explaining what ManifestStaticFilesStorage is and how it works, this is the overview of why we need it at Kogan.com:

cache busting static

In order to deliver the content to our customers as fast as possible, we cache the downloaded static files by using max-age request headers. This allows our customers to download the content once and the subsequent requests to static files will be served from a cache. As shown on the diagram, if we were to use normal static file names like base.css, the content of the file would be cached in the CDN as well as on the browser and we would have a hard time trying to invalidate these caches. We cache-bust the content by appending a md5 hash of the content of the file to the file name. When we deploy a new base.css, {% static %} template tag will turn base.css into base.d1833e.css and the browser will then request a new file. {% static %} template tag is able to translate base.css into base.d1833e.css thanks to static files storage backend. This setting is named STATICFILES_STORAGE in Django.

Before ManifestStaticFilesStorage

Our Django app was previously configured to use CachedStaticFilesStorage which resulted in placing file mappings in the CACHES backend, for us it was Redis. Django adds these mappings during collectstatic when it gathers all statics and puts them in one place.

collectstatic

This solution has coupled static assets deployment with code deployment resulting in a number of issues:

  • Running collectstatic as part of code deployment --> slow deploys
  • Extra load on Redis
  • App servers were sometimes out of sync as we deploy them in batch. When we start the deployment, Redis would be updated with the new keys, the first batch of App servers would get the new code, but the other half still had old code.

Out of sync app servers

ManifestStaticFilesStorage to the rescue

ManifestStaticFilesStorage has helped us to decouple the static compilation stage from deployments by allowing Django to read static file mappings from staticfiles.json on a filesystem. staticfiles.json is an artifact file produced by collectstatic with ManifestStaticFilesStorage as a backend. We can now include this staticfiles.json into our code package and deploy it to a single app server without affecting the others.

New ManifestStaticFilesStorage

Where is staticfiles.json located?

By default staticfiles.json will reside in STATIC_ROOT which is the directory where all static files are collected in. We host all our static assets on an S3 bucket which means staticfiles.json by default would end up being synced to S3. However, we wanted it to live in the code directory so we could package it and ship it to each app server. As a result of this, ManifestStaticFilesStorage will look for staticfiles.json in STATIC_ROOT in order to read the mappings. We had to overwrite this behaviour, so we subclassed ManifestStaticFilesStorage:


from django.contrib.staticfiles.storage import ManifestStaticFilesStorage
from django.conf import settings

class KoganManifestStaticFilesStorage(ManifestStaticFilesStorage):

    def read_manifest(self):
        """
        Looks up staticfiles.json in Project directory
        """
        manifest_location = os.path.abspath(
            os.path.join(settings.PROJECT_ROOT, self.manifest_name)
        )
        try:
            with open(manifest_location) as manifest:
                return manifest.read().decode('utf-8')
        except IOError:
            return None

With the above change, Django static template tag will now read the mappings from staticfiles.json that resides in project root directory.

Thanks Django

Thanks to Django 1.7, we've not only gotten a better schema migration system but also improved our deployment process. And not to mention ManifestStaticFilesStorage addition was only 40-50 lines of code (as of the day this blog post was published).

Tips for writing unit tests for Django middleware

Django framework provides developers with great testing tools and it's dead easy to write tests for views using Django's test client. It has extensive documentation on how to use django.test.Client to write automated tests. However, we often want to write tests for components that we have no control over when using django.test.Client. An example of that is Django Middleware which is used to add business logic either before or after view processing. django.test.Client has no public API for developers to access the internal request object.

Here is a simple example of a middleware class that creates a stash from data saved in the session.


class Stash(object):
    def __init__(self, **kwargs):
        self.__dict__.update(kwargs)

class StashMiddleware(object):
    """
    Reconstructs the stash object from session
    and attach it to the request object
    """
    def process_request(self, request):
        stashed_data = request.session.get('stashed_data', None)
        # Instatiate the stash from data in session
        if stashed_data is None:
            stash = Stash()
        else:
            stash = Stash(**stashed_data)
        # Attach the stash to request object
        setattr(request, 'stash', stash)
        return None

Let's analyze what needs to be tested:
1. Assert that if the stashed data exists in the session, it should be set as an attribute of the request
2. Assert that if the stashed data doesn't exist in the session, an empty stash is created and attached to the request object
3. Assert that all attributes of the stash can be accessed

How about dependencies? What do we need in order to write this test?
- StashMiddleware class (this can be easily imported)
- request object as an argument in process_request(). This one is a bit harder to obtain, and since we are writing a unit test, let's just mock it.

We are now ready to write the test


from django.test import TestCase
from mock import Mock
from bugfreeapp.middleware import StashMiddleware, Stash

class StashMiddlewareTest(TestCase):

    def setUp(self):
        self.middleware = StashMiddleware()
        self.request = Mock()
        self.request.session = {}

This sets up an instance of StashMiddleware and mocks a request. I'm using Michael Foord's mock library to assist me with this. Since we know session is a dictionary like object, we can mock it with an empty dictionary.


    def test_process_request_without_stash(self):
        self.assertIsNone(self.middleware.process_request(self.request))
        self.assertIsInstance(self.request.stash, Stash)

    def test_process_request_with_stash(self):
        data = {'foo': 'bar'}
        self.request.session = {'stashed_data': data}
        self.assertIsNone(self.middleware.process_request(self.request))
        self.assertIsInstance(self.request.stash, Stash)
        self.assertEqual(self.request.stash.foo, 'bar')

The first test asserts that (without stashed data in the session):
- process_request returns None
- Stash object has been attached to request

The second test asserts that:
- process_request returns None
- Dictionary containing data in session is unpacked and used to create a Stash object.
- Stash attributes can be accessed

In both cases, we assert for a return value of process_request. This might sound like a redandunt thing to test for but it actually helps us to identify regressions. Knowing that process_request returns None, we don't have to worry about this middleware skipping the subsequent middlewares.

Tips

  • Not all tests can be written with django.test.client.Client.
  • Keep your dependencies for unit tests as light as possible, use mocks.
  • Write unit tests that run fast. Don't test ORM or network calls, try using mock.patch instead
  • Revisit your code if you have a hard time trying to set up dependencies, that normally indicates that the code is too coupled.

Like the sound of how we work? Check out our Careers Page!