Platinum Partner
python

Django, Nginx, WSGI and encoded slashes

You are serving a Django application using Nginx to proxy to an Apache server running mod_wsgi and you want to allow slashes in your URL keywords.

For example, you may want to edit some attribute of the page at URL /; hence, you want to use a URL regex of the form:

url(r'/edit/page/(?P<page_url>.*)/$', ...)

and use the URL /edit/page/%2F/ to edit this page, where the third path segment is URL-encoded.

This works fine in local development using Django's runserver but not when Nginx/Apache are involved. Both services will 'process' the incoming request in a way that collapses repeating slashes. Django sees the above request path as /edit/path.

Solution

First, in order to get django to encode slashes, you need to pass an empty string to the urlencode template filter.

{% url edit-page url|urlencode:"" %}

Next ensure Nginx's proxy_pass configuration is transmitting the URL in 'unprocessed form' by omitting the path on the proxied server argmenent. That is, instead of:

proxy_pass http://localhost:80;

use:

proxy_pass http://localhost:80/;

The only different between these two examples is the trailing slash. See the nginx documentation for proxy_pass for more details on what 'unprocessed' means.

Next, alter your Apache config to include the AllowEncodedSlashes directive to ensure Apache recognises encoded slashes:

<VirtualHost \*>
    ...
    AllowEncodedSlashes On
    ...
</VirtualHost>

Finally modify your WSGI script to ensure Django gets the slashes in its PATH_INFO environmental variable which it uses for resolving the URL to a view function:

# ... other WSGI stuff: setting up path, virtualenv etc

os.environ['DJANGO_SETTINGS_MODULE'] = 'settings'
import django.core.handlers.wsgi
_application = django.core.handlers.wsgi.WSGIHandler()

import urllib
def application(environ, start_response):
    environ['PATH_INFO'] = urllib.unquote(environ['REQUEST_URI'].split('?')[0])
    return _application(environ, start_response)

The key change is using the REQUEST_URI variable to set PATH_INFO. We pluck the path component from REQUEST_URI and use urllib.unquote to ensure encoded slashes are decoded.

Discussion

The PATH_INFO variable is decoded by mod_wsgi, effectively collapsing repeated slashes. The REQUEST_URI is the raw request and so it's possible to use it to ensure encoded slashes make it through to Django.

Further reading

Published at DZone with permission of {{ articles[0].authors[0].realName }}, DZone MVB. (source)

Opinions expressed by DZone contributors are their own.

{{ tag }}, {{tag}},

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}
{{ parent.authors[0].realName || parent.author}}

{{ parent.authors[0].tagline || parent.tagline }}

{{ parent.views }} ViewsClicks
Tweet

{{parent.nComments}}