Menu iconMenu

We’ve recently been working on a project here at Media Suite using Django served up by NGINX. We have files stored remotely within an S3 bucket and need end users to be able to download them after verifying they have permission. The metadata for the files are stored in the Django database.

I felt like this was a reasonably common problem but it took a bit of configuring to get it working in a way I was happy with, so I thought I’d share in the hopes that others find it useful.

The solution below assumes Django, but it should be easily adaptable to any other server backend.

Options

A direct link to the S3 bucket object wasn’t possible as we need to be able to check their permissions within Django. This is also a problem if you would like to perform any other actions (such as logging the download).

As I saw it, there were three options:

  • Generate a pre-signed S3 object URL that expires within a short time and redirect the user once you’ve checked permissions etc.
  • Download the file to your server in the background first, and then forward it onto them.
  • Proxy the download.

Originally I tried using the pre-signed object URL, but you lose control over the exact headers sent back to the user. S3 does allow you to configure some things here but, as the metadata was stored within the Django database, it wasn’t going to work. Specifically, I wanted to use the Content-Disposition header to set the filename, as well as forcing the browser to download the file (rather than opening directly depending on the mime-type). It does also mean that the link is shared to the user, which could be a security risk if the URL expiry wasn’t short enough.

Downloading the file in the background and then sending to the user would have worked pretty well. We would have had complete control over the response headers. However, it requires Django to be involved with the download streaming of potentially big files (which isn’t ideal). It may also require storing the file temporarily locally (streaming it would likely be possible, but I didn’t explore this option).

The above option was my fallback position, however I was keen to see if there was a way to proxy downloads through using NGINX.

X-Accel

NGINX has a mechanism (called X-Accel) to hand off a direct download, which works well for local files. From their documentation:

X-accel allows for internal redirection to a location determined by a header returned from a backend.

This allows you to handle authentication, logging or whatever else you please in your backend and then have NGINX handle serving the contents from redirected location to the end user, thus freeing up the backend to handle other requests. This feature is commonly known as X-Sendfile.

In a nutshell this allows the backend to pass off the actual transmission of the file to NGINX, but you still get an initial hook into protecting it or logging the download. It works by the backend – Django in our case – giving the URL in the response headers. NGINX will intercept that and serve the file.

Here is the simplest example, which would serve up /var/www/files/somefile.png.

This is handled pretty well for local files or where you’re doing a simple proxy_pass to another server but arbitrary URLs require something a bit different.

Using X-Accel for remote URLs

Using this excellent post as a starting point, there is a way to pass the full URL to NGINX. The approach works by passing the protocol, host, and path of the URL through the X-Accel-Redirect header. The parts are then rebuilt by NGINX and used with proxy_pass.

Note: Just to be clear, this code assumes the remote_url in the File model is not one that a user would ever be able to set. Otherwise any random file on the internet could appear to come from your site. In actual code we use the AWS django-storage backend to create a pre-signed S3 object URL with a short expiry and only store the S3 reference in the File model.

This works well. We have been able to run our permission and logging code as well as set a Content-Disposition response header from our Django code. At this point I was feeling pretty happy with myself but there is a problem with this if the remote server returns a redirect.

Handling remote redirects

If the remote server returns a redirect it is passed straight through to the user (and the users browser redirects directly to the remote resource) rather than being handled by NGINX. S3 will occasionally return redirects, especially if you haven’t specified the correct AWS Region.

Thanks to Stack Overflow I found a solution to this:

By handling the relevant 3xx HTTP codes, we can deal with the redirects within NGINX.

Summary

So there we have it, a mechanism for directly proxying remote URLs. I was pretty happy with it in the end (although it took a fair amount of work getting it configured correctly). I found using Wireshark to check the exact headers really useful as I initially misunderstood some of the NGINX proxy_ directives.

If you want to see a simple app using this check out the repository. You can run it in a Docker container and try it out.

References:

Comments Comments
Add comment

3 Replies to “Using NGINX’s X-Accel with Remote URLs”

Leave a Reply

Your email address will not be published.