Varnish for Drupal 8

Blog
Posted on
Varnish for Drupal 8

Our team had been using Varnish a long time for our Dropsolid Drupal 7 project, and we thought the time had come to get it working for Drupal 8 as well. That is why our CTO, Nick Veenhof, organized a meetup about Caching & Purging in Drupal 8. Niels van Mourik gave an elaborate presentation about the Purge module and how it works.
I definitely recommend watching the video and the slides on his blog. In this blog post, we’ll elaborate and build on what Niels explained to us that day. 
First, let’s start off with a quick crash course on what Varnish actually is and how it can benefit your website.

 

Varnish 101

“Varnish Cache is a web application accelerator also known as a caching HTTP reverse proxy. You install it in front of any server that speaks HTTP and configure it to cache the contents. Varnish Cache is really, really fast. It typically speeds up delivery with a factor of 300 - 1000x, depending on your architecture.” (Source: Varnish-cache.org)

In layman’s terms, Varnish will serve a webpage from its own internal cache if it has it available. This drastically reduces the number of requests to the webserver where your application is hosted. This will, in turn, free up resources on your webserver, so your web application can handle more complicated tasks and more users.

In short, Varnish will make your web application faster and will allow you to scale it more efficiently.

How we use Varnish and Drupal 7

How did things typically work in D7? Well, you’d put a Varnish server with a Drupal compatible Varnish configuration file (vcl) in front of your Drupal 7 site and it would start caching it right away - depending, of course, on what is in the vcl and the headers your Drupal site sent.
Next step would be to install the Varnish module from Drupal.org. This module’s sole purpose is to invalidate your Varnish cache and it does so using telnet. This also requires the Varnish server to be accessible from the Drupal backend. This isn’t always an ideal scenario, certainly not when multiple sites are being served from the same Varnish.

The biggest issue when using Drupal 7 with the Varnish module is that invalidation of content just isn’t smart enough. For instance, if you would update one news item you’d only want that page and the ones where the news item is visible to be removed from Varnish’s cache. But that isn’t possible. This isn’t the module’s fault at all - it’s simply the way Drupal 7 was built. There are a few alternatives that do make it a little smarter, but these solutions aren’t foolproof either.

Luckily, Drupal 8 is a whole new ballgame!

How we use Varnish with Drupal 8

Drupal 8 in itself is very smart at caching and it revolves around the three following main pillars (explained from a page cache perspective, it will also cache parts of pages):

  • Cache tags: A page will get a list of tags based on the content that is on it. For instance if you have a news overview, all rendered news items will be added as tags to that page. Allowing you to invalidate that cache if one of those news items change. 
  • Cache context: A page can be different based based on variables from the current request. For instance if you have a news overview that filters out the news items based on a query parameter. 
  • Cache max-age: A page can be served from cache for X amount of time. After the time has passed it needs to be built up again. 

You can read more about Drupal 8’s new caching system here.

All about invalidation

Niels van Mourik created a module called Purge. This is a modular external cache invalidation framework. It leverages Drupal’s cache system to provide easy access to the cache data so you only need to focus on the communication with an external service. It already has a lot of third-party integrations available like Acquia Purge, Akamai and Varnish Purge.

We are now adding another one to the list: the Dropsolid Purge module.

What does Dropsolid Purge do and why do I need it?

The Dropsolid Purge module enables you to invalidate caches in multiple Varnish load balancers. It also lets you cache multiple web applications by the same Varnish server. The module was very heavily inspired by the Acquia purge module and we reused a lot of the initial code, because it has a smart way of handling the invalidation through tags, but we’ll get in to that a little later. The problem with the Acquia purge module is that it is designed to work on Acquia Cloud, because it depends on certain environment variables and the Varnish configuration is proprietary knowledge of Acquia. This means that it isn’t usable on other environments. 

We also experimented with the Varnish purge module, but it lacked support for cache invalidation in case you have multiple sites/multisites cached by a single Varnish server. This is because the module actually doesn’t tell Varnish which site it should invalidate pages for, so it just invalidates pages for all the sites. It also doesn’t have the most efficient way of passing along the invalidation requests. It contains two ways of sending invalidation request to Varnish: one by one or bundled together. The one by one option results in a lot of requests if you know that updating a single node could easily invalidate 30 tags. Using the Bundle purger could get you to reach the limit of your header size, but more on that later.

What's in the bag?

Currently we provide the following features:

  • Support for tag invalidation and everything invalidation,
  • The module will only purge tags for the current site by using the X-Dropsolid-Site header,
  • The current site is defined by the name you set in config and the subsite directory,
  • Support for multiple load balancers,
  • There is also a default vcl in the examples folder that contains the logic for the bans.

It can be used for any environment if you just follow the installation instructions in the readme.

Under the hood

Preparing and handling the responses for/by Varnish

By default, the module will add two headers to every response it gives:

  • X-Dropsolid-Site: A unique site identifier as a hash based on config (you provide through settings.php) and site parameters:
    • A site name
    • A site environment
    • A site group
    • The path of your site (e.g. sites/default or sites/somesubsite)
  • X-Dropsolid-Purge-Tags: A hashed version of each cache tag on the current page. (hashed to keep the length low and avoid hitting the maximum size of the header)

When the response reaches Varnish, it will save those headers along with the cache object. This will allow us to target these specific cache objects for invalidation.

In our vcl file we also strip those headers, so they aren’t visible by the end user:

sub vcl_deliver {
    unset resp.http.X-Dropsolid-Purge-Tags;
    unset resp.http.X-Dropsolid-Site;
}
Invalidation of pages

For Drupal 8 we no longer use telnet to communicate with Varnish, but we use a BAN request instead. This request will get sent from our site, and it will only be accepted when it comes from our site.  We currently do this by validating the IP of the request against a list of IPs that are allowed to do BAN requests. 

As we mentioned earlier, we provide two ways of invalidating cached pages in Varnish:

  • Tag invalidation: We invalidate pages which have the same cache tags as we send in our BAN request to Varnish.
  • Everything invalidation: We invalidate all pages which are from a certain site.
Tag invalidation

Just like the Acquia purge module, we send a BAN request which contains a group of 12 hashed cache tags which then will be compared to what Varnish has saved. We also pass along the unique site identifier so we indicate we only want to invalidate for a specific site.

Our BAN request has the following headers:

  • X-Dropsolid-Purge: Unique site identifier
  • X-Dropsolid-Purge-Tags: 12 hashed tags

When Varnish picks up this request, it will go through the following logic:

sub vcl_recv {
    # Only allow BAN requests from IP addresses in the 'purge' ACL.
    if (req.method == "BAN") {
        # Same ACL check as above:
        if (!client.ip ~ purge) {
            return (synth(403, "Not allowed."));
        }

        # Logic for banning based on tags
        # https://Varnish-cache.org/docs/trunk/reference/vcl.html#vcl-7-ban
        if (req.http.X-Dropsolid-Purge-Tags) {
            # Add bans for tags but only for the current site requesting the ban
            ban("obj.http.X-Dropsolid-Purge-Tags ~ " + req.http.X-Dropsolid-Purge-Tags + " && obj.http.X-Dropsolid-Site == " + req.http.X-Dropsolid-Purge);
            return (synth(200, "Ban added."));
        }
    }

We check if the request comes from an IP that is whitelisted. We then add bans for every cache object that matches our unique site identifier and matches at least one of the cache tags we sent along. 

You can easily test this by updating a node and seeing that Varnish will be serving you a new version of its page. 
 

Everything invalidation

When the everything invalidation is triggered, a BAN request is sent with the following headers:

  • X-Dropsolid-Purge-All: True
  • X-Dropsolid-Purge: Unique site identifier

And we execute the following logic on Varnish’s side:
 

sub vcl_recv {      
    # Only allow BAN requests from IP addresses in the 'purge' ACL.
    if (req.method == "BAN") {
        # Same ACL check as above:
        if (!client.ip ~ purge) {
            return (synth(403, "Not allowed."));
        }
        # Logic for banning everything
        if (req.http.X-Dropsolid-Purge-All) {
            # Add bans for the whole site
            ban("obj.http.X-Dropsolid-Site == " + req.http.X-Dropsolid-Purge);
            return (synth(200, "Ban added."));
        }
    }
}

When Varnish receives a BAN request with the X-Dropsolid-Purge-All header, it will ban all cache object that have the same unique site identifier. You can easily test this by executing the following command: drush cache-rebuild-external.

Beware: a normal drush cache-rebuild will not invalidate an external cache like Varnish.

Why this matters

To us, this is yet another step in making our cache smarter, our web applications faster and our servers leaner. If you have any questions about this post, you can always leave a comment in the comment section below or open an issue on drupal.org.

Are you looking for a partner that will help you to speed up your site, without having to switch hosting? The Dropsolid Platform helps you to adjust and streamline your development processes, without the typical vendor lock-in of traditional hosting solutions. At Dropsolid, we also offer dedicated hosting, but we never enforce our own platform. Dropsolid helps you to grow your digital business - from every possible angle!