Caching is the process of storing downloaded data for later use, where it can be read from disk rather than requesting it again. Making proper use of your browser and CDN caching can speed up your website significantly.
How Does Caching Work?
Every user’s browser has a built-in cache, which stores static objects downloaded from websites. The next time they connect, if the object they’re requesting is still in the cache, it will load from memory rather than asking for it again, speeding up performance significantly and reducing the load on your web server in the process.
The user’s browser is a client-side cache. However, many large sites will also make use of a Content Delivery Network, or CDN. The CDN sits in front of your web server and caches your pages on the server side, usually on multiple edge servers located around the world. This improves access latency, performance, and greatly reduces the stress on your web server. If you’d like to learn more about CDNs, you can read our guide on them here.
Cache-Control is a header that you can configure your web server to add to all outgoing requests. Using it, you can specify which resources get cached, and for how long. There are a few things to note though before you go adding it site-wide.
Certain pages should never be cached. Anything that requires a user to sign in should not be cached by a CDN, or else you will risk displaying one user’s personal information to others. You can still cache these kinds of pages on the browser side alone (by setting Cache-Control to private). As a general rule, if the page is going to be the exact same for all users, like your home page, you can cache it. Static resources, like CSS and images, can usually be cached, often for much longer.
You’ll also want to make sure you’re setting reasonable Time-To-Live (TTL) values for each resource. TTL controls how long the object will stay in cache before being invalidated, prompting the user to request a new object. The trade-off here is between a long caching time and quick updates. You don’t want to cache your home page for a whole year, because you might be changing something on Tuesday. Setting a max age around a few minutes for your home page is long enough to cover immediate reloads, and quick enough to allow for swift propagation of updates. However, for static resources like images, they may never change, and you should be fine setting high TTL values, even as high as two years.
You can always use versioned filenames to trigger a cache reload. If you release a new version of a CSS style sheet, you can name it styles-1.0.1.css, and the user’s browser (and any CDNs in front of it) will see it as a new file that needs to be redownloaded. Additionally, for some CDNs, you can issue manual invalidations to flush the existing cache without changing any filenames.
How to Use Cache-Control in NGINX
Cache-Control has a few options:
public – May be cached by anyone, including browsers and CDNs. Use this for most static objects. private – Contains sensitive data that cannot be cached by CDNs or reverse proxies. The user’s browser may cache it locally. Use this for most authenticated pages. no-cache – Despite the name, it doesn’t disable caching. The browser may still cache the response for performance but must check with the origin server for updates before using it. Use this if you want the user to revalidate each time no-store – Disables caching entirely. Use this only for highly sensitive data that shouldn’t be sent twice.
When setting the max-age, it’s always done in seconds. However, NGINX allows for a few more custom values:
-1, or off, which will turn off caching, and not modify existing headers epoch, set to Unix time zero, which will explicitly turn off caching and purge all caches (useful if you’re using NGINX as a reverse proxy) max, which will expire when the universe ends, on the 31 of December, 2037 30s, for seconds 1m, for minutes 24h, for hours 3d, for days 1M, for months 2y, for years
Additionally, you can add the no-transform directive, which disables any conversions that may be done to the resource. For example, some CDNs compress images to reduce bandwidth. This directive disables that behavior.
For NGINX, you can modify the Cache-Control headers with the following directives:
The first line sets the max-age to 1 year, and second sets the public and no-transform caching settings. You can add this to a server block to apply site wide, but a better method is to match file extensions with a location block to set different values depending on the file extension:
This location block uses a regular expression match, denoted by the ~. This is useful for applying general settings for content type. If you want to make exceptions for specific locations, you can use a regular location block, which will take precedence over a regex match.
You can also use the = modifier, which matches paths exactly, and will take precedence over a regex match and a standard location block.
Use Surrogate-Control to Modify CDN Behavior
While you can disable CDN caching and still leverage browser caching by using Cache-Control: private, it’s better to have direct control over it. Most CDNs will respect the Surrogate-Control header, which functions exactly the same as Cache-Control, except meant only for CDNs. This way, you can tell Fastly to do one thing, and the user to do another.
In NGINX, you’ll have to set this header manually, and set the max-age value instead of using NGINX’s expires directive.
You will definitely want to test with your CDN to verify that this works—Surrogate-Control is fairly new, and isn’t universal.