TaskRabbit is Hiring!

We’re a tight-knit team that’s passionate about building a solution that helps people by maximizing their time, talent and skills. We are actively hiring for our Engineering and Design teams. Click To Learn more

Mike Nelson

Storehouse: Distributed Rails Page Cache

@ 04 Jan 2013

ruby rails database mysql


What?

Storehouse provides a cache layer that wraps Rails’ page caching strategy. It provides a middleware that returns content from a centralized cache store and writes files to your local machine on-demand, allowing distribution to multiple servers. Cache stores can be easily defined by using or creating an adapter.

Features:

  • Distributed Rails Page Cache
  • Cache Expiration
  • Cache Clearing on demand
  • Many potential storage engines
  • Optional file storage in addition to shared cache (fallback to be served by nginx/apache)
  • Single generator / In-progress locks

Logic

With any application, it’s a good idea to cache whatever pages you can. When dealing with an application which spans many servers, it’s a good idea to cache pages in a common place so those objects can be shared. We couldn’t find an existing gem which met the features described above, so we built Storehouse.

The main goal of Storehouse was the ability to have a common cache which can be expired. This means that I can cache a user’s profile, but tell all servers to update their cache when a change is made to his attributes. I don’t want any web server to communicate directly with any other web server to do this. Using a shared disk (NFS) was too slow, as was S3. In production we started off by using Riak as the store. We quickly figure out that we could fit the body of our cache in Redis which is much faster. We are currently using Storehouse with a redis backend.

You can see from the diagram the hierarchy of cache hits. Serving a static html file through nginx will always be the fastest, so you want to prefer that whenever possible. Next fastest is loading a fully rendered page from Storehouse, and then finally a rails request. One thing to note is that files you opt to keep on disk will not be expired without a deployment or periodic rm * on disk, and this will only happen under special circumstances.

How?

Storehouse supports the following storage engines (add more if you want):

  • Memcached
  • Dalli
  • Redis
  • Riak
  • S3
  • In-memory (great for tests)

Storehouse appends the rails page cache using the method you define. You can configure expiration times for your content and rules for which types of pages to cache (more detail in the readme). Because we make use of Railties, Storehouse only works for Rails 3.0 and up.

Storehouse is resilient to errors. If your storage backend is taking too long or cannot be reached, Storehouse will then pass the request down to the Rails stack to render the page in the normal way, and try to save the result after the request has been fulfilled to the user.

The Storehouse middleware listens for custom headers from your application in order to know what to do:

  • To make Storehouse push the rendered content into the backend, add a X-Storehouse header with a string value of ‘1’: response.headers['X-Storehouse'] = '1' if cache_page?
  • Optionally, you can pass an expiration time for the content: response.headers['X-Storehouse-Expires-At'] = 10.minutes.from_now.to_i.to_s
  • You can also tell Storehouse to distribute the content: response.headers['X-Storehouse-Distribute'] = '1'

These headers will never reach your end user as they are always stripped out.

Distribution is a great way to share cached resources on a multi-box setup. If you would like Storehouse to distribute the rendered page across all boxes, simply add the X-Storehouse-Distribute header. This will do three things:

  1. Add the content to the backend
  2. Mark the content as distributable
  3. Lay the file on the server handling the current request. This means only one server does the work, but your entire system reaps the benefits.

So how much better is it?

On average, we have been seeing about a 40% reduction in page load time for Storehouse cached pages.

A few caveats:

  • We expire cached pages every 10 minutes. We do this to ensure that content is always fresh for list-view. However, this means that until you have at least 4.5K hits/month on average per cached paged, the cache is actually wasting your time. We do have a few cached pages with traffic that low, so we saw an increase in page load time for those pages (bringing the average down).
  • For pages like the above, the increase in page load time is accounted for by the additional overhead of storing the cached pages.

Panic Mode

Storehouse provides an especially useful tool which allows you to switch your site into “panic” mode. Panic mode is for when you’re experiencing massive load due to a traffic spike. Serving files from disk is always going to be more efficient so Storehouse will attempt to make use of that. This is a destructive operation in that any content coming from your backend or from your app with the X-Storehouse header will be written to disk. It will also render expired content read from the cache instead of passing control to your app.

Storehouse is available now. Enjoy!

Comments

Coments Loading...