Caching is easy; Expiration is hard

Still on a high from our success with memcached in AMO version 2, we decided to go a fairly common route and cache query results in version 3. This performs admirably particularly with our ridiculously long and slow queries. Over time, though, the popularity of the site and the load on the servers climb, and soon we’re looking at slowness issues again. On a rough day we decided to increase the expiration timeout for our queries in memcached from a minute to around an hour. This gives the database servers some breathing room but causes excessive delay on the AMO site when add-ons are updated and things like bug 425315 and it’s friends are born. Weird things happen when parts of a site expire at different times and consequently user experience (particularly add-on developers) suffers.

The problem we’re running into in the bug linked above is knowing when to expire a cache. Consider when an add-on author updates the summary of their add-on. We know we’ll have to flush the queries on that page out of memcached, and that’s easy enough, but what about all the other places the summary is used? Search results pages, add-on detail pages, recommended lists, the API, etc. Now we’ve got to figure out the queries used on those pages and expire them too. Suddenly I’m wishing we were caching objects in memcached instead of queries.

I looked in to other ways to use memcached and they all have their pros and cons. Caching entire pages means we’d have to store different versions for a person that is logged in vs. logged out and also what permissions they had (pages have different options for localizers, admins, developers, etc.). Caching objects is attractive, but the way CakePHP does queries makes this a non-option (namely, it’s not asking objects for values, it does joins directly on the db). Directly caching queries seems like the best fit because we can affect just the parts of the pages we want and it will work with CakePHP’s current system…just as soon as we figure out how to relate updating a row to all of it’s associated queries.

I attached an idea to the bug but regardless of the process we use, figuring out how to implement a full time cache that we can expire on the fly is going to be an important step in keeping the AMO site usable as our traffic increases.

5 Comments

For the different-links-for-different-people case, I would suggest creating a single page with everything in it, and then serving hide-admin/hide-developer/hide-localizer CSS rules in a tiny non-cached CSS file for the current user if they're logged in. You can do the same with JS to stick the user's name in the right place if needed. <noscript> blocks in the content can provide a boring-but-workable alternative for the few who turn off JS for the site.

Explicit invalidation is definitely the way to go, if you can do it, but as you say that requires a fair amount of book-keeping to know what gets invalidated when. Does search invalidation need to be on the same schedule? If you just invalidated the main add-on page on any developer edit, and the rec-list/front page, but left search to expire over 15 mins or whatever, do you think that people would still complain? It would not be Truly Pure, but the main use cases (I made a change to my add-on, and want to make sure it looks OK, etc.) would seem to be well-served.
-- shaver, 23 Apr 2008
One idea is the construct the key for the data you're caching in a way that it is the same, regardless on where you get the data from (which is often irrelevant as you point out). For example, an addon's uuid+"title" to get its title?

Also, instant invalidation is not necessary if you do the above, and make sure to overwrite the previous entry as soon as it is updated with the new value.
-- Håkan W, 23 Apr 2008
For the different-links-for-different-people case, I would suggest creating a single page with everything in it, and then serving hide-admin/hide-developer/hide-localizer CSS rules in a tiny non-cached CSS file for the current user if they’re logged in. You can do the same with JS to stick the user’s name in the right place if needed. <noscript> blocks in the content can provide a boring-but-workable alternative for the few who turn off JS for the site.


I think we're starting to do this in a few places already as we rewrite stuff, but it's definitely a direction I see us going.

If you just invalidated the main add-on page on any developer edit, and the rec-list/front page, but left search to expire over 15 mins or whatever, do you think that people would still complain?


Heh, probably ;), but it would be much more reasonable that what we have. If we can expire the main add-on page and the API (both relatively easy compared to search) I think we'd be in a pretty good place.
-- Wil Clouser, 23 Apr 2008
One idea is the construct the key for the data you’re caching in a way that it is the same, regardless on where you get the data from (which is often irrelevant as you point out). For example, an addon’s uuid+”title” to get its title?


It's a nice idea, but with the way CakePHP works (straight queries with joins, etc. to the database) it wouldn't work. We'd have to rewrite a lot of the core Cake code.

It's actually a pretty similar idea to what the patch does - it stores arrays of the query hashes themselves so we can invalidate them later (using a reproducible key). I wrote a few paragraphs about what it did but it seemed a little long winded for the original post. I put the description in bug 425315 comment #8 instead.
-- Wil Clouser, 23 Apr 2008
Yeah, search sucks because you need to invalidate basically every search that hit on either the old or new version of the data. When Laura reinvents search, I'm sure she'll find a good way to solve that. :)
-- shaver, 23 Apr 2008

Post a comment

Feel free to email me with any comments, and I'll be happy to post them on the articles. This is a static site so it's not automatic.