When caching is bad and you should not cache.
(Everyday Code – instead of keeping our knowledge in a README.md let’s share it with the internet)
On Friday we did some refactoring at FLLCasts.com. We removed Refinery CMS, which is a topic for another article, but one issue pop-up – on a specific page caching was used in a way that made the page very slow. This article is about how and why. It is mainly for our team as a way to share the knowledge among ourselves, but I think the whole community could benefit, especially the Ruby on Rails community.
TL;DR;
When making a request to a cache service, be it MemCachir, Redis or any other, you are making a request to a cache service. This will include a get(key) method call and if the value is not stored in the cache, it will include a set(key) method call. When the calculation you are doing is simple it will take more time to cache the result from the calculation than to do the calculation again, especially if this calculation is a simple string concatenation.
Processors (CPUs) are really good at string concatenation and could do them in a single digit milliseconds. So if you are about to cache something, make sure that you cache something worth caching. There is absolutely no reason to cache the result of:
# Simple string concatenation. You calculate the value. No need to cache it.
value = "<a href=#{link}>Text</a>".
# The same result, but with caching
# There isn't a universe in which the code below will be faster than the code above.
hash = calculate_hash(link)
cached_value = cache.get(hash)
if cached_value == nil
cached_value = "<a href=#{link}>Text</a>".
cache.set(hash, cached_value)
end
value = cached_value
Context for Rails
Rails makes caching painfully easy. Any server side generated HTML could be cached and returned to the user.
<% # The call below will render the partial "page" for every page and will cache the result %>
<% # Pretty simple, and yet there is something wrong %>
<%= render partial: "page", collection: @pages, cached: true %>
What’s wrong is that we open the browser and it takes more than 15 seconds to load.
Here is a profile result from New Relic.
As you can see there a lot of Memcached calls – like 10, and a lot of set calls. There are also a lot of Postgres find methods. All of this is because of how caching was set up in the platform. The whole “page” partial, after a decent amount of refactoring turns out to be a simple string concatenation as:
<a href="<%= page.path%>"><%= page.title %></a>
That’s it. We were caching the result of a simple string concatenation which the CPU is quite fast in doing. Because there were a lot of pages and we were doing the call for all of the pages, when opening the browser for the first time it just took too much to call all the get(key), set(key) methods and the page was returning a “Time out”
Conclusion
You should absolutely use caching and cache the values of your calculations, but only if those calculations take more time than asking the cache for a value. Otherwise it is just not useful.
Reply
You must be logged in to post a comment.