Inject is for Wizards

Indistinguishable from magic.

Read Through Caching of ActiveResource

ActiveResource makes talking to REST API’s easy. The beauty of ActiveResource objects is that they mimic ActiveRecord objects, providing a familiar interface to rails developers.

The problem is that a REST API is not a database connection. There’s quite a bit going on to create your objects. ActiveResource sends an API request, your API handles that, gets the records from the database, marshals it into json, sends it over the network, ActiveResource un-marshals the data, and finally returns a ruby object.

If you have a page with many ActiveResource objects being displayed on it, you can kiss any hopes of scalability and performance goodbye.

So it makes sense that you would try and limit the number of times you request the same information from an API. Ideally you would take advantage of cache-control max-age headers, last-modified headers and ETags to limit the number of redundant API requests. ActiveResource doesn’t respect any of these caching strategies, and adding support for them turns out to be a non-trivial task. Believe me, I tried.

Thankfully there is a simple solution, implement a read through cache.

What is a Read Through Cache?

A read through cache is quite simple, it’s designed to serve data from a cache if it has it, otherwise to fetch it from the source.

The algorithm is as follows:

  1. Attempt to read from the cache.
  2. If no results are found, read from the original source.
  3. Write the results to the cache.
  4. Return the results.

An Implementation

In order to add read through caching to ActiveResource, we need to build a module that can hook into the ActiveResource find class methods:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
require 'active_support/concern'
module CachedResource
extend ActiveSupport::Concern
included do
class << self
alias_method_chain :find, :read_through_cache
end
end
module ClassMethods
def find_with_read_through_cache(*arguments)
puts "I'm hijacking the find method, but not doing anything."
find_without_read_through_cache(*arguments)
end
end
end

If you include this module in any ActiveResource class and call the find method you will see that find_with_read_through_cache is called and prints a message letting you know about it.

In order to cache an object, we need a suitable cache key, something unique that we can use to store and retrieve the objects. A combination of the ActiveResource class name and the arguments passed to find should do just nicely:

1
2
3
def cache_key(*arguments)
"#{name}/#{arguments.join('/')}".downcase
end

Using Rails.cache we can implement the basic read through cache algorithm quite easily. The only quirk is that we must call dup on anything pulled from Rails.cache that we intend to modify it later:

1
2
3
4
5
6
7
8
9
10
11
def find_with_read_through_cache(*arguments)
key = cache_key(arguments)
result = Rails.cache.read(key).try(:dup)
unless result
result = find_without_read_through_cache(*arguments)
Rails.cache.write(key, result)
end
result
end

If you are familiar with how Rails.cache works, then you will have noticed the flaw in the above code. The cache will be held onto for ever. Probably not what you want as the results could change and you have no way to flush the cache.

So we need a way to set the cache expiry. A class level cache_for attribute, and a cache_expires_in method to retrieve the value or a default if none is provided. Then we can pass the :expires_in parameter to Rails.cache.write and our simple caching strategy is complete:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
require 'active_support/concern'
module CachedResource
extend ActiveSupport::Concern
included do
class << self
alias_method_chain :find, :read_through_cache
end
class_attribute :cache_for
end
module ClassMethods
def cache_expires_in
self.cache_for || 60
end
def find_with_read_through_cache(*arguments)
key = cache_key(arguments)
result = Rails.cache.read(key).try(:dup)
unless result
result = find_without_read_through_cache(*arguments)
Rails.cache.write(key, result, :expires_in => self.cache_expires_in)
end
result
end
private
def cache_key(*arguments)
"#{name}/#{arguments.join('/')}".downcase
end
end
end

With this simple module we can now cache ActiveResource objects for as long as we’re comfortable.

I used this technique on a recent project, combined with a cron job to keep the caches warm, to ensure that pages including upwards of 30 ActiveResource objects were able to load with 100ms response times.

Poor Design

I hate my salt and pepper shakers.

Salt and Pepper Shakers Side on

I always use the wrong one. I’ve had them for nearly two years, and I still always reach for the wrong one.

Salt and Pepper Shakers Top View

You see, the pepper is in the red shaker, not the black. When I think of pepper, I think of black pepper, therefore I instinctively reach for the black shaker. The visual queues given to me by the colour of the shakers don’t align with reality of what’s inside them.

I hate my salt and pepper shakers, they make me feel stupid.

Unit Tests Should Not Write to the Database

As a rails developer, I’ve become accustomed to writing tests that interact with the database. Over the years I’ve transitioned from using fixtures, to using factories, when generating test data. Tools like machinist make generating dynamic test data dead simple and repeatable. I think it’s time to stop, and here’s why:

Database access is slow

Reading and writing from a database is slow when compared to manipulating objects in memory. Most of the time, you don’t actually need an object in the database to unit test it properly.

Object Graphs

Machinist makes generating test data easy, it makes it even easier to build complex object graphs with one line of code. Hands up if you’ve ever wondered why your tests are grinding to a halt only to realise one line of code is generating, validating and saving 30 objects you don’t really care about? It’s just too easy to build yourself a mess.

Cleaner, Focused Tests.

With no object graph to distract you, you’re left with only one option: write small focused unit tests.

How?

In a side project of mine I decided to do an experiment, could I force myself to write model tests that never touch the database? Leaving integration tests to manipulate the full system, and thus build large object graphs.

I don’t trust myself at the best of times, so I duck punched ActiveRecord to raise a custom exception on any attempt to execute an insert statement.

Making ActiveRecord Raise on Insert Statements (active_record_database_unavailable.rb) download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
module ActiveRecord
class DatabaseUnavailable < ActiveRecordError
end
module ConnectionAdapters
class AbstractAdapter
def exec_insert(sql, name, binds)
raise DatabaseUnavailable.new('Writing to the database makes for slow unit tests')
end
end
end
end

The Result?

Really fast unit tests, that test one thing in isolation, like they’re supposed to. Coupled with really slow integration tests, that exercise the full system stack.

Indistinguishable From Magic

My friend, Marcus Crafter, once told me: inject is for wizards. He was refering to the inject method from Ruby’s enumerable module.

It reminded me of the following quote:

Any sufficiently advanced technology is indistinguishable from magic.

Arthur C. ClarkeThird Law of Prediction

The inject method of course, has nothing to do with magic, it just seems that way to the uninitiated. People new to the ruby language avoid it, looking up the documentation every time they encounter it. Eventually, through repeated exposure, the use of inject becomes second nature.

As a software developer, this journey from a complete lack of understanding, through to all powerful wizardry happens on a near daily basis. Join me in a journey from stupidity to sorcery, or something like that.