AridCache supports Rails 3!
AridCache is a caching framework written in Ruby which makes caching easy and effective. AridCache supports Rails 2 and 3 and provides caching on any ActiveRecord class or instance method right out of the box. AridCache keeps caching logic in your cache configuration rather than dispersed throughout your code, and makes your code easy to manage by making cached calls explicit.
AridCache supports caching large, expensive ActiveRecord collections by caching only the model IDs, provides efficient in-memory pagination of your cached collections, and gives you collection counts for free. Non-ActiveRecord collection data is cached unchanged allowing you to cache the results of anything simply by prepending your method call with cached_
.
AridCache simplifies caching by supporting auto-expiring cache keys - as well as common options like :expires_in
- and provides methods to help you manage your caches. AridCache now supports counts, limits, pagination and in-memory sorting of any cached Enumerable, as well as Proxies, a powerful new feature which gives you total control over your cached data.
-
v1.5.2: In-memory bug fix: omit records which cannot be found
-
v1.5.1: Support in-memory sorting for ActiveRecord sets; Set AridCache.order_in_memory. Also add tentative support for sorting ActiveRecord collections using procs.
-
v1.5.0: Full Rails 3.1 support
-
v1.4.8: Add :pass_options option which causes all options to be passed into the method call
-
v1.4.7: Support generating cache keys by manually specifying the id, so the record does not need to be loaded.
-
v1.4.6: Tested with will_paginate prerelease and official 3.0 releases
-
v1.4.5: Return empty array if offset is greater than the array size. Raise exception if you try to apply limit or offset to a Hash.
-
v1.4.4: Fix empty relations return nil instead of an empty array
-
v1.4.2: Add
:proxy_out
and:proxy_in
options;AridCache::Proxies::IdProxy
. Support proxies asProcs
. -
v1.4.1: Default
:page
to1
if it isnil
-
v1.4.0: Rails 3 fully supported!
-
v1.3.5: Backwards-compatibility fixes
-
v1.3.4: Inherited cache configurations: Cache options and cache blocks are inherited from superclasses
-
v1.3.2:
AridCache.raw_with_options
configuration for better:raw
handling on cached ActiveRecord collections -
v1.3.1: Proxy support which allow you to control how your objects get serialized and unserialized
-
v1.3.0: Support limits, ordering and pagination on cached Enumerables
-
v1.2.0: Fix Rails 3 ActiveRecord hooks & remove some Rails dependencies…almost Rails agnostic!
-
v1.0.5: Support
:raw
and:clear
options.
Rails 3:
Add the gem to your ‘Gemfile`
gem 'arid_cache'
Then
bundle install
Rails 2:
Add the gem to your config/environment.rb
file:
config.gem 'arid_cache'
Then
rake gems:install
-
Include the AridCache module in any Class
-
Rails 2 & 3 supported with automatic ActiveRecord::Base integration.
-
When using ActiveRecord 3, returns lazy-loaded result sets.
-
Auto-generated, namespaced cache keys
-
auto-expiring cache keys - your cache expires when the updated_at timestamp changes
-
Supports limits, ordering & pagination of cached Enumerables and ActiveRecord collections
-
Define caches and their options on your class using
instance_caches
andclass_caches
-
Counts for free - if you have already cached the result, you get the count for free
-
Smart counts - if you only ask for the count it will only calculate the count; useful when the result is an Association Reflection or a named scope.
-
Supports eager-loading and other options to
ActiveRecord::Base#find
like:conditions
,:include
,:joins
,:select
,:readonly
,:group
,:having
,:from
-
Provides methods to clear caches individually, at the instance-level, class-level and globally
-
Preserves ordering of your cached ActiveRecord collections
-
Optimized to make as few cache and database accesses as absolutely neccessary
-
Define your own cache proxy to serialize your objects as they go to and from the cache
-
Inherited cache configurations - subclasses inherit cache options and cache blocks from superclasses.
The name AridCache comes from ActiveRecord ID Cache. It’s also very DRY…get it? :)
Out of the box AridCache supports caching on all your ActiveRecord class and instance methods and named scopes. If a class or class instance respond_to?
something, you can cache it.
AridCache supports limits, pagination and ordering options on cached ActiveRecord collections and any other Enumerable. Options to apply limits are :limit
and :offset
. Options for pagination are :page
and :per_page
and the :order
option accepts the same values as ActiveRecord::Base#find. If the cached value is an ActiveRecord collection most other options to ActiveRecord::Base#find are supported too. If the cached value is an Enumerable (e.g. an Array) the value for :order must be a Proc
. The Proc is passed to Enumerable#sort to do the sorting. Unless the Enumerable is a list of Hashes in which case it can be a String or Symbol giving the hash key to sort by.
The way you interact with the cache via your model methods is to prepend the method call with cached_
. The part of the method call after cached_
serves as the basis for the cache key. For example,
User.cached_count # cache key is arid-cache-user-count genre.cached_top_ten_tracks # cache key is arid-cache-genres/<id>-top_ten_tracks
You can also define caches that use compositions of methods or named scopes, or other complex queries, without having to add a new method to your class. This way you can also create different caches that all use the same method. For example,
class User named_scope :active, :conditions => { :active => true } class_caches do most_active_users(:limit => 5) do active.find(:order => 'activity DESC') end end end
This defines a cache most_active_users
on the User class which we can call with:
User.cached_most_active_users >> [#<User>, #<User>]
This will return up to five users, but there may be many more in the cache, because we didn’t apply a limit in our call to active.find()
. We can also pass options to the call to override the stored options:
User.cached_most_active_users(:limit => 1) >> [#<User>]
If the result of your cached_
call is an array of ActiveRecords, AridCache only stores the IDs in the cache (because it’s a bad idea to store records in the cache).
On subsequent calls we call find_all_by_id
on the target class passing in the ActiveRecord IDs that were stored in the cache. AridCache will preserve the original ordering of your collection (you can change this using the :order
option).
The idea here is to cache collections that are expensive to query. Once the cache is loaded, retrieving the cached records from the database simply involves a SELECT * FROM table WHERE id IN (ids, ...)
.
Consider how long it would take to get the top 10 favorited tracks of all time from a database with a million tracks and 100,000 users. Now compare that to selecting 10 tracks by ID from the track table. The performance gain is huge.
Cached enumerables support find-like options such as :limit
, :offset
and order
as well as pagination options :page
and :per_page
. :order
must be a Proc
. It is passed to Enumerable#sort to do the sorting. Unless the enumerable contains hashes in which case :order
can be a string or symbol hash key to order by.
Anything that is not an array of ActiveRecords is cached as-is. Numbers, arrays, hashes, nils, whatever.
An example of caching using existing methods on your class:
class User < ActiveRecord::Base has_many :pets has_one :preferences named_scope :active, :conditions => [ 'updated_at <= ', 5.minutes.ago ] end User.cached_count # uses the built-in count method User.cached_active # only stores the IDs of the active users in the cache User.cached_active_count # returns the count of active users directly from the cache user.cached_pets_count # only selects the count until the collection is requested user.cached_pets # loads the collection and stores the pets IDs in the cache
To dynamically define caches just pass a block to your cached_
calls. Caches can be defined on your classes or class instances. For example,
User.cached_most_active_users do active.find(:order => 'activity DESC', :limit => 5) end => [#<User id: 23>, #<User id: 30>, #<User id: 5>, #<User id: 2>, #<User id: 101>] user.cached_favorite_pets do pets.find(:all, :conditions => { 'favorite' => true }) end => [#<Pet id: 11>, #<Pet id: 21>, #<Pet id: 3>]
We can clean up our views significantly by configuring caches on our model rather than defining them dynamically and passing options in each time. You configure caches by calling instance_caches(options={})
or class_caches(options={})
with a block and defining your caches inside the block (you don’t need to prepend cached_
when defining these caches because we are not returning results, just storing options).
You can pass a hash of options to instance_caches
and class_caches
to have those options applied to all caches in the block. The following is a more complex example that also demonstrates nested cached calls.
# app/models/genre.rb class Genre class_caches do most_popular do popular(:limit => 10, :order => 'popularity DESC') end end instance_caches(:order => 'release_date DESC') do highlight_tracks(:include => [:album, :artist]) do cached_tracks(:limit => 10, :include => [:album, :artist]) end highlight_artists(:order => nil) do # override the global :order option cached_artists(:limit => 10) end highlight_albums(:include => :artist) do cached_albums(:limit => 3, :include => :artist) end end end # app/controllers/genre_controller.rb @most_popular = Genre.cached_most_popular @tracks = @genre.cached_highlight_tracks @artists = @genre.cached_highlight_artists @albums = @genre.cached_highlight_albums
You can configure your caches in this manner wherever you want, but I think the model is a good place. If you wanted to move all your cache configurations to a file in lib
or elsewhere, your calls would look like,
Genre.class_caches do ... end Genre.instance_caches do ... end
AridCache cache keys are defined based on the methods you call to interact with the cache. For example:
Album.cached_featured_albums => cache key is arid-cache-album-featured_albums album.cached_top_tracks => cache key is arid-cache-albums/<id>-top_tracks
Caches on model instances can be set to automatically incorporate the ActiveRecord cache_key
which includes the updated_at
timestamp of that instance, making them auto-expire when the instance is updated.
To incorporate the the cache_key
pass :auto_expire => true
to your cache method:
album.cached_top_tracks(:auto_expire => true) => cache key like arid-cache-albums/2-20091211120100-top_tracks
Or via the cache configuration:
Album.instance_caches do top_tracks(:auto_expire => true) end
If you need to examine values in the cache yourself you can build the AridCache key by calling arid_cache_key('method')
on your object, whether it is a class or instance. Using the examples above we would call,
Album.arid_cache_key('featured_albums') => arid-cache-album-featured_albums album.arid_cache_key('top_tracks') => arid-cache-albums/2-top_tracks album.arid_cache_key('top_tracks', :auto_expire => true) => arid-cache-albums/2-20091211120100-top_tracks
AridCache provides methods to help you clear your caches:
AridCache.clear_caches => expires all AridCache caches Model.clear_caches => expires class and instance-level caches for this model Model.clear_instance_caches => expires instance-level caches for this model Model.clear_class_caches => expires class-level caches for this model
The Model.clear_caches
methods are also available on all model instances.
Your cache store needs to support the delete_matched
method for the above to work. Currently MemCacheStore and MemoryStore do not.
Alternatively you can pass a :force => true
option in your cached_
calls to force a refresh of a particular cache, while still returning the refreshed results. For example:
Album.cached_featured_albums(:force => true) => returns featured albums album.cached_top_tracks(:force => true) => returns top tracks
If you just want to clear a cache without forcing a refresh pass :clear => true
. The cached value will be deleted with no unnecessary queries or cache reads being performed. It is safe to pass this option even if there is nothing in the cache yet. The method returns the result of calling delete
on your cache object. For example:
Album.cached_featured_albums(:clear => true) => returns false Rails.cache.read(Album.arid_cache_key(:featured_albums)) => returns nil
You can pass an :expires_in
option to your caches to manage your cache expiry (if your cache store supports this option, which most do).
Album.cached_featured_albums(:expires_in => 1.day) album.cached_top_tracks(:expires_in => 1.day)
Or via the cache configuration,
Album.instance_caches(:expires_in => 1.day) do top_tracks featured_albums end
AridCache gives you counts for free. When a collection is stored in the cache AridCache stores the count as well so the next time you request the count it just takes a single read from the cache.
To get the count just append _count
to your cached_
call. For example, if we have a cache like album.cached_tracks
we can get the count by calling,
album.cached_tracks => returns an array of tracks album.cached_tracks_count => returns the count with a single read from the cache
This is also supported for your non-ActiveRecord collections if the collection responds_to?(:count)
. For example,
album.cached_similar_genres => returns ['Pop', 'Rock', 'Rockabilly'] album.cached_similar_genres_count => returns 3
Sometimes you may want the collection count without loading and caching the collection itself. AridCache is smart enough that if you only ask for a count it will only query for the count. This is only possible if the return value of your method is a named scope or association proxy (since these are lazy-loaded unlike a call to find()
).
In the example above if we only ever call album.cached_tracks_count
, only the count will be cached. If we subsequently call album.cached_tracks
the collection will be loaded and the IDs cached as per normal.
Other methods for caching counts are provided for us by virtue of ActiveRecord’s built-in methods and named scopes, for example,
Artist.cached_count # takes advantage of the built-in method Artist.count
AridCache supports pagination using WillPaginate. If you are not changing the order of the cached collection the IDs are paginated in memory and only that page is selected from the database - directly from the target table, which is extremely fast.
An advantage of using AridCache is that since we already have the size of the collection in the cache no query is required to set the :total_entries
on the WillPaginate::Collection
.
To paginate just pass a :page
option in your call to cached_
. If you don’t pass a value for :per_page
AridCache gets the value from Model.per_page
, which is what WillPaginate
uses.
The supported pagination options are:
:page, :per_page, :total_entries, :finder
Some examples of pagination:
User.cached_active(:page => 1, :per_page => 30) User.cached_active(:page => 2) # uses User.per_page user.cached_pets(:page => 1) # uses Pet.per_page
If you want to paginate using a different ordering, pass an :order
option. Because the order is being changed AridCache cannot paginate in memory. Instead, the cached IDs are passed to your Model.paginate
method along with any other options and the database will order the collection, apply limits and offsets, etc. Because the number of records the database deals with is limited, this is still much, much faster than ordering over the whole table.
For example, the following queries will work:
user.cached_companies(:page => 1, :per_page => 3, :order => 'name DESC') user.cached_companies(:page => 1, :per_page => 3, :order => 'name ASC')
By specifying an :order
option in our cached call we can get different “views” of the cached collection. I think this a “good thing”. However, you need to be aware that in order to guarantee that the ordering you requested is the same as the order of the initial results (when the cache was primed), we have to order in the database. This results in two queries being executed the first time you query the cache (one to prime it and the other to order and return the results). If no order option is specified, we can skip the second query and do everything in memory.
If you have an expensive cache and don’t want that extra query, just define a new cache with your desired ordering and use that. Make sure that the order of the initial results matches your desired ordering. Building on the example above we could do:
User.instance_caches do companies_asc do companies(:order => 'name ASC') end companies_desc do companies(:order => 'name DESC') end end user.cached_companies_asc(:page => 1, :per_page => 3) user.cached_companies_desc(:page => 1, :per_page => 3)
You apply :limit
and :offset
options in a similar manner to the :page
and :per_page
options. The limit and offset will be applied in memory and only the resulting subset selected from the target table - unless you specify a new order.
user.cached_pets(:limit => 2, :include => :toys) user.cached_pets(:limit => 2, :offset => 3, :include => :toys) genre.cached_top_ten_tracks { cached_tracks(:limit => 10, :order => 'popularity DESC') }
The supported options to find
are:
:conditions, :include, :joins, :limit, :offset, :order, :select, :readonly, :group, :having, :from, :lock
You can pass options like :include
(or any other valid find
options) to augment the results of your cached query. Just because all of the options are supported, does not mean it’s a good idea to use them, though. Take a look at your logs to see how AridCache is interacting with the cache and the database if you don’t get the results you expect.
For example, we could call:
User.cached_active(:page => 2, :per_page => 10, :include => :preferences)
To return page two of the active users, with the preferences
association eager-loaded for all the users.
When you cache a collection of ActiveRecords, the records IDs are stored in the cache in a AridCache::CacheProxy::CachedResult
which is a Struct
with methods to return the ids
, count
and klass
of the cached records.
Sometimes you may want to access what is in the cache without instantiating any records. :raw => true
is what you use for that.
The current (and deprecated) behaviour of :raw => true
is to return the CachedResult
object and ignore all other options - if the cached result is an ActiveRecord collection. Passing :raw => true
when you have cached an Enumerable or some other base type does not share this behaviour. For these types :raw => true
returns whatever is in the cache, after applying all options. This means that we can apply limits, ordering and pagination to our cached results and return the same type of data as is stored in the cache.
v1.3.2 introduces a new configuration option which makes the behaviour on cached ActiveRecord collections the same as for other cached types. Set
AridCache.raw_with_options = true
to enable the new behaviour. With this option set, rather than return a CachedResult
:raw => true will return the list of IDs itself, after applying all options. So you can limit, order (with a Proc) and paginate the IDs before they are returned. Keep in mind the <tt>:order Proc is applied <b>to the cached/raw data
.
Note that passing the :raw
option to your cache store is not supported, because the AridCache option shares the same name. If you really want to get the marshalled result from your cache you can do a cache read manually.
The current (deprecated) behaviour:
user = User.first user.cached_favorite_tracks => returns [#<Track:1>, #<Track:2>] user.cached_favorite_tracks(:raw => true) => returns { :klass => "Track", # stored as a string :count => 1, :ids => [1, 2] } user.cached_favorite_tracks(:raw => true).ids => returns [1, 2]
The cache will be primed if it is empty, so you can be sure that it will always return a AridCache::CacheProxy::CachedResult
.
In some circumstances - like when you are querying on a named scope - if you have only requested a count, only the count is computed, which means the ids array is nil
. When you call your cached method passing in :raw => true
AridCache detects that the ids array has not yet been set, so in this case it will perform a query to seed the ids array before returning the result. This can be seen in the following example:
class User named_scope :guests, :conditions => { :account_type => ['guest'] } end User.cached_guests_count => returns 4 Rails.cache.read(User.arid_cache_key(:guests)) => returns { :klass => "User", :count => 4, :ids => nil # notice the ids array is nil in the cache } User.cached_guests(:raw => true) => returns { :klass => "User", :count => 4, :ids => [2, 235, 236, 237] # the ids array is seeded before returning }
With AridCache.raw_with_options = true
:
user = User.first user.cached_favorite_tracks >> [#<Track:1>, #<Track:2>] user.cached_favorite_tracks(:raw => true) >> [1, 2] user.cached_favorite_tracks(:order => Proc.new { |a, b| b <=> a }, :raw => true) >> [2, 1] user.cached_favorite_tracks(:offset => 1, :raw => true) >> [2]
Proxies allow you to do anything you want to your objects as they go into - and come out of - the cache. They are most useful for serializing your objects, for example if you want to store JSON, or hashes.
The :proxy
option should be a Symbol giving the name of a class method. The method is passed the result of your cache block (or method) as the first parameter, and in future may also pass a Hash of options. The method is called on the class that calls cached_
.
A simple example of a proxy which stores hashes of ActiveRecord attributes:
class User has_many :companies instance_caches do companies(:proxy => :serializing_proxy) end def self.serializing_proxy(records, *args) return records if records.empty? records.first.is_a?(ActiveRecord::Base) ? records.collect(&:attributes) : records.collect { |r| Company.find_by_id(r['id']) } end end
Now when we first call User.first.cached_companies<tt> the cache is empty, so the cache method (<tt>companies
) is called and the result is passed to User.serializing_proxy
which converts the records to Hashes. The result of the proxy is stored in the cache. The next time we call User.first.cached_companies<tt> the cached result is retrieved and passed to <tt>User.serializing_proxy
which converts the hashes to records, so we get ActiveRecords back.
@user = User.first @user.cached_companies >> [#<Company id: 1>, #<Company id: 2>] @user.cached_companies(:raw => true) >> [{ 'id' => 1 }, { 'id' => 2 }]
The :raw
option bypasses the proxy on the way out of the cache because we are asking for a raw result. The :raw
option works a bit differently from the default behaviour (which is to return a CachedResult
object if the cached result is an ActiveRecord collection) because you can use options to do limiting, pagination and in-memory ordering with :raw
to get different views of your cached enumerable.
Proxies still support limits, pagination and ordering, which is applied to the cached result. For example:
Limits:
@user.cached_companies(:limit => 2, :offset => 1) >> [#<Company id: 2>]
Pagination returns a WillPaginate::Collection
:
result = @user.cached_companies(:page => 2, :per_page => 1) >> [#<Company id: 2>] result.total_entries >> 1 result.current_page >> 2
Order by:
@user.cached_companies(:order => Proc.new { |a, b| b['id'] <=> a['id'] }) >> [#<Company id: 2>, #<Company id: 1>]
The above Proc orders by id reversed. We can also combine all the options. The order by will be applied first, then the limits and finally the pagination.
All of the above calls work with raw => true
. With that option the results would be same but instead of returning records, it returns hashes.
In practice you probably want to define your proxy method on ActiveRecord::Base
so that it is available to all your models. In future AridCache will probably have some built-in proxies that you can make use of.
Subclasses automatically inherit cache definitions from their superclasses. The options for a particular cache are merged, with options from subclasses overriding the superclass. If a superclass defines a block for that cache, the one from the closest ancestor is used.
Here is a simple example:
class Abc include AridCache instance_caches do name(:limit => 2) { 'abc' } actual_name(:limit => 2) end def actual_name 'abc' end end class Def < Abc instance_caches do name(:limit => 1) actual_name(:limit => 1) end def actual_name 'def' end end > a, d = Abc.new, Def.new > a.cached_name "ab" > a.cached_actual_name "ab" > d.cached_name "a"
So here cached_name
is using its own :limit => 1
with the block from Abc
. So it’s returning the first character of ‘abc’.
> d.cached_actual_name "d"
Here cached_actual_name
is using its own :limit => 1
but since no ancestor class defines a block on the cache, it uses the method from the instance. So it’s returning the first character of ‘def’.
-
AridCache intercepts calls to
cached_
methods usingmethod_missing
then defines those methods on your models as they are called, so they bypass method missing on subsequent calls. -
In-memory pagination of cached collections speeds up your queries. See Pagination.
-
If you only request a count AridCache will only select the count. See Cached Counts.
-
If a collection has already been loaded, you get the count for free. See Cached Counts.
Ruby: 1.8.6, 1.8.7, REE 1.8.7 and 1.9.1, 1.9.2, 1.9.3 Rails: 2.3.8, 2.3.11, 3.0.0, Rails 3.0.5, 3.0.6 - 3.1.2 WillPaginate: 2.3.15 (Rails 2), 3.0.pre2 (Rails 3), 3.0
For Ruby < 1.8.7 you probably want to include the following to extend the Array class with a count
method. Otherwise your cached_<key>_count
calls probably won’t work:
Array.class_eval { alias count size }
-
Caches that contains duplicate records will only return unique records on subsequent calls. This is because of the way
find
works when selecting multiple ids. For example, if your query returns[#<User id: 1>, #<User id: 1>, #<User id: 1>]
, the IDs are cached as[1,1,1]
. On the next call to the cache we load the IDs usingUser.find_all_by_id([1,1,1])
which returns[#<User id: 1>]
, not[#<User id: 1>, #<User id: 1>, #<User id: 1>]
as you might have expected. -
You can’t cache polymorphic arrays e.g. [#<User id: 1>, #<Pet id: 5>] because it expects all ActiveRecords to be of the same class. If you need polymorphism consider using a proxy and do the record-loading yourself.
-
Rails ActiveRecord (or SQL) has a quirk where if you pass an
offset
without alimit
theoffset
is ignored. AridCache fixes this unexpected behaviour by including alimit
on its queries. -
When you use the
:order
option with a cached ActiveRecord collection we have to go to the database to order. So we also do pagination and limiting there because we want to retrieve as few records as possible. So we use WillPaginate’s ActiveRecord::Base#paginate method. Unfortunately if you pass:limit
or:offset
in addition to your pagination options, the limit and offset are ignored. This is not the behaviour when interacting with other cached types. In that case we apply the order, then the limits and finally the pagination, which is what you would expect. -
There is a bug in ActiveRecord 3’s AREL where calling
count
orsize
on an AREL query will cause a </tt>COUNT(*)<tt> to be executed but without applying aLIMIT
. Callinglength
works as expected, and returns the number of records that matched the query. So we aliascount
andsize
to thelength
on the AREL object that is returned.
Contributions are welcome! Please,
-
Fork the project.
-
Make your feature addition or bug fix.
-
Add tests for it (this is important so I don’t break it in a future release).
-
Commit (don’t mess with the Rakefile, version, or history).
-
Send me a pull request.
Copyright © 2009 Karl Varga. See LICENSE for details.