In a project I was working on recently, we were building up a web page by loading up its template, content hierarchy and other information dynamically. This is a common setup, and a very flexible one since it allows sweeping changes to be made very easily. Unfortunately, this flexibility comes with a cost; pulling everything through EJB calls can give you a speed not unlike that of a slightly stunned tortoise trying to sprint up a treacle covered hill.
One way of mitigating this problem is to use a Facade pattern to load as much as possible with as few calls as possible. However, this is not always the best approach, especially if most of the content is not likely to change very often. In these cases, it’s preferable to store the content at the end where it’s being used, avoiding the remote call completely.
This can be implemented quickly and easily with a Map, but that would leave us with a lot of work to do when it comes to managing the cached content. For example, what if we wanted to make sure that the cache does not continue to fill up indefinitely, or that it gets backed to disk when it goes idle for some time to preserve memory?
Since we were already using Hibernate in the data layer, we were already using a cache at that level in the form of EHCache. It made sense to try and see if it was possible to use this in our controller layer, and it turned out to be extremely elegant to use.
EHCache is largely configuration driven, so the actual amount of code needed to use it in your application is minimal. In class SimpleCacheDemo in the example, we only need to initialize the cache and close it when the application is done:
18: CacheManager manager = CacheManager.create();
The create method in CacheManager looks up the configuration file ehcache.xml, which should be located in the classpath, and creates a cache instance for each cache declared in the configuration. In our case, we’re declaring a single cache, called demo, which we will be using for our application.
We’re also declaring the required default cache, which is used for any caches which are created programmatically. So far, I have not needed to create caches in this way, but the option is there for anyone who needs it.
The configuration for the demo cache is minimal, and is certainly not suited for production use. It specifies a maximum time to live of 5 seconds, which means that items in the cache go stale after 5 seconds of inactivity. It will also keep a maximum of 10,000 elements in memory – overkill for the demo.
The cache is not configured to be persistent, so any elements evicted from the cache because of lack of space are just dropped. The value for memoryStoreEvictionPolicy tells the cache how to decide which items to drop. In our case it’s LRU, short for Least Recently Used. In other words, when the 10,001st object comes in, it will look over all the entries, and drop the one which has been idle for the longest time.
Accessing the Cache
Once the caches are created, we can get the instances using the manager’s getCache method, simply providing the name of the cache we need. From there, we can get elements using the cache’s get method, and put elements into the cache using its put method. It’s quite easy to remember.
The test method CachingSequenceGeneratorTestCase.shouldQueryCacheWhenRequested demonstrates the workflow.
Suppose you repeatedly need to get the nth term of a Fibonacci sequence (something I’m reliably informed happens a lot if you’re the protagonist of a certain novel, but not so much to other people) very quickly (presumably because you’re being chased by angry monks with guns). While you have a method that calculates the sequence, the developer has very helpfully made that method extremely slow.
Luckily, you can cache the results, so what happens is this:
- You ask for a sequence of the given length.
- The cache checks whether this is available. Since this is the first time, it will return a null.
- The sequence is generated, cached, and returned.
Now, if you happen to need the same length again before it expires, what will happen is:
- You ask for a sequence of the given length.
- The cache finds the element.
- The sequence is returned.
Since this case does not involve processing, it’s a lot faster. You can run the demo application to get some idea of the difference.
Cleaning up even further
The example application is a very simple demonstration of how to wrap an object to make it use a cache. On the other hand, we could have rewritten the generator to make full use of the cache, populating 0 .. n-1 while generating n – since it needs to do this anyway, it would hardly have any added cost, and would make subsequent access, much faster.
Then again, our aim here is to demonstrate the use of a cache, not making a super fast Fibonacci calculator for the benefit of fictional characters.
One area which I would like to explore further is the use of self-populating caches. Unlike the vanilla cache we saw in this demo, you don’t actually need to put stuff into it. When we create a self-populating cache, we can give it an object factory. If we ask it for something it doesn’t have, it pulls an instance from the factory instead. This should make the usage even cleaner.
The configuration of a distributed cache should also be interesting if I can find a suitably rainy afternoon and enough virtual machines.
Closing off with a note of thanks to Matthew Sant for pointing me in the direction of Mockito and reviewing the demo application.
Is there anything you would implement differently? Do you have any suggestions or tips on how to configure or use EHCache? Have you used any other caching systems which you prefer? Drop me a comment and let me know!