Let's continue the EJAPP Top 10 countdown with number 6.

Caching is a funny thing. Done right it can improve the performance of your Enterprise Java application tremendously and can even be essential to reach acceptable performance levels, but sometimes caching itself can be the cause for your performance problems. Improper caching covers both cases.

There are quite a lot of things that are candidates for caching:

  • Local resources that require a lot of initialization such as JNDI resources (EJBs), Spring BeanFactories, AxisEngines, etc.
  • Network resources that are hard to set up such as database connections, HTTP(S) connections, etc.
  • Data that is hard to retrieve/calculate such as object retrieved from a database, HTML pages rendered, etc.

For each of these, there are different trade-offs to make when it comes to questions like:

  1. How does the cache check that the resource still works? For example; If a database connection is not used for a while, the database may decide to close it or a firewall may drop the connection. Checking the validity of a resource may be an expensive operation that negates any performance gain that could be had from the cache.
  2. How does the cache check whether the resource has changed? Not properly implementing may mean that the application has to be restarted after a configuration (or content) change.
  3. How much memory does the cache take?
  4. Is the cache thread safe? Improperly implemented, lock contention can occur in the cache.
  5. What is the cache hit ratio? If the cache hit ratio would be too low, the cache management overhead negates any positive impact to be had.
  6. How is the performance of the application tested with respect to the cache? Can the cache be pre-loaded with correct data or disabled while testing?
  7. Can the cache be monitored at runtime? Things to monitor include the number of objects in the cache, the amount of memory used, and the cache hit ratio.
  8. Can the cache be managed at runtime? Think about enabling/disabling the cache, flushing its contents or saving/restoring its contents (for testing purposes).

A colleague of mine found a nice example of criteria #1, #4, and #5 conspiring to cause bad performance. The check-and-get part of the cache was placed in a critical section like this:

public Object getFromCache(String key) {
        synchronized(map) {
                if(!map.containsKey(key)) {
                        <em>... retrieve data from backend ...</em>
                }
                return map.get(key);
        }
}

However retrieving data from the backend took 100 ms and the cache hit ratio was less than 10%. The net effect was a lot of contention on the cache lock. The problem went away after this cache was removed altogether!

Like I said in the beginning, caching is a funny thing. 🙂

At the least, keep the following in mind then thinking about caching:

  • Cache if and only if caching is needed.
  • Verify the functional and performance behaviour of your cache with performance tests.
  • Make sure your cache can be managed and monitored at runtime.