I was starting today to look on different approaches and techniques that are used in scalable web or non web applications. One technique used in many large systems is simply called "memory cache". It means that data are cached in memory so the data will not be queried again.
Cache and memory cache exists for a long time; even the hardware parts link hard disks cd units have some sort of memory cache.
Why memory cache becomes so important when we talk about web applications? It's simple. Because web applications happens to fulfill thousands of requests simultaneously. Or sometimes they have to. It's obvious that keeping the data into memory and reuse it the next time you need it it will improve the performance. And it looks very simple. At the first view simple hashmap would do the job, unless...
There a few facts we need to be consider in real world:
The cache cache is very simple. If the data can be retrieved from cache it will be retrieved from cache. If not then it should be retrieved from database and put in cache:
if (data is in cache)
retrieve data from cache
if (data not in cache)
retrieve data from database
add data to cache
use data
Even if we have a simple application and we use a simple hashmap as a memory cache we still have to address this issue. When we update a database the cache should be updated with the new data or the old data should be removed from the cache.
Each time we update something in the database we have to remove the updated entities from the cache. We should take a special care because we our changes might affect other entities kept in cache:
update database
invalidate saved and affected entities from cache
As I said web applications are special applications. For each http request a thread will be created. In order to make sure that everything will work fine we must ensure that:
The applications where memory cache is required are the applications with many users and usually they run in distributed environments. There are options here: