Distributed Caching for Airavata Gateway

Problem Statement

We need a way to provide single logical view (and state) for session and security management using caching for the web applications.

Possible Solutions

In order to maintain single state of app data and web session data the approach of utilizing distributed cache system is preferred. It is an extension of the traditional concept of cache used on a single machine.

A distributed cache system should be able to -

There are two very popular distributed cached systems available which are heavily used in production environment -

Feature Options
Distributed Cache Memcached, Redis

Solution Evaluations

Memcached

Memcached is an in-memory key-value store for small chunks of arbitrary data (strings, objects) from results of database calls, API calls, or page rendering. It is free & open source, high-performance, distributed memory object caching system, generic in nature, but intended for use in speeding up dynamic web applications by alleviating database load.

It is a standalone server and web-applications has to know in advance how many Memcached servers are installed. If there are 10 Memcached Servers, the web-application is configured to use these 10 servers as hash buckets for storing <key, value> pairs.

memcached

Note: Image from Source

Memcached Pros


repcached

Note: Image from Source

Memcached Cons

Redis

It is an open source, in-memory data structure store, used as a database, cache and message broker. It supports data structures such as strings, hashes, lists, sets, sorted sets with range queries, bitmaps, hyperloglogs and geospatial indexes with radius queries. Redis has built-in replication, Lua scripting, LRU eviction, transactions and different levels of on-disk persistence, and provides high availability via Redis Sentinel and automatic partitioning with Redis Cluster.

It has Master-Slave architecture. Slaves are replica of their master. All write requests are redirected to Master Server. A separate server, known as Sentinel, manages Automatic election of Master Server when primary master fails. All data from Master is replicated to child nodes. All reads requests are distributed across all the replication-nodes in a redis cluster.

redis

Note: Image from Source

Redis Pros

Redis Cons

Conclusion

We’ve tried both Memcached and Redis. It is realized Redis system is a better choice for large scale and high traffic web apps. Redis has wide selection of data structures to store desired data. This brings efficiency boost for specific application scenarios where you do not require to wrap and unwrap objects in cache. It also provides in-built support for high availability, fail over, and persistance. There are other side benefits of using redis like - Redis allows key names and values to be as large as 512MB each as compared to Memcached’s limits of 250 bytes. On the other side, if the web application is meant to be simple and requires quick implementation, using Memcached (with Repcached) is reasonable choice and works best for a situation where infrastructure cost and time investment is critical.

Associated GitHub issue(s)

Associated Discussion(s)

[#Spring17-Airavata-Courses] : Distributed Caching for Airavata Gateway

References Used for Experiment(s)