How to handle sessions in an active-active HA SaaS solution spanning multiple datacenters?
Hi this is my first post on Laracasts, so bear with me is this is not the right place to ask for help like this.
I am trying to come up with a scaleable HA solution for a SaaS, that I am developing.
Currently the service is hosted within a single datacenter, and we would like to have a redundant fail-over solution ready in case of a disaster.
When it comes the session management I see a few potential solutions:
- Run Redis on each application server and use sticky session.
- Simple setup.
- Best performance in terms of response time.
- Users are however logged out, if an application server fails.
- Run Redis on a separate server.
- Introduces another single point of failure, unless scaled to a Redis cluster.
- Application servers can fail without requiring new login for our users.
- Make a Redis cluster that spans both datacenters.
- Performance declines as the network delay increases.
- Session are live in both datacenters.
- Load balancing at DNS level could be distributed round robin, instead of manually handling specific sub-domains.
- Use the database as session driver.
- Slower access than in-memory Redis.
- Sessions are alive across both datacenters.
- Use cookies as session driver.
- I do not have any experience with the performance of this solution. But i guess it would impact network traffic, and cpu usage, when the session cookie need to be transferred and de-crypted at each request. Hence i wonder if this is a viable solution in a production environment.
Do you have any experience with some of the above solutions? how did it work out in terms of maintainability and performance? Am i on the right track, or am I missing something?
TL;DR
While reading up on different approaches to overcome this problem, I found that an active-active solution seems like the way to go. Partly because i like the thinking that no resources are simply idle, and partly to avoid finding out that the fail-over system, simply is not ready to take over, in case of an emergency.
The setup so far consists of 3 datacenters.
- Two datacenters will take care of processing requests.
- Each of these datacenters will contain
- Two load balancers (HA proxy) in a fail-over setup
- Two or more application servers
- A single (for now) Database server
- The third datacenter should contain a single database server (To have at least 3 servers in the cluster).
- Databases will be kept in sync by setting up a Galera Cluster.
In our service each of our client get their own sub-domain. Some load balancing will happen at DNS level - Chosen clients are routed to DC1, while all others will use DC2. In case of a datacenter outage, all requests are routed to the online datacenter.
Here is an image illustrating the setup https://ibb.co/cDzJAo
Sorry for the long post, thank you in advance.
PS. If relevant: We serve sessions in magnitude of 100's, the average session duration is however relatively long (3-4 hours), with requests happening every minute of so.
Please or to participate in this conversation.