Did anybody have to deal with this type of scalability challenge?
Laravel and Master-Slave replication lag
Hi,
I'm using Laravel 5.3 and AWS Aurora db. In AWS, I have an Aurora master and several slaves. What I like about Aurora is that it creates a database cluster and provides me with two endpoints, one for writes and one of reads and the read endpoint automatically distributes the selects across all slaves which means that the only thing that I need to setup in my Laravel project is the database read and write hosts.
So far so good. The problem with the Master-Slave architecture is that there is a replication lag, in my case the lag is anywhere from 100 ms to 700 ms depending on the database load. This means that if I insert a record and then try to retrieve it, the probability not to find it is high because of the replication lag.
I do have a solution for this issue which I'm going to share with you, but I'm wondering is there is a better way to deal with M-S architecture.
The idea behind my solution is to wait 1 second after an eloquent modal is saved. So I listen to the eloquent saved event and execute the following code:
// Wait for replication when model is not in a database transaction.
if (!$this->getConnection()->getPdo()->inTransaction()) {
seep(1); // Sleep for 1 second
}
And in my EventServiceProvider I have the following code that deals with database transactions:
// Wait for replication after a database transaction is committed.
Event::listen(TransactionCommitted::class, function ($event) {
seep(1); // Sleep for 1 second
});
The above solution works but I don't like the idea of delaying the execution of my code for 1 second every time a model is saved, especially when most of the time the replication lag is only 100ms.
Ideally would be to set the above code to wait for 100ms only. Then if the new record created cannot be retrieved from the slave, wait for another XXX milliseconds before retrying.
Does anyone know where would be the best place to check if a model was not found, wait again and then retry? Or any other better approach?
Thanks, Gabriel
Please or to participate in this conversation.