Mar 9, 2017

Level 1

Laravel and Master-Slave replication lag

Hi,

I'm using Laravel 5.3 and AWS Aurora db. In AWS, I have an Aurora master and several slaves. What I like about Aurora is that it creates a database cluster and provides me with two endpoints, one for writes and one of reads and the read endpoint automatically distributes the selects across all slaves which means that the only thing that I need to setup in my Laravel project is the database read and write hosts.

So far so good. The problem with the Master-Slave architecture is that there is a replication lag, in my case the lag is anywhere from 100 ms to 700 ms depending on the database load. This means that if I insert a record and then try to retrieve it, the probability not to find it is high because of the replication lag.

I do have a solution for this issue which I'm going to share with you, but I'm wondering is there is a better way to deal with M-S architecture.

The idea behind my solution is to wait 1 second after an eloquent modal is saved. So I listen to the eloquent saved event and execute the following code:

  // Wait for replication when model is not in a database transaction.
  if (!$this->getConnection()->getPdo()->inTransaction()) {
          seep(1);  // Sleep for 1 second
  }

And in my EventServiceProvider I have the following code that deals with database transactions:

  // Wait for replication after a database transaction is committed.
  Event::listen(TransactionCommitted::class, function ($event) {
            seep(1);  // Sleep for 1 second
  });

The above solution works but I don't like the idea of delaying the execution of my code for 1 second every time a model is saved, especially when most of the time the replication lag is only 100ms.

Ideally would be to set the above code to wait for 100ms only. Then if the new record created cannot be retrieved from the slave, wait for another XXX milliseconds before retrying.

Does anyone know where would be the best place to check if a model was not found, wait again and then retry? Or any other better approach?

Thanks, Gabriel

gabvalentin

9 years ago

Level 1

Did anybody have to deal with this type of scalability challenge?

Speedy11

9 years ago

Level 2

Maybe it's AWS Aurora issue. Have you tried, AWS MySQL Multi-AZ?

gabvalentin

9 years ago

Level 1

All RDS databases have a replication lag, some of them more and some less.

Aurora is supposed to be the one with less replication lag. It also has the advantage of creating a cluster and it provides only 2 endpoints, one for read and one for write which is perfect for laravel because I don't have to handle several read connections for multiple salves, Aurora does it automatically.

The Multi-AZ doesn't solve the scalabillity challenge because the second instance is a stand-by replica which doesn't accept connections.

As far as I know, MySQL NDB Cluster is the only one that provides synchronous replication but I need to use RDS.

Ideally would be to detect in Laravel when a select query can't find the record (via Eloquent) and then wait 100ms and retry it but I didn't find a way to hook into Eloquent to check if the model returns data or not.

Thanks

LaraPhil

8 years ago

Level 2

I experiencing the exact same issues with my RDS Master-Slave system (MySQL). Did you find a better solution? I really dont like the idea of slowing down the performance of my laravel project just to be sure that syncing is over. I wonder why this problem is not discussed a lot more. Currently I think about going back to just one database and try to scale it up a little bit, but this cannot be the solution.

fallen

8 years ago

Level 1

On RDS I occasionaly had 1-2 second lag, that driven me nuts on why the app was not behaving, so what I did is:

read $user from replica
if (!$user) {try reading from master}

this way 98% queries were fulfilled by replica, and the rest by reading master.

afrayedknot

8 years ago

Level 14

In Laravel 5.5 I've done a PR for a "sticky" option for replicated read/write databases.

It means your application will use the "read" connection whenever possible, but if you do a "write", then any subsequent "reads" on the same request cycle will also come from the write connection.

This ensures data integrity.

https://github.com/laravel/framework/pull/20445

3 likes

narwy

8 years ago

Level 23

@laurence has it been documented yet?

Please or to participate in this conversation.