queue:work with redis cluster - strange behaviour w. sharding

Hello,

i'm building a high traffic application that uses a redis cluster with 3 sharding nodes:

database.php /redis

    'clusters' => [
        'default' => env('REDIS_CLUSTER', false) ? array_map(function ($host) {
            return [
                'host' => $host,
                'password' => env('REDIS_PASSWORD'),
                'port' => env('REDIS_PORT', 6379),
                'database' => env('REDIS_CACHE_DB', 0),
            ];
        }, explode(',', env('REDIS_CLUSTER_HOSTS', ''))) : [],
    ],

.env: REDIS_CLUSTER_HOSTS=10.0.0.1,10.0.0.2,10.0.0.3.

as I also use redis as session driver, everything seems to be fine here. The entries get split upon all shards and therefore also if one node is down, the application wont fail and keeps splitting data onto all remaining node members.

The only thing that is strange is that when running a queue, the entries don't seem to be removed from the cluster correctly. Initially, the queue entries gets processes just fine with the correct result:

2023-12-15 01:45:01 App\Mail\PasswordRecoveryEmail ................. RUNNING 2023-12-15 01:45:01 App\Mail\PasswordRecoveryEmail ........... 125.62ms DONE 2023-12-15 01:45:13 App\Mail\PasswordRecoveryEmail ................. RUNNING 2023-12-15 01:45:13 App\Mail\PasswordRecoveryEmail ........... 126.18ms DONE

.. but after a while, out of nowhere , the (SAME!) job get executed again but with a fail mark.

2023-12-15 01:46:34 App\Mail\PasswordRecoveryEmail ................. RUNNING 2023-12-15 01:46:34 App\Mail\PasswordRecoveryEmail ............ 32.10ms FAIL 2023-12-15 01:46:34 App\Mail\PasswordRecoveryEmail ................. RUNNING 2023-12-15 01:46:34 App\Mail\PasswordRecoveryEmail ............ 29.41ms FAIL 2023-12-15 01:46:34 App\Mail\PasswordRecoveryEmail ................. RUNNING 2023-12-15 01:46:34 App\Mail\PasswordRecoveryEmail ............ 26.69ms FAIL 2023-12-15 01:46:34 App\Mail\PasswordRecoveryEmail ................. RUNNING

I can also see in telescope that the job initially is marked as processed and done, but suddently (sometimes after 1-2 minutes) the whole series of jobs keep beeing reprocessed. I am just using default queue:work command. The exception message in the (now marked as fail) job gives:

App\Mail\PasswordRecoveryEmail has been attempted too many times.

in redis tab i can see the following:

setex APPNAME_staging_cache_:lT5teQh5HnGDvqQoOP93bCZj2p43Ju4hjpqQSsfo 7200... 0.63ms 29s ago setex... 0.78ms 29s ago zadd APPNAME_staging_cache_:tag:SystemVars:entries 1702602791... 0.63ms 29s ago get... 0.75ms 29s ago

When doing this on a single node configuration (via default database.php / redis ) config, the problem does not occur. Unfortunately I cannot use horizon as it dont seem to work with redis cluster setup.

any ideas?

fogbreaker

2 years ago

Level 1

anyone?

Please or to participate in this conversation.