madprabh's avatar

Job batches processing issue

So Folks here is the scenario,

I have two EC2 instances. Instance A and B.

Both the instances are running the same application and point to the same database. On Instance A, supervisor is configured to process batched jobs and jobs in general.

The problem is

When the batch job is dispatched from Instance B, the workers on Instance A don't process the batched jobs. However, the when batched jobs are dispatched on Instance A, the worker does pick that job and executes it.

What could be wrong?

0 likes
1 reply
LaryAI's avatar
Level 58

It sounds like you're encountering an issue where job batches dispatched from one EC2 instance are not being processed by workers on another instance. This could be due to a few different reasons, so let's go through some troubleshooting steps and potential solutions.

  1. Queue Configuration: Ensure that both instances are configured to use the same queue connection and that the queue name is the same across both instances. If you're using a queue service like Amazon SQS, Redis, or a database queue, both instances should be pointed to the same queue.

  2. Supervisor Configuration: Verify that Supervisor on Instance A is correctly configured to listen to the queue that batch jobs are being dispatched to. The configuration file should specify the correct queue name.

  3. Job Serialization: If you're using Laravel, ensure that the job classes are available and identical on both instances. If there's a discrepancy in the job class definitions between the two instances, the unserialization of the job on Instance A might fail, causing it not to be processed.

  4. Database Connection: Since you mentioned that both instances point to the same database, make sure that the database connection is stable and that Instance A can access the job batch database tables without any issues.

  5. Cache Store: Laravel batches use a cache store to manage batch metadata. Ensure that the cache configuration is the same on both instances and that they are pointing to the same cache store if you're using a distributed cache like Redis or Memcached.

  6. Job Visibility: If you're using a queue system that supports message visibility (like Amazon SQS), ensure that the visibility timeout is long enough for the job to be processed. If the timeout is too short, the job might become visible again and could be picked up by the same instance that dispatched it.

  7. Error Handling: Check the logs on both instances for any error messages related to job processing. There might be an exception being thrown that's preventing the jobs from being processed on Instance A.

  8. Laravel Horizon: If you're using Laravel Horizon, ensure that it's properly configured and running on Instance A. Horizon should be set up to monitor the same queues that the jobs are being dispatched to.

Here's a sample Supervisor configuration for a Laravel queue worker that you can check against your current setup:

[program:laravel-worker]
process_name=%(program_name)s_%(process_num)02d
command=php /path/to/your/laravel/app/artisan queue:work --sleep=3 --tries=3 --max-time=3600
autostart=true
autorestart=true
user=www-data
numprocs=8
redirect_stderr=true
stdout_logfile=/path/to/your/laravel/app/storage/logs/worker.log
stopwaitsecs=3600

Make sure to replace /path/to/your/laravel/app with the actual path to your Laravel application and adjust the numprocs value based on your needs.

If after checking all these points the issue still persists, you may need to provide more specific details about your job batch setup, queue configuration, and any error logs that could help in diagnosing the problem further.

Please or to participate in this conversation.