It sounds like you're encountering an issue where job batches dispatched from one EC2 instance are not being processed by workers on another instance. This could be due to a few different reasons, so let's go through some troubleshooting steps and potential solutions.
-
Queue Configuration: Ensure that both instances are configured to use the same queue connection and that the queue name is the same across both instances. If you're using a queue service like Amazon SQS, Redis, or a database queue, both instances should be pointed to the same queue.
-
Supervisor Configuration: Verify that Supervisor on Instance A is correctly configured to listen to the queue that batch jobs are being dispatched to. The configuration file should specify the correct queue name.
-
Job Serialization: If you're using Laravel, ensure that the job classes are available and identical on both instances. If there's a discrepancy in the job class definitions between the two instances, the unserialization of the job on Instance A might fail, causing it not to be processed.
-
Database Connection: Since you mentioned that both instances point to the same database, make sure that the database connection is stable and that Instance A can access the job batch database tables without any issues.
-
Cache Store: Laravel batches use a cache store to manage batch metadata. Ensure that the cache configuration is the same on both instances and that they are pointing to the same cache store if you're using a distributed cache like Redis or Memcached.
-
Job Visibility: If you're using a queue system that supports message visibility (like Amazon SQS), ensure that the visibility timeout is long enough for the job to be processed. If the timeout is too short, the job might become visible again and could be picked up by the same instance that dispatched it.
-
Error Handling: Check the logs on both instances for any error messages related to job processing. There might be an exception being thrown that's preventing the jobs from being processed on Instance A.
-
Laravel Horizon: If you're using Laravel Horizon, ensure that it's properly configured and running on Instance A. Horizon should be set up to monitor the same queues that the jobs are being dispatched to.
Here's a sample Supervisor configuration for a Laravel queue worker that you can check against your current setup:
[program:laravel-worker]
process_name=%(program_name)s_%(process_num)02d
command=php /path/to/your/laravel/app/artisan queue:work --sleep=3 --tries=3 --max-time=3600
autostart=true
autorestart=true
user=www-data
numprocs=8
redirect_stderr=true
stdout_logfile=/path/to/your/laravel/app/storage/logs/worker.log
stopwaitsecs=3600
Make sure to replace /path/to/your/laravel/app with the actual path to your Laravel application and adjust the numprocs value based on your needs.
If after checking all these points the issue still persists, you may need to provide more specific details about your job batch setup, queue configuration, and any error logs that could help in diagnosing the problem further.