ahoi's avatar
Level 5

Queues and timeouts: Jobs are running simultaneously

Hey there,

I got some very long running jobs, which are handled by Horizon. I also keep track of the jobs using a job middleware. If a job is being started, I am writing that event to the database, if it failed, I also do that. And if it's released back to queue, I am also keeping track of it.

Now I got that situation:

If a job runs longer as the timeout I specified, the job is not marked as released, finished or failed. It just starts over. I don't know the reason, but the job that timed out, keeps running in the background, so I got the same job running twice.

Let's take this job:

<?php
declare(strict_types=1);

class TestJob implements ShouldQueue
{
    use Trackable, Dispatchable, InteractsWithQueue, Queueable, SerializesModels;
    
    public int $tries   = 5;
    public int $timeout = 15 * 60;

    public function handle(): void
    {
        try{
            sleep(15*60);
        } catch (Exception $exception) {
            ray($exception);
        }
    }
}

It runs for exactly 15min but the catch-block is never reached.

This is my horizon config:

What am I missing?

0 likes
1 reply
LaryAI's avatar
Level 58

It looks like you're encountering an issue where your job is being restarted after it times out, but the original job continues to run in the background. This can happen if the job is not properly terminated when it hits the timeout limit.

Here are a few steps to help you resolve this issue:

  1. Ensure Proper Job Termination: Make sure that your job is properly terminated when it hits the timeout. You can use the timeoutAt method to set a specific timeout for the job.

  2. Check Horizon Configuration: Ensure that your Horizon configuration is correctly set up to handle timeouts and retries.

  3. Use timeoutAt Method: Modify your job to use the timeoutAt method to ensure it respects the timeout.

Here's an updated version of your TestJob class:

  1. Check Supervisor Configuration: Ensure that your supervisor configuration in Horizon is correctly set up to handle timeouts and retries. Specifically, check the timeout and retry_after settings.

Here's a snippet of your Horizon configuration with the relevant settings:

'defaults' => [
    'supervisor-1' => [
        'connection' => 'redis',
        'queue' => ['default'],
        'balance' => 'auto',
        'autoScalingStrategy' => 'time',
        'maxProcesses' => 1,
        'maxTime' => 0,
        'maxJobs' => 0,
        'memory' => 128,
        'tries' => 1,
        'timeout' => 15 * 60, // 15 minutes
        'retry_after' => 20 * 60, // 20 minutes
        'nice' => 0,
    ],
],
  1. Check for Overlapping Jobs: Ensure that your job is not being dispatched multiple times unintentionally. You can use job middleware to prevent overlapping jobs.

Here's an example of a middleware to prevent overlapping jobs:

<?php

namespace App\Jobs\Middleware;

use Illuminate\Support\Facades\Cache;

class PreventOverlapping
{
    public function handle($job, $next)
    {
        $lock = Cache::lock($job->uniqueId(), $job->timeout);

        if ($lock->get()) {
            try {
                $next($job);
            } finally {
                $lock->release();
            }
        } else {
            // Job is already running
            $job->release($job->timeout);
        }
    }
}

And in your job class, you can use this middleware:

public function middleware()
{
    return [new PreventOverlapping];
}

public function uniqueId()
{
    return 'test-job-' . $this->id;
}

By following these steps, you should be able to ensure that your job is properly terminated when it hits the timeout and prevent it from running multiple times simultaneously.

Please or to participate in this conversation.