aarondfrancis's avatar

Confusion about Queue Timeouts

Hey all, back with another queue question.

I'm trying to figure out the difference between the expiration in the queue config (using driver db) and the --timeout flag on the queue:listen command.

Here's my issue: I've got a command that takes a reallly long time because it does some HTTP requests and some image processing, so I set the --timeout=900 flag. Problem is, the queue is in the db with this configuration

'database' => [
            'driver' => 'database',
            'table' => 'jobs',
            'queue' => 'default',
            'expire' => 60,
        ],

After running several queue:listen commands, I would notice that it would be picking up the same job multiple times. Bad news. After digging around I found that artisan will silently release those long running jobs back into the queue if they have "expired" (ie: passed the 60 second mark).

So what is the point of the two different options? I understand that --timeout actually allows the length of the processing handler to go on longer, so why have expire at all? If my command runs 61 seconds, artisan dumps it and it gets picked back up.

Does that make sense?

0 likes
15 replies
aarondfrancis's avatar

For those interested, here's the code that pops it off:

Illumintate\Queue\DatabaseQueue.php

public function pop($queue = null)
    {
        $queue = $this->getQueue($queue);

        if ( ! is_null($this->expire))
        {
            $this->releaseJobsThatHaveBeenReservedTooLong($queue);
        }

        if ($job = $this->getNextAvailableJob($queue))
        {
            $this->markJobAsReserved($job->id);

            $this->database->commit();

            return new DatabaseJob(
                $this->container, $this, $job, $queue
            );
        }

        $this->database->commit();
    }


/**
     * Release the jobs that have been reserved for too long.
     *
     * @param  string  $queue
     * @return void
     */
    protected function releaseJobsThatHaveBeenReservedTooLong($queue)
    {
        $expired = Carbon::now()->subSeconds($this->expire)->getTimestamp();

        $this->database->table($this->table)
                    ->where('queue', $this->getQueue($queue))
                    ->where('reserved', 1)
                    ->where('reserved_at', '<=', $expired)
                    ->update([
                        'reserved' => 0,
                        'reserved_at' => null,
                        'attempts' => new Expression('attempts + 1')
                    ]);
    }
1 like
toniperic's avatar

@aarondfrancis not quite sure what's your question here?

Using artisan queue:listen command you can override that "expire" config variable. If you set --timeout flag to be 30, it will expire after 30 seconds.

There is also --tries flag which allows you to set how many times to attempt a job before logging it failed. By default it's set to zero, but you can override it like so:

php artisan queue:listen --tries 3 --timeout 30

Hope that helps.

2 likes
aarondfrancis's avatar

@toniperic hey there, thanks for the response.

That's exactly my question! Consider the following scenario:

Make sure your db config is set to expire=60 (as in the default config) Queue a job with the following handler

var_dump("Hey, processing");
sleep(80);

Start two different listeners with --timeout=100

After 60 seconds seconds, you'll see the db expire the job, the second listener pick it up, and the first listener keep processing away. They'll trade of like this back and forth, processing the same job over and over.

aarondfrancis's avatar

So what I mean is this: the timeouts do not know about each other. The command will keep processing up until the 100 second mark, as specified by --timeout, but the db will expire it and re-release, leading to unexpected outcomes.

toniperic's avatar

The --timeout flag you specify is the timeout that gets set to the Symfony Process component, located at Symfony\Component\Process\Process. Symfony's Process component checks whether the timeout limit has been exceeded, and throws a ProcessTimedOutException. You can see this in the Symfony's docs if you prefer.

The "expire" variable for database queue driver (in the queue.php config file) literally has nothing to do with queue:listen command --timeout flag.

Imagine if it did - what if you used Beanstalkd - now what would the --timeout flag override for Beanstalkd driver? Note that Beanstalkd doesn't have the "expire" config variable like database queue driver does.

My first post might've been misleading, sorry if it was. Hope this helps you get things straight.

2 likes
aarondfrancis's avatar

I guess the best question is, why would you ever want to expire a job that is still being processed? What's the point of the expire value?

3 likes
toniperic's avatar
Level 30

Well, you have to expect the unexpected.

If you are processing a small image on a solid server, you know that in the worst-case scenarios it just shouldn't take longer than 60 seconds. If it is taking longer than that, it means something's rotten here so you'd just mark it as failed job and move on with your other tasks, and then you can manually review what went wrong and fix it.

4 likes
timhaak's avatar

Think the question isn't why is there a time out.

But why the db doesn't use the timeout given. Ie if I specify --timeout 600 I would expect it to handle it or at least say something like may not be longer than db timeout.

trevorg's avatar

It does seem odd that there is both a --timeout flag as well as an expire variable for the database queue driver. What is the purpose of the expire variable? It seems redundant, and can lead to unexpected situations if not properly configured.

If in queue.php I specify the following:

'database' => [
            'driver' => 'database',
            'table' => 'jobs',
            'queue' => 'default',
            'expire' => 3600,
        ],

And then I run two workers with this command: php artisan queue:listen --timeout=30

What happens is the job goes to the first worker, and after 30 seconds it throws the ProcessTimedOutException and quits. HOWEVER, the other worker does not pick it up to retry it, since it respects the expire value in the config. (It should pick it up after 1 hour.)

However, if I set it up like so:

'database' => [
            'driver' => 'database',
            'table' => 'jobs',
            'queue' => 'default',
            'expire' => 60,
        ],

And then I run two workers with this command: php artisan queue:listen --timeout=120

What happens here is at 60 seconds, while the job is still running (and not yet timed out), the second worker will pick up the job and start processing it simultaneously. This can potentially be a very bad scenario where you have the same job processed multiple times.

So it seems like you must utilize the both the expire config setting, as well as the --timeout flag if you are using the database driver.

6 likes
JonDickson20@gmail.com's avatar

@trevorg Your last post cleared it up for me.

If you're using the database driver, never set your --timeout to a greater value than your 'expire' or you're just asking for trouble.

2 likes
halexmorph's avatar

@aarondfrancis "I guess the best question is, why would you ever want to expire a job that is still being processed? What's the point of the expire value?" I just tweeted at Taylor Otwell with the same question. The difference is this:

A job that passes it's --timeout parameter stops processing and releases the job for another retry (if you allow multiple tries). The expiration is to clean up any jobs that have hung and never reached that state. So if your job somehow reaches a state where it has completely hung and stopped responding, the --timeout might not function properly and the job will be forever tied up. (I've noticed this happens with Out of Memory errors). The expiration will release that job so it is either retried or failed.

I hope this helps to clarify why both a --timeout and expiration are useful!

3 likes
jjudge's avatar

@themsaid we were just talking about contacting you about this, since the documentation leaves out a few important points in understanding how this works. It feels like deep dive stuff.

A problem we have encountered is that when running a listener is daemon mode, the timeout option has a very nasty side-effect. In summary:

  • In daemon mode, the jobs are run in the same process as the listener.
  • If the Process Control PHP extension (pcntl) is installed, then the queue listener will make good use of it to send and receive signals.
  • If the timeout timer goes off in a job, the job will kill itself, with a rather severe -9 aka SIGKILL. This is kind of fine (it's never fine really, but that's the way it is) in non-daemon mode to kill a rogue process that may be eating up resources.
  • In daemon mode, the job killing itself is also killing the listener since they are one and the same process.
  • The listener does not exit cleanly, so is not able to increment the number of tries for the job or mark it as failed; the number of tries does not increase.
  • On restarting, the listener sees the same job on the queue, runs it, and the cycle continues.

So running a 100 second long job in daemon mode (queue:work) and with a default timeout of 60 seconds, means that job gets re-run over and over, potentially blocking the queue if you only have one listener.

A workaround is not to use daemon mode, using queue:listen instead. That runs separate processes for each job, which is cleaner, safer, but involves more resources and a longer startup time.

The laravel queue workers contain some clever stuff, but also some odd behaviour, and some combinations of settings and runtime modes that can get you into an awful lot of trouble. I'm not sure if this is stuff for the main laravel documentation, or best in a deep-dive inner-workings chapter of a book ;-) Happy to help if the latter.

Please or to participate in this conversation.