pilat's avatar
Level 41

Is there a way to add jobs to a batch one by one or in chunks?

Hi, I'm thinking of creating a batch of jobs containing multiple items in it. But tinkering with the $batch->add() command I've realized that it heaps all the jobs in RAM and then tries to write them all into database with a single SQL query… Is there a way to add jobs one by one, or, better, in manageable chunks?

I'm thinking of something like:

$batch = Bus::batch([]);
foreach (File::chunk(50) as $chunk) {
    foreach ($chunk as $file) {
        $batch->add(new MyJob($file);
    }
    $batch->PERSIST_THIS_CHUNK();
}

$batch->dispatch();
0 likes
14 replies
Glukinho's avatar

You can add jobs to an already dispatched batch:

$pending_batch = Bus::batch([
	new App\Jobs\TestJob(1), 
	new App\Jobs\TestJob(2), 
	new App\Jobs\TestJob(3), 
]);

$dispatched_batch = $pending_batch->dispatch();

$dispatched_batch->add(new App\Jobs\TestJob(4));
$dispatched_batch->add(new App\Jobs\TestJob(5));
$dispatched_batch->add(new App\Jobs\TestJob(6));

Be aware you should ->add() on dispatched batch object, not pending! Otherwise new jobs will not be queued.

// TestJob

public function __construct(private int $counter) { }

public function handle(): void
{
    Log::info('dispatched ' . $this::class . ', counter: ' . $this->counter);
}

in log:

[2025-09-02 11:11:43] local.INFO: dispatched App\Jobs\TestJob, counter: 1  
[2025-09-02 11:11:43] local.INFO: dispatched App\Jobs\TestJob, counter: 2  
[2025-09-02 11:11:44] local.INFO: dispatched App\Jobs\TestJob, counter: 3  
[2025-09-02 11:12:16] local.INFO: dispatched App\Jobs\TestJob, counter: 4  
[2025-09-02 11:12:16] local.INFO: dispatched App\Jobs\TestJob, counter: 5  
[2025-09-02 11:12:17] local.INFO: dispatched App\Jobs\TestJob, counter: 6 
1 like
pilat's avatar
Level 41

@Glukinho thank you! Here's what I've tried:

$files = File::take(10)->get();
$batch = Bus::batch([]);
foreach ($files->chunk(5) as $chunk) {
	$jobs = $chunk->map(fn ($file) => new MyJob($file))->all();

    $batch->add($jobs)->dispatch();
}

Expected behavior: it adds 10 jobs total (5 + 5 each SQL request);

Observed behavior: it adds 15 jobs total (5 + 10);

pilat's avatar
Level 41

@Glukinho In your example, I see three more jobs ->add()ed after dispatching. But ->add() itself does not persist anything to the database, as far as I understand… when they are persisted exactly?

Glukinho's avatar

@pilat again:

$pending_batch = Bus::batch([]); // returns Illuminate\Bus\PendingBatch object
$pending_batch->add(new TestJob); // job is appended to pending batch and will be actually queued after ->dispatch() execution

$dispatched_batch = $pending_batch->dispatch(); // returns Illuminate\Bus\Batch object
$dispatched_batch->add(new TestJob); // job is actually queued to database

You need to add jobs to a dispatched batch, AFTER you executed ->dispatch().

$files = File::take(10)->get();

$batch = Bus::batch([])->dispatch();  // dispatch empty batch

foreach ($files->chunk(5) as $chunk) {
	$jobs = $chunk->map(fn ($file) => new MyJob($file))->all();

    $batch->add($jobs); // no need to ->dispatch() here, the batch is already dispatched, jobs go right to queue as soon as they are added to the batch
}
pilat's avatar
Level 41

@Glukinho Ah, I see. I need to use separate $dispatched_jobs variable to add more jobs into the batch. This way I can see all the jobs actually added to the database in proper amount, but the script itself crashes with "Allowed memory size of 134217728 bytes exhausted" error.

Looks like it still tries to load all the jobs into memory at some moment of time :(

pilat's avatar
Level 41

@Glukinho

Here's what I've tried in Tinkerwell:

$s = FilesSession::first();

$batch = Bus::batch([]);
$files = $s->files()->notUploaded();
$isFirst = true;
$dispatchedBatch = null;
foreach ($files->lazyById(100) as $file) { // I've tried ->cursor() as well -- still not enough memory :(
    if ($isFirst) {
        $batch->add(new UploadOneFileToStorageJob($file));
        $dispatchedBatch = $batch->dispatch();
    } else {
        $dispatchedBatch->add(new UploadOneFileToStorageJob($file));
    }

    $isFirst = false;
}

This code runs for some time, there are jobs in the database (as expected) in the results. But in the end this code fatals because of the memory limit. Not sure where is the leak exactly…

Glukinho's avatar

@pilat show the actual code, and job class too. Do you try to put file content into a job, or just file path?

I have added 100 000 jobs to queue now from tinker the way I showed, 100 000 jobs showed up in a database jobs table, memory consumption was about 16 Mb all the time:

> $batch = Bus::batch([])->dispatch()
> for ($i = 0; $i < 100_000; $i++) { $batch->add(new App\Jobs\TestJob(0)); echo "$i\n"; }
1
2
3
...
Glukinho's avatar

@pilat DON'T call ->dispatch() every time in a loop. Call ->dispatch() one time on an empty batch before the loop:

$batch = Bus::batch([])->dispatch();

Then, in the loop, just add a job to a batch. The batch is already dispatched, so each job goes to the queue immediately and is not collected in a batch object (so memory is not consumed):

foreach (...) {
	$batch->add(new UploadOneFileToStorageJob($file));

	// and DON'T call any ->dispatch() in a loop
}
Glukinho's avatar

@pilat

But ->add() itself does not persist anything to the database, as far as I understand

If a batch is pending (->dispatched() wasn't executed), yes. On a dispatched batch (on which ->dispatched() was executed) - no, a job is persisted to a queue (to a database in your case).

pilat's avatar
Level 41

@Glukinho Every $file is an instance of an Eloquent model. No actual file content in it. And I don't run any queue workers at this stage, just trying to find a way to schedule a big batch :)

Glukinho's avatar

@pilat Ok, do as I wrote before, it will work.

Update: queue worker is not related, it can be running or stopped, no matter.

pilat's avatar
Level 41

@Glukinho considering that it actually adds all of the jobs to the DB, I assume it was something in Tinkerwell that run out of memory. In particular, I can see that it tries to collect all the queries, even though I've specifically chosen to use CLI Output, not the Detail Dive pane…

...

Yup, if I run the same code in plain old php artisan tinker, it would not fail.

Thank you!!

SUMMARY:

  1. When dispatching a batch for the first time (with a single or manageable amount of jobs), assign the result to $dispatchedBatch variable: $dispatchedBatch = Bus::batch($firstChunkOfJobs)->dispatch();
  2. Add more jobs to the DISPATCHED, instead of PENDING batch: $dispatchedBatch->add($extraJobOrJobs);
Snapey's avatar

are you sure you are not passing too much data into the job?

Please or to participate in this conversation.