Be part of JetBrains PHPverse 2026 on June 9 – a free online event bringing PHP devs worldwide together.

jonathan1's avatar

Max Limit from PDO for Related Models

I'm using the queue manager to manage my API calls to Amazon. So I'm queueing up 10 SKUs at a time for each job. However, I ran into a hard limit of MySQL and PDO, I think I recall seeing it's something like 65k rows (don't quote me :) I have 85k products in my Product model's table right now.

That said, I'm using the below code to query on my main model, Product, based on the ProductDetails model which stores some data I need to find the right SKU, which belongs to the Product model.

I plan to have millions of products in my database and I want to get the prices updated every day, so I'll have to queue up these products everyday.

Here's the error I ran into:

Based on what I've read for similar issues, I think I need to chunk the results from Eloquent -- probably in batches of 60k -- and then run the foreach to chunk those into jobs, but I tried chunking the eloquent results and no go.

Here is my working code before the volume of products.

    public function handle()
    {
        $start = microtime(true);

        $builder = ProductDetail::with('product')
            ->where('vendor_id','1');
        $productDetails = $builder->get();

        echo "\n".$builder->count() . " products added.\n\n";
        echo $builder->toSql()."\n\n";

        $products = $productDetails->map(function($p){
            return $p->product;
        })->chunk(config('services.amazonapi.max_results'));
        $now = \Carbon\Carbon::now();
        foreach($products as $chunk){
            AmazonProductLookup::dispatch($chunk)->onQueue('amazon_product_lookups')->delay($now->addSecond());
        }
        $end = microtime(true);
        $this->info("Took ".($end - $start) ." seconds to complete\n");
    }

Any help is GREATLY appreciated, TIA!

0 likes
8 replies
bobbybouwmann's avatar
Level 88

The best way to get around the limit is using chunks. I see you're already doing that but only on the collection and not the query. Instead you can do something like this:

$productDetails = ProductDetail::with('product')->where('vendor_id','1')->chunk(200, function ($productDetail) {
    // Do your amazon product lookup here
});

See: https://laravel.com/docs/5.5/eloquent#chunking-results

1 like
jonathan1's avatar

Thanks @bobbybouwmann!

I've updated to use that method. The next issue I'm running into is that I now see 88k jobs queued, whereas before I would see #of products / amazon chunk = jobs, so 88k / 10 I should see 8k jobs.


        ProductDetail::with('product')
            ->where('vendor_id','1')
            ->where('detail_type','price_updated_at')
            ->chunk(1000, function($p) {
                $p->map(function($m){
                    return $m->product;
                })->chunk(config('services.amazonapi.max_results'));
                $now = \Carbon\Carbon::now();
                foreach($p as $c){
                    AmazonProductLookup::dispatch($c)->onQueue('amazon_product_lookups')->delay($now->addSecond());
                }
            });

Additionally, it's passing different information to the Redis queue manager now so when I run my queue:work, it's not processing the same way it was before. I think something with the map() and product there.

TIA!

bobbybouwmann's avatar

You are chunking the collection twice. You don't have to do that anymore. Something like this should do I believe.

 ProductDetail::with('product')
    ->where('vendor_id','1')
    ->where('detail_type','price_updated_at')
    ->chunk(1000, function($productDetails) {
        $now = \Carbon\Carbon::now();
        $products = $productDetails->map(function($productDetail){
            return $productDetail->product;
        });

        foreach($products as $product){
            AmazonProductLookup::dispatch($c)
                ->onQueue('amazon_product_lookups')
                ->delay($now->addSecond());
        }
    });
jonathan1's avatar

Simple fix!

        ProductDetail::with('product')
            ->where('vendor_id','1')
            ->where('detail_type','price_updated_at')
            ->chunk(1000, function($p) {
                $n = $p->map(function($m){
                    return $m->product;
                })->chunk(config('services.amazonapi.max_results'));
//                dd($p);
                $now = \Carbon\Carbon::now();
                foreach($n as $c){
                    AmazonProductLookup::dispatch($c)->onQueue('amazon_product_lookups')->delay($now->addSecond());
                }
            });

So changed the $n for a new variable and ran $n instead of $p

Worked like a charm. I see 8800 jobs in the queue manager and running queue:work runs properly!

jonathan1's avatar

@bobbybouwmann well I still need to chunk the amazon results because I want 10 SKUs per job because that's what Amazon allows for. The code above worked for me.

My last question, given I assume I'll have millions of rows soon, should I be using large chunks or small chunks from the query builder? 1000 vs 60000?

bobbybouwmann's avatar

The chunk size doesn't matter that much. The documentation uses 200, but I would personally go for a 1000. Not really a specific reason. With a 1000 you keep the memory usage and the query time low :D

jonathan1's avatar

@bobbybouwmann great, thanks.

Ok one more question ;)

I read something about paging and when the results get updated.

So I'm querying on ->where('detail_type','price_updated_at') and going to query where detail_value != curdate();

I have supervisord running that will pickup the jobs and run them immediately. Will it cause any issues that the price_updated_at is being updated while it's querying?

So in parallel, it will be querying and processing the same data set.

Thanks!

bobbybouwmann's avatar

The chunk method uses limit and offset to query the results. So something like pagination is used to actually query the results. I think the timestamp shouldn't really be an issue. Otherwise you need to query for something that is smaller than the current time. Only fetch the results that are not updated yet.

The chunk method will keep running until it can't find any new items that match the query, so when you do something with a date field it will keep looking for items that do that match your where statements.

Please or to participate in this conversation.