Be part of JetBrains PHPverse 2026 on June 9 – a free online event bringing PHP devs worldwide together.

jakubjv's avatar

How can i improve query in Laravel for few milions records?

Hello everyone, I have a question regarding the Laravel query builder. I have a task where I need to retrieve only those purchasegroups from the Purchase model that have a pricelist with ordered set to true and at the same time do not have any pricelist with tax set to null. I return the data through a cron job to the log. The problem, however, is that even after 63 minutes, which is the longest cron I have, the data still doesn't get logged simultaneously. It also happens that a purchase is logged despite the query settings specifying that the pricelist should have ordered as false or tax as null. But main task is, log finally just purchases which hase purchasegroup with pricelists that have differnt tax inside one purchase..For example Purchae has 3 purchasegroups and every purchase group has its own pricelist but every pricelist must have tax = 19 ˇ%, they cant have different taxes inside one purchase..Can someone please try to advise me?

This is my code in cron file which is filtering data.

 public function handlePurchases()
    {
        Log::info('Cron started');
        $purchaseGroupQuery = Purchasegroup::whereHas('purchase')
        ->whereHas('pricelist', function ($query) {
            $query->where('ordered', '=' ,true)->where('tax', '!=', null);
        })
        ->with([
            'pricelist' => function ($query) {
                $query->selectRaw("REPLACE(
                        REPLACE('tax', ',', ''), '.', ''
                    ) as formatted_tax")
                    ->groupBy('formatted_tax');
            }
        ]);



        $printedPurchases = [];

        $purchaseGroupQuery->chunk(1000, function ($chunks) use (&$printedPurchases) {
            foreach ($chunks as $purchaseGroup) {
                $purchaseGroupId = $purchaseGroup->purchase;

                if (!in_array($purchaseGroupId, $printedPurchases)) {
                    $purchaseGroupTaxes = $purchaseGroup->pricelists->pluck('tax')->filter(function ($tax) {
                        return $tax !== null;
                    });

                    if ($purchaseGroupTaxes->isNotEmpty()) {
                        Log::info("purchase: {$purchaseGroupId}", $purchaseGroupTaxes->toArray());
                    }

                    $printedPurchases[] = $purchaseGroupId;
                }
            }
        });

        Log::info('Cron Ended');
    }

This is relation in Purchase model

public function purchasegroups()
	{
		return $this->hasMany(Purchasegroup::class, 'purchase');
	}

Relations in Purchasegroup model

public function purchase()
	{
		return $this->belongsTo(Purchase::class, 'purchase');
	}

public function pricelists()
	{
		return $this->hasMany(Pricelist::class, 'purchasegroup','id');
	}

	public function pricelist()
	{
		return $this->hasOne(Pricelist::class, 'purchasegroup','id')->where("ordered", true);
	}

And relation in Pricelist model

public function purchasegroup()
	{
		return $this->belongsTo(Purchasegroup::class, 'purchasegroup');
	}

I will be glad for every advice!

0 likes
9 replies
Tray2's avatar

the easiest would be to look at the generated SQL and run an explain plan on it. My guess is that you probably need an index or two.

Tray2's avatar

@jakubjv add ->toSQL() on your query, and then you take that query and run it in a database tool with the words EXPLAIN or EXPLAIN PLAN in from of it.

EXPLAIN SELECT * FROM table WHERE col = 'Find Me' ORDER BY created_at;
1 like
jakubjv's avatar

@Tray2 okey so this is result ->

[2024-01-29 11:05:38] local.INFO: Explain Plan: {"sql":"select * from \"purchasegroup\" where exists (select * from \"purchase\" where \"purchasegroup\".\"purchase\" = \"purchase\".\"id\") and exists (select * from \"pricelist\" where \"purchasegroup\".\"id\" = \"pricelist\".\"purchasegroup\" and \"ordered\" = ? and \"tax\" is not null and \"ordered\" = ?)","explain":[{"stdClass":{"QUERY PLAN":"Result  (cost=0.00..0.00 rows=0 width=6728)"}},{"stdClass":{"QUERY PLAN":"  One-Time Filter: false"}}]} 
Tray2's avatar

@jakubjv The result of your explain should look something like this

MariaDB [foreign_keys]> EXPLAIN SELECT authors.name, books.title FROM books JOIN authors  ON books.author_id = authors.id;
+------+-------------+---------+--------+---------------+---------+---------+------------------------------+-------+-------+
| id   | select_type | table   | type   | possible_keys | key     | key_len | ref                          | rows  | Extra |
+------+-------------+---------+--------+---------------+---------+---------+------------------------------+-------+-------+
|    1 | SIMPLE      | books   | ALL    | NULL          | NULL    | NULL    | NULL                         | 10000 |       |
|    1 | SIMPLE      | authors | eq_ref | PRIMARY       | PRIMARY | 8       | foreign_keys.books.author_id | 1     |       |
+------+-------------+---------+--------+---------------+---------+---------+------------------------------+-------+-------+
2 rows in set (0,000 sec)
1 like
jakubjv's avatar

@Tray2 i did update it and now it showing somethil like this

[2024-01-29 13:16:33] local.INFO: Explain Plan: {"sql":"select * from \"purchasegroup\" where exists (select * from \"purchase\" where \"purchasegroup\".\"purchase\" = \"purchase\".\"id\") and exists (select * from \"pricelist\" where \"purchasegroup\".\"id\" = \"pricelist\".\"purchasegroup\" and \"ordered\" = ? and \"tax\" is not null and \"ordered\" = ?)","explain":[{"stdClass":{"QUERY PLAN":"Gather  (cost=17160.39..38727.25 rows=47718 width=6728)"}},{"stdClass":{"QUERY PLAN":"  Workers Planned: 2"}},{"stdClass":{"QUERY PLAN":"  ->  Parallel Hash Join  (cost=16160.39..32955.45 rows=19882 width=6728)"}},{"stdClass":{"QUERY PLAN":"        Hash Cond: (purchasegroup.purchase = purchase.id)"}},{"stdClass":{"QUERY PLAN":"        ->  Parallel Hash Semi Join  (cost=10579.22..27322.09 rows=19882 width=6728)"}},{"stdClass":{"QUERY PLAN":"              Hash Cond: (purchasegroup.id = pricelist.purchasegroup)"}},{"stdClass":{"QUERY PLAN":"              ->  Parallel Seq Scan on purchasegroup  (cost=0.00..15240.70 rows=232370 width=6728)"}},{"stdClass":{"QUERY PLAN":"              ->  Parallel Hash  (cost=10330.66..10330.66 rows=19885 width=4)"}},{"stdClass":{"QUERY PLAN":"                    ->  Parallel Seq Scan on pricelist  (cost=0.00..10330.66 rows=19885 width=4)"}},{"stdClass":{"QUERY PLAN":"                          Filter: (ordered AND (tax IS NOT NULL) AND ordered)"}},{"stdClass":{"QUERY PLAN":"        ->  Parallel Hash  (cost=4695.52..4695.52 rows=70852 width=4)"}},{"stdClass":{"QUERY PLAN":"              ->  Parallel Seq Scan on purchase  (cost=0.00..4695.52 rows=70852 width=4)"}}]} 

Tray2's avatar

@jakubjv Look at the rows for each part of the query, you can see that you look at a shitload of records, that is usually a sign that you need an index on the column in question.

Please don't show us the plan in json, it makes it hard to read.

Snapey's avatar
  • use correct indexes
  • eager load related models, eg purchaseGroup
krisi_gjika's avatar

you pluck a tax column from pricelists that you don't select in your query

Please or to participate in this conversation.