Be part of JetBrains PHPverse 2026 on June 9 – a free online event bringing PHP devs worldwide together.

thebigk's avatar
Level 13

GroupBy and Chunk - How do I go about it?

I've an intermediate table that maintains the performances of the students in the tests. The table structure is as follows -

user_id | test_id | question_id | marks | ....

1 | 1 | 3 | 23
1 | 1 | 4 | 34
1 | 1 | 5 | 22
2 | 1 | 3 | 23
2 | 1 | 4 | 34
2 | 1 | 5 | 22

I wish to calculate the total marks obtained by each student for given test. I'm therefore looking at grouping the users by user_id so that I can process individual student's performance like total marks, total time taken etc.

I'm wondering how do I go about it. The following throws error -

DB::table('performances')->orderBy('user_id')->groupBy('user_id')->chunk(100, function($scores) {

    foreach($scores as $score) {
        // Do some calculation here
    }   

});

My guess is that groupBy isn't working as I expect it. So far, I've only used it after ->get() that is when all the records are retrieved from the database.

In this case, the number of rows is very large; and I must figure out a way to use chunk. How do I go about using groupBy and chunk together?

0 likes
6 replies
thebigk's avatar
Level 13

Update

I found out that my approach with groupBy does not work if MySQL is running with strict => true in the config/database.php

So it's all about the strict model of MySql. I'm not in favor of turning it off just for the sake of having my approach works.

The problem however, is that the dataset I have is huge! It'll be impractical to use chunk() on the collection retrieved.

I need a mechanism to chunk the results by user_id so that I can process all the rows for each of the user.

Is that even possible?

jlrdw's avatar

Group by transverses the whole recordset anyway. If there is a way to avoid it do it.

I would usually reserve group by for things like monthly reports.

But you could use the length aware paginator with group by.

https://laracasts.com/discuss/channels/guides/length-aware-paginator

You could also "query a query" and have several smaller group by's. Point is there are many ways to do this.

The most efficient is, if a big dataset is of course dumping to local, and via odbc using a desktop database to generate reports and such.

I in the past have used Visual Foxpro and later MS Access.

thebigk's avatar
Level 13

Thank you @jlrdw . The above code is meant to run inside a job. So far, using pagination seems to be the only way to go.

Also, I cannot process this offline because the job must run immediately after each test.

But that said, I must find a way to group the data set based on user_id; because even if I'm using a paginator; it won't know if all of user X's rows have been fetched. In my case; each user may not have fixed number of rows.

jlrdw's avatar

I normally use group by like: see this

https://laracasts.com/discuss/channels/laravel/groupby-table-values-in-eloquent

And it's okay to turn strict off.

A regular group by is something like:

$sql = "SELECT powners.ownerid, powners.oname, ";
          $sql .= "COUNT(pets.petid) AS CountOfpetid ";
          $sql .= "FROM powners LEFT JOIN pets ON ";
          $sql .= "powners.ownerid = pets.ownerid ";
          $sql .= "GROUP BY powners.ownerid ";
          $sql .= "ORDER BY powners.oname";

Using PDO direct you can turn off full group by for just that connection instance:

$conn = new PDO("mysql:host=$host;dbname=$db", $user, $pass, array(PDO::MYSQL_ATTR_INIT_COMMAND => 'SET sql_mode="STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,NO_ENGINE_SUBSTITUTION"'));

Is a work a round.

Me I just set strict to false.

oroalej's avatar

Why not start your query in user table?

Like this


  $result = User::with('performances')
        ->get()
        ->each(function($student, $key) {
        // $total_mark = $student->performances->sum('marks') / $student->performances->count();
            // Do some of your config.
        });

In the each() , when you do $student->performances all the test of the user will be listed.

thebigk's avatar
Level 13

@oroalej - My guess is that get() would try to fetch all the records from the database and each would work on the results that have been fetched.

Or will it make queries in batches?

Please or to participate in this conversation.