alnouirah's avatar

Export thousand of records

Hi all.

This code export 150,000 row in 15 seconds.

public function handle(){
$rawQuerey = Transaction::with('source.kyc')
                            ->with('target.kyc')
                            ->with('sourceGroup')
                            ->with('targetGroup')
                            ->with('currency')
                            ->with('transactionType')
                            ->with('model');

 FastExcel::data($this->rowsGenerator($rawQuerey)))->export('public/test.csv');

}

    function rowsGenerator($query) {
        foreach ($query->cursor() as $row) {
            yield $row;
        }
    }
            

but then add this line :

    function rowsGenerator($query) {
        foreach ($query->cursor() as $row) {
            $row->source_name = $row->source->kyc->first_name;
            yield $row;
        }
    }

It takes more than 30 minutes. It turned out that the cursor method does not eager load the relations.so I have a 1+N. for each row, it will make a query to the database and that is what makes it too slow.

However, the query builder's cursor method returns a LazyCollection instance. This allows you to still only run a single query against the database but also only keep one Eloquent model loaded in memory at a time. In this example, the filter callback is not executed until we actually iterate over each user individually, allowing for a drastic reduction in memory usage

so how to retrieve the relations data without impacting the speed of exporting?

0 likes
11 replies
Tray2's avatar

Since you are exporting to csv, why not just let the database handle it?

Here is an example that exports 1.3 million records into a csv, in less than 5 seconds.

 select * from books into outfile '/users/tray2/code/books.csv' fields terminated by ',' enclosed by '"' lines terminated by 'n';
Query OK, 1311006 rows affected, 1 warning (4.426 sec)
1 like
alnouirah's avatar

@Tray2 because I am taking the query from the user when he using the screen. it's not a static SQL query, the user may filter the results and then want to export the result by that filter. I am using laravel eloquent to filter for easy use.

alnouirah's avatar

@Tray2 what do you mean by build the query dynamically can you give an example pleas?

Do you mean that convert ElquentBuilder instanse to a raw querey ?

alnouirah's avatar

@Tray2 I am using the same code snippet for all my application screens. when the use click export button I am sending the Illuminate\Database\Eloquent\Builder to the job , finally get the collections and iterate throw it using laravel cursor method .

i have tried to get the sql command by using toSql() method but i get only the main sql statment without relations and even without the binded barameters.

Tray2's avatar

@alnouirah Basically this

$query = Model::where('this', 'that')->toSQL() .  into outfile <path to file> fields terminated by ',' enclosed by '"' lines terminated by 'n';

DB::statement($query);
2 likes
alnouirah's avatar

@Tray2 ah, that's will bring us back to the first point which is the code snippet for all screens tables. with additional relations based on the user filter.

I have managed to solve the problem by taking the selected ids and then using the query builder and exporting it directly to .xlsx with the same performance. Thanks in advance for your help.

\DB::table('transactions')
->leftJoin('kyc as sender_kyc','sender_kyc.account_id','=','transactions.source_account_id')
->leftJoin('kyc as target_kyc','target_kyc.account_id','=','transactions.target_account_id')
->leftJoin('groups as sender_group','sender_group.id','=','transactions.source_group_id')
->leftJoin('groups as target_group','target_group.id','=','transactions.target_group_id')
->leftJoin('currencies as currency','currency.id','=','transactions.currency_id')
->leftJoin('transaction_types','transaction_types.id','=','transactions.transaction_type_id')
->select([
'transactions.id',
'sender_kyc.family_name as sender_family_name',
'target_kyc.first_name as target_first_name',
'target_kyc.family_name as target_family_name',
'sender_group.name->ar as sender_group_name',
'target_group.name->ar as target_group_name',
'transaction_types.name->ar as transaction_type_name','])
->whereIn('transactions.id', $this->ids);

FastExcel::data($this->rowsGenerator($rawQuerey)))->export('public/test.xlsx');

rodrigo.pedra's avatar
Level 56

Use lazy(), or use lazyById() if you have an id.

Both lazy load by chunks, so the eager load is reduced to the number of chunks.

function rowsGenerator($query) {
    foreach ($query->lazyById() as $row) {
        $row->source_name = $row->source->kyc->first_name;
        yield $row;
    }
}

https://laravel.com/docs/10.x/eloquent#chunking-using-lazy-collections

There is actually a note on the cursor() documentation regarding this:

Since the cursor method only ever holds a single Eloquent model in memory at a time, it cannot eager load relationships. If you need to eager load relationships, consider using the lazy method instead.

https://laravel.com/docs/10.x/eloquent#cursors

1 like
alnouirah's avatar

@rodrigo.pedra That's exactly what I want. wondering how I messed that up from the docs. SIDE NOTE It solved the problem partially with a little slow speed compared with the cursor() method. Raw DB query with the cursor() method is much faster.

rodrigo.pedra's avatar

@alnouirah

Raw DB query with the cursor() method is much faster.

It sure is, as it keeps the underlying PDOStatement open, and then use one record at a time, whereas lazy()/lazyById() is grabbing n-records at a time and loading into memory.

The deal is when you need to eager load any relations. On cursor() it is unfeasible as you would have one record at a time.

On lazy()/lazyById(), they fetch n-records (a thousand by default), hydrate the same number of Eloquent models, run each relation eager load query, hydrate as many results they have, and then handles it to the loop.

It is a trade-off.

  • No relations need to be loaded? Use cursor() any day.
  • Need to eager-load some relations? Default to lazy()/lazyById().
1 like

Please or to participate in this conversation.