Be part of JetBrains PHPverse 2026 on June 9 – a free online event bringing PHP devs worldwide together.

soham-laravel-dev's avatar

Facing performance issue while custom exporting 20M records

Community i am facing performance issue. let me tell you i am custom export data around 20M and it takes too much time like one or two day. in custom export i already job queue with pagination wise export data with per page around 10k records append and write data with same excel file.

so guys, provide solution how to manage export 20 M records within particular time

0 likes
18 replies
JussiMannisto's avatar

The maximum number of rows for a modern Excel file is around 1 million per sheet. You can't have more than that.

I wouldn't even try exporting 20M records as an Excel file — it's just not the right format for it. I'd export it as CSV.

You should also show your code. You didn't even mention which library you're using.

You can't easily "append" data to an Excel file. An Excel file is just a bunch of XML files in a zip archive, and you have to parse the entire sheet before you can write anything to it. If you keep restarting this process or don't use any kind of chunked writing, then it's going to be very slow with that many records.

soham-laravel-dev's avatar

@JussiMannisto , Let me share you my code and what is the process i am following to export

Here is my code :

class CustomExportJob implements ShouldQueue { use Queueable, CommonTraits; protected $fileName; protected $dir; protected $order; protected $search; protected $totalPages; protected $currentPage; protected $perPage; protected $exportColumns;

public function __construct(
	$fileName,
	$dir,
	$order,
	$search,
	$totalPages,
	$currentPage,
	$perPage,
	$exportColumns
) {

	$this->fileName = $fileName;
	$this->dir =  $dir;
	$this->order = $order;
	$this->search = $search;
	$this->totalPages = $totalPages;
	$this->currentPage =  $currentPage;
	$this->perPage = $perPage;
	$this->exportColumns = $exportColumns;
}

public function handle()
{
	try {

		$directoryName = $this->exportFileDirectory('shipment_excel') ?? 'excel-export';
		$storagePath = 'app/public/' . $directoryName . '/';
		$csvFilePath = storage_path($storagePath . pathinfo($this->fileName, PATHINFO_FILENAME) . '.csv');
		$directory = dirname($csvFilePath);
		if (!File::exists($directory)) {
			File::makeDirectory($directory, 0755, true);
		}
		$columns = $this->exportColumns;

		$data = $this->Query(
			$this->currentPage,
			$this->order,
			$this->dir,
			$this->perPage
		);

		$shipments = collect($data->items());
		if (!$shipments->isEmpty()) {
			$fileHandle = fopen($csvFilePath, 'a');

			$shipments->chunk(2500)->each(function ($shipmentChunk) use ($exportStringReplace, $fileHandle, $columnMap)
			{
				foreach ($shipmentChunk as $shipmentRecord) {
					$row[] = $shipmentRecord->id ?? '';
					$row[] = $shipmentRecord->tracking_id ?? '';
					fputcsv($fileHandle, $row);
					$this->unsetValue($row);
					fflush($fileHandle);
				}
			});
			fclose($fileHandle);

			if (intval($this->currentPage) < intval($this->totalPages)) {
				$this->updateCurrentPage($this->currentPage);
				$this->currentPage++;
				$delay = $this->currentPage % 5 === 0 ? 3 : 2;
				gc_collect_cycles();
				CustomExportJob::dispatch(
					$this->fileName,
					$this->dir,
					$this->order,
					$this->search,
					$this->totalPages,
					$this->currentPage,
					$this->perPage,
					$this->exportColumns
				)->onQueue('default')->delay(now()->addSeconds($delay));
			} elseif (intval($this->currentPage) == intval($this->totalPages)) {
				$this->updateCurrentPage($this->currentPage);
			}
		}

	} catch (Exception $e) {
		$this->updateFailedStatus();
		Log::channel('errorLog')->info(" EXCEPTION ", ['error' => $e->getMessage()]);
	}
}

}

JussiMannisto's avatar

@soham-laravel-dev

This job looks bizarre. It's a CustomExportJob that dispatches other CustomExportJobs from a loop. That complicates things without performance benefits.

I would've just used a chunked database query and written the results to a file. The whole process should be very simple.

I'm not talking about chunking the result collections like you're doing — I'm talking about using the chunking methods on the query builder, which run separate database queries. Chunking a collection doesn't save any memory because you've already retrieved the rows from database.

This is what the entire functional part of the job might look like if it were written by me:

$this->buildQuery()->chunkById(1000, function (Collection $results) use ($file) {
	foreach ($results as $result) {
		fputcsv($file, $this->toCsvColumns($result), ';', escape: '');
	}
});

Where buildQuery() constructs the Eloquent query builder and toCsvColumns() converts a result row to CSV columns values. This would run as fast as the query returns results, and wouldn't exhaust memory because only 1000 results are in memory at any time.

Also, if there was an exception, I'd let it bubble out and let the exception handler take care of it, so that it gets reported to the team and the bugs get fixed. You're stripping away all useful information from the exception, such as the stack trace, filename, line numbers, etc. You could do status updates in the failed() method of the job.

Tray2's avatar

For what reason are you exporting that many records?

VladPtashnyk06's avatar

@Tray2 I need to export product analytics to a CSV file. I would be very grateful if you could take a look. I have the code there.

Name of post: laravel-job-for-exporting-millions-of-records-to-csv-how-to-optimize-and-avoid-memory-errors

soham-laravel-dev's avatar

@Tray2 so i am working on product which has lots of records and the all records are very crucial so every week or month the whole records are exports as backup you they required with date filtering options but now the current logic is taking too much time as background export process and as i know to there is possibility to optimize of this performance.

VladPtashnyk06's avatar

I also have a problem with memory overflow. Although I initially used ->cursor(), I now use ->toBase()->cursor(). But I still get an error somewhere around 5 million about memory overflow.

$query = ProductAnalytic::query() ->join('products as p', 'product_analytics.product_id', '=', 'p.id') ->leftJoin('categories as c', 'p.category_id', '=', 'c.id') ->select([ 'product_analytics.product_id as product_id', 'product_analytics.views as views', 'product_analytics.clicks as clicks', 'product_analytics.buy as buy', 'product_analytics.created_at as created_at', 'p.title as product_title', 'p.code as product_code', 'p.category_id', ]) ->selectRaw("JSON_UNQUOTE(JSON_EXTRACT(c.title, '$.uk')) as category_title_uk") ->whereBetween('product_analytics.created_at', [ Carbon::parse($this->startDate)->startOfDay(), Carbon::parse($this->endDate)->endOfDay(), ]) ->orderBy('product_analytics.id');

foreach ($query->toBase()->cursor() as $row) { $csv->insertOne([ $row->product_id, $row->product_title, $row->product_code, $row->category_id, $row->category_title_uk, $row->views ?? '0', $row->clicks ?? '0', $row->buy ?? '0', $row->created_at, ]);

        fflush($stream);
    }
JussiMannisto's avatar

Since you're exporting to a CSV file, writing shouldn't be an issue. The limiting factor should be the speed of the DB queries.

I wouldn't mess around with any spreadsheet libraries. I'd just open a file handle with fopen and write to it using fputcsv, e.g.:

$file = fopen('example.csv', 'w');

$row = ['foo', 'bar', 'baz'];

fputcsv($file, $row, ';', escape: '');

I'd also fetch the results in chunks of 1000 rows or so, then write them out. That way you don't get memory issues or run any super long queries.

ProductAnalytic::query()
	...
	->chunkById(1000, function ($rows) use ($file) {
		foreach ($rows as $row) {
			fputcsv($file, $row, ';', escape: '');
		}
	});

But you have to select product_analytics.id in the query for that to work.

P.S. You should format any code blocks you post using three backticks (```). Right now the query is unreadable.

jlrdw's avatar

Why would you need to export them if they are already in a database?

If all you need is a report, you could do some summary data, monthly, quarterly, etc.

soham-laravel-dev's avatar

@jlrdw yes it's already in database but in my scenario here i am used the multiple databases. All the data are relational with databases so i need the some as matched records to export with requirement and it's too much but as the requirement it's should process with the huge data exporting.

VladPtashnyk06's avatar

This is how it turned out for me, processing millions of records.

$batchSize = 1000; $lastId = 0;

    do {
        $batch = \DB::table('product_analytics')
            ->select('id', 'product_id', 'views', 'clicks', 'buy', 'created_at')
            ->where('created_at', '>=', $start)
            ->where('created_at', '<=', $end)
            ->where('id', '>', $lastId)
            ->orderBy('id')
            ->limit($batchSize)
            ->get();

        foreach ($batch as $row) {
            fputcsv($stream, [
                $row->product_id,
                $row->views ?? 0,
                $row->clicks ?? 0,
                $row->buy ?? 0,
                $row->created_at,
            ]);

            $lastId = $row->id;
        }

        fflush($stream);

    } while ($batch->count() === $batchSize);
VladPtashnyk06's avatar

@Glukinho I don't know why, but my chunk() method was loading memory. Although, in theory, it only takes 1,000 records at a time.

VladPtashnyk06's avatar

I also used cursor(), but that didn't help either. And this one takes one entry at a time. If I'm not mistaken. If I am mistaken, please correct me.

Please or to participate in this conversation.