floryan's avatar
Level 13

Weird memory leak

In this code each Item::create increases memory usage. Usage without load is 31 MB, but after 8k item creations the memory usage is already 60 MB. That is a huge problem for me as I need to process over 700k items.

SaveQuietly doesnt help, I cannot use insert method and I got Telescope disabled.

What can I do?

$path = Storage::path("data");

$file = fopen($path, "r");

while ($line = fgets($file)) {
  $line = json_decode(trim(trim($line), ","));

  Item::create([
    'a' => $line->b,
    'c' => $line->d,
    'e' => $line->f,
  ]);
}

fclose($file);
0 likes
9 replies
LaryAI's avatar
Level 58

It seems like you're experiencing a memory leak while processing a large number of items in a loop. This can happen in PHP scripts that run for a long time and deal with a lot of data. Here are a few suggestions to help you manage memory usage:

  1. Disable Eloquent Events: If you don't need model events, you can disable them temporarily which can save memory.

  2. Use unset: After each iteration, unset variables that are no longer needed.

  3. PHP's Garbage Collection: Force garbage collection after a certain number of iterations.

  4. Chunking: Instead of reading the whole file, read it in chunks.

  5. Optimize JSON decoding: If your JSON structure is simple, consider writing a custom parser.

  6. Use Database Transactions: If applicable, use transactions to reduce the overhead of database operations.

Here's an example of how you might modify your code to incorporate some of these suggestions:

use Illuminate\Support\Facades\DB;

$path = Storage::path("data");
$file = fopen($path, "r");

DB::beginTransaction();

$counter = 0;
$batchSize = 1000; // Adjust the batch size as needed

while ($line = fgets($file)) {
    $line = json_decode(trim(trim($line), ","), true);

    if ($line) {
        Item::create([
            'a' => $line['b'],
            'c' => $line['d'],
            'e' => $line['f'],
        ]);
    }

    $counter++;

    if ($counter % $batchSize === 0) {
        DB::commit();
        DB::beginTransaction();
        gc_collect_cycles(); // Force garbage collection
    }

    unset($line); // Unset the variable to free up memory
}

DB::commit(); // Commit the remaining transactions
fclose($file);

In this example, I've added a transaction block that commits every 1000 inserts. This can help reduce memory usage by clearing the query log that Laravel maintains for each transaction. Additionally, I've added a call to gc_collect_cycles() to force garbage collection after each batch.

Remember to adjust the $batchSize to a number that works well with your environment. A larger batch size may be faster but use more memory, while a smaller batch size may be slower but use less memory.

If you're still experiencing memory issues, you may need to look into optimizing your server configuration or using a queue system to process the items in smaller batches over time.

1 like
gych's avatar

@floryan Its because of your while loop. Its better to call a separate function within the while loop, this forces a scope change after each iteration. Resulting in garbage collection kicking in.

Small example:

while ($line = fgets($file)) {
	handleData($line);
}

function handleData() {
	  $line = json_decode(trim(trim($line), ","));

  		Item::create([
   			 'a' => $line->b,
   			 'c' => $line->d,
   			 'e' => $line->f,
  		]);
}
1 like
JussiMannisto's avatar

Every line is a JSON document? Json_decode is not very memory-efficient and I've had to sometimes manually parse big JSON docs rather than rely on it. But how long are your lines?

If this is really everything your code does, it shouldn't leak any memory unless the Item model has some weird lifecycle hooks. Have you confirmed that the memory grows beyond 60 MB? It can take a while for garbage collection to kick in.

Snapey's avatar

I would be inclined to change it to DB inserts so that the impact of Eloquent is assessed

  DB::table('items')->insert([
    'a' => $line->b,
    'c' => $line->d,
    'e' => $line->f,
  ]);
wafto's avatar

Same use DB::table and maybe batching with an insert with multiple items to save some queries instead doing one by one.

floryan's avatar
floryan
OP
Best Answer
Level 13

Calling gc_collect_cycles() every couple iterations helped.

Please or to participate in this conversation.