Be part of JetBrains PHPverse 2026 on June 9 – a free online event bringing PHP devs worldwide together.

chrisblackwell's avatar

What's the best approach for handling large data import?

I currently have a command that runs nightly, and grabs data from one database, runs some logic, then dumps the cleaned up data into another database. The problem is, the database contains 250,000+ rows and growing. What would be the best approach to running this, without slowing down the app? Once it's done, I only have to check for rows that have changed, so that saves some time, but I'm looking for an optimal approach.

0 likes
2 replies
jlrdw's avatar

Been discussed many times stored procedure outside of PHP directly in my SQL.

vvvphpdev's avatar
Level 5

Things you need to do

  • use raw sql instead of eloquent, much faster when processing large data
  • fetch the total rows / 5000 per array set = total loops + 1 (it is up to you, 5000 is minimal, it is up to your query logic too, most probably to maximize the default memory allocated.)
  • use for loop to do that
    • run the logic in the loop
    • unset/null the index from the 5000 set, like a russian doll
    • if you need to store the essentials for later process, you could store it as json data in your temp database table, or store it in your cache (redis)
$limit = 5000;
$loopCount = ($totalRows / $limit) + 1;
for ($i = 1; $i <= $loopCount; $i++) {
    // todo: sql query using ( ($i > 1 ?:  0) * $limit) as offset, and limit by $limit

    if (!count($queried)) continue;

    foreach ($queried as $idx => $val) {
        // ...
        unset($queried[$idx]);
    }
}
1 like

Please or to participate in this conversation.