alex_storm's avatar

How to up speed for insert?

Hello! I have simple array for insert to db. It adds 50,000 entries at a time. In total in one table more than 2 million records which need to be entered in a database.

The question is how to speed up the introduction of this data.

I made tests for that.

  1. When I use next construction, I getting 23400 records per min.

// Inserting one record at a time
foreach ($records as $record)
{
DB::table($table)->insert($record);
}

  1. When I use next construction, I getting 40 records per min.

// Inserting 50000 records at a time
// I tried to reduce the number of records at a time, but the result is approximately the same
DB::table($table)->insert([$records]);

  1. When I use next construction, I getting 230000 records per min.

// Inserting 50000 records at a time
// The downside of this construction is that need to generate data for the insert yourself.
DB::insert('INSERT INTO '.$prefix.$table.' ('.$columns.') VALUES '.$values);

What other options can be used in Laravel system for mass data insertion with fast speed? Thanks.

0 likes
16 replies
shez1983's avatar

i would say your third method is fast.. thanks for doing this..

1 like
Cronix's avatar

Try disabling the query logging, which speeds it up a ton and uses a LOT less memory.

DB::disableQueryLog();  // or DB::connection('connection-name')->disableQueryLog();

//your queries here
alex_storm's avatar

@Cronix yeap, I tried there.... But, all the tests I submitted are done with the inclusion of this method)

I can't understand why Laravel is so slow to insert data.

alex_storm's avatar

I almost forgot. I conducted these tests after configuring the database and tuning her for fast data insertion.

martinbean's avatar

@alex_storm Not so much related to speed, but if it’s imperative that all data must be inserted, then I’d strongly suggest putting this in a queue. It could prove to be faster if you have many workers processing the queue in the parallel.

2 likes
alex_storm's avatar

I checked the work through this service, no difference :)

MySQL by default has the functionality of the sequence of query execution and there is no sense in doing separately the functionality on php.

The speed of inserting large data for console applications remains the same.

Cronix's avatar

Can you just import it from a file and insert it directly, bypassing the laravel database layer?

something like

exec('mysql -u username –-password=password databaseName < file.sql', $output);
dd($output);

Also, what kind of system are you running this on? Is it a production server? My macbook is really slow at a ton of inserts (I deal with real estate data), but whips right through it on the production servers.

alex_storm's avatar

@Cronix, yes, but it's not correct for this tasks. I have next system: VPS i3 2.4GGHx4, 60SSD, 8RAM. This system should suffice for the processing of such data

1 like
Cronix's avatar

Have you tried doing it outside of laravel? Just a pure sql script to compare times against? Or just trying what I suggested to get a pure benchmark on what your system can handle vs how laravel is handling it?

alex_storm's avatar

@Cronix I know that this will work many times faster. But I need use laravel for validate each column. Validation is very fast, but not the addition of data (

If I could just go to mysqldump -Q -c -e I would have already done this (but I can not (

gregrobson's avatar

A few observations:

  1. In the original post your final method (building raw queries) will always be faster than the previous methods as Laravel will be doing fewer function calls to build the query and escape values.
  2. If you have indexes on the table and can afford to remove them during the import the DMBS won't be rebalancing the index as records are added. It's quicker for the index to be built in one go than update it on every insert.
  3. Running multiple threads as @martinbean suggested might be worth a try - but I'd hazard a guess that any table with an identity field or primary key is going to hit the same limit. At some point the database engine has to limit access to the primary key/identity to one process at a time.
  4. Batching inserts might help, for example:
INSERT INTO foo (column1, column2) VALUES ('val1', 'val2'), ('val3', 'val4), ('val5', 'val6');

It's a bit more work to compile the string: but Heap analytics really boosted their performance using this method (every query attracts an overhead): https://heapanalytics.com/blog/engineering/basic-performance-analysis-saved-us-millions

Depending on the database engine, deferring constraint checking is also an option. Also you can try adjusting memory settings for queries.

Your on the right track though: I would benchmark different strategies with 100,000 records and see what works best for you.

4 likes
ssquare's avatar

@alex_storm so what was your last choice? I keep facing a similar situation on time to time. Even worse, I need to insert on multiple tables.

1 like
WMage's avatar

probably PDO limitations. that can handle only ~ 15000 bindings.

aosdev's avatar

You can use chunk

$collection->chunk(100000, function ($subset) {
    DB::collection('collection_name')->insert($subset->toArray());
});
marcogermani87's avatar

In my case i was need to import 800k records. With this script i've made it in 90 seconds (9k records for sec):

$chunk = 1500;

DB::disableQueryLog();
DB::connection()->unsetEventDispatcher();

foreach ($rows as $row) {
    $currentChunk[] = $row;
    if (count($currentChunk) >= $chunk) {
        Model::insert($currentChunk);
        $currentChunk = [];
    }
}

When you use massive insert consider the default binding limit of 65,535 (2^16-1) for the values.

1 like

Please or to participate in this conversation.