diginize's avatar

Updating (Upserting) thousands of mysql records freezes regularly

I'm really struggling with a problem I can't seem to diagnose or resolve and I'm hoping someone here can help, please! I have a flat text file from another database (that I don't own) of 280,000 records that contain updates that require merging with my laravel/mysql database. Each row of the file represents one person (with a unique id), but contains attributes like phone and address which are stored in my database, normalized, in hasMany relationships. The business requirement is to update the person's bio, address, and phone in the Laravel database, but not overwrite any of the other biographic, phone, or address information that may be stored for that person (which is why I can't just drop records and use sql's INSERT INTO function.)

I have a while loop that reads that flat file one line at a time, and foreach line it does:

$contact = Contact::firstOrNew( ['sourceid' => $importedArray['person.sourceid'] ] );
$contact->fill( [array of specific fields I want to overwrite]  );
$contact->save();

$phone = Phone::firstOrNew([ 'contact_id' => $contact->id, 'type' => "external"]);
$phone->number = $fullPhoneNumber;
$phone->save();

$streetAddress = Address::firstOrNew([ 'contact_id' => $contact->id, 'type' => 'Street']);
$streetAddress->fill( [array of specific fields I want to overwrite]  );
$streetAddress->save();

I understand that this is a lot of query and IO load on the database (3 updates * 280K rows) and that the updates may be kinda slow, but each row seems to process in under a second which I consider acceptable. The part I can't understand is why the process just halts with no errors somewhere between 7,000 and 11,000 rows. I have php's max_execution_time set very high, so I no longer get that error. The best I can tell is that mysql just freezes up, but with no errors I can't be sure.

To ease the database I've removed all indexes from each table except the primary key and the person.sourceid that I'm using for the lookup. And I've even tried to increase the memory allocation to the database ("SET GLOBAL innodb_buffer_pool_size=8*1024*1024*1024;" = 8GB) . This is all running localhost on a mac with 16GB of RAM but the activity monitor says mysql isn't even asking for more than about 1 GB and the computer has more available. I tried adding a 4 min sleep() to the loop after every so many rows, to allow the database to breathe, but that had no effect.

I'd even be happy having a slower process, the painful part is having to babysit it for days and keep refreshing the process.

Eventually I've love to push it into a background Laravel job/queue but I can't do that if it freezes with no error or way to detect that it failed, which is the current state.

Any help or wisdom that the community might offer is much appreciated! Thanks in advance! -Robert

0 likes
5 replies
Snapey's avatar

I think you need to protect memory as much as possible, such as releasing the objects as soon as you have done with them

Also make sure you disable the mysql query log

diginize's avatar

Hi Snapey, thank you for your answer. If you don't mind providing a little more info, can you point me to what you mean about releasing the objects? I looked a little to determine what happens (and the SQL connection) with the object after I call $contact->save(); but I actually didn't find a close() or mysqli_close() equivalent in the Laravel docs.

One thing that did surprise me while this big update runs is that neither the php or mysql processes seem particularly stressed on CPU or RAM usage.

For the mysql query log, you're referring to this: https://dev.mysql.com/doc/refman/5.7/en/query-log.html , correct? That's something I actually didn't know much about. Thank you.

In my troubleshooting I learned also of Laravel's query log, but I believe that's still disabled by default: https://laravel.com/docs/5.0/database#query-logging and I'm not even seeing it in the 5.4 docs anymore.

Thanks for the help!

DirkZz's avatar

You could go plain SQL if your database throws an error when you try to insert a row which already exists or breaks an unique constraint. You can also batch up your inserts at that point.

For example:

INSERT INTO table(a,b,c) VALUES (1,2,3),(3,4,5),(6,7,8)
ON DUPLICATE KEY UPDATE  b=VALUES(b),c=VALUES(c)

In this example it will try to insert all rows, if it finds a conflict with your PK or any other unique constraint then it will trigger the "On duplicate key" part. In this example it will; leave colum "a" as is and update colums B and C with the new values.

If this is new; try it in development first. And I wouldn't do a batch insert with 280K rows you still need to chunk it up.

The maximum rows you can insert/update at once are determined by the mysql variables; max_allowed_packet, bulk_insert_buffer_size, key_buffer_size.

So it depends on your data and config if you can insert/update 100 at a time or 10.000 at a time.

Snapey's avatar

I was thinking about the DB query log. but as you say maybe it is not on by default.

Last time I did this sort of thing I used

            \DB::connection()->disableQueryLog();

and unset, eg;

    unset($contact);
    unset($phone);
    unset($streetAddress);
jlrdw's avatar

I have done inserts through stored procedures before, regular pdo not laravel, yes time consuming, but worked.

Please or to participate in this conversation.