I'm running some analysis on data from a system in Visual Fox Pro. The only option that i have to access the data is to dump the tables into CSV files. Even when i'm limiting the data to only one year a couple of tables have more than 500k records each. The process that i'm doing now is truncate the tables and upload the csv files into them. The 2 500k record files take a very long time to upload (i'm not collecting a lot of data from each file). my question is: do you think will be quicker to compare the data from the files and only create the new records and update the ones have changed.
Any ideas on speeding up the upload/update of a csv file.
@bugsysha you are probably correct. I was looking for someone that may have done something similar and it would say to me "oh, i got great results by ....".
I did some pruning and the 2 files have about half the records. It takes 30 minutes to process on my desktop, the server should be a little faster, but 30 minutes is doable.
@dan3460 I imagine you would see a big performance improvement if you insert it directly with MySQL. I had built an importer for an app a couple years back where looping in php took way too long, as I was dealing with a similar data volume. Give MySQL's load data infile a shot. Some benchmarks I took show about 1 million records into a table from CSV in about two minutes using that method (performed on some pretty cheap hardware). More info on MySQL's load data here:
@joefusco That is interesting, didn't know that such a command existed. I'm guessing you could put the execution of the command as a raw sql from Laravel. I will play with it.