dlook's avatar
Level 4

Laravel 5.2 - Uploading and processing big CSV files

Hi,

I'm developing a web app in Laravel, it's kinda mini crm type of app, but my client now wants to be able to import huge CSV files with data, let's say User records. I.e. he has a CSV with 50.000 records, and it needs to check each if exists, if not create a user, attach corresponding role and move to the next one.

I'm using Maatwebsite\Excel to handle csv and xlsx files, but it will probably time-out if I try to process 50k of records just like that.

Can anyone shed some light on this, if I need to use queue - how to really set it up so I have the following:

  • User uploads CSV with 50.000 records
  • After the upload it says "we're processing your file, we'll shoot you an email when we finish"
  • When the process is done, send an email?
public function store(Request $request)
    {
        if($request->hasFile('list')) {
            $fileItself = $request->file('list');
            $load = Excel::load($fileItself, function($reader) {})->get();
            foreach($load as $row) {
                // ...
            }
        }
    }

That is as far as I got code-wise, I don't think it will work like this on 50.000 scale.

0 likes
5 replies
tam1's avatar

hey @dlook ,

You seem to be on the right path, I would agree that this is a job for the queue.

Considering you really didn't give any code of your own and i'm not really sure what you want us to show you. I'm just gonna direct you to https://laravel.com/docs/5.2/queues which covers Laravel's Queues better than I ever would be able to.

dlook's avatar
Level 4

Hi @tam1,

I read that article, but I just can't wrap my head around my particular case.. I've updated the post with my "code" if we can even call it that way.

tam1's avatar

hey @dlook,

I really recommend watching this set of videos discussing queues: https://laracasts.com/lessons/beanstalkd-queues-with-laravel (Note: the videos are for laravel 4.2 I believe, so the syntax isn't exactly the same anymore.

Here is some psuedo-code that might help you get started:

// your controller
public function store(Request $request)
{
    if($request->hasFile('list')) {
        $path = $request->file('list')->path();

        $this->dispatch(new ImportsExcelFile($path));
    }
}
class ImportsExcelFile extends Job implements ShouldQueue
{
    use InteractsWithQueue, SerializesModels;

    protected $file;

    public function __construct($file)
    {
        $this->file = $file;
    }

    public function handle(Excel $excel, Mailer $mailer)
    {
        $excel->load($this->file, function($reader) {})->get();
        
        foreach($load as $row) {
            // handle the logic
        }

        // Send that email
        $mailer->send('the.email.to.send');
    }
}

Good luck!

dlook's avatar
Level 4

Hey @tam1,

So I followed your advice but it seems that nothing is happening, I created a store function and Job class, and when I run smaller things like 20 entries it finishes and I get an email.. But if I just pass (like just to count entries and send back to email) it never sends the email, and after uploading the file it just returns blank post route, here is the code:

public function queueOldUsers(Request $request)
    {
        if($request->hasFile('oldOnes')) {
            $path = $request->file('oldOnes')->path();
            $this->dispatch(new ImportOldContacts($path));
        }
    }
class ImportOldContacts implements ShouldQueue
{
    use InteractsWithQueue, Queueable, SerializesModels;

    protected $file;

    /**
     * ImportOldContacts constructor.
     * @param $file
     */
    public function __construct($file)
    {
        $this->file = $file;
    }

    /**
     * @param \Maatwebsite\Excel\Excel $excel
     * @param \App\ImportMail          $mailer
     */
    public function handle(Excel $excel, ImportMail $mailer)
    {
        $load = $excel->load($this->file, function($reader) {})->get();
        $data = 0;
        foreach($load as $row) {
            $data++;
        }
        $mailer->importIsDone($data);
    }
}

This is my model that sends the email

class ImportMail extends Model
{
public static function agileIsDone($data) {
        if(Auth::check()) {
            $receiver = "[email protected]";
            Mail::send('emails.notifications.import', ['data' => $data], function ($m) use ($receiver) {
                $m->from('[email protected]', 'import test!');
                $m->to($receiver)->subject('Testing Import!');
            });
        }
    }
}

So as I've said when it's smaller file it goes well, but when it is like 70.000, and again I'm just trying for now to "count" them trough foreach to see if it goes trough the entire file, but I never got the email...

Help?

Inquisitive's avatar

@dlook this might be little late to reply. If any other guys going into the same problem, then this might be helpful. I was also facing the same problem. But, I realized these queues are not the solution you are looking for. You are looking for some ajax requests, which needs to send again and again. Even if you run these kinds of huge tasks on queue, it's not going to solve the problem because of maximum time execution.

All you need to send the required data and offset value, and after completion of a chunk/sets return status true then as soon as the 1st request completed send the second request with updated offsets value. In between them, you can use jquery to show completion status.

The queue is something that you can run in the background. But these queues will never solve the execution timeout issues on their standalone.

These queues guys will just push you into another way, which don't provide the solution to you. I totally got frustrated, as after learning and doing queues, the problem still exists. Trust me you guys don't want to follow this.

Sweet and simply just use ajax requests.

Please or to participate in this conversation.