moretti's avatar

Parallel data processing and writing to DB

Hello,

I need help in figuring out what is the most efficient way to run functions in parallel.

Here is the task that I want to do:

  1. Make request to an API and get data
  2. Process the data
  3. Write the data to DB(Mongo)

This task will be executed about 200-300 times and then all the entries should be returned sorted, in some way, as response. Currently I'm using Guzzle(https://github.com/guzzle/guzzle) to make request and MongoDB for database, using this package https://github.com/jenssegers/laravel-mongodb to write the data. Running single task is fast enough 0.2 - 0.3s, but running all of them slows down the response time.

What is the fastest and most efficient way, in terms of response time, to run those tasks in parallel.

I was thinking of running each task as a Job in queue and use sockets to track when all tasks are finished, but this feels unsafe to me, because a job might fail.

0 likes
1 reply
willvincent's avatar

I was thinking of running each task as a Job in queue and use sockets to track when all tasks are finished, but this feels unsafe to me, because a job might fail.

So... retry failed jobs?

A queue is absolutely the right solution here.

Please or to participate in this conversation.