Queued Jobs and Scaling

Hey guys,

Just a thought question with scalability.

Say I have a load balancer and 3 small linux boxes (identical) with my Laravel app deployed to them and running with nginx.

If I wanted to introduce a queuing system (such as Amazon SQS), which server or servers should the supervised queue worker be on? Should I only use one server or can I get all of the servers to each help with processing queued jobs?

fideloper

9 years ago

Level 11

Hey!

"It depends"!

Some strategies:

Create a server (or multiple, depending on how large your queue volume is) that isn't in the load balancing rotation, and who's job is just to churn through queue jobs
Choose a server in the load balancer rotation that is also has a queue worker (if that server fails, you'll have a manual step of enabling queues on another server)
Run queue workers on each server (safest)

The "depends" part depends on your application:

Do queue jobs take up a lot of memory (image processing?) that might compete with the web-server/app for resources? (If so, a separate server might be best)
Are there a large volume of jobs? (Do you need more than one worker churning through jobs?). This can affect the decision how many workers you enable and across how many servers.

Last notes:

I usually use SQS so I can avoid managing yet another service (such as supervisord or rabbitmq). However, some things that make SQS different:

It's neither LIFO nor FIFO - there's NO guaranteed order in which you'll receive a job, so if ordering matters, you'll need to write code to orchestrate it, or use another queue that is LIFO or FIFO
SQS has a "Visibility Timeout" on each job. If a job is NOT complete with that set time, it gets reset as "available" and another worker may pick it up. This means you need to carefully manage that:
- You'll need an idea of how long a job might take (so you don't accidentally make a job available and have it run twice)
- You might need to manage resetting timeout visibility per job type, or have multiple SQS queues each with a different default timeout. The AWS SDK will let you change job visibility per job, which is something I've done before (I Know Job A takes 10 minutes, but Job B only takes up to 10 seconds, so I set job visibility per job accordingly within each worker as it gets processed).

3 likes

Jamesking56

9 years ago

Level 11

Hey @fideloper !

Thanks for your insight, its really helpful!

To give a better idea of my queued jobs, some jobs will be email / SMS processing (which should take seconds since it's literally just cURLing another provider's API).

But my main concern for my queued jobs is processing raw data from a car engine that will be sent via a mobile device. This data will arrive to my API in chunks and when all of the bits have been received (see my other thread here about checksumming to ensure I have all the data), I will then queue a job to chunk through each individual raw record to convert and process the data into statistics and calling out to Google Maps's API.

This process will be really large and I'm looking at the best way to deal with it, I'm expecting each dataset from the mobile device to take 20 mins or more to process, depending on data sizes (as they can vary). For this, it might be worth breaking it up into smaller bits as follows:

Queued job runs
Processes 50 raw rows
Checks if there is more
If there is, it saves the state in cache of where it is at (Redis?) and queues itself again (so that the next run can do another 50 from where it left off)
If there is no more to process, it'll finish and mark the dataset as processed, meaning my application can then display the end results to the user.

With this chunking approach, it might be possible then for me to use Amazon SQS and process an entire dataset in pieces rather than all at once.

Is this a better approach in your opinion? I think if I followed this approach I could then stick a queue worker on every load balanced server and use autoscaling to create new servers with queue workers ready to go as needed.

fideloper

9 years ago

Level 11

That sounds really interesting! I'm also curious how you get mobile devices hooked into a car and internet enabled so you can send data to your API :D

I think what you're saying there sounds like a good plan. It sounds like each API call you receive from a mobile devices is not a complete data set - is that because it takes time to complete a data set, or because there is a lot of data to send?

jekinney

9 years ago

Level 47

Using a separate entity will be the way to go. Either ensure the connection for that API route always hits the same server after initial connection or a specific server to proccess the data via route(like a sub domain).

I think that the issue might be your application gets all the data then queues it for processing? If so maybe look for a dedicated app/service to handle that complete feature on its own. In most cases that's what business do for big data. Like periodic syncing from pos at a store to the main data center/server.

The data warehouse also then processes periodic reports and has them avalible. Think McDonaldes here with 10,000 plus stores all syncing for reports and inventory.

Jamesking56

9 years ago

Level 11

There's a lot of data, which is why I have to chunk it up into the pieces then tell the Api when it finishes.

With the mobile device connecting to the car, that's my secret sauce ;) it will be a SaaS so ill let you know when it's ready!

Please or to participate in this conversation.