Be part of JetBrains PHPverse 2026 on June 9 – a free online event bringing PHP devs worldwide together.

thjeu's avatar
Level 3

Best way to schedule large amount of tasks

I'm currently building an API for retrieving prices from different hosts. There are currently 756 different 'coins' with different hosts to retrieve the prices from.

For example: Coin X

  • Host 1
  • Host 2
  • Host 3 -- up to 30 hosts

Coin Y

  • Host 1
  • Host 2
  • Host 4 -- up to 30 hosts

etc.

The problem here is, that ideally each coin should be updated every 10 seconds. This means that each coin needs to call all of its hosts, calculate the average price, save the price in the DB and finally save a JSON file with the total history of the coin. (Perhaps it would be better to also save the current price as JSON to save some time)

I've tried to putting all of this in a class for each host, but the exexution time is way too long (around 5 minutes using CURL).

I'm thinking to create a task for each coin. This way the 'updating' of the coins can go in sync (multiple coins at once). But i'm not quitte sure this would be the best way.

What approach would you guys recommend? All tips are welcome.

0 likes
7 replies
inctor's avatar

If the 5 minute execution time is based on the 700+ hosts loading time, you'll need a large amount of backend nodes, that can process that amount of requests spread out to several jobs.

You might be able to gain some speed by using a Shell/Bash-script and saving it to a JSON file, and then parsing that file every 10 seconds instead.

1 like
thjeu's avatar
Level 3

Hi Inctor,

I'm currently using CURL_MULTI for each host. A large host (with around 500 coins connected to the host) takes now around 12 seconds (including the saving and checking for active coins etc.)

I'n now thinking to create a task for each host and run them parallel. Is this what you mean? I'm not quitte sure if I understand you correctlty.

beetuco's avatar

Can each host calculate the average itself, reducing the extra step in the api sequence? You either have a lot of nodes processing small work or one node processing everything.

Run everything in memory, cached where possible and optimised (db indexes if relational etc). I'd be using NoSQL.

I would use Lumen for the api also to reduce the footprint.

And yeah as ohffs mentioned use asynchronous requests where possible.

1 like
thjeu's avatar
Level 3

Great, thanks guys. I'm going to deep into the asynchronous request and Guzzle.

The idea is to build the API completly from scratch (as it's a learning project which turned into a real project :) ).

@beetuco : The hosts themselves are 'stupid'. No logic can be provided, just returning data. And each host is different. Some hosts can provide a complete history, some of them only the current price. As I don't want to dedicate the API to just one host, multiple hosts will be called. I think that therefor calculating the average must be on the API side.

Perhaps the idea should be reframed. I'm now thinking that each host should have it's own class for updating the hosts prices in the storage. Than I can create a task for each of them (max. 30 tasks) and run them asynchromous. These tasks will only update the prices that are stored in the storage.

The folder structure will be something like:

CoinX

  • Host 1
    • history.json

CoinY

  • Host 1
    • history.json
  • Host 2
    • history.json

Than to calculate the current price, a new task will crawl the folders of the coin and calculate the average.

The advantage here is that some hosts only have 5 coins, where others have 500. The smaller hosts will consume a lot less memory. Also leaving out some steps in the 'crawling' process will reduce the time.

jekinney's avatar

Perfect synaro for possible lumen implementation.

Set you're scheduling in lumen. Either update the db directly or set as a web hook to post to your Laravel app. On lumen for each coin. Also nothing stopping you to update each host at the same time.

I suggest: test with a single lumen install. I'ld probably use redis as a db. Set a command to deal with your logic. Queue an event. Schedule the command to check/fire every x amount of time. I say check in case one or more are still in queue being processed.

Another approach is use phps timeout function: use a job to process your requests. Then set it to wait 10 seconds after done and re do the job (infinite loop). Your command will be similar to queue listen where it fires every x time to ensure the jobs are running.

Both cases you need to log errors so your app knows what failed and why. If it failed because the outside API was busy, timed out or something that you can re do the event/job then have the command re try. X amount of failures you're probably just wasting resources so send up a notification (email etc) to let you know.

I suggest lumen as it's actually what it's designed for and you can keep it isolated in its own environment. If it hangs up or otherwise craps the bed it doesn't affect your main app.

1 like
thjeu's avatar
Level 3

@jekinney Thank you. I've dived a little deeper into Lumen as this is new to me. Seems very interessting, so thanks to you and beetuco for the heads up on that. I now have the following structure:

  • Laravel app (plugin like system) / admin back-end
  • Laravel app for the API
  • Lumen for the data gathering?

Is that correct? And when on production, would that mean each app needs to run on a different server? (Cloud)

I've connected them all to 1 MySQL database. I'm not really familiar with any other db than MySQL.

I'm also trying to build an advanced error logging system as you mentioned, so the API will be autonomous.

Please or to participate in this conversation.