Be part of JetBrains PHPverse 2026 on June 9 – a free online event bringing PHP devs worldwide together.

faroutchris's avatar

Help needed with caching data from API request

Background: I have a website/service that I'm building where the bulk of my data comes from an external API. This API has a limit of 150 000 requests per day. The data is basically live data and I want to throttle the amount of times I hit their endpoints (basically cache it).

So my plan is to run queries against this API almost like on a schedule and store the responses in my database. If an entry in my table is older than 10 minutes I want to make a request to the API and get fresh data.

By querying the data every 10 minutes, in a worst case scenario, I can get down to ~50 000 requests per day.

I'm not sure what the right approach is. I'm mainly a frontend/React developer so my experience designing backends with caching and scale in mind is limited.

Has anyone done something similar, if so, how did you approach this? I'd like to get some kind of high level overview of how to design such a system.

I can probably set up a Resource class that gets the data but as I've looked in the documentation, the Resource is usually consumed directly by the views. Can I use my Resource to update the underlying DB with the data?

I've also thought about using something like Redis instead of the DB (but then I can't really decorate the data and set up relationships which has been really nice).

I hope my question is clear enough :)

0 likes
4 replies
Cronix's avatar

The easiest thing to do would to create a scheduled task: https://laravel.com/docs/5.7/scheduling

You can just have it automatically run every 5 or 10 minutes to get the latest data. It has default everyFiveMinutes() and everyTenMinutes() interval methods that would work nicely.

1 like
faroutchris's avatar

Yeah, I looked into the schedule, but am I to understand that this is roughly the correct approach:

// psuedocode
...
protected function schedule(Schedule $schedule)
{
     $schedule->call(function () {

        $client = new \GuzzleHttp\Client();
        $res = $client->request('GET', 'https://api.com/someresource');

        if ($res->getStatusCode() == '200') {
            
            $body = $res->getBody();
            
            foreach($body['data'] as $field) {

                DB::table('resource')->delete();
                DB::table('resource')->add($field); 
                // is there some kind of mass assignment, like I can do with php artisan db:seed 
    
            }
        }

     })-> everyTenMinutes();
}

I'm not sure that I want to always hit their API. With a naive approach I would hit all of the 34 different API endpoints I use to get fresh data.

Could/should I check each table and see if the data is older than 10 minutes in the schedule function and run the queries as needed? But then some data that doesn't fit the criteria of being older than 10 minutes might not get updated.

Cronix's avatar

Pretty close. Could you provide an example of what $body['data'] contains (maybe just first record if there are many)?

faroutchris's avatar
[
...
{
    "key":"soccer",
    "label":"Soccer",
    "league":"UEFA Champions League",
    "teams":[
        "Liverpool",
        "Red Star Belgrade"
    ],
    "event_start":1541526900,
    "stats": [
        "score": [0, 1],
        "corners": [1, 2],
        "substitutes": [
            [{"Naby Keïta", "James Milner"}], []
        ]
        // etc
    ]
},
...
]

The Event and the Stats are different models in my application.

Please or to participate in this conversation.