Be part of JetBrains PHPverse 2026 on June 9 – a free online event bringing PHP devs worldwide together.

ibrahimwithi's avatar

Speed up relationships query on more than one million record

Hi,

I have three columns [ stores - rating_average - ratings ], if a user rated a store a new record will be added to the ratings table [ store_id -rating - guest_id - timestamps ] after that it will count all the ratings for that store id and crate or update the rating_average [ store_id - average - ratersCount ].

Now when the user click and view a listing (store) I make an ajax call after 10s of page fully load to the following function at ** RatingController ** to determine if the listing is trending or not.

#RatingController 

    public function show(int $id): \Illuminate\Http\JsonResponse
    {
        $store = Store::where('id', $id)->firstOrFail();

        $last_period = $store->ratings()->selectRaw('AVG(ratings.rating) as aggregate')->where('created_at', '<=', Carbon::now()->subDay(1) )->pluck('aggregate')->shift();
        $this_period = $store->ratings()->selectRaw('AVG(ratings.rating) as aggregate')->where('created_at', '>=', Carbon::today() )->pluck('aggregate')->shift();

        return response()->json(($last_period > $this_period ? false : ( $last_period < $this_period ? true : 0)), 200);
    }

Now what I want to do is to add a Boolean column to rating_average and name it trending so I can query all the trending stores (listings) but I don't when to call the function that do the calculation to determine if the store is trending!

Is there any more simple way to do this? User can rate a listing and I could increment the counter of people who have rated that listing and add an avg rating and compare average of ratings on two dates and see if the average rating is higher than the previous day.

#store model relationships
    public function ratings(): \Illuminate\Database\Eloquent\Relations\HasMany
    {
        return $this->hasMany(Rating::class, 'store_id');
    }

    public function rating_average(): \Illuminate\Database\Eloquent\Relations\HasOne
    {
        return $this->hasOne(ratingAverage::class, 'store_id');
    }

#ratingAverage model relationships
    public function store(): \Illuminate\Database\Eloquent\Relations\BelongsTo
    {
        return $this->belongsTo(Store::class,'store_id');
    }

# migrations

        Schema::create('rating_average', function (Blueprint $table) {
            $table->id();
            $table->bigInteger('store_id')->unsigned()->unique();
            $table->decimal('average')->unsigned();
            $table->bigInteger('ratersCount');

            $table->foreign('store_id')->references('id')->on('stores')->onDelete('cascade');

            $table->timestamps();
        });


#Rating model relationships

    public function stores(): \Illuminate\Database\Eloquent\Relations\BelongsTo
    {
        return $this->belongsTo(Store::class,'store_id');
    }

#migration 
        Schema::create('ratings', function (Blueprint $table) {
            $table->id();
            $table->bigInteger('store_id')->unsigned();
            $table->Integer('rating');
            $table->string('guest_id')->unsigned();

            $table->foreign('store_id')->references('id')->on('stores')->onDelete('cascade');

            $table->timestamps();
        });


0 likes
15 replies
Tray2's avatar

I just created a table with aprox 200K records and did a

select movie_id, avg(rating) from ratings group by movie_id;

To get the avg rating of 10K movies and the result was returned in 0.141 seconds.

For one movie 0.044 seconds.

So I don't think you need to worry too much about performance just yet

1 like
ibrahimwithi's avatar

Thanks @Tray2 for your reply, So to get the top 10 trending I should query each store whenever a user loads the homepage? imagine having 50K stores and each one have 500K of ratings?!

Tray2's avatar

@ibrahimwithi This is me getting the top ten from the table

select movie_id, 
			avg(rating), 
           count(movie_id)
from ratings 
group by movie_id 
order by count(movie_id) 
desc limit 10;

The result 685K records

+----------+-------------+-----------------+
| movie_id | avg(rating) | count(movie_id) |
+----------+-------------+-----------------+
|     4101 |      3.0385 |             104 |
|      671 |      3.2100 |             100 |
|     6388 |      3.2100 |             100 |
|     3521 |      2.8800 |             100 |
|     1229 |      2.9800 |             100 |
|     9497 |      2.9898 |              98 |
|     8943 |      2.9897 |              97 |
|     6478 |      3.1237 |              97 |
|     4051 |      3.2784 |              97 |
|     6559 |      2.9063 |              96 |
+----------+-------------+-----------------+
10 rows in set (0.274 sec)
1 like
Tray2's avatar

@ibrahimwithi I will generate 10M records and see how fast my query is.

At the moment I have 1.1M records and it takes less than half a second.

1 like
Tray2's avatar

@ibrahimwithi Do you know that you will have 50K stores from day one?

Probably not, so don't go overboard with optimizations just yet.

1 like
Tray2's avatar

@ibrahimwithi

select count(*) from ratings;
+----------+
| count(*) |
+----------+
| 11192260 |
+----------+
1 row in set (2.678 sec)
MariaDB [slask]> select movie_id, avg(rating), count(movie_id) from ratings group by movie_id order by count(movie_id) desc limit 10;
+----------+-------------+-----------------+
| movie_id | avg(rating) | count(movie_id) |
+----------+-------------+-----------------+
|     4291 |      3.0373 |            1261 |
|     1354 |      2.9491 |            1237 |
|     3249 |      2.9683 |            1232 |
|     6103 |      3.0268 |            1231 |
|     9314 |      2.9732 |            1231 |
|     9680 |      3.0911 |            1229 |
|     4022 |      2.9951 |            1229 |
|     6475 |      2.9967 |            1228 |
|     2587 |      3.0024 |            1227 |
|      775 |      2.9306 |            1224 |
+----------+-------------+-----------------+
10 rows in set (5.122 sec)
1 like
ibrahimwithi's avatar

@Tray2 The thing is I have a store with a one million rating and others vary from 50k to 300k, So is there is a way to store the result of this big query that takes 5 sec to execute and then increment it whenever a new record is added?

Instead of looping through the store ratings over and over, we store the average and then increment it idk if this is possible.

Tray2's avatar

@ibrahimwithi Sure there are several.

  1. Cache the result of the query for x hours before refreshing.
  2. Copy top ten to another table every x hours
  3. Copy all with with the average already calculated every x hours
  4. Use a materialized view that refreshes every x hours instead of copying the data with a job.
2 likes
ibrahimwithi's avatar

@Tray2 could you explain more on how to do step number 2 and 4, is there is some kind of a scheduler function that I can query the avg and update it in the ratings_average table?

ibrahimwithi's avatar

@Tray2 I will try it and give you feedback... Thank you so much for your replies ❤❤❤

Please or to participate in this conversation.