Caching vs. 2nd database table for fast changing data

Posted 1 month ago by lambooni

I see a lot of guides using caching for things such as dashboards, where we are more than happy for the data to be maybe an hour or 2 old. I do not see many examples where caching is used where invalidation is critical when data is updated.

In my use case, I am doing stock counts against a delivery of items. There may be a few hundred items per delivery and as stock is scanned in the counts will increment. When I run a query, I must be able to ensure that I am getting the most up to date data, however when I view the whole lot at once I do not want a 10 second or greater page load to occur!

The source of the data is a stock history table which shows ever scan coming in/out etc. So to find out how much was on the delivery I would need to sum all of these individual event rows, which is slow. Currently to get around this, I have created another table which just holds the sums. When a scan is made the history table is updated and the sums table is updated with the most up to date tally.

My approach does work, but storing 2 instances of the same data does feel messy, especially when I have to have different queries/functions to retrieve data from each respective table. If I could just make the original sum query from the histories table, but use caching to save the result it could be cleaner... but what are your thoughts?

Requirements:

  • Must be able to guarantee that the cache is invalidated when the sum calculation changes for that item.
  • The data is fast changing during the delivery process, but ones its complete it should never change.
  • The data would only be accessed occasionally after the delivery process is complete.

So what would you say is better in this case - a caching & invalidating mechanism, or secondary "sums" table?

Please sign in or create an account to participate in this conversation.