I am trying to figure out the fastest way to access data stored in a junction object. The example below is analagous to my problem, but with a different context, because the actual dataset I am dealing with is somewhat unintuitive in its relationships.
We have 3 classes: User
, Product
, and Rating
. User has a many-to-many relationship to Product
with Rating
as the junction/'through' class.
The Rating
object stores the answers to several questions which are integer ratings on a scale of 1-5 (Example questions: How is the quality of the Product
, how is the value of the Product
, how user-friendly is the Product
). For simplification assume every User
rates every Product
they buy.
Now here is the calculation I want to perform: For a User
, calculate the average rating of all the Product
s they have bought (that is, the average rating from all other Users
, one of which will be from this User
themself). Then we can tell the user "On average, you buy products rated 3/5 for value by all customers who bought that product".
The simple and slow way is just to iterate over all of a user's review objects. If we assume that each user has bought a small (<100) number of products, and each product has n ratings, this is O(100n) = O(n).
However, I could also do the following: On the Product
class, keep a counter of the number of Rating
s that selected each number (e.g. how many User
s rated this product 3/5 for value). If you increment that counter every time a Product
is rated, then computing the average for a given Product
just requires checking the 5 counters for each Rating
criteria.
Is this a valid technique? Is it commonly employed/is there a name for it? It seems intuitive to me, but I don't know enough about databases to tell whether there's some fundamental flaw or not.
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…