If you talk to the people who design systems to produce recommendations tailored to user preferences, you'll see lots of impressively daunting mathematics, with formulae for measuring things like "Mean Absolute Error".
If you talk to the people who sell these systems, you'll hear stories of uncannily perceptive and prescient suggestions that anticipated interests that Jo Consumer didn't even know she had yet.
The information that neither tend to volunteer until asked is that there are no agreed criteria for a perfect, or even a good, recommendation.
Hence you will see reviews of different recommender systems where people put the same starting point into different systems and then compare the results to judge which is the best. But the judgements, like this one, are based on the subjective views of experts: they have no way of articulating what's behind those judgements in a way that could be replicated in other studies. And how reliable are those judgements? Well, when Paul Lamere asked the readers of his blog to compare two sets of recommendations and identify which was done by an expert reviewer and which by machine, over three quarters of the respondents got it wrong. That means if you'd tossed a coin you could have done better than asking these people. (I was one of the few who got it right, but I had already said that there was no way to determine which was which with any confidence, therefore my answer was a guess, and it turned out to be a lucky one.)
So what is a good recommendation, and can we measure value or 'accuracy' at all?
Continue reading "What makes a good recommender system, and how do you validate it?" »
Recent Comments