For this reason, Mahout will not have much to say about this sort of recommendation. These ideas can be built into, and on top of, what Mahout provides; an example of this will follow in a later chapter, where you will build a recommender for a dating site.
For now, however, it is time to experiment with collaborative filtering within Mahout by creating some simple input and finding recommendations based on the input.
2.3 Evaluating a Recommender
A recommender engines is a tool, a means to answer the question, “what are the best recommendations for a user?” Before investigating the answers, it’s best to investigate the question. What exactly is a good recommendation? And how does one know when a recommender is producing them? The remainder of this chapter pauses to explore evaluation of a recommender, because this is a tool that will be useful when looking at specific recommender systems.
The best possible recommender would be a sort of psychic that could somehow know, before you do, exactly how much you would like every possible item that you've not yet seen or expressed any preference for. A recommender that could predict all your preferences exactly would merely present all other items ranked by your future preference and be done. These would be the best possible recommendations.
And indeed most recommender engines operate by trying to do just this, estimating ratings for some or all other items. So, one way of evaluating a recommender's recommendations is to evaluate the quality of its estimated preference values – that is, evaluating how closely the estimated preferences match the actual preferences.
2.4 Training data and scoring
Those “actual preferences” don't exist though. Nobody knows for sure how you'll like some new item in the future (including you). This can be simulated to a recommender engine by setting aside a small part of the real data set as test data. These test preferences are not present in the training data fed into a recommender engine under evaluation -- which is all data except the test data. Instead, the recommender is asked to estimate preference for the missing test data, and estimates are compared to the actual values.
From there, it is fairly simple to produce a kind of “score” for the recommender. For example it’s possible to compute the average difference between estimate and actual preference. With a score of this type, lower is better, because that would mean the estimates differed from the actual preference values by less. 0.0 would mean perfect estimation -- no difference at all between estimates and actual values.
Sometimes the root-mean-square of the differences is used: this is the square root of the
average of the squares of the differences between actual and estimated preference values. Again, lower is better.
Above, the table shows the difference between a set of actual and estimated preferences, and how they are translated into scores. Root-mean-square more heavily penalizes estimates that are way off, as with item 2 here, and that is considered desirable by some. For example, an estimate that’s off by 2 whole stars is probably more than twice as “bad” as one off by just 1 star. Because the simple average of differences is perhaps more intuitive and easy to understand, upcoming examples will use it.
B.原文的翻译
推荐系统简介
本章包括:
• 首先看一下实战中的推荐系统
• 推荐引擎的精度评价
• 评价一个引擎的准确率和召回率
• 在真实数据集:GroupLens 上评价推荐系统
每一天,我们形成的事情,我们喜欢,不喜欢的意见,甚至不计较。它发生在不知不觉中。你听到的电台歌曲,或者注意到它,因为它是朗朗上口,或者因为它听起来很可怕 - 也许不会注意到它。同样的事情发生与T恤,沙拉,发型,滑雪场,面孔,和电视节目。论文网
- 上一篇:MAP-REDUCE的程序和系统英文文献和中文翻译
- 下一篇:进销存管理系统英文文献和中文翻译
-
-
-
-
-
-
-
中国传统元素在游戏角色...
巴金《激流三部曲》高觉新的悲剧命运
高警觉工作人群的元情绪...
上市公司股权结构对经营绩效的影响研究
g-C3N4光催化剂的制备和光催化性能研究
江苏省某高中学生体质现状的调查研究
现代简约美式风格在室内家装中的运用
C++最短路径算法研究和程序设计
NFC协议物理层的软件实现+文献综述
浅析中国古代宗法制度