mahout基于用户推荐的简单例子(2)
jopen
9年前
首先是封装了一个给予用户的推荐,用的相似度算法还是皮尔逊相似度,其他的也可以封装。
package com.liuxinquan.utils; import java.io.File; import java.io.IOException; import java.util.List; import org.apache.mahout.cf.taste.common.TasteException; import org.apache.mahout.cf.taste.impl.model.file.FileDataModel; import org.apache.mahout.cf.taste.impl.neighborhood.NearestNUserNeighborhood; import org.apache.mahout.cf.taste.impl.recommender.GenericUserBasedRecommender; import org.apache.mahout.cf.taste.impl.similarity.PearsonCorrelationSimilarity; import org.apache.mahout.cf.taste.model.DataModel; import org.apache.mahout.cf.taste.neighborhood.UserNeighborhood; import org.apache.mahout.cf.taste.recommender.RecommendedItem; import org.apache.mahout.cf.taste.recommender.Recommender; import org.apache.mahout.cf.taste.similarity.UserSimilarity; public class UserPersonSim { public static List<RecommendedItem> userRec(String filePath, int nearCnt, int userId, int recCnt) { try { DataModel model = new FileDataModel(new File(filePath)); UserSimilarity similarity = new PearsonCorrelationSimilarity(model); UserNeighborhood neighborhood = new NearestNUserNeighborhood(nearCnt, similarity, model); Recommender recommender = new GenericUserBasedRecommender(model, neighborhood, similarity); List<RecommendedItem> recommendations = recommender.recommend(userId, recCnt); return recommendations; } catch (IOException e) { // TODO Auto-generated catch block e.printStackTrace(); } catch (TasteException e) { // TODO Auto-generated catch block e.printStackTrace(); } return null; } }
4个参数:filePath---------要分析的数据文件.csv
nearcnt---------推荐给用户的最近的一组数据个数
userid------推荐用户id
reccnt-------推荐给用户的个数
具体使用:
package com.liuxinquan.recommmder; import java.util.ArrayList; import java.util.HashMap; import java.util.List; import java.util.Map; import org.apache.mahout.cf.taste.recommender.RecommendedItem; import com.liuxinquan.utils.UserPersonSim; public class UserRecommder { public static void main(String[] args) { HashMap<String, String> map = new HashMap<>(); map.put("101", "橘子"); map.put("102", "苹果"); map.put("103", "香蕉"); map.put("104", "梨"); map.put("105", "西瓜"); map.put("106", "哈密瓜"); map.put("107", "葡萄"); String filePath = "xxx/intro.csv"; for (RecommendedItem item : UserPersonSim.userRec(filePath, 2, 1, 1)) { System.out.println(map.get(item.getItemID() + "")); } } }结果:梨
和上一篇的104是对应的。这样更贴近实际应用,也给大家提供了一种思路。在实际中不可能都是数据格式的,更常见的是: 张三:梨。这就需要我们制定一种规则,先从现实中抽象出来物体的特征:比方一本书的作者、出版商、出版日期等,用数字把特征对应起来后在还原。