摘 要: 针对高效的协同过滤推荐技术处理大数据时的计算效率问题,提出了并行计算的ASUCF算法。该算法采 用Hadoop平台的MapReduce并行编程模型,改善大数据环境下高效的CF算法在单机运行时的计算性能问题。最后在实 验部分,结合Mahout,实现ASUCF算法的并行化,设计不同数据集上的加速比实验,验证算法并行化后在大数据环境 中具有较好的计算性能。 |
关键词: 协同过滤;计算效率;加速比;Hadoop;Mahout |
中图分类号: TP391
文献标识码: A
|
基金项目: 面向服务工作流的信息系统绩效审计智能算法研究(GGSS2015-06). |
|
Research on Parallel ASUCF Algorithm Based on Hadoop and Mahout |
CAO Ping
|
( Nanjing Audit University, Nanjing 211815, China)
|
Abstract: Aiming to solve the CF's (Collaborative Filtering) computing efficiency problem in big data processing,the paper proposes parallel ASUCF(Average Similarity of User-Item Collaborative Filtering) algorithm.It applies the MapReduce parallel-programming model in Hadoop platform,which improves the CF's computational efficiency in big data processing on a single PC.Combined with Mahout,the parallelization of ASUCF is achieved.The paper designs speedup experiments on different data sets.The experiment results prove that the parallel algorithm brings out better computing performance in big data processing. |
Keywords: collaborative filtering;computing efficiency;speedup;Hadoop;Mahout |