细分到菜谱推荐方面,推荐系统大多使用协同过滤的推荐算法,存在用户行为过少难以推荐,冷启动困难以及众多新物品难以被发现等问题。本文改变思路,使用基于内容的推荐算法,侧重找到食谱与食谱之间内在的相似性,而不是依靠
用户的历史行为进行推荐,能够有效解决以上问题。同时还对系统的其他部分进行研究,定义了如何对菜谱进行结构化的设计与存储,如何对推荐的结果进行回测和评价分析,通过搭建网站作为前台显示页面的技术等。研究过程中主要使用 Java 代码来实现推荐算法,食谱的结构化存储使用MySQL 数据库,系统设计开发完成后通过网页的形式来进行展示。使用 jsp 技术编写网页,Tomcat 进行托管,设计模式使用 MVC 架构,用到 DAO 模式来完成与数据库的连接,主要开发工具为 Intellij IDEA。
关键词:推荐系统,基于内容的推荐算法,文本相似度
Abstract
Segmentation to the diet recommended, most of the recommended system using
collaborative filtering recommended algorithm, there are too few user behavior is
difficult to recommend, cold start difficult and many new items difficult to be found
and other issues. This article changes the idea, the use of content-based
recommendation algorithm, focusing on finding the inherent similarity between
recipes and recipes, rather than relying on the user's historical behavior to recommend,
can effectively solve the above problems. At the same time, it also studies the other
parts of the system, defines how to design and store the recipe, how to evaluate and
evaluate the recommended results, and build the website as the technology of
displaying the page in the foreground.
In the course of the study, Java code is used to implement the recommended
algorithm. The structured storage of the recipe uses the MySQL database. After the
system design and development is completed, it is displayed in the form of web pages.
Using jsp technology to write web pages, Tomcat hosting, design mode using MVC
architecture, using DAO mode to complete the connection with the database, the main
development tool for the Intellij IDEA.
Keywords: recommendation system, content - based recommendation algorithm, text
similarity
摘要 .............................................................. I
目录 ............................................................ III
1 绪论 ............................................................ 1
1.1 课题研究的背景和意义 ............................................ 1
1.2 国内外研究现状 .................................................. 2
1.3 课题的主要工作以及论文结构 ...................................... 3
2 基于内容的推荐系统的概述 ........................................ 5
2.1 引言 ............................................................ 5
2.2 数据的收集和存储 ................................................ 5
2.3 基于内容的推荐算法 .............................................. 6
2.4 对推荐结果的评价 ................................................ 7
2.5 本章小结 ........................................................ 8
3 基于内容的菜谱推荐方法 .......................................... 9
3.1 引言 ............................................................ 9
3.2 对文本进行分词 .................................................. 9
3.3 文本相似度计算 ................................................. 10
3.4 基于同义词词林扩展的相似度计算 ................................. 12
3.5 推荐算法实验 ................................................... 15
3.6 本章小结 ....................................................... 19
4 推荐系统的搭建与评价 ........................................... 20
4.1 引言 ........................................................... 20
4.2 推荐系统架构 ................................................... 20
4.3 推荐系统的功能 ................................................. 21
4.4 对推荐结果的评价 ............................................... 24
4.5 本章小结 ....................................................... 26
5 推荐系统性能测试与展望 ......................................... 27
5.1 引言 ........................................................... 27
5.2 系统性能测试和优化 ............................................. 27
5.3 对推荐系统的展望 ............................................... 30
5.4 本章小结 ....................................................... 31
6 总结 ........................................................... 32
参考文献 ......................................................... 33
翻译部分 ......................................................... 36