摘要
文字是人类交流信息的重要工具,在科技和网络不断发展的今天,文本的方式或者说载体发生了很大的变化,文字不再只停留在书面,更以标识牌,横幅,广告牌等等方式出现我们的生活中,或者说,它们是一张张图片中的文本信息。用计算机检测识别这些信息将给我们的生活带来极大的便利。比如说,自动驾驶技术识别路边的各种指示牌,停车场的车牌识别,扫描录入身份证信息等等。
本毕设课题是属于计算机视觉下的目标检测与识别,对象为自然场景下的各种文本信息,通俗的说就是检测识别图片中的文本信息。由于文本的特殊性,本毕设将整个提取信息的过程可以分为检测、识别两个部分。
论文对用到的相关技术概念有一定的介绍分析,如机器学习,深度学习,以及各种的网络模型及其工作原理过程。
检测部分采用水平检测文本线方式进行文本检测,主要参考了乔宇老师团队的CTPN方法,并在正文部分从模型的制作到神经网络的设计实现对系统进行了较为详细的分析介绍。
识别部分则采用的是Densenet + CTC,对于印刷体的文字有较好的识别。
关键词:深度学习;文本检测;文本识别;CTPN;Densenet;CTC
ABATRACT
The words is an important tool for human beings to exchange information. Today, with the continuous development of science, technology and the Internet, great changes have taken place in the way or carrier of text. The text is no longer confined to writing, but is even more marked with signs and banners. Billboards and so forth appear in our lives, or rather, they are text messages in pictures. Using computer to detect and recognize these information will bring great convenience to our life. For example, autopilot technology recognizes roadside signs, license plate recognition in parking lots, scanning for ID information, and so on.
This topic belongs to the computer vision target detection and recognition, the object is the natural scene of a variety of text information, commonly said is to detect and recognize the text information in the picture. Because of the particularity of the text, the whole process of extracting information can be divided into two parts: detection and recognition.
The paper introduces and analyzes the relevant technical concepts, such as machine learning, in-depth learning, and a variety of network models and their working principles.
The detection part uses horizontal detection text line for text detection, mainly referring to the CTPN method of teacher Qiaoyu team, and in the text part, from the model making to the design and implementation of neural network, the system is analyzed and introduced in detail.
In the part of recognition, Densenet + CTC is used, which has a good recognition of printed text.
Keywords: deep learning; text detection; character recognition; CTPN;Densenet; CTC
目录
1 绪论 1
1.1 选题背景依据 1
1.2 目前的研究现状 1
2技术相关 2
2.1 tensorflow框架 2
2.2 OpenCV 2
2.3 DenseNet(Dense Convolutional Network) 3
2.4 CTC(Connectionist temporal classification) 4
2.5 faster-rcnn框架 4
2.5.1 RPN 4
2.5.2 Fast R-CNN 5
3深度学习基础概念 6
3.1卷积 6
3.2 池化 7
3.3 Padding(填充) 8
3.4卷积神经网络 8
3.5 VGG16模型 9
3.6 LSTM模型 10
4 系统详细设计 13
4.1 基本流程概述 13
4.2 图像预处理 14
4.3模型训练 14
4.3.1 数据集制作 14
4.3.2 模型制作 15
4.4 文本检测 16
4.4.1 检测过程综述 17
4.4.2 feature map(W*H*C) 17
4.4.3 滑窗 19
4.4.4 BLSTM 19
4.4.5 全连接 19
4.4.6 text proposals 19
4.4.7 文本线构造 20
4.5 文本识别 20
4.5.1 densenet 20
4.5.2 CTC 21
5 实验结果 22
5.1 数据来源 22
5.2 软件硬件环境 22
5.3 判断标准 23
5.4 CTPN在ICDAR 2011、ICDAR 2013、ICDAR 2015库的检测结果 24
5.5 检测识别结果样例 24
5.6检测识别结果分析 25
6 总结 26
6.1 总结 26
6.2 展望 26
致谢...................................................................................................................................28
参考文献...........................................................................................................................29