基于卷积神经网络的图像风格迁移研究
摘要
图像风格迁移(IST)是指通过算法抽取名画的风格信息,然后将这种风格迁移到给定内容图片的技术。研究人员围绕IST领域的风格建模和图像重建技术,开展了大量的研究工作。因此本文围绕着该领域的两大关键技术进行了下述的研究。
首先,根据采用的关键技术不同,本文针对四种不同类型的IST算法进行实现,并选择其中两种算法——基于卷积神经网络的IST算法(CNN-IST)和基于Perceptual Loss的实时PSPM算法(PL-PSPM)进行详细地阐述。在两个算法中,本文都按以下顺序做了详细地介绍:(1)算法研究框架和网络结构,主要介绍反射填充、残差层和分数步长卷积等;(2)算法涉及的数学理论;(3)实验细节与性能评估,主要介绍论文的实验细节和参数设置并且对论文的复现结果进行定量和定性的分析。其次,在已有的评估方法基础上,本文尝试提出一些合理的评估方法对四种IST算法进行比较,主要从以下两个方面进行评估:(1)三向权衡评估,从风格迁移的结果质量、算法效率和风格灵活性进行对比;(2)损失指标评估,从训练损失曲线和最终损失进行对比分析。
此外,本文还对基于特征变换的ASPM算法(FT-ASPM)进行了算法拓展性实验,主要包括下述两个实验:(1)基于纹理合成的多风格融合迁移,该实验基于纹理合成的方法将多种风格通过线性插值的方式进行融合迁移;(2)基于掩膜操作的特定区域风格迁移,该实验基于掩膜的操作将特定的风格迁移到感兴趣区域。归结到数学理论上,上述进行的两个拓展性实验仅仅采用了简单的线性插值和掩膜的逻辑运算,因此可以非常容易地迁移和运用到最新的IST算法中,给用户提供更多的风格化选择。
最后,本文根据上述提出的评估方法总结了四种不同类型的IST算法的优缺点,并结合当前的发展情况提出对该领域未来的展望。
关键词: 图像风格迁移,卷积神经网络,CNN-IST,PL-PSPM,FT-ASPM
Abstract
Image Style Transfer (IST) is a technology that extracts the style information from famous paintings through algorithm learning and then transfers this style to a given content image. Researchers focusing style modeling and image reconstruction technology have carried out lots of research work in IST field. Therefore, the following research is conducted around two key technologies in this field.
Firstly, according to the difference of key technologies used, this paper implements four different IST algorithms and chooses two of them, IST based on Convolutional Neural Network (CNN-IST) and real-time PSPM algorithm based on Perceptual Loss (PL-PSPM), to elaborate in detail. In both algorithms, this paper is detailedly introduced in the following order. (1) Research framework and network structure. The section introduces reflection padding, residual layer and fractionally stride convolutions, etc. (2) Mathematical theory. (3) Experimental details and performance evaluation. After introducing the experimental details and parameter settings of the paper, the section mainly makes quantitative and qualitative analysis of the algorithm results. Secondly, on the basis of the existing evaluation methods, this paper tries to propose some reasonable evaluation methods to compare the four IST algorithms. The chapter mainly evaluates these algorithms from the following two aspects. (1) Three-way trade-off evaluation. The section compares the result quality, algorithm efficient and style flexibility of IST. (2) Loss indicator evaluation. The evaluation compares and analyzes the training loss curve and the final loss.
In addition, this paper also conducts an extension experiment for these algorithms using the ASPM algorithm based on Feature Transformation (FT-ASPM). The chapter mainly includes the following two experiments. (1) Multi-style fusion transfer based on texture synthesis. The experiment based on texture synthesis fuse multiple styles by linear interpolation. (2) Specific regional style transfer based on mask operation. Based on a mask operation, the experiment transfers a particular style to the region of interest. It comes down to mathematical theory that the two extended experiments described above use only simple linear interpolation and logic operations, so it can be easily applied to the latest IST algorithm, giving users more stylized choices.
Finally, this paper summarizes the advantages and disadvantages of four different IST algorithms based on the evaluation methods proposed above, and puts forward the future prospects in combination with the current development of the field.
Key words: Image Style Transfer, Convolutional Neural Network, CNN-IST, PL-PSPM, FT-ASPM
目 录
1 绪论
1.1 研究背景
1.2 研究目的和意义
1.2.1 研究的应用价值
1.2.2 研究的前沿性和学术性
1.3 研究内容
1.3.1 风格建模
1.3.2 图像重建
1.4 研究现状和挑战
1.4.1 评估方法
1.4.2 理论支撑
1.4.3 风格迁移的三向权衡
1.5 前人研究工作
1.5.1 IOB-IST算法
1.5.2 MOB-IST算法
1.6 论文结构和章节安排
2 基于CNN的图像风格迁移算法
2.1 CNN-IST 研究框架
2.2 CNN-IST 网络基本结构
2.2.1 二维卷积层
2.2.2 激活函数
2.2.3 池化层
2.2.4 全连接层
2.3 CNN-IST 算法理论
2.4 实验设计与算法评估
2.4.1 实验细节和参数设置
2.4.2 定量评估
2.4.3 定性评估
3 基于感知损失的实时PSPM算法
3.1 PL-PSPM 网络框架
3.1.1 反射填充
3.1.2 残差块
3.1.3 分数步长卷积
3.1.4 批标准化
3.2 PL-PSPM算法理论
3.3 实验设计与算法评估
3.3.1 实验细节和参数设置
3.3.2 定性评估
3.3.3 定量评估
4 算法对比
4.1 三向权衡评估
4.1.1 质量评估
4.1.2 算法效率和灵活性评估
4.2 损失指标评估
4.2.1 训练损失比较
4.2.2 最终损失评估
5 算法拓展
5.1 FT-ASPM算法理论
5.2 基于纹理合成的多风格融合迁移
5.3 基于掩膜操作的特定区域风格迁移
总结与展望
本文的总结
本文的展望
参考文献
致谢