摘 要
随着AI时代的深入发展,计算机和电子产品已经成为现代生活不可或缺的一部分,与此同时,人们对它们的依赖性与日俱增。然而,在技术进步的背后,网络安全威胁亦随之加剧,恶意代码如病毒、木马等以惊人的速度繁衍扩散,其中大部分属于已知类型变异而来,对个人隐私和信息安全构成了前所未有的挑战。
本文提出了一种基于卷积神经网络的恶意代码分类系统,旨在应对当前恶意代码数量指数级增长及变种多样化的挑战。该系统首先通过改进B2M算法,在本研究中,我们采用了一种创新的方法,将恶意软件的反汇编代码转换成正方形的图形表示,并运用全局特征提取技术Gist来提取图像的关键特征。基于这些特征,我们设计并优化了一种卷积神经网络(CNN)模型,通过调整卷积层的参数、采用学习率衰减策略以及引入批量归一化等技术来提升模型的性能。实验结果显示,该系统在识别和分类恶意软件方面具有较高的准确率,并且误判的可能性较低。
此外,我们还开发了一个具有良好用户界面的系统,它不仅能够展示恶意软件样本的图形化表示,还能预测其可能的家族归属,从而增强用户对潜在威胁的识别和防护能力。通过实验,我们验证了该方法在识别不同种类的恶意代码方面的有效性,它不仅具有较高的正确识别率,而且误报率也得到了有效控制。同时,我们还构建了一套专门用于预测恶意代码家族特征的系统,以进一步提升我们的分析能力。该系统不仅能精准预测未知样本所属的家族类型,还可以直观地展示恶意代码样本对应的图像特征,极大地提升了用户对潜在恶意软件的防御意识和快速反应能力。
关键词:恶意代码分类;卷积神经网络;Gist特征提取;系统
ABSTRACT
With the deepening of the AI era, computers and electronic products have become an indispensable part of modern life, and at the same time, people are increasingly dependent on them. However, behind the technological progress, the threat of network security also intensifies, malicious code such as viruses, Trojan horse reproduction and spread at an amazing speed, most of which belong to the known type of mutation, posing an unprecedented challenge to personal privacy and information security.
This paper presents a malicious code classification system based on convolutional neural network, aiming to meet the challenge of exponential growth and diversification of the number of malicious codes. The system first improves the B2M algorithm to convert the disassembly files of malicious code into square images, and feature extracted the images using the Gist global feature extraction method. Subsequently, the convolutional neural network models are constructed and optimized for training, including strategies for optimizing the convolution kernel, learning rate degradation, and batch normalization. The experimental results show that the system has high accuracy rate and low false positive rate of malicious code family classification. In addition, this paper also developed a graphical interface system that can display malicious code sample image and predict its family species, thus improving the user's prevention ability and vigilance against malicious code.
After experimental verification, this method can effectively identify the categories of various malicious code, not only high correct identification rate, but also low false positives rate. In addition, the researchers also developed a system to predict the properties of malicious code family. The system can not only accurately predict the family types of unknown samples, but also intuitively display the corresponding image characteristics of malicious code samples, which greatly improves the user's defense awareness and rapid response ability to the potential malware software.
Keywords: Malicious code classification; Convolutional neural network; disassembly image; Gist feature extraction; Python implementation
目 录
摘 要
ABSTRACT
1 绪论
1.1 研究背景与意义
1.2 国内外研究现状
1.3 技术路线
2 相关知识背景
2.1 恶意代码检测概述
2.2 深度学习技术基础
2.2.1卷积神经网络基本结构
2.2.2激活
2.2.3池化
3 基于卷积神经网络的恶意代码分类方法
3.1 卷积神经网络的恶意代码分类
3.1.1卷积网络结构
3.1.2 深度学习分类器
3.2公共数据集介绍
3.3数据预处理
3.4 评价指标
3.5 实验环境介绍
3.6 实验数据展示
3.6.1 实验结果与分析
3.6.3 TensorBoard可视化
4 交互页面展示
4.1 界面分析总框架
4.2 效果展示系统的设计与实现
4.2.1 开发环境
4.2.2 系统设计
4.2.3 系统功能实现
参考文献
致 谢