基于AFL的覆盖引导模糊测试优化技术研究
摘 要
随着互联网的发展,应用软件已经渗透至各行各业,极大地方便人们的生产和生活,随之而来的是软件的安全问题。软件中的缺陷被称为漏洞,漏洞的存在除了影响软件的正常运行外,更大的威胁是被攻击者发现并利用,进而造成隐私泄露和财产损失。模糊测试作为一种自动化漏洞挖掘技术具有可扩展性强、专业要求低、适用性广等优势。覆盖引导模糊测试是模糊测试领域近几年来最成功的技术分支,它的核心思想是跟踪代码覆盖范围,并利用覆盖信息作为反馈来指导输入的生成,使测试朝着更高的代码覆盖率的方向前进。AFL是最流行的覆盖引导模糊测试器,其凭借高效性和易扩展性受到安全研究者的青睐,学术界已经形成了基于AFL的覆盖引导模糊测试技术生态圈。基于AFL的覆盖引导模糊测试主要包括预处理、种子选择、确定变异、随机变异、执行追踪和反馈指导六个模块,对以上任意一个模块进行合理地优化仍然可以提高模糊测试器的速度、效率或针对性。经过前期的调研分析工作发现技术生态圈仍然存在如下问题:第一,在随机变异阶段,现有随机变异调度策略无法为种子的每个变异位置选择最合适的变异算子。第二,在确定变异阶段存在大量的无效变异,将无效变异生成的测试用例输入到目标程序中执行会造成性能浪费。第三,在预处理阶段为目标程序插入的记录覆盖信息的桩代码存在哈希冲突,导致收集到的覆盖信息不准确。针对上述问题,本文分别设计了优化方案,并实现了新的模糊测试器,主要工作如下:
(1)提出基于强化学习的随机变异调度策略。变异调度是指随机变异阶段每一次变异时变异算子的选择,现有的变异调度策略能为不同种子选择出较适合的变异算子,但是无法为种子的每个变异位置选择最有效的变异算子,针对上述问题提出一种位置自适应变异调度策略,该技术通过一种自定义的双层多臂老虎机模型为变异位置和变异算子建立联系,并且采用置信区间上界算法解决模糊测试场景中的多臂老虎机问题。基于AFL实现了位置自适应的模糊测试器Position-Adaptive Mutation Scheduling Strategy AFL(PAMSSAFL),实验结果表明,位置自适应的变异调度策略能明显提升模糊测试器的bug发现能力和覆盖能力。
(2)提出轻量级自适应敏感区域确定变异技术。确定变异阶段存在大量无效变异,将无效变异生成的测试用例输入到目标程序中执行会造成性能浪费。针对上述问题提出一种自适应敏感区域变异算法,该算法将变异后执行新路径的字节标记为敏感字节,并基于敏感字节延展形成敏感区域,将后续变异限制在敏感区域内,并且根据测试用例的执行结果自适应地调整种子的敏感区域,实现减少无效变异的目的。此外,为了更深入地探索敏感区域,设计了新的种子选择策略和修剪策略来配合敏感区域变异。将自适应敏感区域变异算法集成到AFL中,并将其命名为Sensitive-region-based Mutation American Fuzzy Lop(SMAFL)。实验结果表明,自适应敏感区域变异算法能够提升模糊测试器的探索效率。
(3)提出基于位运算的覆盖信息记录方案。覆盖引导模糊测试利用覆盖信息指导探索方向,所以覆盖信息的准确性一定程度上决定了模糊测试器发现漏洞的能力。AFL利用一个位图存储边的命中次数作为覆盖信息,每条边都被随机分配到位图中的一个位置,并在该位置上记录命中次数。由于分配的随机性,不同的边可能在位图中的同一个位置存储命中计数,即发生哈希冲突。哈希冲突导致覆盖信息不准确,甚至可能影响模糊测试器的探索方向。针对上述问题,对造成哈希冲突的情况进行定量分析,根据分析结果提出了一种基于位运算的哈希冲突缓解方案,该方案将目标程序分为小型程序和大型程序,对小型程序采用位运算解决哈希冲突,对大型程序采用再哈希法缓解哈希冲突。在AFL之上实现该方案形成了BitAFL,实验结果表明该方案能为AFL提供更准确的覆盖信息并提升覆盖率。
(4)实现了新的覆盖引导模糊测试器Smart Coverage-guided Fuzzer(SmartCGF)。将基于强化学习的随机变异调度策略、轻量级自适应敏感区域确定变异技术和基于位运算的覆盖信息记录方案进行有机融合,实现了新的覆盖引导模糊测试器SmartCGF。实验结果表明,在可接受的额外开销的前提下,SmartCGF的漏洞挖掘能力和覆盖能力提升显著。
关键词:软件漏洞,漏洞挖掘,模糊测试,覆盖率,遗传算法,变异
Abstract
With the development of the Internet, application software has penetrated into all walks of life, greatly facilitating people's production and life, followed by software security issues. The defects in software are called bug. In addition to affecting the normal operation of software, the existence of bugs is more threatening to be found and used by attackers, thus causing privacy leakage and property losses. As an automated vulnerability mining technology, fuzzing has the advantages of strong scalability, low professional requirements, and wide applicability. Coverage-guided fuzzing is the most successful technology branch in the field of fuzzing in recent years. Its core idea is to track code coverage and use coverage information as feedback to guide the generation of input, so that testing moves towards higher code coverage. AFL is the most popular coverage-guided fuzzer, which is favored by security researchers because of its efficiency and scalability. The academic community has formed an AFL based coverage-guided fuzzing technology ecosystem. The coverage-guided fuzzing based on AFL mainly includes six modules: pretreatment, seed selection, deterministic mutation, random mutation, execution tracking and feedback guidance. Reasonable optimization of any of the above modules can still improve the speed, efficiency or pertinence of the fuzzer. After the previous research and analysis work, it is found that the following problems still exist in the technology ecosystem: First, in the random mutation stage, the existing random mutation scheduling strategy cannot select the most appropriate mutation operator for each mutation position of the seed. Second, there are a lot of invalid mutations in the deterministic mutation stage, and it will waste performance to input the test cases generated by invalid mutations into the target program for execution. Third, there is hash conflict in the instrumentation code inserted for the target program to record the coverage information in the preprocessing stage, resulting in inaccurate coverage information collected. In view of the above problems, this paper has designed optimization schemes and implemented new fuzzers. The main work is as follows:
(1) A random mutation scheduling strategy based on reinforcement learning is proposed. The mutation scheduling refers to the selection of mutation operators at each mutation in the random mutation stage. The existing mutation scheduling strategy can select more suitable mutation operator for different seeds, but it is unable to select the most effective mutation operator for each mutation position of a seed. To solve the above problem, a position-adaptive mutation scheduling strategy is proposed. This technology establishes the relationship between the mutation position and the mutation operator through a user-defined double-layer multi-armed bandit model, and uses the Upper Confidence Bound algorithm to solve the multi-armed bandit problem in the fuzzing scenario. PAMSSAFL(Position-Adaptive Mutation Scheduling Strategy AFL), a position-adaptive fuzzer, is implemented based on AFL. The experimental results show that the position-adaptive mutation scheduling strategy can significantly improve the bug detection ability and coverage ability of the fuzzer.
(2) A lightweight adaptive sensitive region determination mutation technique is proposed. There are a lot of invalid mutations in the deterministic mutation stage, and it will cause performance waste to input the test cases generated by the invalid mutations into the target program for execution. To solve the above problem, an adaptive sensitive region mutation algorithm is proposed. The algorithm marks the byte executing the new path after mutation as sensitive byte, and forms a sensitive region based on the extension of sensitive byte. The subsequent mutation is limited to the sensitive region, and the sensitive region of the seed is adaptively adjusted according to the execution results of the test case to reduce invalid mutations. In addition, in order to explore the sensitive region more deeply, new seed selection strategies and trimming strategies were designed to match the sensitive region mutation. The adaptive sensitive region mutation algorithm is integrated into AFL and named Sensitive-region-based Mutation AFL (SMAFL). The experimental results show that the adaptive sensitive region mutation algorithm can improve the exploration efficiency of the fuzzer.
(3) A coverage information recording scheme based on divide and conquer is proposed. Coverage-guided fuzzing uses coverage information to guide the direction of exploration, so the accuracy of coverage information to some extent determines the ability of fuzzers to detect vulnerabilities. AFL uses a bitmap to store the hit counts of edges as coverage information. Each edge is randomly assigned to a location in the bitmap, and the hit counts are recorded at that location. Due to the randomness of the allocation, different edges may store hit counts at the same location in the bitmap, that is, hash conflicts occur. Hash conflict leads to inaccurate coverage information and may even affect the exploration direction of the fuzzer. In view of the above problems, a quantitative analysis of the situation that causes hash conflict is carried out. According to the analysis results, a hash conflict mitigation scheme based on bit operation is proposed. The scheme divides the target program into small programs and large programs. For small programs, bit operation is used to solve hash conflict, and for large programs, rehash method is used to alleviate hash conflict. The scheme is implemented on top of AFL to form BitAFL. Experimental results show that the scheme can provide more accurate coverage information for AFL and improve coverage.
(4) A smart coverage-guided fuzzer (SmartCGF) is implemented. By organically integrating the random mutation scheduling strategy based on reinforcement learning, the lightweight adaptive sensitive region determination mutation technology and the coverage information recording scheme based on divide and conquer, the smart coverage-guided fuzzer SmartCGF is realized. The experimental results show that under the premise of acceptable additional overhead, the vulnerability mining ability and coverage ability of SmartCGF are significantly improved.
Key words:Software vulnerability, vulnerability mining, fuzzing, coverage, genetic algorithm, mutation