基于CUDA C并行计算架构的大规模图着色算法毕业论文+任务书+开题报告+文献资料+外文翻译及原文+答辩PPT+代码及测试数据-毕业作品网站

设计任务书文档开题答辩说明书格式模板外文翻译范文资料作品文献课程实习指导调研下载网络教育 计算机 网站网页 小程序 商城购物订餐电影安卓 Android Html Html5 SSM SSH Python 爬虫大数据 管理系统 图书校园网考试选题网络安全推荐系统机械模具夹具自动化数控车床汽车故障诊断电机建模 机械手 去壳机千斤顶变速器减速器图纸电气变电站电子 Stm32 单片机 物联网 监控密码锁 Plc 组态控制智能 Matlab 土木建筑结构框架教学楼住宅楼造价施工办公楼给水排水桥梁刚构桥水利重力坝水库采矿环境化工固废工厂 视觉传达 室内设计产品设计 电子商务 物流盈利案例分析评估报告营销报销会计

首　页

机械毕业设计

电子电气毕业设计

计算机毕业设计

土木工程毕业设计

视觉传达毕业设计

理工论文

文科论文

毕设资料

帮助中心

设计流程

您现在所在的位置：首页 >>计算机毕业设计 >> 文章内容

我们提供全套毕业设计和毕业论文服务，联系微信号：biyezuopinvvp QQ：1015083682

基于CUDA C并行计算架构的大规模图着色算法毕业论文+任务书+开题报告+文献资料+外文翻译及原文+答辩PPT+代码及测试数据

文章来源：www.biyezuopin.vip 发布者：毕业作品网站

摘要
随着当前信息时代的日益发展，数据量的逐渐增大，大规模图数据的处理在网页搜索应用，社交网络和交通网络等场景中作用也愈发明显。因此，大规模图着色的研究也开始成为高性能计算领域的热点。而通用GPU（Graphics Processing Unit）的发展使其作用逐渐广泛，用于图着色的加速也成了一种研究的趋势。但调研后，现有的较广泛的并行图着色均存在以下问题中的部分或全部：反复着色，在着色时，后续迭代轮次中顶点颜色会导致已着色顶点颜色出错，需要重新修改；并行度不够，每轮着色中，正确的着色顶点比例稀少；长尾问题，在着色最后，每轮着色顶点数大幅度减少；负载不均衡，每个线程运行时间开销相差过大，导致整体性能较差。
设计的新的基于并行架构的图着色算法，解决了反复着色，并行度不够以及长尾的问题。首先，预处理图数据，将其变成只有小编号顶点指向大编号顶点的有向图，避免了反复着色，也确保了算法能在有限轮次后终止。此外，配合使用CSR和CSC的存储格式保存图数据，可以做到一定程度的连续访存，并减少修正工作一半的工作量。其次，着色过程采用无冲突的并行着色思路，每一轮分为着色和修正两个步骤，提高了并行度，并可以避免原子操作。最后，在未着色顶点数剩余极少时，避免多次迭代，将着色策略变成串行，解决长尾问题。
本算法使用CUDA C语言编程设计完成，提供了针对不同的图数据的接口参数。通过在相同环境下针对普遍的大规模图数据进行的实验，结果表明，该算法相较于CPU下的贪心着色算法有1.51-125.81倍的性能提升，相较于JPL（Jones-Plassman-Luby）有8.37-71.07倍的性能提升。
关键词：图着色；图形处理器；并行运算技术；两阶段
Abstract
With the increasing amount of data in the current information age, the processing of large-scale map data has become more and more obvious in the fields of web search applications, social networks and traffic networks. Therefore, the study of large-scale graph coloring has also become a hot spot in the field of high-performance computing. The development of general-purpose GPUs has gradually become more and more effective, and the acceleration for graph coloring has become a research trend. However, after the investigation, the existing parallel graph coloring has the following problems: repeated coloring. When coloring, the color of the vertex in the subsequent iteration round will cause the color of the shaded vertex to be wrong and needs to be re-modified; the degree of parallelism is not enough, in each round of coloring, The correct coloring vertex ratio is sparse; the long tail problem, at the end of the coloring, the number of colored vertices per round is greatly reduced; the load is unbalanced, and the running time overhead of each thread is too large, resulting in poor overall performance.
The parallel graph coloring algorithm designed solves the problem of repeated coloring, insufficient parallelism and long tail. First, the map data is preprocessed, turning it into a directed graph with only small numbered vertices pointing to the large numbered vertices, avoiding repeated coloring and ensuring that the algorithm can terminate after a finite number of rounds. In addition, in conjunction with the use of CSR and CSC amount storage format to save map data, a certain degree of continuous memory access can be achieved, and the workload of partial correction can be reduced. Secondly, the coloring process adopts a collision-free parallel coloring idea. Each round is divided into two steps of coloring and correction, which improves the degree of parallelism and avoids atomic operations. Finally, when the number of vertices remains very small, the coloring strategy is changed to serial to solve the long tail problem.
The algorithm is designed using CUDA C language programming and provides interface parameters for different graph data. Experiments on general large-scale graph data in the same environment show that the algorithm has a performance improvement of 1.51-125.81 times compared with the greedy coloring algorithm, and has a performance improvement of 8.37-71.07 times compared with JPL.
Keywords：Graph Coloring， GPU，CUDA，Two stages
目录
摘要 I
Abstract II
1 绪论 1
1.1 课题背景 1
1.2 国内外研究现状 1
1.3 研究目的和主要内容 4
1.4 论文结构 6
2 相关背景 7
2.1 CUDA相关知识 7
2.2 数据存储 13
2.3 难点分析 14
2.4 本章小结 16
3 算法实现 17
3.1 算法思路 17
3.2 算法分析 22
3.3 算法关键技术 24
3.4 本章小结 29
4 性能评估 30
4.1 实验设置 30
4.2 切换执行阶段的时间 31
4.3 与最新的技术比较 34
4.4 在不同的GPU设备上运行本算法 35
4.5 本章小结 36
5 总结与展望 37
致谢 39
参考文献 40

全套毕业设计论文现成成品资料请咨询微信号：biyezuopinvvp QQ：1015083682 返回首页如转载请注明来源于www.biyezuopin.vip

打印本页 \| 关闭窗口
上一篇文章：基于CUDA C并行计算架构的大规模图着色算法开题报告	下一篇文章：Enterprise:图在GPU上的宽度优先遍历

本类最新文章

基于MatlabSimulink …	35kV输电线路继电保护的设计 …	分布式风电场低电压穿越故障建模与 …
大学生内容分享和社交平台的设计与 …	基于SSM框架的企业人事薪酬管理 …	基于大模型的代码注释自动生成与维 …

| 关于我们 | 友情链接 | 毕业设计招聘 |

Email：biyeshejiba@163.com 微信号：biyezuopinvvp QQ：1015083682
本站毕业设计和毕业论文资料均属原创者所有,仅供学习交流之用,请勿转载并做其他非法用途.如有侵犯您的版权有损您的利益,请联系我们会立即改正或删除有关内容!