摘要
在21世纪初,互联网进入了一个新的发展高峰,数据量呈指数增长,对海量数据的管理、访问的需求变得越来越迫切。随着云计算和大数据技术的兴起,各种大规模数据级应用程序不断涌现,这些新应用程序对数据存储的需求日益增加。但是,传统的关系数据库(如Oracle和MySQL)越来越难以满足云计算环境的数据存储需求。分布式非关系数据库充分结合分布式系统抗单点故障能力和天然横向扩展性的特点,可以应对海量数据存储的挑战。 NoSql类型数据库近年来一直蓬勃发展。常见的nosql产品有键值类型,文档类型,列存储类型,图形存储类型,xml数据库等。但是,最流行和最广泛使用的nosql类型是键值类型。 Redis是键值类型的代表。在使用Redis的过程中,我深深感受到了Redis的不便,因此我实现了一个高性能的分布式键值存储系统Mondis。 Mondis具有以下基本功能或创新点:
(1)以现有的分布式理论为基础,研究了分布式存储的相关技术,介绍了分布式 NoSQL 数据库的概念、特点以及相关基础理论,并分析了目前应用广泛的键值存储系 统的架构与优缺点。在此基础之上,设计了一个高性能,高可用,分布式的键值存储系统Mondis。
(2)在使用Redis的过程中深感不便的情况下,研读了Redis的源代码实现,给键值存储添加了数据结构嵌套特性,使得Mondis可以保存任意结构化的数据,这是本文的主要创新点。
(3)研究基于 Raft 一致性协议解决主从复制过程中主节点宕机的问题,从而来保证可用性和数据的最终一致性。
(4)基于上述理论,实现了一个分布式键值存储引擎 Mondis,开发语言为 C++,在其中部署了改进的 Raft 一致性协议来实现强一致性的数据复制方式,从而 来保证可用性,并针对该系统做了 详细的测试,在此基础上提出了今后的改进计划。
本文所使用的的java客户端可通过此链接获得:
关键词 键值存储;分布式;高性能;嵌套;json
ABSTRACT
In the early 21st century, the Internet entered a new peak of development, the amount of data increased exponentially, and the management of massive data was increased. The demand for accessibility became more and more urgent. With the rise of cloud computing and big data technologies, a variety of massive data-scale applications are emerging, and these new applications are placing increasing demands on data storage. However, traditional relational databases, such as Oracle and MySQL, have become increasingly difficult to meet the data storage needs of cloud computing environments. The distributed non-relational database fully combines the characteristics of the distributed system against single point of failure and natural horizontal scalability, and can cope with the challenges of massive data storage. The NoSql type database has been booming in recent years. Common nosql products have key-value type, document type, column storage type, graph storage type, xml database and so on. But in nosql, the most popular and widely used key-value type is Redis. In the process of using Redis, I deeply felt the inconvenience of Redis, so I realized a high-performance, distributed key-value storage system, Mondis. Mondis has the following basic features or innovations:
(1) Based on the existing distributed theory, the related technologies of distributed storage are studied. The concept, characteristics and related basic theories of distributed NoSQL database are introduced. The architecture of the widely used key storage system is analyzed. With advantages and disadvantages. Based on this, a high-performance, high-availability, distributed key-value storage system, Mondis, was designed.
(2) In the case of deep inconvenience in the process of using Redis, the source code implementation of Redis was studied, and the data structure nesting feature was added to the key value storage, so that Mondis can save any structured data, which is the The main innovation point.
(3) Research on the Raft-consistent protocol to solve the problem of master node downtime in the master-slave replication process, so as to ensure the ultimate consistency of availability and data.
(4) Based on the above theory, a distributed key-value storage engine, Mondis, is developed. The development language is C++, and an improved Raft consistency protocol is deployed to implement a strong and consistent data replication method to ensure availability. The system has been thoroughly tested and based on this, a future improvement plan is proposed.
You can get java client of the article from the following link:
KeyWords key-value storage; distributed; high performance; nested; json