How to proceed the data Model and system Architecture Design of Graph Database Nebula Graph 04/16 Update SLTechnology News&Howtos

How to proceed the data Model and system Architecture Design of Graph Database Nebula Graph

2025-04-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)05/31 Report--

This article shows you how to map the database Nebula Graph data model and system architecture design, the content is concise and easy to understand, absolutely can make your eyes bright, through the detailed introduction of this article, I hope you can get something.

The following mainly introduces the data model and system architecture design of Nebula Graph.

Directed attribute graph DirectedPropertyGraph

Nebula Graph uses an easy-to-understand directed attribute graph to model, that is, logically, a graph consists of two kinds of graph elements: vertices and edges.

Vertex Vertex

In Nebula Graph, vertices are composed of label tag and attribute groups corresponding to tag. Tag represents the type of vertices, and attribute groups represent one or more attributes owned by tag. A vertex must have at least one type, the label, or there can be more than one type. Each tag has a corresponding set of attributes, which we call schema.

As shown in the figure above, there are two types of tag vertices: player and team. Player's schema has three attributes ID (vid), Name (sting) and Age (int), and team's schema has two attributes ID (vid) and Name (string).

Like Mysql, Nebula Graph is a strong schema database where the name and data type of the attribute are determined before the data is written.

Edge Edge

In Nebula Graph, edges are composed of type and edge attributes, while edges in Nebula Graph are all directed edges, which indicates the relationship between one vertex (starting src) and another vertex (end dst). In addition, in Nebula Graph, we call the edge type edgetype, and each edge has only one edgetype, and each edgetype defines the schema of this edge attribute.

Going back to the above illustration, there are two types of edges, one is the like relationship of player pointing to player, the attribute is likeness (double), the other is the serve relationship of player pointing to team, the two attributes are start_year (int) and end_year (int).

It is important to note that between start point 1 and end point 2, there can be multiple edges of the same or different types.

Graph segmentation GraphPartition

Because the number of nodes in a very large-scale relational network is as high as ten billion to hundreds of billions, and the number of edges will be as high as trillions, even if only storage points and edges are far larger than the capacity of ordinary servers. Therefore, there needs to be a way to cut the graph elements and store them on different logical slicing partition. Nebula Graph uses edge segmentation, the default sharding policy is hash, and the number of partition is statically set and cannot be changed.

Data model DataModel

In Nebula Graph, each vertex is modeled as a key-value, which is hashed according to its vertexID (or vid for short) and stored on the corresponding partition.

A logical edge will be modeled as two separate key-value in Nebula Graph, called out-key and in-key, respectively. The starting point of out-key and this edge are stored on the same partition, and the end point of in-key and this edge are stored on the same partition.

System architecture Architecture

Nebula Graph includes four main functional modules, namely, storage layer, metadata service, computing layer and client.

Storage layer Storage

In Nebula Graph, the corresponding process of storage layer is nebula-storaged, and its core is distributed Key-valueStorage based on Raft (consistency algorithm used to manage log replication) protocol. The main storage engines currently supported are "Rocksdb" and "HBase". Raft protocol maintains the consistency of data by means of leader/follower. Nebula Storage mainly adds the following features and optimizations:

Parallel Raft: allows the same partiton-id on multiple machines to form a Raft group. Concurrent operations are implemented through multiple sets of Raft group.

The multi-machine synchronization of Write Path & batch:Raft protocol depends on log id ordering, so the throughput throughput is low. Higher throughput can be achieved through batch and out-of-order submission.

Learner: learner based on asynchronous replication. When you add a new machine to the cluster, you can mark it as learner first and pull data from leader/follower asynchronously. When the learner catches up with leader, it is marked as follower and participates in the Raft protocol.

Load-balance: for some machines with heavy access pressure, migrate the partition they serve to cooler machines to achieve better load balancing.

Metadata service layer Metaservice

The corresponding process of Metaservice is nebula-metad, and its main functions are:

User management: the user system of Nebula Graph includes Goduser, Admin, User and Guest. Each user has different rights to operate.

Cluster configuration management: supports new servers online and offline.

Image space management: add and delete picture space, modify picture space configuration (number of Raft copies)

Schema Management: Nebula Graph is designed for strong schema.

The types of fields of the properties of Tag and Edge are recorded through Metaservice. Supported types are: integer int, double precision type double, time data type timestamp, list type list, etc.

Multi-version management, support to add, modify and delete schema, and record its version number

TTL management, which supports automatic data deletion and space recovery by identifying expired time-to-live fields

The MetaService layer is a stateful service, and its state persistence method is stored in the same way as the Storage layer through KVStore.

Computing layer Query Engine & Query Language (nGQL)

The corresponding process in the computing layer is nebula-graphd, which is composed of completely peer-to-peer stateless and unrelated computing nodes, and there is no communication between computing nodes. The main function of the * * Query Engine * * layer is to parse the nGQL text sent by the client, generate the execution plan through lexical parsing Lexer and syntax parsing Parser, and hand over the execution plan to the execution engine after optimization. The execution engine obtains the schema of graph points and edges through MetaService, and obtains the data of points and edges through the storage engine layer. The main optimizations of the Query Engine layer are:

Asynchronous and concurrent execution: since both IO and network are long-delay operations, asynchronous and concurrent operations are required. In addition, to avoid the impact of a single long query on subsequent query,Query Engine, a separate resource pool is set for each query to ensure the quality of service QoS. Jiaozuo traditional Chinese Medicine Gastrointestinal Hospital: https://www.jianshu.com/p/b8966d1a468e

Computational sinking: in order to prevent the storage layer from sending too much data back to the computing layer to occupy valuable bandwidth, operators such as conditional filtering where will be sent to the storage layer node along with the query conditions.

Execution plan optimization: although it has been a long time to implement plan optimization in relational database SQL, there are few researches on graph query language optimization in the industry. Nebula Graph explores the execution plan optimization of graph queries, including execution plan caching and concurrency execution of context-free statements.

Client API & Console

Nebula Graph provides clients in C++, Java and Golang. The communication mode between Nebula Graph and the server is RPC, and the communication protocol is Facebook-Thrift. Users can also operate on Nebula Graph through console on Linux. Web access is currently under development.

The above content is how to progress the data model and system architecture design of the database Nebula Graph. Have you learned the knowledge or skills? If you want to learn more skills or enrich your knowledge reserve, you are welcome to follow the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.