Overview of GlusterFS distributed file system 07/19 Update SLTechnology News&Howtos

Overview of GlusterFS distributed file system

2025-07-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)06/03 Report--

Blog Directory

I. Overview of GlusterFS

1. Features of GlusterFS

2. GlusterFS terminology

3. Modular stack architecture

II. Working principle of GlusterFS

1. Workflow of GlusterFS

2. Elastic HASH algorithm

III. Volume type of GlusterFS

1. Distributed Volume

2. Strip roll

3. Copy volume

4. Distributed Stripe Volume

5. Distributed replication volumes

I. Overview of GlusterFS

GlusterFS is an open source distributed file system and the core of Gluster, a Scale-Out storage solution. GlusterFS has strong horizontal scalability in storing data and can support petabytes of storage capacity by expanding different nodes. GlusterFS brings together disparate storage resources via TCP/IP or InfiniBand RDMA networks to provide storage services and manage data using a single global command space. GlusterFS is based on a stackable user space and a non-element design that provides excellent performance for a variety of data loads.

GlusterFS consists primarily of storage servers, clients, and an optional NFS/Samba storage gateway. The biggest design feature of GlusterFS architecture is that there is no metadata server component, which helps improve the performance, reliability and stability of the overall system. Traditional distributed file systems mostly store metadata through meta-servers, which include directory information and directory structure on storage nodes. Such designs are very efficient when browsing directories, but they also have some defects, such as single point failure. Once metadata servers fail, even if nodes have high redundancy, the whole storage system will collapse. GlusterFS distributed file system is based on no meta-server design, and its data horizontal expansion ability is strong. High reliability and storage efficiency. GlusterFS supports TCP/IP and InfiniBand RDMA high-speed network interconnection. Clients can access data through the acoustic GlusterFS protocol. Other terminals that do not run GlusterFS clients can access data through the storage gateway through the NFS/CIFS standard protocol. As shown below:

1. Features of GlusterFS

Scalability and high performance;

High availability;

Globally unified namespaces;

Based on standard protocols;

Flexible Volume Management;2. GlusterFS terminology

Brick: refers to a dedicated partition in the trusted host pool provided by the host for physical storage. It is a basic storage unit in GlusterFS and a storage directory provided externally on the server in the trusted storage pool. The format of the storage directory is composed of the absolute path of the server and directory. The representation method is SERVER: EXPORT, such as: 192.168.1.4/date/mydir/;

Volume: A logical volume is a collection of bricks. A volume is a logical device for data storage, similar to logical volumes in LVM. Most Gluster management operations are performed on volumes;

FUSE: A kernel module that allows users to create their own file systems without modifying kernel code;

VFS: kernel space provides disk access to user space;

Glusterd (background management process): runs on every node in the storage cluster;3. Modular stacked architecture

As shown in the figure below, GlusterFS adopts modular and stacked architecture, and can configure customized application environments according to requirements, such as large file storage, mass small file storage, cloud storage, multi-transfer protocol applications, etc. Complex functions can be achieved by combining modules in various ways. For example, the Replicate module can implement RAID1, the Stripe module can implement RAID0, and the combination of the two can implement RAID10 and RAID01, while achieving higher performance and reliability.

GlusterFS is a modular stack architecture design. Modules become Translators, a powerful mechanism GlusterFS provides to extend file system functionality efficiently and easily with this well-defined interface.

1) The server and client are designed to be highly modular. The module interface is compatible, and the same transstator can be loaded on both client and server.

2) All functions in GlusterFS are implemented through transstators, where the client is more complex than the server. Therefore, the focus of the function is mainly concentrated on the client;

II. The working principle of GlusterFS 1. The workflow of GlusterFS

The figure shows only a summary of GlusterFS data access.

1) Client or application accesses data through GlusterFS's hanging point;

2) Linux kernel receives and processes requests through VFS API;

3) VFS delivers data to the FUSE kernel file system and registers an actual file system FUSE with the system, while the FUSE file system delivers data to the GlusterFS client via the/dev/fuse device file. The FUSE file system can be understood as a proxy;

4) GlusterFS client receives data. client processes the data according to the configuration file;

5) After being processed by GlusterFS client, the data is transferred to the remote GlusterFS Server through the network, and the data is written to the server storage device;

2. Elastic HASH algorithm

The elastic HASH algorithm uses Davies-Meyer algorithm to obtain a 32-bit integer range through the HASH algorithm. Assuming that there are N storage units Brick in the logical volume, the 32-bit integer range is divided into N consecutive subspaces, and each space corresponds to a Brick. When a user or application accesses a namespace, it locates the Brick where the data is located according to the 32-bit integer space corresponding to the HASH value by calculating the HASH value for the namespace. The advantages are as follows:

Ensure that the data is evenly distributed in each Brick;

Resolves the dependency on metadata server, and then solves the single point of failure and access bottleneck; III. Volume type of GlusterFS

GlusterFS supports seven types of volumes: distributed volume, stripe volume, replication volume, distributed stripe volume, distributed replication volume, stripe replication volume, and distributed stripe replication volume, which can meet the high performance and high availability requirements of different applications.

Distribute volume: Files are distributed to all Brick Servers through HASH algorithm. This volume is the basis of Glusterf. Files are hashed to different Brick according to HASH algorithm. In fact, it only expands disk space. If one disk is damaged, data will be lost. It belongs to RAID0 at file level and does not have fault tolerance.

Stripe volume: Similar to RAID0, files are divided into data blocks and distributed to multiple Brick Servers in a polling manner. Files are stored in data blocks. Large file storage is supported. The larger the file, the higher the reading efficiency.

Replication volume: Synchronizes files to multiple bricks, making them have multiple copies of files, belonging to file-level RAID1, with fault tolerance. Because the data is spread across multiple bricks, read performance is greatly improved, but write performance is degraded;

Distribute Stripe Volume: The number of Brick Servers is a multiple of the number of stripes (the number of Brick distributed in blocks), which has the characteristics of both distributed and striped volumes.

Distributed replica volume: The number of Brick Servers is a multiple of the number of mirrors (the number of data replicas), and has the characteristics of distributed volume and replica volume;

Stripe replica volume: Similar to RAID 10, it has the characteristics of both stripe volume and replica volume.

Distributed Stripe Replication Volume: Composite volume of three basic volumes, usually used for Map-Reduce applications;1. Distributed volume

Distributed volumes are the default volume for GlusterFS, and when creating volumes, the default option is to create distributed volumes. In this mode, the file is not chunked, and the file is stored directly on a Server node. Using the local file system directly for file storage, most Linux commands and tools will continue to work. HASH values need to be saved through extended file attributes. Currently, the supported underlying file systems include ext3, ext4, ZFS, XFS, etc.

Because of the use of local file systems, access efficiency is not improved, but will be reduced due to network communication reasons; in addition, supporting very large files will be difficult, because distributed volumes do not partition files. Although ext4 can already support a single file up to 16TB, the capacity of local storage devices is limited.

File1 and File2 are stored on Server1, while File3 is stored on Server2. Files are stored randomly. A file is either stored on Server1 or Server2. It cannot be stored in blocks on Server1 and Server2 at the same time.

Distributed volumes have the following characteristics:

Files are distributed across different servers, laying out redundancy;

Easier and cheaper expansion of volume size;

A single point of failure can cause data loss;

Depends on underlying data protection;

Create a distributed volume:

[root@cecentos01 ~]# gluster volume create dis-volume server1:/dir1 server2:/dir2Creation of dis-volume has been successfulPlease start the volume to access data2, ribbon volume

Stripe mode is equivalent to RAID0, in which files are divided into N blocks (N stripe nodes) according to offsets and stored on each Brick Server node on a polling basis. The node stores each data block in the local file system as an ordinary file, and records the total number of blocks and the serial number of each block through extended attributes. The number of stripes specified at configuration time must be equal to the number of storage servers contained in the Brick in the volume, especially when storing large files, but without redundancy.

As shown in the figure below, the file is stored in different servers, File is divided into 6 segments, 1, 3 and 5 are placed on server1, and 2, 4 and 6 are placed on server2.

Stripe rolls have the following characteristics:

Data is divided into smaller chunks distributed to different stripes in a chunk server farm;

Distribution reduces load and smaller files speed up access;

No data redundancy;

Create Stripe Volume:

[root@centos01 ~]# gluster volume create stripe-volume stripe 2 transport tcp server1:/dir1 server2:/dir2Creation of rep-volume has been successfulPlease start the volume to access data3, Duplicate volume

Copy mode, also known as AFR, is equivalent to RAID1. That is, one or more copies of the same file are saved, and each node saves the same content and directory structure. Replication mode disk utilization is low because copies are kept. If the storage space on multiple nodes is inconsistent, the capacity of the lowest node is taken as the total capacity of the volume according to the barrel effect. When configuring a replication volume, the number of replicas must equal the number of storage servers contained in the Brick in the volume, and the replication volume is redundant, so that even if a node is damaged, it will not affect the normal use of data.

File1 and File2 are stored on both Server1 and Server2, which is equivalent to the file in Server2 being a copy of the file in Server1.

Replication volumes have the following characteristics:

All servers in the volume keep a complete copy;

The number of copies of a volume can be determined by the customer at the time of creation;

At least two block servers or more;

redundancy;

Create replication volumes:

[root@centos01 ~]# gluster volume create rep-volume replica 2 transport tcp server1:/dir1 server2:/dir2Creation of rep-volume has been successfulPlease start the volume to access data4, distributed stripe volume

Distributed Stripe Volume takes into account the functions of both distributed and striped volumes, mainly used for large file access processing, and at least 4 servers are required to create a distributed stripe volume.

File1 and File2 are located to Server1 and Server2, respectively, through the functionality of distributed volumes, as shown in the following figure. In Server1, File1 is divided into 4 segments, where 1 and 3 are in the exp1 directory in Server1, and 2 and 4 are in the exp2 directory in Server1. In Server2, File2 is also divided into 4 segments, where 1 and 3 are in the exp3 directory in Server2, and 2 and 4 are in the exp4 directory in Server2.

Create a distributed stripe volume:

[root@centos01 ~]# gluster volume create dis-stripe stripe 2 transport tcp server1:/dir1 server2:/dir2 server3:/dir3 server4:/dir4Creation of rep-volume has been successfulPlease start the volume to access data

When a volume is created, if the number of storage servers equals the number of stripes or replicas, then a stripe volume or replication volume is created; if the number of storage servers is twice or more that of stripe volumes or replication volumes, then a distributed stripe volume or distributed replication volume is created.

5. Distributed replication volumes

Distributed replication volumes combine the functions of distributed and replicated volumes, and are primarily used when redundancy is required.

File1 and File2 are located to Server1 and Server2, respectively, through the functionality of distributed volumes, as shown in the following figure. When File1 is stored, File1 will have two identical copies, the exp1 directory in Server1 and the exp2 directory in Server2, depending on the characteristics of the replication volume. When File2 is stored, File2 will have two identical copies, the exp3 directory in Server3 and the exp4 directory in Server4, depending on the characteristics of the replication volume.

Create a distributed replication volume:

[root@centos01 ~]# gluster volume create dis-rep replica 2 transport tcp server1:/dir1 server2:/dir2 server3:/dir3 server4:/dir4Creation of rep-volume has been successfulPlease start the volume to access data

If there are 8 servers, in the order of server list, servers 1 and 2 as one replica, servers 3 and 4 as one replica, servers 5 and 6 as one replica, servers 7 and 8 as one replica when replica is 2, and servers 1/2/3./when replica is 4, in the order of server list 4 as a replica, servers 5/6/7/8 as a replica.

About building GlusterFS distributed file system cluster will be in the next blog post detailed configuration!!!

--------This article ends here, thanks for reading-------

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.