GlusterFS correlation 07/05 Update SLTechnology News&Howtos

GlusterFS correlation

2025-07-05 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)06/02 Report--

1. Introduction of Gluster

GlusterFS is the core of Scale-Out storage solution Gluster. It is an open source distributed file system with strong scale-out ability to support several PB storage capacity and handle thousands of clients. GlusterFS gathers physically distributed storage resources together with TCP/IP or InfiniBand RDMA, a "converted cable" technology that supports multiple concurrent links, and uses a single global namespace to manage data. GlusterFS is based on a stackable user space design that provides excellent performance for a variety of different data loads, working schematics:

GlusterFS supports standard clients running standard applications on any standard IP network. As shown in figure 2, users can access application data using standard protocols such as NFS/CIFS in a globally unified namespace. GlusterFS enables users to get rid of the original independent, high-cost closed storage system, can use ordinary cheap storage devices to deploy storage pools that can be centrally managed, scale out, and virtualized, and the storage capacity can be expanded to the TB/PB level.

The main features of GlusterFS are as follows:

1. Scalability and high performance

GlusterFS takes advantage of dual features to provide highly scalable storage solutions from several TB to several PB. The Scale-Out architecture allows storage capacity and performance to be improved by simply adding resources. Disk, computing, and Imax O resources can all be added independently, supporting high-speed network interconnection such as 10GbE and InfiniBand. Gluster Elastic Hash (Elastic Hash) removes the need for metadata server in GlusterFS, eliminates single point of failure and performance bottleneck, and truly realizes parallel data access.

2. High availability

GlusterFS can automatically copy files, such as mirrored or multiple copies, to ensure that data is always accessible, even in the event of a hardware failure. The self-healing function can restore the data to the correct state, and the repair is performed incrementally in the background, with little performance load. GlusterFS does not design its own private data file format, but uses mainstream disk file systems in the operating system (such as EXT3, ZFS) to store files, so data can be copied and accessed using a variety of standard tools.

3. Global uniform namespace

The global unified namespace aggregates disk and memory resources into a single virtual storage pool, shielding the underlying physical hardware from upper-level users and applications. Storage resources can be flexibly expanded in the virtual storage pool as needed, such as expanding or shrinking. When storing virtual machine images, there is no limit to the number of virtual image files stored, and thousands of virtual machines share data through a single mount point. Virtual machine Imando O can automatically load balance on all servers in the namespace, eliminating the access hotspots and performance bottlenecks that often occur in the SAN environment.

4. Elastic hashing algorithm

Instead of using centralized or distributed metadata server indexes, GlusterFS uses an elastic hashing algorithm to locate data in the storage pool. In other Scale-Out storage systems, metadata servers usually cause performance bottlenecks and single point of failure problems. In GlusterFS, all storage systems in the Scale-Out storage configuration can intelligently locate arbitrary data fragments without looking at the index or querying other servers. This design mechanism completely parallelizes data access and achieves real linear performance expansion.

5. Elastic volume management

Data is stored in logical volumes, which can be logically partitioned independently from virtualized physical storage pools. The storage server can be added and removed online without interrupting the application. Logical volumes can grow and shrink in all configuration servers, can be migrated across different servers for capacity balancing, or add and remove systems, all online. File system configuration changes can also be made and applied online in real time to adapt to changes in workload conditions or online performance tuning.

6. Based on standard protocol

Gluster storage services support NFS, CIFS, HTTP, FTP and Gluster native protocols and are fully compatible with the POSIX standard. Existing applications can access data in Gluster without making any modifications or using a dedicated API. This is useful when deploying Gluster in a public cloud environment, where Gluster abstracts the cloud service provider-specific API and then provides a standard POSIX interface.

Reference link

Http://www.tuicool.com/articles/AbE7Vr

The GlusterFS term explains:

A storage unit in Brick:GFS that is an export directory of a server in a trusted storage pool. Can be identified by hostname and directory name, such as' SERVER:EXPORT'

Client: the device with the GFS volume mounted

Extended Attributes:xattr is a file system feature that allows users or programs to associate files / directories and metadata.

FUSE:FilesystemUserspace is a loadable kernel module that allows non-privileged users to create their own file systems without modifying kernel code. The code that runs the file system in user space is bridged with the kernel through FUSE code.

Geo-Replication

Each file or directory in a GFID:GFS volume is associated with a unique 128bit data, which is used to simulate inode

Namespace: each Gluster volume exports a single ns as the mount point for POSIX

Node: a device with several brick

RDMA: remote direct memory access, which supports direct memory access without the OS of both parties.

RRDNS:round robin DNS is a method that returns different devices through DNS rotation for load balancing.

Self-heal: used for background runs to detect inconsistencies between files and directories in the replica volume and resolve these inconsistencies.

Split-brain: cerebral fissure

Translator:

The configuration file for the Volfile:glusterfs process, usually located in / var/lib/glusterd/vols/volname

Volume: a logical set of bricks

II. Installation and configuration

1. Cluster architecture

System: Centos 6.7 x86room64 server: 192.168.159.128 192.168.159.129 client: 192.168.159.130

2. Install on the server side

First install Gluster's YumSource # yum install centos-release-gluster38-y install Glusterfs#yum install glusterfs-server-y after the installation is complete View version # glusterfs-Vglusterfs 3.8.12 built on May 11 2017 18:24:27Repository revision: git://git.gluster.com/glusterfs.gitCopyright (c) 2006-2013 Red Hat, Inc. GlusterFS comes with ABSOLUTELY NO WARRANTY.It is licensed to you under your choice of the GNU LesserGeneral Public License, version 3or any later version (LGPLv3or later), or the GNU General Public License, version 2 (GPLv2), in all cases as published by the Free Software Foundation. Boot gluster#service glusterd startStarting glusterd: [OK] set Boot self-boot # chkconfig-- add glusterd#chkconfig glusterd on view listening port # netstat-tunlptcp 00 0.0.0.0chkconfig 24007 0.0.0.0chkconfig * LISTEN 65940/glusterd

3. Install on the client

First install Gluster's YumSource # yum install centos-release-gluster38-y install client glusterfs#yum install glusterfs glusterfs-fuse glusterfs-client glusterfs-libs-y

4. Configure on the server side

Note: configuring a glusterfs cluster will generally mount new disks on the server, format partitions, and mount the whole disk to a directory specifically for storing data, which improves the availability of data to a certain extent. When problems occur in the system, the data will not be lost. Reference official website: http://gluster.readthedocs.io/en/latest/Quick-Start-Guide/Quickstart/ configuration glusterfs cluster must meet the following conditions: 1, at least 2 nodes 2, one network (interworking between nodes) 2, nodes need to have 2 virtual hard disks, one is to install os, the other is to configure glusterfs cluster, otherwise the error will be as follows: volume create: gfs01: failed: The brick 192.168.159.128:/data/gluster is being created in the root partition. It is recommended that you don't use the system's root partition for storage backend. Or use 'force' at the end of the command if you want to override this behavior. But the official website said to format the disk into xfs format, I think some friends are formatted into ext4 format, and then add / etc/fstab/dev/sdb1 / opt ext4 defaults 00 so add new disks on 2 servers, then partition format, and mount to the / data directory! # mkdir / data#fdisk / dev/sdb # is divided into a zone, execute n, p, 1, and then press enter twice, and add w # mkfs.ext4 / dev/sdb1#mount / dev/sdb1 / data to / etc/fstab/dev/sdb1 / data ext4 defaults 00 zones # to start configuration You can operate any machine on the server: add a node: # gluster peer probe 192.168.159.128 # hint there is no need to add native peer probe: success. Probe on localhost not needed#gluster peer probe 192.168.159.129peer probe: success. View status: # gluster peer statusNumber of Peers: 1Hostname: 192.168.159.129Uuid: 1aed6e01-c497-4890-9447-c3bd548dd37fState: Peer in Cluster (Connected) View status on another server # gluster peer statusNumber of Peers: 1Hostname: 192.168.159.128Uuid: 851be337-84b2-460d-9a73-4eee6ad95e95State: Peer in Cluster (Connected) # # # create a gluster shared directory (you need to create a shared directory on both 128and 129servers) # mkdir / data/gluster create a volume named gfs01: # glustervolume create gfs01 replica 2 192.168.159.128:/data/gluster 192.168.159.129:/data/glustervolume create: gfs01: success: please start the volume to access data boot volume # glustervolume start gfs01volume start: gfs01: success View volume information: # glustervolume info Volume Name: gfs01Type: ReplicateVolume ID: 1cdefebd-5831-4857-b1f9-f522fc868c60Status: StartedSnapshot Count: 0Number of Bricks: 1 x 2 = 2Transport-type: tcpBricks:Brick1: 192.168.159.128:/data/glusterBrick2: 192.168.159.129:/data/glusterOptions Reconfigured:transport.address-family: inetperformance.readdir-ahead: onnfs.disable: on View Volume status: # gluster volume statusStatus of volume: gfs01Gluster process TCP Port RDMA Port Online Pid- -Brick 192.168.159.128:/data/gluster 49152 0 Y 2319 Brick 192.168.159.129:/data/gluster 49152 0 Y 4659 Self-heal Daemon on localhost N/A N/A Y 2339 Self-heal Daemon on 192.168.159.129 N/A N/A Y 4683 Task Status of Volume gfs01-- -- other uses of There are no active volume tasks gluster volume stop gfs01 # stop volumes gluster volume delete gfs01 # delete volumes

5. Configure mount on the client

Create a mount point # mkdir / opt/test_gluster#mount-t glusterfs 192.168.159.128:/gfs01 / opt/test_gluster/ [root @ client ~] # ll / opt/test_gluster/total 0 [root@client ~] # df-hFilesystem Size Used Avail Use% Mounted on/dev/sda3 18G 822M 16G 5% / tmpfs 491m 0491m 0% / dev/shm/dev/ Sda1 190m 27m 154m 15% / boot192.168.159.128:/gfs01 4.8G 11m 4.6G 1% / opt/test_gluster Note: 128,129 is a cluster You can mount any one, if the client pays more attention to the load. Add boot auto mount # cat / etc/rc.localmount-t glusterfs 192.168.159.128:/gfs01 / opt/test_gluster/

6. Test file

Create a test file testfile.txt on the client 130 server Write to the content and test its hash value with md5sum # ll / opt/test_gluster/total 3 RWLY RWMI RMI-1 root root 2262 Jun 1 00:10 testfile.txt [root@client test_gluster] # md5sum testfile.txt 6263c0489d0567985c28d7692bde624c testfile.txt### # # to 2 servers to check the situation on root@server1-2 root root 2262 Jun 1 00:10 testfile.txt [root@server1] # md5sum / data/gluster/testfile.txt 6263c0489d0567985c28d7692bde624c / data/gluster/testfile.txt on [root@server2] # ll / data/gluster/total 8 Mustang [root@server2] # root root 2262 Jun 1 00:10 testfile.txt [root@server2] ~] # md5sum / data/gluster/testfile.txt 6263c0489d0567985c28d7692bde624c / data/gluster/testfile.txt can see that the md5sum hash value is consistent.

7. Test cluster data consistency

The reason why you can use the cluster is to share the load, and the other is to achieve high availability, so let's test it: shut down 129 server directly! Then check the cluster status at 128 As follows: [root@server1 ~] # gluster volume statusStatus of volume: gfs01Gluster process TCP Port RDMA Port Online Pid----Brick 192.168 .159.128: / data/gluster 49152 0 Y 2319 Self-heal Daemon on localhost N/A N/A Y 2589 Task Status of Volume gfs01 -There are no active volume tasks can see that the 129th cluster no longer exists! # delete the testfile.txt file # rm-f / opt/test_gluster/testfile.txt of client 130 at this time, and then you will find that the testfile.txt file disappears in the server 128/data/gluster directory. Restart server # service glusterd status # at this time because it is configured to boot automatically, so it starts. Glusterd (pid 1368) is running... [root@server2 ~] # ll / data/gluster/ # it is possible to view the file again at this time, because the server shuts down before deleting the client file! The hash value of total 8kw / data/gluster/testfile.txt 6263c0489d0567985c28d7692bde624c / data/gluster/testfile.txt # md5sum verification is the same as that of testfile.txt [root@server2 ~] # md5sum / data/gluster/testfile.txt 6263c0489d0567985c28d7692bde624c / data/gluster/testfile.txt # md5sum verification at 00:10. After about 10 seconds, the testfile.txt files on the 129th server also disappeared, indicating that the gluster cluster is to maintain data consistency!

Description: glusterfs cluster, can not write data on the server, to write data on the client, will be automatically synchronized to the server!

3. Some commands commonly used in gluster cluster

# Delete volume gluster volume stop gfs01gluster volume delete gfs01# move machines out of cluster gluster peer detach 192.168.1.10 only allow network access of 172.28.0.0 glusterfsgluster volume set gfs01 auth.allow 172.28.26.*gluster volume set gfs01 auth.allow 192.168.222.1192.168.administrators # add new machines and add to the volume (because the number of replicas is set to 2 At least add 2 (4, 6, 8.) gluster peer probe 192.168.222.134gluster peer probe 192.168.222.13 add new volume gluster volume add-brick gfs01 repl 2 192.168.222.134:/data/gluster 192.168.222.135:/data/gluster force# delete volume gluster volume remove-brick gfs01 repl 2 192.168.222.134:/opt/gfs 192.168.222.135:/opt/gfs startgluster volume remove-brick gfs01 repl 2 192.168.222.134:/opt/gfs 192.168.222.135:/opt/gfs statusgluster volume remove-brick gfs01 repl 2 192.168.222.134:/opt/gfs 192.168.222.135:/opt/gfs commit Note: when expanding or shrinking volumes Also according to the type of volume, the number of brick added or reduced must meet the corresponding requirements. # when the volume is expanded or shrunk, the data of the volume needs to be rebalanced. Gluster volume rebalance mamm-volume start | stop | status### migration volume-mainly completes the online migration of data between volumes # start the migration process gluster volume replace-brick gfs01 192.168.222.134:/opt/gfs 192.168.222.134: / opt/test start force# check migration status gluster volume replace-brick gfs01 192.168.222.134:/opt/gfs 192.168.222.134:/opt/test status# after migration is completed, submit completed gluster volume replace-brick gfs01 192.168.222.134:/opt/gfs 192.168.222.134:/opt/test commit# machine failed Execute forced submission of gluster volume replace-brick gfs01 192.168.222.134:/opt/gfs 192.168.222.134:/opt/test commit force### to trigger copy self-healing gluster volume heal mamm-volume # repair only defective files gluster Volume heal mamm-volume full # repair all files gluster volume heal mamm-volume info # View self-healing details # data-self-heal Metadata-self-heal and entry-self-heal enables or disables self-healing of file content, file metadata, and directory items All three are "on" by default. # set one of them as an example of off: gluster volume set gfs01 entry-self-heal off IV, glusterfs shortcomings Analysis reference Learning connection: https://www.cnblogs.com/langren1992/p/5316328.html inadequacies, please point out more!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.