Build GFS distributed file system-practice 07/01 Update SLTechnology News&Howtos

Build GFS distributed file system-practice

2025-07-01 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)06/02 Report--

1. Brief introduction to GlusterFS:

GFS is an extensible distributed file system for large, distributed applications that access large amounts of data. It runs on cheap ordinary hardware and provides fault tolerance. It can provide high-performance services to a large number of users.

Open source distributed file system

It is composed of storage server, client and NFS/Samba storage gateway.

(1) characteristics of GlusterFS:

Scalability and high performanc

High availability

Global uniform namespace

Elastic volume management

Based on standard protocol

(2) Modular stack architecture:

1. Modular, stackable structure

2. Through the combination of modules, complex functions are realized.

3. GlusterFS workflow:

4. Elastic HASH algorithm:

(1) get a 32-bit integer by HASH algorithm

(2) divided into N connected subspaces, each space corresponding to a Brick

(3) the advantages of resilient HASH algorithm:

(4) ensure that the data is evenly distributed in each Brick.

(5) the dependence on metadata server is solved, and then the single point of failure and server access bottleneck are solved.

2. Volume type of GlusterFS:

(1) distributed volumes:

(1) the file is not divided into blocks.

(2) Save HASH values by extending file attributes

(3) the underlying file systems supported are ext3, ext4, ZFS, XFS, etc.

Features:

(1) Files are distributed on different servers and do not have redundancy

(2) expand the size of the volume more easily and cheaply

(3) single point failure will result in data loss.

(4) rely on the underlying data protection.

(2) stripe rolls:

(1) the file is divided into N blocks (N stripe nodes) according to the offset, and the polled is stored in each Brick (2) Server node.

(3) the performance is particularly outstanding when storing large files.

(4) No redundancy, similar to raid0

Features:

(1) the data is divided into smaller blocks and distributed to different stripes in the block server farm.

(2) Distribution reduces load and smaller files speed up access.

(3) No data redundancy.

(3) copy the volume:

(1) keep one or more copies of the same document

(2) the disk utilization of replication mode is low because the copy is saved.

(3) if the storage space on multiple nodes is inconsistent, then take the capacity of the lowest node (4) as the total capacity of the volume.

Features:

(1) all servers in the volume keep a complete copy

(2) the number of copies of the volume can be determined by the customer when it is created.

(3) at least two block servers or more servers

(4) disaster tolerance.

(4) distributed stripe volume:

(1) taking into account the functions of distributed and striped volumes

(2) mainly used for large file access processing

(3) at least 4 servers are required.

(5) distributed replication volumes:

(1) take into account the functions of distributed volumes and replication volumes

(2) for situations where redundancy is needed

3. GlusterFS starts to walk:

Five virtual machines: one as a client and four as nodes, with 4 disks added to each virtual machine (20g per disk)

1. After partitioning, formatting and mounting each disk, you can use the following script

Vim disk.sh / / Mount the disk script with one click #! / bin/bashecho "the disks exist list:" fdisk-l | grep 'disk / dev/sd [arelazi]' echo "=" PS3= "chose which disk you want to create:" select VAR in `ls / dev/sd* | grep-o'sd [Bmerz]'| Uniq` quitdo case $VAR in sda) fdisk-l / dev/sda break Sd [bmerz]) # create partitions echo "n p w" | fdisk / dev/$VAR # make filesystem mkfs.xfs-I size=512 / dev/$ {VAR} "1" & > / dev/null # mount the system mkdir-p / data/$ {VAR} "1" & > / dev/null echo-e "/ dev/$ { VAR} "1" / data/$ {VAR} "1" xfs defaults 0\ n "> > / etc/fstab mount-a & > / dev/null break ; quit) break;; *) echo "wrong disk,please check again";; esacdone

2. Operations on four node nodes

(1) modify the hostname (node1, node2, node3, node4) and turn off the firewall.

(2) Editing the hosts file (when the user enters a URL that needs to be logged in in the browser, the system will first automatically find the corresponding IP address from the Hosts file. Once found, the system will immediately open the corresponding web page. If it is not found, the system will submit the URL to the DNS domain name resolution server for IP address resolution. ), adding a hostname and IP address

Vim / etc/hosts192.168.220.172 node1192.168.220.131 node2192.168.220.140 node3192.168.220.136 node4

(3) write the library of yum source and install GlusterFS:

Cd / opt/mkdir / abcmount.cifs / / 192.168.10.157/MHA / abc / / remotely mount to the local cd / etc/yum.repos.d/mkdir bak mv Cent* bak/ move the original sources to the newly created folder vim GLFS.repo / / create a new source [GLFS] name=glfsbaseurl= file:///abc/gfsrepogpgcheck=0enabled=1

(4) install the software package

Yum-y install glusterfs glusterfs-server glusterfs-fuse glusterfs-rdma

(5) start the service

Systemctl start glusterdsystemctl status glusterd

(6) View status

3. Time synchronization, each node needs to operate

Ntpdate ntp1.aliyun.com / / time synchronization

Add a storage trust pool by adding three other nodes on one host:

This is done on the node1 node:

Gluster peer probe node2gluster peer probe node3gluster peer probe node4gluster peer status / / View the status of all nodes

Fourth, create various volumes

1. Build a distributed volume

Gluster volume create dis-vol node1:/data/sdb1 node2:/data/sdb1 force / / is created using two disks on node1 and node2; dis-vol is the disk name; force means to force gluster volume start dis-vol / / start gluster volume info dis-vol / / to view the status

2. Create stripe volume

Gluster volume create stripe-vol stripe 2 node1:/data/sdc1 node2:/data/sdc1 forcegluster volume start stripe-volgluster volume info stripe-vol

3. Create a replication volume

Gluster volume create rep-vol replica 2 node3:/data/sdb1 node4:/data/sdb1 forcegluster volume start rep-volgluster volume info rep-vol

4. Distributed stripe volume

Gluster volume create dis-stripe stripe 2 node1:/data/sdd1 node2:/data/sdd1 node3:/data/sdd1 node4:/data/sdd1 forcegluster volume start dis-stripegluster volume info dis-stripe

5. Distributed replication volumes

Gluster volume create dis-rep replica 2 node1:/data/sde1 node2:/data/sde1 node3:/data/sde1 node4:/data/sde1 forcegluster volume start dis-repgluster volume info dis-rep

6. Client configuration

(1) turn off the firewall

(2) configure and install GFS source:

Cd / opt/mkdir / abcmount.cifs / / 192.168.10.157/MHA / abc / / remotely mount to local cd / etc/yum.repos.d/vim GLFS.repo / / create a new source [GLFS] name=glfsbaseurl= file:///abc/gfsrepogpgcheck=0enabled=1

(3) install the software package

Yum-y install glusterfs glusterfs-fuse

(4) modify hosts file:

Vim / etc/hosts192.168.220.172 node1192.168.220.131 node2192.168.220.140 node3192.168.220.136 node4

(5) create a temporary mount point:

Mkdir-p / text/dis/ / Recursive create a mount point mount.glusterfs node1:dis-vol / text/dis/ mount distributed volumes mkdir / text/stripmount.glusterfs node1:stripe-vol / text/strip/ mount stripe volumes mkdir / text/repmount.glusterfs node3:rep-vol / text/rep/ mount replication volumes mkdir / text/dis-strmount.glusterfs node2:dis-stripe / text/dis- Str/ Mount distributed stripe volumes mkdir / text/dis-repmount.glusterfs node4:dis-rep / text/dis-rep/ Mount distributed replication volumes

(6) df-hT: view mount information:

Fifth, test each volume

(1) create 5 40m files:

Dd if=/dev/zero of=/demo1.log bs=1M count=40dd if=/dev/zero of=/demo2.log bs=1M count=40dd if=/dev/zero of=/demo3.log bs=1M count=40dd if=/dev/zero of=/demo4.log bs=1M count=40dd if=/dev/zero of=/demo5.log bs=1M count=40

(2) the 5 files created are copied to different volumes:

Cp / demo* / text/discp / demo* / text/stripcp / demo* / text/rep/cp / demo* / text/dis-strcp / demo* / text/dis-rep

(3) check how the volumes are distributed: ll-h / data/sdb1

1. Distributed volumes:

It can be seen that every file is complete.

2. Stripe volume:

All files are divided into halves for distributed storage.

3. Copy the volume:

All files are copied completely and stored.

4. Distributed stripe volume:

5. Distributed replication volumes:

(4) failure test:

Now shut down the second node server to simulate downtime, and then check each volume on the client:

Summary:

1. All files of the distributed volume are in the

2. Copy the volume and all the files are in

3. Demo5.log is the only file for mounting distributed stripe volumes, and 4 files have been lost.

4. Mount the distributed replication volume. All files are in the

5. Stripe Volume all files are lost.

(5) other operations:

1. Delete the volume (stop first, then delete):

Gluster volume stop volume name gluster volume delete volume name

2. Blacklist and whitelist settings:

Gluster volume set volume name auth.reject 192.168.220.100 / refuse to mount a host gluster volume set volume name auth.allow 192.168.220.100 / allow a host to mount

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.