Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Introduction of distributed system and installation and basic configuration of MogileFS

2025-01-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Share

Shulou(Shulou.com)06/01 Report--

Introduction of distributed system and installation and basic configuration of MogileFS

Distributed MogileFS

Outline

Foreword:

What is distributed?

What is the meaning of distributed existence?

The difficulties of Distribution and the introduction of CAP, BASE, 2PC and X/Open XA

Distributed storage and distributed file systems:

MogileFS implementation principle:

MogileFS compilation, installation and configuration

Summary

Foreword:

Before we know it, we enter the era of big data. What is big data? What is distributed? What is cloud computing? As we will introduce later, in this article, we will focus on distributed systems.

What is distributed?

The word distributed sounds great, but in fact, we used to build distributed systems before (the author's blog), from the initial separation of MySQL in LAMP to the introduction of Varnish cache pages, and then to the use of LVS load balancing Nginx | Apache, Nginx load balancing Tomcat, etc., are all distributed systems in a broad sense.

To put it simply, distribution is the integration of various components of a system (MySQL, PHP, Apache...) Hosts distributed on the network, and components communicate and coordinate work only through messaging

What is the meaning of distributed existence?

In fact, we have already answered the previous blog posts related to load balancer, and there are mainly the following questions:

The cost performance of vertical expansion is not high.

There is a critical point for performance improvement in stand-alone expansion.

For the consideration of stability and availability, there will be many problems in the single machine.

The difficulties of Distribution and the introduction of CAP, BASE, 2PC and X/Open XA

There are the following difficulties in distributed systems

Lack of global clock

Independence in the face of failure

It is difficult to deal with a single point of failure

It is difficult to implement a transaction

The transaction should have ACID. But this is very difficult to achieve in distributed systems.

A: Atomicity atomicity

C: Consistency consistency

I: Isolation isolation

D: Durability persistence

Many databases can implement stand-alone transactions, but once built into a distributed system, stand-alone ACID can not be realized. There are two options: 1. Abandon transactions 2. Introduce distributed transactions.

Implementation of distributed transactions:

The main role in a transaction:

Participants in the transaction

Server that supports transactions

Resource server

Transaction manager

Model and specification of distributed transactions: Distributed Transaction Processing: The XA Specification

X/Open DTP: Distribution Transaction Process

AP: application

RM: Resource Manager, resource manager, usually DBMS

TM: Transaction Manager, responsible for coordinating and managing affairs

Provide to AP programming interface and manage resource manager

2PC:

Two Phase Commitment Protocol two-stage submission

As shown in the figure: a transaction must first prepare resources. After all the resources of all nodes are ready, Commit will be performed at the same time. If interrupted, they will ROLLBACK together to achieve data consistency.

CAP: more information about CAP

Proposed by Eric Brewer in July 2000, and proved by others, distributed systems can not meet CAP at the same time.

C: Consistency consistency the data of all hosts are synchronized

A: Avaiability availability can ensure the availability of the system (host downtime does not affect users)

P: Partition tolerance partition fault tolerance: even if the network fails to partition, it does not affect the operation of the system.

In general, distributed systems are compromised in C (Consistency).

BASE: an alternative to ACID

BA: basic availability of Basically Availibale

S: the status accepted by Soft state for a period of time cannot be synchronized

E: Eventually Consistent final consistency

Compared with ACID, BASE is not so demanding. BASE allows partial failure but does not cause system failure.

DNS is the most famous implementation of Eventually Consistent.

Distributed storage and distributed file systems:

There are generally two types of storage:

Centralized:

NAS: Network Attached Storage; file system level, such as NFS, FTP, SAMBA...

SAN: Storage Aera Network; block level, such as IP SAN, FC SAN...

Distributed system

Central node storage: each cluster has nodes dedicated to storing metadata, while other nodes store part of the data

Centerless node storage: each node in each cluster stores metadata and some data

Distributed storage and distributed file systems:

File system: with file system interface

Storage: no file system interface, accessed through API

Common implementations:

GFS: Google File System

The founder of distributed system, due to the internal needs of Google development, later released a paper to publish its technical details, but there is no open source

HDFS: Hadoop Distribution File System

Through the papers published by Google, HDFS is realized.

Both GFS and HDFS store metadata in memory, periodically in persistent storage, and are only suitable for storing millions or tens of millions of large files.

GlusterFS:

Decentralized design, no metadata nodes

Ceph:

Linux kernel-level implementation of the file system, has included the Linux kernel

MogileFS:

Suitable for storing a large number of small files, written in Perl language, some people in China use C language to rewrite and open source to FastDFS

TFS:

TaoBao FileSystem, based on HDFS development, suitable for storing a large number of small files

MogileFS implementation principle:

Terms in MogileFS:

Tracker: save the metadata information of each node file with the help of the database, make it easy to retrieve and locate the data location and monitor each node, inform the client of the location of the storage area and direct the storage node to replicate the data. The process is mogilefsd.

Database: stores metadata information for node files for tracker nodes

Storage: convert keys in the specified domain to unique file names and store them in a specific device file. After conversion, the file name is value, and storage automatically maintains the corresponding relationship between key values. Storage node uses http for data transfer, depending on perbal, and the process is mogstored, perbal.

Domain: key values in a domain are unique, and a MogileFS can have multiple fields to store different types of files.

Class: the smallest unit of replication, managing file properties, defining the number of copies of files stored on different devices

Device: a storage node that can have multiple device, that is, the directory used to store files. Each device has a device ID, which needs to be configured in the mogstored configuration file and cannot be deleted. The device can only be set to dead. After setting it to dead, the data cannot be recovered, and the device ID can no longer be used.

MogileFS Architecture:

Features of MogileFS:

The working application layer does not require special components

No single point of failure

Automatically copy files

Simple namespace

No need for RAID

Cannot append, write at random

Data is uploaded to Storage Node (mogstored) through HTTP/WebDAV service

MySQL stores MogileFS metadata (namespace, location)

High-availability architecture for MogileFS:

MogileFS compilation, installation and configuration

I came here to compile and install it, but for various reasons, I installed it using the rpm package, which is provided by Ma GE.

For all the operating procedures in the experiment, due to time reasons, it is not described here for more details. See: official WIKI

Experimental environment

Node6 172.16.1.7 tracker, database

Node7 172.16.1.8 storage

Node8 172.16.1.9 storage

Installation: an epel source is required. Every host has to be installed.

[root@node6~] yum install perl-Net-Netmask perl-IO-AIO # every host must be installed, otherwise mogstored may not be able to listen to the port properly

[root@node6~] yum localinstall MogileFS-Server-2.46-2.el6.noarch.rpm MogileFS-Server-mogilefsd-2.46-2.el6.noarch.rpm MogileFS-Server-mogstored-2.46-2.el6.noarch.rpm MogileFS-Utils-2.19-1.el6.noarch.rpm Perlbal-1.78-1.el6.noarch.rpm Perlbal-doc-1.78-1.el6.noarch.rpm perl-MogileFS-Client-1.14-1.el6.noarch.rpm perl-Net- Netmask-1.9015-8.el6.noarch.rpm perl-Perlbal-1.78-1.el6.noarch.rpm

[root@node6~] yum install mysql-server

[root@node6~] scp * .rpm 172.16.1.8:/root/

[root@node6~] scp * .rpm 172.16.1.9:/root/

[root@node7~] yum install perl-Net-Netmask perl-IO-AIO # every host must be installed, otherwise mogstored may not be able to listen to the port properly

[root@node7~] yum localinstall MogileFS-Server-2.46-2.el6.noarch.rpm MogileFS-Server-mogilefsd-2.46-2.el6.noarch.rpm MogileFS-Server-mogstored-2.46-2.el6.noarch.rpm MogileFS-Utils-2.19-1.el6.noarch.rpm Perlbal-1.78-1.el6.noarch.rpm Perlbal-doc-1.78-1.el6.noarch.rpm perl-MogileFS-Client-1.14-1.el6.noarch.rpm perl-Net- Netmask-1.9015-8.el6.noarch.rpm perl-Perlbal-1.78-1.el6.noarch.rpm

[root@node8~] yum install perl-Net-Netmask perl-IO-AIO # every host must be installed, otherwise mogstored may not be able to listen to the port properly

[root@node8~] yum localinstall MogileFS-Server-2.46-2.el6.noarch.rpm MogileFS-Server-mogilefsd-2.46-2.el6.noarch.rpm MogileFS-Server-mogstored-2.46-2.el6.noarch.rpm MogileFS-Utils-2.19-1.el6.noarch.rpm Perlbal-1.78-1.el6.noarch.rpm Perlbal-doc-1.78-1.el6.noarch.rpm perl-MogileFS-Client-1.14-1.el6.noarch.rpm perl-Net- Netmask-1.9015-8.el6.noarch.rpm perl-Perlbal-1.78-1.el6.noarch.rpm

Configure the database:

[root@node6~] service mysqld start

Mysql > GRANT ALL ON *. * TO root@'%' IDENTIFIED BY 'passwd'; # configure a root user who can connect remotely

Query OK, 0 rows affected (0.00 sec)

Mysql > GRANT ALL ON mogilefs.* TO mogileuser@'%' IDENTIFIED BY 'passwd'; # configure a user who can manage the mogilefs database

Query OK, 0 rows affected (0.00 sec)

Mysql > FLUSH PRIVILEGES

Query OK, 0 rows affected (0.00 sec)

Mysql > CREATE DATABASE mogilefs; # create mogilefs database

Query OK, 1 row affected (0.00 sec)

[root@node6~] mogdbsetup-- dbhost=172.16.1.7-- dbuser=mogileuser-- dbpass=passwd-- dbname=mogilefs-- dbrootpass=passwd # generate data table

Mysql > USE mogilefs

Reading table information for completion of table and column names

You can turn off this feature to get a quicker startup with-A

Database changed

Mysql > SHOW TABLES; # to see if the table has been generated

+-+

| | Tables_in_mogilefs |

+-+

| | checksum |

| | class |

| | device |

| | domain |

| | file |

| | file_on |

| | file_on_corrupt |

| | file_to_delete |

| | file_to_delete2 |

| | file_to_delete_later |

| | file_to_queue |

| | file_to_replicate |

| | fsck_log |

| | host |

| | server_settings |

| | tempfile |

| | unreachable_fids |

+-+

17 rows in set (0.00 sec)

Configure mogilefsd

[root@node6~] vim / etc/mogilefs/mogilefsd.conf

Db_dsn = DBI:mysql:mogilefs:host=172.16.1.7

Db_user = mogileuser

Db_pass = passwd

Listen = 0.0.0.0purl 7001

Conf_port = 7001

[root@node6~] service mogilefsd start

Starting mogilefsd [OK]

[root@node6~] mogadm host add node1-- ip=172.16.1.7 alive

[root@node6~] mogadm host add node2-- ip=172.16.1.8 alive

[root@node6~] mogadm host add node3-- ip=172.16.1.9 alive

[root@node6~] mogadm host list

Node1 [1]: down

IP: 172.16.1.7:7500

Node2 [2]: down

IP: 172.16.1.8:7500

Node3 [3]: down

IP: 172.16.1.9:7500

Configure mogstored

[root@node6~] mkdir / data/mogilefs/dev1-pv

Mkdir: created directory `/ data'

Mkdir: created directory `/ data/mogilefs'

Mkdir: created directory `/ data/mogilefs/dev1'

[root@node6~] vim / etc/mogilefs/mogstored.conf

Maxconns = 10000

Httplisten = 0.0.0.0pur7500

Mgmtlisten = 0.0.0.0purl 7501

Docroot = / data/mogilefs/

[root@node6] # chown mogilefs.mogilefs / data/mogilefs/-R

[root@node6~] service mogstored start

Starting mogstored [OK]

[root@node7~] mkdir / data/mogilefs/dev2-pv

Mkdir: created directory `/ data'

Mkdir: created directory `/ data/mogilefs'

Mkdir: created directory `/ data/mogilefs/dev2'

[root@node7~] vim / etc/mogilefs/mogstored.conf

Maxconns = 10000

Httplisten = 0.0.0.0pur7500

Mgmtlisten = 0.0.0.0purl 7501

Docroot = / data/mogilefs/

[root@node7] # chown mogilefs.mogilefs / data/mogilefs/-R

[root@node7~] service mogstored start

Starting mogstored [OK]

[root@node8~] mkdir / data/mogilefs/dev3-pv

Mkdir: created directory `/ data'

Mkdir: created directory `/ data/mogilefs'

Mkdir: created directory `/ data/mogilefs/dev3'

[root@node8~] vim / etc/mogilefs/mogstored.conf

Maxconns = 10000

Httplisten = 0.0.0.0pur7500

Mgmtlisten = 0.0.0.0purl 7501

Docroot = / data/mogilefs/

[root@node8] # chown mogilefs.mogilefs / data/mogilefs/-R

[root@node8~] service mogstored start

Starting mogstored [OK]

[root@node6~] mogadm device add node1 1 alive

[root@node6~] mogadm device add node2 2 alive

[root@node6~] mogadm device add node3 3 alive

[root@node6~] mogadm check

Checking trackers...

127.0.0.1:7001... OK

Checking hosts...

[1] node1... OK

[2] node2... OK

[3] node3... OK

Checking devices...

Host device size (G) used (G) free (G) use% ob state I + O%

-

Dev1 74.435 2.069 72.366 2.78% writeable 28.9

[2] dev2 74.435 1.958 72.477 2.63% writeable 0.0

[3] dev3 74.435 1.954 72.481 2.63% writeable 0.5

-

Total: 223.306 5.982 217.324 2.68%

Create a domain

[root@node6~] mogupload-- trackers=172.16.1.7-- Doma ^ C

[root@node6~] mogadm domain list

Domain class mindevcount replpolicy hashtype

--

[root@node6~] mogadm domain add files

[root@node6~] mogadm domain add p_w_picpaths

[root@node6~] mogadm domain list

Domain class mindevcount replpolicy hashtype

--

Files default 2 MultipleHosts () NONE

P_w_picpaths default 2 MultipleHosts () NONE

Create a class

[root@node6~] mogadm class list

Domain class mindevcount replpolicy hashtype

--

Files default 2 MultipleHosts () NONE

P_w_picpaths default 2 MultipleHosts () NONE

[root@node6~] mogadm class add files fulltext-- mindevcount=1

[root@node6~] mogadm class list

Domain class mindevcount replpolicy hashtype

--

Files default 2 MultipleHosts () NONE

Files fulltext 1 MultipleHosts () NONE

P_w_picpaths default 2 MultipleHosts () NONE

Upload and view files

[root@node6~] mogupload-trackers=172.16.1.7-domain=files-key='/fstab.txt'-file=/etc/fstab

[root@node6~] mogfileinfo-trackers=172.16.1.7-domain=files-key='/fstab.txt'

-file: / fstab.txt

Class: default

Devcount: 2

Domain: files

Fid: 2

Key: / fstab.txt

Length: 711

-http://172.16.1.8:7500/dev2/0/000/000/0000000002.fid

-http://172.16.1.9:7500/dev3/0/000/000/0000000002.fid

Verification

Summary

MogileFS configuration is still very Easy, but distributed theory is more important than configuration, we must keep in mind!

The author's level is very low, if there are any mistakes, point out in time, if you think this article is good, please click on a wave of likes ~ (≧▽≦) / ~

Author: AnyISaIln QQ: 1449472454

Thank you: MageEdu

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Database

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report