In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-17 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >
Share
Shulou(Shulou.com)06/02 Report--
What is Lsyncd?
Github:axkibe/lsyncd
Official document: Lsyncd-Live Syncing (Mirror) Daemon
Lysncd, or Live Syncing Daemon, is an open source real-time data synchronization tool (background process) based on inotify and rsync. Lsyncd is an auxiliary file synchronization tool. It listens to the file change events of the system and calls rsync for synchronization. Note the word "auxiliary", which means: synchronizing files itself is not achieved by lsyncd, it is only responsible for monitoring which files have changed, and then calling rsync to complete synchronization, the real synchronization of files is rsync, if you do not know what is rsync, please see the rsync section of real-time file synchronization using sersync + rsync.
Three synchronization modes of Lsyncd
Default.rsyncdefault.rsyncsshdefault.direct
For convenience, we directly call them rsync, rsyncssh, direct.
1. Rsync synchronization mode
First, you need to know the basic usage of rsync. The following rsync command can push files from the local "/ data/wwwroot" directory to the remote "remote@192.168.1.6::wwwroot/" directory. Lsyncd, which runs in "rsync synchronization mode", synchronizes files by assembling commands like this:
Rsync-avz-- partial-- delete / data/wwwroot remote@192.168.1.6::wwwroot/-- password-file=/etc/rsyncd.password
Some people may wonder, since rsync itself can synchronize, why do you need lsyncd? There are two reasons:
Real-time synchronization. Reduce the delay and performance loss caused by rsync scanning files. Rsync cannot know when to synchronize, because only when rsync executes the synchronization command will it scan files to determine which files have been modified, and can only set up a scheduled task to perform synchronization at regular intervals (such as 5 minutes, 10 minutes, etc.), so that it can be synchronized, but it is not real-time. Lsyncd can monitor file changes, in other words, a file has been modified, lsyncd will be notified (the principle is to use the inotify/fsevents function of the linux system), and then call rsync to synchronize the modified files (that is, assemble a rsync synchronization statement and execute) to achieve the effect of "real-time synchronization". When lsyncd calls the rsync synchronization command, it uses options such as rsync's-- include-from=FILE to specify which files to synchronize, which can "reduce the latency and performance loss caused by rsync scanning files." Why would you say that? Because if you let rsync to query which files have changed, it will take more time and server resources, just imagine, in 1 million files, one file added an English full stop. If there is no lsyncd to tell rsync to change this file, then rsync has to scan 1 million files to find out that there is only one more. Although the search efficiency of rsync is very high, this search is very unnecessary, which is why lsyncd can "reduce the latency and performance loss caused by rsync scanning files."
2. Rsyncssh synchronization mode
Understand the rsync synchronization mode, it is not difficult to understand the rsyncssh mode, because rsync itself has ssh mode, lsyncd is also responsible for listening for which files have changed, and then assembling rsync synchronization commands for synchronization.
The main benefit of rsync's ssh mode is that it is used when files are moved.
Suppose I have two machines, An and B, where the test directory file changes will be automatically synchronized to the test directory in B, and now the test directory files of An and B are as follows: ├── dir1 │ ├── aa.txt │ ├── bb.txt └── └── dir2 └── dd.txt suppose that bb.txt and cc.txt of machine An are now moved from dir1 to dir2 That is, it becomes the state shown below: ├── dir1 │ └── aa.txt └── dir2 ├── bb.txt ├── cc.txt └── dd.txt assumes that the normal rsync mode is used, then rsync will first delete bb.txt and cc.txt in dir1 in machine B, and upload the bb.txt and cc.txt files in machine A to the dir2 directory in machine B to achieve synchronization. But if you rsync runs in ssh mode, it will directly move the bb.txt and cc.txt in dir1 to the dir2 directory with the mv command in the B machine, and the A machine does not send any data to the B machine, so the efficiency is obvious (especially when there is a lot of moving data). From this point of view, the rsyncssh mode should be the best, but it has a disadvantage that the synchronization process can only be a single process (maxProcesses=1), while the rsync mode can synchronize multiple processes (fast).
3. Direct synchronization mode
This mode is used for synchronization between two local directories, not for remote server synchronization. Lsyncd also listens for file change events, and then synchronizes the changed files from the source directory to the target directory. The synchronization command is the command of the linux system itself, such as cp, rm, mv, added files copied with cp, deleted files, deleted files over there, moved files, and moved with mv over there.
Briefly describe how to synchronize
Suppose there are two machines An and B, and An is synchronized to B, then:
A: install lsyncd+rsync and run the lsyncd service; B: just install rsync and run the rsyncd service
After the lsyncd in A listens to the file change, it calls the rsync in A to push the file to B. B can receive the push because it is running a rsyncd service, thus completing the file synchronization.
In addition, there can be C, D, E, F,. They are all the same as B, as long as the rsyncd service is running, and A can be configured to push to multiple machines at once.
Install lsyncd
CentOS uses yum, and others use their own package management software, such as Ubuntu using apt-get,Mac and brew install:
Yum-y install rsync lsyncd
After Lsyncd is installed, the default configuration file is / etc/lsyncd.conf, and there are other configuration examples under / usr/share/doc/lsyncd-2.2.2/examples:
/ usr/share/doc/lsyncd-2.2.2/examples/ ├── lalarm.lua ├── lbash.lua ├── lecho.lua ├── lftp.lua ├── lgforce.lua ├── limagemagic.lua ├── lpostcmd.lua ├── lrsync.lua ├── lrsyncssh.lua └── lsayirc.lua0 directories, 10 files
/ etc/lsyncd.conf can also be written as / etc/lsyncd.lua, which itself is configured in lua (a scripting language).
Detailed explanation of Lsyncd configuration file
The configuration file is written in Lua, so comments are made in Lua's comment symbols, that is, two horizontal bars.
The configuration file has three main parts:
Some settings of settings:lsyncd itself, such as the log file path, the number of synchronization processes, whether to run in the background, and so on. Sync: settings related to synchronization, such as where to sync to, which files to ignore, how often to synchronize, etc. Rsync: this section is in sync, which mainly configures some of the options of rsync itself.
Here are two official configuration documents:
Configure Settings:The Configuration File configure sync and rsync:Config Layer 4: Default Config
The default content in / etc/lsyncd.conf is of little value and can be deleted altogether. Let's explain how to write the configuration file below.
Default.rsync Mode profile:
-because the configuration file is actually the syntax of the lua language, so you need to write comments using-- is the annotation symbol of the lua language.
-- configuration of Lsyncd itself
Settings {--specify the log file location logfile = "/ var/log/lsyncd/lsyncd.log",-- specify the status file location statusFile = "/ var/log/lsyncd/lsyncd.status",-- whether to run in the background, note that it is nodaemon, so it is double negative. If you enter false, it means "do not run in the background" (that is, run in the background). Non-background running is generally used for debugging. Set the verbose of rsync to true, so that the details of synchronization will be output to the console, so that it is convenient to debug nodaemon = false, the system inotify specifies the change of listening and what events will be synchronized. CloseWrite means synchronization when the file is closed (the CloseWrite event is triggered when the file is created, modified and saved), which can be "Modify", "CloseWrite" (the default), or "CloseWrite or Modify". InotifyMode = "CloseWrite",-- maximum number of synchronous processes (default.rsyncssh mode, it must be set to 1, this is the disadvantage of rsyncssh mode. If it is default.rsync mode, it can be set greater than 1, so there will be multiple synchronous processes, faster) maxProcesses = 8,-- maxProcesses = 1,-- used with the following delay option, delay unit is seconds, when the delay time is up, no matter how much maxDelays is set. All will be synchronized. Similarly, when maxDelays reaches the set value, it will be synchronized regardless of delay time, that is, synchronization will be triggered if one of the two options is satisfied. In order to synchronize the same > step in real time, we generally set it to 1, which means that even if there is only one file change, synchronization will be maxDelays = 1,}-synchronization configuration default.rsync mode (such as where the configuration is synchronized to where, which files are ignored, how often synchronization, etc.) There can be multiple sync modules, each of which is used to set up a target machine sync {--there are three modes of default.rsync/default.direct/default.rsyncssh, and we can all use default.rsync by default. Default.rsync,-- synchronization source directory (a local directory) source = "/ data/wwwroot",-- synchronization destination address. Different synchronization modes have different writing methods. Since rsync synchronization is used in most cases, the synchronization address of rsync target = "remote@192.168.1.6::wwwroot" is written here, which is the default true. Some files in the directory server are allowed to be deleted (i.e., "those files that do not exist on the source server"). Available values are: true/false/startup/running,startup determines which files in the target server do not exist in the source server only when the lsyncd service is started, and then deletes these files, but if new files are added to the target server after startup, these files do not exist on the source server. And it won't be deleted. Running is the opposite of startup, which is not deleted at startup, but will be deleted after startup. True=running+startup,false is equivalent to neither running nor startup. -- delete = true,-- which files are out of sync (available regular) exclude = {'.* *', '.git / * *','* .bak','* .tmp', 'runtime/**',' cache/**'},-- in conjunction with the maxDelays above, maxDelays is the cumulative number of events (in units) Delay is the time (in seconds). The two will be synchronized once as long as one meets the criteria, but in order to ensure real-time synchronization, maxDelays is generally set to 1, that is, as long as there is a file change event, it will be synchronized once, while delay is relatively large, the default is 15. Of course, if we set maxDelays to 100, it may not reach 100 file changes after 15 seconds, but because of the arrival time, it will also be synchronized. Delay = 15,-when init = false, only the files that have changed events after the start of the process are synchronized. The original directories will not be synchronized even if there are differences. If it is true, they will be synchronized if there are differences between the files in the source directory and the destination directory. The default is true. -- init = false,-- configuration of rsync (this is default.rsync mode, if it is default.rsyncssh mode The configuration of this module will be different) rsync = {- absolute path to the rsync executable file binary = "/ usr/bin/rsync",-- password file path (not required in default.rsyncssh mode) password_file = "/ etc/rsyncd.password",-- synchronize after packaging (note, packaging is not equal to compression Packaging can be compressed or not compressed) archive = true,-- compressed and then synchronized compress = false,-- output synchronization information (because it is executed in the background, it is not necessary to output, if it is not executed in the background, it can be set to true Non-background execution is mainly used for debugging) verbose = false,-- because rsync has so many options (please check rsync-- help), some non-primary options can be specified in the form of _ extra, enclosed in double quotes and separated by commas (bw in bwlimit is bandwith, that is, bandwidth, which means bandwidth limit). Omit-link-times ignores the modification time of symbolic links) _ extra = {"- bwlimit=200", "--omit-link-times"}
Explain several options:
Target = "remote@192.168.1.6::wwwroot", 192.168.1.6 is the rsync server-side ip,remote is the server-side configured user name and wwwroot is the server-side module name. Password_file = "/ etc/rsyncd.password", the content in rsyncd.password is a string (for example: 123456, you do not need to write password=123456), it is the password on the server side (the account and password can be configured on the rsync server side). You need to use the command chmod to grant the file 400permissions.
Exclude can then be replaced by excludeFrom, so that you can write separate files in external files that you want to exclude synchronization:
ExcludeFrom = "/ etc/lsyncd_exclude.lst"
External exclusion synchronization file / etc/lsyncd_exclude.lst writing:
.svnRuntime / * Uploads/*
If some fragments in the path of an event match these texts, they are excluded. For example, / bin/foo/bar matches the rule foo.
-if the rule starts with /, then only the beginning of the path is matched
If the rule ends with /, then only the end of the path is matched
-? Match any character that is not /
-* match 0 or multiple non / characters
-* * matches any character 0 or more times.
Rsync server-side configuration
At the beginning of the article, we have already said what the An and B machines are going to install, and now the operation of the A machine has been mentioned before, and the operation of the B machine, as we have written before, will not be repeated here, please read directly: the use of rsync.
Open port
The default port of rsync is 873.If you are CentOS7's firewalld firewall, you can allow port 873in the following ways:
Firewall-cmd-zone=public-add-port=873/tcp-permanentfirewall-cmd-reload
If you are experimenting locally and find the firewall troublesome, you can also turn off the firewall:
Systemctl stop firewalld
Start the lsync service
Official documentation on how to launch: Invoking.
Once the lsync configuration file is written, you can start it. Since we have a configuration file, the startup method is as follows:
Lsyncd-log Exec / etc/lsyncd.conf
-log Exec means to log all processes (because if maxProcesses is greater than 1, there will be multiple synchronous processes)
After startup, it only outputs:
21:46:54 Normal:-Startup, daemonizing
Check to see if the startup was successful:
Ps aux | grep lsyncd
How to run the process properly, you can see:
Root 5238 7.7 0.6 13348 3340? Ss 21:46 0:15 lsyncd / etc/lsyncd.conf
If you look at the log file, you will see that many files have been synchronized:
Tail-100f / var/log/lsyncd/lsyncd.log
But in fact, in CentOS7 systems, we generally do not start directly, but use the systemctl command to start:
Systemctl start lsyncd
View startup status:
Systemctl status lsyncd
Stop:
Systemctl stop lsyncd
Restart:
Systemctl restart lsyncd
Set Boot self-boot:
Systemctl enable lsyncd
Default.rsyncssh Mode profile:
Compared with rsync, the main changes are as follows:
The maxProcesses in settings must be 1, or it cannot be started with the following error:
The password_file in Error: error preparing / etc/lsyncd.conf: / etc/lsyncd.conf:69: default.rsyncssh must have maxProcesses set to 1.rsync is removed (or commented out) because ssh no longer needs to be verified with rsync's password. Sync adds a host, which is the format of the ssh login (that is, the format of username@). Change target to targetdir, and the format of the value is the absolute address of the target server, such as: / data/wwwroot/ (the last slash may or may not be required, preferably, because it is a directory at a glance). The ssh login user specified by host needs to have the permission of the directory specified by targetdir. If it cannot be started, try to use root, and set the user to log in secret-free: Linux- logs in with ssh password-free login, and password-free login will not start if you do not configure it. Password-free login should be noted: suppose your A machine uses root to start lsyncd (basically you have to use root), and you host=zhangsan@12.34.56.78 (B machine), then you must ensure that A machine can ssh to B machine under the root user. Host specifies ssh users, which must be the same as the users needed in the target folder. For example, in many cases, we all use users and groups like www:www in the wwwroot website directory, so you must use this user to synchronize, otherwise the files created after synchronization do not have this permission, and the site may have problems if the permissions are incorrect. -- because the configuration file is actually the syntax of the lua language, you need to write comments using-specify the location of the log file logfile = "/ var/log/lsyncd/lsyncd.log",-- specify the location of the state file statusFile = "/ var/log/lsyncd/lsyncd.status",-- inotify event mode, which events are synchronized CloseWrite means to synchronize when a file is closed (create a file, modify a file and then close it (such as vim's: wq) will trigger the CloseWrite event) inotifyMode = "CloseWrite",-- the maximum number of synchronization processes (default.rsyncssh mode must be set to 1, otherwise cannot be started, default.rsync mode can be set to greater than 1) maxProcesses = 1,-used with the following delay option, the delay unit is seconds, when the delay time is up No matter how much maxDelays is set, it will be synchronized. Similarly, when maxDelays reaches the set value, it will be synchronized regardless of whether it reaches delay time or not. That is, synchronization will be triggered if one of the two options is satisfied. In order to synchronize the same > step in real time, we generally set it to 1, which means that even if there is only one file change, maxDelays = 1,-whether it is run in the background mode, note that it is nodaemon, so it is double negative, if you enter false. It means "do not run in the background" (that is, running in the background). Non-background running is generally used for debugging, and the verbose of rsync is also set to true, so that the details of synchronization will be output to the console, so it is convenient to debug nodaemon = false,}-- synchronously configure default.rsync mode (such as where to configure synchronization to where, which files to be ignored, how often synchronization, etc.), there can be multiple sync modules Each module is used to set up a target machine sync {- there are three modes of default.rsync/default.direct/default.rsyncssh, we use default.rsyncssh mode by default, because this method is actually the best. Default.rsyncssh,-- synchronize the source directory (a local directory) source = "/ data/wwwroot/",-- synchronize the destination address, rsyncssh mode writes host= "192.168.1.6", targetdir= "/ data/wwwroot/",-- default true, which allows you to delete some files in the directory server (that is, delete "those files that do not exist in the source server") Optional values are: true/false/startup/running,startup only determines which files in the target server are not in the source server when the lsyncd service is started, and then deletes these files, but if new files are added to the target server after startup, these files will not be deleted even if they do not exist on the source server Running is the opposite of startup, which is not deleted at startup, but will be deleted after startup. True=running+startup,false is equivalent to neither running nor startup. -- delete = true,-- which files are out of sync (available regular) exclude = {'.* *', '.git / *','* .bak','* .tmp', 'runtime/**',' cache/**'} -- ignore file path rules or use external configuration file-- excludeFrom = "/ etc/lsyncd_exclude.lst",-- cooperate with maxDelays above MaxDelays is the cumulative number of events (unit: unit), delay is the time (unit: second), as long as one meets the conditions, the two will be synchronized once, but in order to ensure real-time synchronization, maxDelays is generally set to 1, that is, as long as there is a file change event, it will be synchronized once, while delay is relatively large, the default is 15. Of course, if we set maxDelays to 100, it may not reach 100 file changes after 15 seconds, but because of the arrival time, it will also be synchronized. Delay = 15,-when init = false, only the files that change events occur after the process starts. The original directory will not be synchronized even if there are differences. If it is true, then after startup, if the files in the source directory and the target directory are different, they will be synchronized. Of course, we have to set it to true, and the default is true, so this setting can not be written, and it is written here to explain it. -- init = false,-- configuration of rsyncssh (this is default.rsyncssh mode, if it is default.rsyncssh mode, the configuration of this module will be different) rsync = {--absolute path of rsync executable file binary = "/ usr/bin/rsync",-- password file path (this configuration is not used in rsync mode This item is required for rsyncssh mode)-- password_file = "/ etc/rsyncd.password",-- synchronize after packaging (note that packaging is not equal to compression Packaging can be compressed or not compressed) archive = true,-- compressed and then synchronized compress = true,-- synchronous symbolic link file copy_links = true,-- synchronous symbolic link directory copy_dirlinks = true,-- output synchronization information (since it is executed in the background, it is not necessary to output If non-background execution can be set to true, non-background execution is mainly used for debugging) verbose = false,-- because rsync has so many options (please rsync-- help check), some non-primary options can be specified in the form of _ extra, surrounded by double quotes and separated by commas (bw in bwlimit is bandwith, that is, bandwidth, which means bandwidth limit). Omit-link-times ignores the modification time of symbolic links) _ extra = {"- bwlimit=200", "--omit-link-times"},-- specify ssh related parameter options rsh = "/ usr/bin/ssh-l xiebruce-I / root/.ssh/id_rsa-o StrictHostKeyChecking=no"}}
This sentence is used to log in to the server through ssh:
Rsh = "/ usr/bin/ssh-l xiebruce-I / root/.ssh/id_rsa-o StrictHostKeyChecking=no"
You usually use ssh to log in to the server, perhaps using ssh zhangsan@192.168.1.6, and at most add-p to specify the port, but in fact, ssh has many options, such as ssh-l test root@192.168.1.6, which means that although I log in with root, I use-l (the abbreviation of login) to specify the user to log in, so I will eventually log in as a test user. And-I (abbreviation of identify) means to specify a private key (authentication file), usually to log in to the server without a password. The reason is simple: this synchronization cannot allow you to enter a password every time, so you need to log in without a password. While-o (abbreviation for option) means options, ssh has many options, which can be found with man ssh, and what each option means needs to be checked with man ssh_config. "StrictHostKeyChecking" means to strictly check the key fingerprint (fingerprint key) of the host, when you log in to a server for the first time. It always has this hint: The authenticity of host '192.168.1.6 (192.168.1.6)' can't be established.ECDSA key fingerprint is SHA256:xcDUp3zNlJvhY4fwfwDH1pgOyc5p8Vsr2OjopanEQBw.Are you sure you want to continue connecting (yes/no)?
If you enter no, it will not log in, if you enter yes, it will log in and add the "key fingerprint" (fingerprint key) to the known_hosts file under your terminal's ssh configuration directory. For Mac/Linux computers, the location of this file is in ~ / .ssh/known_hosts,Windows, then it is in the C:\ Users\ user name\ directory.
Similarly, now it is rsync to log in to your ssh, so rsync will also store such a fingerprint key. If "StrictHostKeyChecking" is set to yes, it means you have to strictly check the key every time (which means you have to type yes every time you log in from the terminal), which is obviously not necessary, so we have to set it to "StrictHostKeyChecking=no".
Default.direct mode: I didn't test it.
Sync {default.direct, source = "/ home/user/src/", target = "/ home/user/trg/"} synchronize to multiple computers at the same time
The format is as follows: each target server can have one sync module, and each sync module can be written as mentioned above. In fact, the ip is different, and everything else is the same:
Settings {logfile = "/ var/log/lsyncd/lsyncd.log", inotifyMode = "CloseWrite or Modify",-- statusFile = "/ var/log/lsyncd/lsyncd.status",}-- B server configuration sync {default.rsync, source = "/ etc/nginx/", target = "rsync://rsync@192.168.1.6:1873/nginx/", exclude = {". *", "* .tmp", "* .swp" "* .bak", "* .log", "* .swx", "* ~", "sets/config.json", "listen_local_*"}, delay = 2, init = false, rsync = {password_file = "/ etc/rsyncd.passwd", archive = true, compress = true, verbose = true, checksum = true Ignore_times = true}}-- C server configuration sync {default.rsync, source = "/ data/web/", target = "rsync://rsync@192.168.1.11:1873/web/", exclude = {". *", "* .tmp", "* .swp", "* .bak", "* .log", "* .out", "* / logs/*", "* .swx" "* ~"}, delay = 120,120 init = false, rsync = {password_file = "/ etc/rsyncd.passwd", archive = true, compress = true, verbose = true, checksum = true, ignore_times = true}}-- D server configuration sync {default.rsync, source = "/ data/script/" Target = "rsync://rsync@192.168.1.100:1873/script/", exclude = {".tmp", "* .swp", "* .bak", "* .log", "* .out", "* / logs/*", "* .swx", "* ~"}, delay = 2, init = false, rsync = {password_file = "/ etc/rsyncd.passwd" Archive = true, compress = true, verbose = true, checksum = true, ignore_times = true}}
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.