In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-18 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >
Share
Shulou(Shulou.com)06/03 Report--
1. The concept, necessity and basic ideas of log cutting 1.1 what is log cutting
Log cutting refers to when the log file of the application or system reaches the set trigger condition (for example, according to a certain time period: every day, according to the size: 500MB), it is cut / split, similar to truncation processing, the log file with large capacity is "robbed" and saved into another log file for retention and archiving, and the log generated after this moment continues to be output to the log file in which the file header is reset to 0.
Changes: the capacity of log files (slimming down), the number of log files (more than one cut history log)
The unchanged part: the log file name remains the same
In addition, after a period of time, we also need to delete old log files, the whole process is also commonly known as log scrolling (log rotation).
1.2 Why do you need log cutting?
During the long-term operation of online applications (including operating systems), a lot of process logs are generated, usually files recorded by the application that are useful to system administrators or program developers. such as what is being executed, what errors have occurred, and so on.
As the log records accumulate, the log files become larger and larger, resulting in the following disadvantages over time:
Log files take up more and more hard disk space. Log files are too large, so it is too time-consuming to view content after reaching GB level, and it is very inconvenient to track errors.
A chestnut:
1.3 the basic idea of log cutting
Basic requirements for log slicing:
Application level
The normal operation of the application will not be affected during the cutting process (the application cannot be disabled to split the log)
Data plane
No logs are lost, or very few logs are lost within an acceptable range
During the cutting process, it does not affect the application to continue to output logging (the log file name remains the same)
Log capacity level
After cutting, the new log file is re-recorded from the empty file (the capacity and file header are reset), which is convenient for subsequent queries.
Log archiving level
The old log after cutting is convenient for archiving and compression (file name plus date suffix, etc.)
The split old log can be deleted by polling according to the save cycle.
Management and maintenance level
Automation is carried out periodically and over and over again
Based on the above requirements, two basic ideas for log cutting can be designed under the primary premise of meeting the primary premise of non-stop applications:
Idea 1 color-rename removes the old log file and generates a new log file (the file name is the same as before cutting)
Mv the existing log into another file, and automatically generate a new log file with the same file name (the log file name before mv)
The file name of the log remains the same, but you need to make sure that the application can point to the new file handle
Of course, the new log file is written from 0, and the log file has been slimmed down successfully!
Idea 2-copy and rename an existing log file while emptying the contents of the existing large log file
Cp the existing log file to another file name, while emptying the existing log file very quickly
However, the file handle of the existing log file is not changed (more on this in section 3.1.1).
In this way, the file name and handle of the log file have not changed, but the content has been cleared, and it continues to be written from 0 to lose weight successfully!
Finally, combined with scheduled tasks, log cutting tasks need to be performed periodically and periodically, and automatic processing is needed.
two。 Common log cutting scheme 2.1 Custom script for log cutting
Custom script cuts the log, the core principle is that mv has a log file, and then generate a new log file, combined with kill-USR1 PID to reload application, so that the application of the file handle to obtain the new log file, the log can be output to the new log file.
Note:
The file name of the new log file here has not changed, but it is essentially a brand new file. Kill-USR1 PID is only the configuration of the reload application and will not really restart the application process, so it will not cause the application to stop running.
The script cuts nginx log chestnuts:
# / bin/bashbakpath='/home/nginx/logs'logpath='/var/log/nginx/logs'if [!-d $bakpath/$ (date +% Y) / $(date +% m)] Then mkdir-p $bakpath/$ (date +% Y) / $(date +% m) fimv $logpath/access.log $bakpath/$ (date +% Y) / $(date +% m) / access-$ (date +% Y%m%d%H%M) .log # send a semaphore to nginx, let nginx overload, and regenerate a new log file kill-USR1 `cat / usr/local/nginx/logs/ nginx.pid`2.2 Application level combined with log4j to cut the log
Log4j is a set of apache open source log framework for java applications. Java applications can standardize the format and level of log output from the application itself by loading specified jar packages and combining configuration files, as well as additional log cutting processing.
The trigger condition of cutting can be two dimensions: time period and log size.
Log4j generally requires the assistance of developers and is best implemented directly by developers.
2.3 cutting log 2.3.1 logrotate based on third-party open source tools
Linux system comes with log processing tools, the function is very powerful, can be used for log cutting, compression, rolling deletion and other processing.
It runs based on the crontab of the system and does not require manual configuration.
2.3.2 cronolog
Cronolog open source log processing tool, can automatically generate periodic log files according to the rules, need to be installed separately and configured to use.
2.4 comparison of log slicing schemes and type selection comparison of deployment workload to the application affinity functional script the initial workload is large, the underlying logic needs to implement the fully functional script one by one, generally speaking, the script itself is log4j is small, loading the jar package for configuration is good and more powerful, but does not support log compression third-party tools are small and only need to be configured better and more powerful.
The third-party tools here are only for logrotate (other tools can be compared by reference to comparison factors).
Type selection:
Use borrowlism to avoid repeating your own wheels (especially your own Barabara scripting)
Preferred application level combined with log4j scheme
It is suggested that log4j can be used at the application level to apply log4j, combined with third-party tools for log compression and periodic deletion.
Alternative (second choice) open source third-party tool-logrotate
It is recommended to use third-party tools when it is not convenient to use log4j, especially logrotate. The system comes with it and can be used out of the box-- the second choice.
3. Logrotate uses the working principle of 3.1 logrotate 3.1.1 core principle file handle
The operating system maintains a separate open file table (fdtable) for each process, and each new file opened by the process adds an entry to the table.
The file descriptor is an integer that represents the index location (subscript) in fdtable and points to the specific struct file (file handle / file pointer)
The file handle (file pointer) corresponds to the detailed information of the file, storing the status information, offset and inode information of the file.
The inode information stored in the file handle corresponds to a specific file being written by the application process
Each file descriptor corresponds to an open file (file handle)
The application process locates the file it wants to write to through the file handle (file pointer) in fdtable
To sum up, the file handle can uniquely define a specific file in the operating system.
Reasoning:
The file handle does not include the file path and file name, so changes in the file path and file name will not cause changes to the file handle, which in turn will not cause the application process to write to a particular file (the specific direction of the file handle).
The above reasoning conclusion can be verified by experiments.
Cutting log process
The existence of the file handle determines that there are two solutions for log slicing in logrotate:
-# use a new file handle after cutting-- create mode
Create is also the default solution, and its core idea is to rename the original log file and create a new log file. In create scheme, after the mv+create is executed, the application is notified to write again in the new log file.
So how do you tell the application to reopen the log file to write to the new empty log file?
The simple and rude way is to kill the process and reopen it. But this will affect the online business, not desirable!
So some programs provide the interface to reopen the log. Nginx, for example, informs Nginx to reopen the log file by sending a USR1 signal to the Nginx process. There are other ways (such as IPC), as long as the program itself supports it.
Continue to use the old file handle-- copytruncate method after cutting
However, some programs do not support create and do not provide an interface to restart the log at all; and if you rudely restart the application, it is bound to reduce availability, so the scheme of copytruncate is introduced.
The idea of this solution is to copy the log being exported and rename it, and then trucate the original log. As a result, the old log content is stored in the scrolling file, and the new log is output to the empty file.
Matters needing attention
Of the above two schemes, if you can use the default create scheme, you don't need copytruncate,why?
Risk of data loss
Take too long, risk.
Here, the specific implementation process of create and copytruncate, as well as considerations, contains more key technical details. If the above description is not enough to understand, you can follow the course https://edu.51cto.com/sd/3f309, to accept a free trial.
3.1.2 scheduled task execution
Logrotate itself has been integrated into the system's scheduled tasks and runs based on CRON, which runs by default once a day.
Scheduled tasks:
/ etc cron.daily logrotate logrotatekeeper fiexit 0 / then usr/bin/logger-t logrotate "ALERT exited abnormally with [$EXITVALUE]"
Main points:
/ usr/sbin/logrotate execution file
-s / var/lib/logrotate/logrotate.status record the status after execution
Configuration files loaded by / etc/logrotate.conf runtime
Question: what time does cron.daily execute every day?
Logrotate is based on CRON, so this time is controlled by CRON. You can query CRON's configuration file / etc/anacrontab (the old version of the file is / etc/crontab)
Cat / etc/anacrontab # / etc/anacrontab: configuration file for anacron# See anacron (8) and anacrontab (5) for details.SHELL=/bin/shPATH=/sbin:/bin:/usr/sbin:/usr/binMAILTO=root# the maximal random delay added to the base delay of the jobsRANDOM_DELAY=45 # this is a random delay time, indicating a maximum of 45 minutes # the jobs will be started during the following hours onlySTART_HOURS_RANGE=3-22 # this is the start time period 3: 00-22:00 # period in days delay in minutes job-identifier command1 5 cron.daily nice run-parts / etc/cron.daily7 25 cron.weekly nice run-parts / etc/cron.weekly@monthly 45 cron.monthly nice run-parts / etc/cron.monthly
1 5 cron.daily
The first is Recurrence period, and the second is delay time (the base delay, basic delay time).
The total delay time difference is: basic delay + random delay = 5 ~ (5-45), that is, 5-50 min
The base time for the start: 3-22:00, without accident, it starts at 3: 00.
So the cron.daily will be executed during the time period of 3RU 00+ (5JI 50).
Timing the execution time of the task can also be confirmed by log and processed log files.
3.1.3 logrotate execution process
The process for logrotate to automate log processing is as follows:
The default cron of the Redhat series system wakes up and triggers the logrotate scheduled tasks defined under cron.daily during the time period of 3ju 00 + (5pm 50) every day.
Logrotate loads the default configuration file / etc/logrotate.conf and locates the log files that need to be processed
Based on the relevant parameters of the configuration file, the log cutting, compression, dump, or periodic deletion processing is performed on the matched log files.
Generate process status record file / var/lib/logrotate/logrotate.status after completion
In the actual combat application, we first set the relevant parameters and tasks of logrotate, and then wait for cron to wake up scheduled tasks to trigger execution.
Note:
The above principle part is the most important part, after an in-depth understanding of the principle, for the interpretation of configuration files, and the actual application of the production environment, as well as the effective solution of practical problems, the role of help is obvious. This part of the content seems simple, in fact, there are many details and key points, veterans may not be able to straighten them out, it is recommended to follow the course https://edu.51cto.com/sd/3f309, to accept a free try. 3.2 configuration file parsing of logrotate
Now that we understand the principle process, let's comb through the configuration file of logrotate. The configuration file is divided into custom configurations under the global default configuration / etc/logrotate.conf and / etc/logrotate.d directories.
Validity and priority of the configuration:
When logrotate loads the configuration, it will be merged and rendered with the global default configuration / etc/logrotate.conf for each custom configuration in the / etc/logrotate.d directory. If the same or conflicting items exist, the custom configuration shall prevail:
Same item:
The same parameter key is configured in the global configuration and custom configuration, but the value is different.
Conflicting items:
Conflicting items in global and custom configuration
3.2.1 Global default profile
Global profile:
Cat / etc/logrotate.conf # see "man logrotate" for details# rotate log files weeklyweekly # rotate once a week # keep 4 weeks worth of backlogsrotate 4 # keep four cut and dumped log files (that is, backup old log files) # create new (empty) log files after rotating old onescreate # rotate, create a new empty file The default create way to deal with # use date as a suffix of the rotated filedateext # log files is to add the date suffix, suffix format YYYYmmdd, and the cut file name, such as xxx.log-20200202# uncomment this if you want your log files compressed#compress # default cut dump log is the include method commonly used in uncompressed # RPM packages drop log rotation information into this directoryinclude / etc/logrotate.d # programming. All configuration files in this directory will be referenced to take effect # no packages own wtmp and btmp-- we'll rotate them here # two orphan logs in rorate / var/log/wtmp {monthly create 0664 root utmp minsize 1m rotate 1} / var/log/btmp {missingok monthly create 0600 root utmp rotate 1} # system-specific logs may be also be configured here.3.2.2 custom configuration file
The path to the custom configuration file is already indicated in the default configuration file through the include / etc/logrotate.d method, so you can define a refined configuration under this path.
Example 1: cut nginx logs in create mode
Cat / etc/logrotate.d/nginx/var/log/nginx/*.log {# can specify multiple paths, which are separated by spaces or newline characters, and support regular matching of daily # log polling cycles. Weekly,monthly,yearly rotate 30 # retains 30 copies of cut old logs with a daily cycle, that is, 30 days old logs are saved. Delete size + 100m # if it exceeds 100m, split it in unit kMagazine G, which is higher than daily compress # and compress the old log with gzip immediately after cutting. You can also add the date suffix missingok # when cutting nocompress dateext # log files. If there are no log files and do not report errors, notifempty # do not switch when the log is empty. The default is ifempty create 640log # use this mode to create new empty log files, mode The user and group parameters can be omitted from all sharedscripts # files before executing the following script postrotate if [- f / var/run/nginx.pid] at one time Then # script conforms to shell syntax and logic can kill-USR1 `cat / var/run/ nginx.pid` fi endscript}
Other more commonly used configuration parameters:
After the nocopytruncate copy log file is not truncated, the log is not cleared. Only the backup nocreate does not establish a new log file. It is used for the scenario where only the application log is compressed and polled and deleted. When the errors address encounters an error, the information is sent to the specified Email address. The log file after the olddir directory dump is placed in the specified directory. The instructions that must be executed in the same file system prerotate before the logrotate dump as the current log file, such as creating a dump log directory rotate count specifies the number of times to dump before the log file is deleted, 0 refers to no backup, 5 refers to keeping 5 backup maxage to keep old logs in a single dump time period If processed every day, it represents how many days of old logs are retained. At this time, dateext uses the current date-YYYYmmdd as the appended suffix for the dumped log files. If it is not configured, it is suffixed with the numbers 1 to n. N is the configuration parameter dateformat.% s in rotate n used with dateext. Immediately following the next line, the file name of the definition file after cutting is appended with a suffix, which must be used with dateext. Only the four parameters of% Y% m% d% s are supported.
Others can be studied by referring to the man documentation and the configuration files that come with the system under / etc/logrotate.d/.
Note:
Logrotate itself is triggered at a certain time between 3: 00 a.m. and 4: 00 a.m. every day through the system scheduled task cron. As for whether logrotate itself will cut the log files we specify after it is triggered, it also depends on the conditions under which we perform the actions of logrotate. For more details, you can follow the course https://edu.51cto.com/sd/3f309, to accept a free try.
3.3.Automated log cutting and compression using logrotate in actual combat
Application examples:
Create cut nginx log cat / etc/logrotate.d/nginx/var/log/nginx/*.log {daily missingok rotate 60 compress notifempty dateext sharedscripts postrotate if [- f / var/run/nginx.pid]; then kill-USR1 `cat / var/run/ nginx.pid` fi endscript}
Cut tomcat catalina.out logs in copytruncate mode
Compress the localhost logs that have been cut by tomcat
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.