In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-15 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/01 Report--
Monitoring Agent integrated Lua engine how to achieve multi-dimensional log collection, in view of this problem, this article introduces the corresponding analysis and solution in detail, hoping to help more partners who want to solve this problem to find a more simple and feasible method.
In the monitoring system, log processing is the process of collecting the original logs generated when the service is running, extracting available data and forming monitoring indicators according to the parsing rules configured by users. This process is generally completed by the log collection Agent of the monitoring system.
General log collection Agent generally provides a variety of log parsing methods, such as delimiter, KRV, regular expression and so on. In order to adapt to some commonly used systems or components (such as Nginx, Syslog, etc.), some log collection Agent will also provide some prefabricated log parsing configuration to achieve the effect of being used out of the box.
Baidu's business scenarios are very complex, involving search services, community services, financial services, AI services and so on. There are great differences in log formats produced by these business programs. How to deal with these different formats of logs becomes an important issue. Today, we will discuss how to solve this problem from the perspective of Baidu Noah monitoring platform.
1K:V log
As shown in the figure above, this is a typical log in the form of KRV.
We can separate the log by a simple separator, and extract uri, c_time, idc and other monitoring items from the log according to the style of KRV.
2 multi-line log
This is the Stack information of a C++ program. You need to extract the multi-line log as a Trace information, and extract the function name, file name and line number in each line separately and push them uniformly for fault location of batch instances.
This example requires two capabilities, multi-line log processing and string extraction within a single-line log.
3 mixed log
In this example, each line of log is mixed with information such as service name, code location, user-defined data, and so on. It needs to be extracted by delimiter, KRV and JSON parsing, respectively.
For these scenarios, some open source schemes, such as Logstash,Collectd, implement such functionality by supporting such semantics or plug-ins in configuration files. We refer to these open source implementations, combined with the Baidu business scenario, and realize the log processing requirements through the log plug-in function on the monitoring and collection Agent.
When implementing a plug-in, you need to focus on the following aspects:
1. Versatility and ease of use: it needs to meet the customized needs of users as much as possible, and the development is simple.
two。 Performance: in a typical log collection scenario, log files with MB or even dozens of MB per second need to be processed, and operations such as field segmentation, regular matching and data format conversion need to be performed by the processing engine.
3. Availability and security: Agent runs on an online production server with high requirements for stability and security.
Implementation of Agent Log plug-in
How to implement customized log parsing logic is very simple. We encapsulate the Log parsing class, which contains interfaces for obtaining single-line logs and returning the parsing results of monitoring items, which can be called by user-defined log parsing scripts. The user needs to implement the Callback function in the log parsing script, which is called by Agent when parsing each line of logs.
All the log processing logic is completely implemented in the script. For example, the user can maintain the global Context in the script and complete the multi-line log processing through the progress information saved in the Context.
It also encapsulates a general log processing tool library, which is provided in the form of Lua built-in classes, including JSON, Debug and other tools.
Availability and security
Agent runs on all servers, and availability and security are the most important considerations.
In terms of usability, it is mainly to avoid the abnormal collection function caused by the Bug of the custom script itself or the plug-in engine Bug. In addition, it is necessary to avoid the excessive occupation of resources, which will affect other services on the server.
For user code, resource usage needs to be strictly regulated. Execute the task of the plug-in, as a separate process, using mechanisms such as Cgroup and Ulimit to limit the resource consumption, and also as a means of execution isolation to avoid the Bug of a single script or plug-in engine affecting the normal execution of all collection tasks.
In addition, the execution time of the task is also controlled by Agent to prevent the task from running overtime.
In terms of security, custom log parsing scripts need to be hosted by the configuration center to avoid tampering.
Some functions provided by Lua itself are also shielded, such as io.open/io.popen/os.execute/os.remove and other high-risk operation interfaces to avoid calling external programs from scripts or deleting system files.
Enhancement mode
After a period of online running, in some scenarios, the performance of log processing can not meet the requirements.
For general log collection scenarios, the log resolution throughput is increased by about 4 times by replacing Lua with Luajit, which can cover almost all of our general log collection scenarios. In the process of replacement, attention should be paid to the handling of compatibility issues, such as the semantics of Regexp is not the same as the standard Lua, the maximum number of lua_ctx limits, and so on.
Special business requirements scenarios need to be targeted and optimized. For example, in the process of collecting some business logs, operations such as the conversion from UNIX timestamp to RFC format and the conversion from IP address to computer room information are needed. The efficiency of realizing the requirement by looking up the table or converting in the Lua script is very low. For these scenarios, we use C++ and other languages to encapsulate classes that can be called directly in Lua, effectively improving the performance of such operations by more than an order of magnitude. This integration method can also be used to support some customized functions, such as the collection of information such as Protobuf and BaiduRPC variables.
There is still room for improvement in performance. The current log processing is to run the log processing engine in a single process and single thread to solve the demand, which is extended to multi-thread, and the concurrent mode can be used to effectively improve the throughput.
This is the answer to the question about how to realize multi-dimensional log collection by monitoring Agent integrated Lua engine. I hope the above content can be of some help to you. If you still have a lot of doubts to be solved, you can follow the industry information channel for more related knowledge.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.