In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-18 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/03 Report--
This article will explain in detail the problem of 60-second positioning, what is the daily Debug of ten times programmers, and the content of the article is of high quality, so the editor will share it for you to do a reference. I hope you will have a certain understanding of the relevant knowledge after reading this article.
TDengine is a cluster system, any operation will involve logical nodes such as APP, taosc, mnode and vnode. These nodes communicate with each other through Socket. And in the test, there may be multiple TDengine instances, which makes the analysis more complicated. For an operation, how to collude the log matching of different logical nodes and filter efficiently has become the key to analyze the problem.
Turn on the relevant log switch
Each independent module of TDengine has its own debugFlag, including taosc, dnode, vnode, mnode, tsdb, wal, sync, query, rpc, timer and so on. Currently, the log output of each module can be controlled to:
Fatal/Error, error, ERROR will be displayed in the log
Warning, warning, WARN will be displayed in the log
Info, important information
Debug, general information
Trace, very detailed and recurring debugging information
Dump, raw data
The output log can be controlled to:
File
Screen
All the above controls are controlled by one byte of debugFlag, and the bit diagram in this byte is as follows:
Therefore, if you want to output error, warning, info,debug to a log file, then debug should be set to: 135; if you also want to output trace-level logs, debug should be set to: 143; if only error and warning are output, debug should be set to 131. Under normal circumstances, it is recommended that the debug be set to 135.
The debug flag settings of each module are all controlled by the configuration file taos.cfg. The specific parameters of each module and the module name shown in the log are as follows:
If there are too many configuration parameters, the easiest way is to set the total parameter debugFlag of debug. After this parameter is set, except for the tmr log, the debug of all modules is set to the same parameter debugFlag. The default value of debugFlag is 0, and when debugFlag is non-0, all log configuration parameters will be overridden.
Unless there is a special case, setting util is not recommended. The debugFlag of timer is 135,131is appropriate.
Log file
TDengine generates client-side and server-side logs and stores them in the log directory. The default log directory is / var/log/taos, but it can be specified by modifying the configuration parameter logDir in taos.cfg.
The client log file is called taoslogY.X (because multiple clients can be run, multiple log files can be generated on one machine)
The server-side log file is taosdlog.X
The size of the log file is controlled. After reaching a certain number of lines (the parameter numOfLogLines is controlled in taos.cfg), a new log file will be generated. However, TDengine retains only two log files, with file names ending with 0 or 1, alternating.
Log format:
Log files, from left to right, divided into four chunks
Timestamp, accurate to subtlety
Thread ID, because it is multithreaded, this parameter is very important, because only the log output from the same thread is guaranteed by timing and is output according to the designed flow.
Module name, three letters
Log output from each module
Several steps to analyze the log
When a test or customer reports a Bug, either manually or automatically, it occurs by performing a specific action. This specific operation usually executes a SQL statement. This problem may be caused by the client or by the server code. Take create table as an example to explain the analysis of the log, and remove the timestamp from the figure to facilitate focus interpretation.
Take a look at the client first.
The first thing you need to view is the client log. The example screenshot is as follows:
First find the problem SQL statement, search the client log for "SQL:", and you can see it (the second line of the screenshot). Search the log for "SQL result:" (line 11 of the screenshot). If successful, "SQL result:success" will be displayed, otherwise "SQL result: xxxx" will be displayed, where xxxx is the error message. How to quickly find the failed SQL, you need to know the approximate time range, what the SQL statement is, so the search will be very fast.
A very important parameter for the log data of taoc is pSql, which is an address assigned to the internal SQL Obj. Taosc puts this address information at the top of all taosc logs, immediately after TSC. This value is critical and is the key to traditional client and server logs. In the screenshot above, mark it with a green background.
The parameter pSql is passed to the RPC module as ahandle, and RPC prints it in all messages (green background). Because many modules will call the RPC module, RPC will also print out who is called. For example, in the screenshot, if it is called by TSC, the RPC TSC will be printed.
RPC will send the message create-table to the server, and the RPC log will be printed (line 8 of the screenshot), telling the End Point of the dnode to which it was sent. The screenshot shows that the message is sent to hostname: 9be7010a917e, with a port of 6030. If there is a problem, then we need to check the server log where the End Point is located.
You can see that the RPC module received a response from the server, but to avoid resource consumption for translation, the log shows only the hexadecimal IP address (line 9 of the screenshot, 0x20012ac) and the port number. The logging of the RPC module is critical because it collects the logical nodes.
Look at the server again.
After analyzing the client log, the server log is very critical. Take create-table as an example. Please see the screenshot:
From the client log, find pSql, and the value is 0x5572c4fab3a0, so search 0x5572c4fab3a0 directly in taosdlog and you can see the log with the green background in the screenshot. Therefore, pSql is an important parameter for concatenating client and server logs.
For the specific operation of create-table, there is mnode processing. In the screenshot, because the first table is created, you need to create vnode first, and then create table and other operations. Many modules are involved and will not be explained in detail.
Finally, mnode creates the table and sends it back to response through the RPC module (line 52 of the screenshot, the last line), so it is clear that the server is working properly.
Note: after receiving the message, the dnode module will distribute the message to the message queue of mnode vnode according to the message type. The worker thread then consumes the message in the message queue and processes the message. For vnode, the sub-modules tsdb, wal, sync and cq are all executed in a single thread, so after finding the pSql (the second line of the screenshot), you need to look down according to the ID of the thread to know the whole process, which is easy to analyze.
Several key points
Find the failed SQL statement first
Find the value of pSql and copy it, assuming it is xxxxx
Grep "xxxxx" taoslogx.x, find the client log associated with this SQL and see if you can find the problem
Open the taosdlog server log, search for the value xxxxx of pSql, and check the timestamp to see if it is the failed operation
Then analyze the server log
Messages from the RPC module are critical. It is important that for each RPC message, parse code: xx is printed, which is the result of the protocol parsing, 0 indicates no problem, and other values indicate that the protocol parsing is not successful. But at the same time, the message itself also has code: 0xXX, which is the error code brought by the sender, which is usually sent by the server to the client. If it is correct, the code is 0, otherwise it is an error.
Another log matching method
When the client sends a message through the RPC module, the log has a similar
Sig:0x01000000:0x01000000:55893
This is RPC's source ID, dest ID, and transcation ID, which are combined to uniquely identify a link from a client. When each new message is sent, the transcation ID is incremented by one, so it is unique for a period of time (transcation ID is a two-byte and loops).
Version 1.6 can only rely on the string sig to match the client and server logs, but it needs to look at a lot of context, so it is troublesome and inefficient.
Version 2.0 passes the pSql to the server side so that the log matching between the client and the server will be greatly accelerated.
Be familiar with the method of logging
First of all, you need to understand the design of TDengine and the flow of each major operation.
Turn on all log switches (set debugFlag to 135), execute all SQL statements once, and check the corresponding client and server logs against each SQL.
View the SQL statements executed by the client
The client generates a lot of logs and looks at the executed SQL statements to facilitate analysis and repetition of the problem. There are several ways to find out what kind of SQL statements the system has executed
If the client log is open, execute: grep "SQL:" taoslog*, will see all executed SQL statements in the log.
If you use taos to execute the SQL statement manually, look in the home directory for the hidden file .taos _ history, which will contain all the historical commands executed by taos.
Configure the client, in the configuration file, set the tscEnableRecordSql parameter to 1, that is, print the SQL statements entered by the client to a separate file (tscnote-xxxx.0,xxxx is pid), the same directory as the client log.
For the resetful interface, setting the httpEnableRecordSql parameter to 1 in the taosd configuration file prints the htpp request to a separate file (httpnote.0), the same directory as the server log.
Dynamically modify the log
Sometimes the server or client cannot be restarted, but the log is not set correctly. You need to set it dynamically. Perform the following steps:
Show dnodes;// finds IDalter dnode id debugFlag 143 for each dnode; / / sets the corresponding debugFlag
The id in the second step is obtained in the first step.
Sometimes it is necessary to output the subsequent log to a new file to facilitate log viewing and search, and execute:
Alter dnode id resetlog
Sometimes the shell cannot be linked at all, so you can send SIGUSR1 commands to the process on the machine running in taosd, such as:
Kill-SIGUSR1 pidxxx on the 60-second positioning problem, ten times what the programmer's Debug has to share here, I hope the above content can be of some help to you, can learn more knowledge. If you think the article is good, you can share it for more people to see.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.