Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Print the log correctly in the Java project if it deteriorates

2025-02-24 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/03 Report--

The main content of this article is to explain "if worsening correctly print the log in the Java project", interested friends may wish to have a look. The method introduced in this paper is simple, fast and practical. Let's let the editor take you to learn "if worsening correctly print the log in the Java project"!

Text

How mistakes are made

How to write an error log that is easier to troubleshoot

The main goal of typing the error log in the program is to provide important clues and guidance for better troubleshooting and solving problems. However, in practice, the content and format of the error log vary, and the error prompt may be incomplete, have no relevant background, and do not understand its meaning, which makes it very inconvenient or time-consuming to troubleshoot and solve the problem.

In fact, if you pay a little attention to programming, you will reduce a lot of useless work in troubleshooting problems. Before explaining how to write an effective error log, it is important to understand how errors occur.

How mistakes are made

For the current system, errors are introduced in three places:

1. Illegal parameters introduced by the upper system. For errors introduced by illegal parameters, errors can be intercepted through parameter check and precondition check.

two。 Errors resulting from interaction with lower-level systems. There are two kinds of errors resulting from interacting with the lower layer:

a. The lower layer system is processed successfully, but the communication is wrong, which will lead to data inconsistency between subsystems.

In this case, the timeout compensation mechanism can be used to record the task in advance and correct the data later through the scheduled task.

If you have any better design plan, you can also leave a message.

b. The communication was successful, but the lower layer made a mistake.

In this case, it is necessary to communicate with lower-level developers and coordinate the interaction between subsystems.

It is necessary to do appropriate processing or give reasonable prompt information according to the error code and error description returned by the lower layer.

No matter which case, it is necessary to assume that the reliability of the lower-level system is general, and make a good design consideration of errors.

3. An error occurred in the processing of the system at this layer.

The cause of the error in this layer of system:

Reason one: caused by negligence. Negligence means that the programmer has the ability to avoid such errors but does not actually do so. For example, knock & & into &, = = into =; boundary error, compound logic judgment error and so on. Negligence is either the programmer's lack of concentration, such as being tired, working all night, writing programs while meeting, or rushing to achieve functions without taking into account the robustness of the program, and so on.

Improvement: using code static analysis tools, such problems can be effectively avoided by unit testing line coverage.

The second reason is that errors and exceptions are not handled thoroughly. Such as input problems. To calculate the addition of two numbers, we should consider not only the problem of calculation overflow, but also the case of illegal input. For the former, it may be avoided by understanding, making mistakes, or experience, while for the latter, it must be limited so that it is within the control of our IQ, such as using regular expressions to filter out illegal input. Regular expressions must be tested. For illegal input, tips, reasons and suggestions should be given as detailed, easy to understand and friendly as possible.

Improvement measures: consider all kinds of error situations and exception handling as carefully as possible. After implementing the main process, an additional step is added: carefully scrutinize all possible errors and exceptions, and return a reasonable error code and error description. Each interface or module can effectively handle its own errors and exceptions, which can effectively avoid bug caused by complex scene interaction.

For example, a business use case is accomplished by scenario A.B.C interaction. When the actual execution of A.B succeeds and C fails, B needs to roll back and return A reasonable code and message according to C, and A rolls back according to B's return and returns reasonable code and message to client. This is a segmented rollback mechanism, which requires that each scenario must take into account the rollback in the case of exceptions.

The third reason is that the logic coupling is close. Due to the close coupling of business logic, with the step-by-step development of software products, all kinds of logical relations are complex, so it is difficult to see the overall situation, resulting in the impact of local modifications spread to the global scope, resulting in unpredictable problems.

Improvement measures: write short functions and methods, each function or method preferably no more than 50 lines. Write stateless functions and methods, read only the global state, the same premise will always output the same result, will not depend on the external state and change their own behavior; define reasonable structures, interfaces and logical segments, make the interaction between interfaces as orthogonal and low coupling as possible; for the service layer, provide simple and orthogonal interfaces as far as possible Continuous refactoring, maintaining application modularization and loose coupling, and sorting out logical dependencies.

For cases where a large number of business interfaces influence each other, the logical processes and interdependencies of each business interface must be sorted out and optimized as a whole; for entities with a large number of states, it is also necessary to sort out the relevant business interfaces to sort out the transition relationship between states.

Reason 4: incorrect algorithm.

Improvement measures: firstly, the algorithm is separated from the application. If the algorithm has multiple implementations, it can be found through cross-check unit tests, such as sorting operations; if the algorithm is reversible, it can be found through reversible unit tests, such as encryption and decryption operations.

Reason 5: the same type of parameters, caused by the wrong input order. For example, modifyFlow (int rx, int tx), the actual call is modifyFlow (tx,rx)

Improvement measures: make the type as specific as possible. The floating point number is used for the floating point number, the string is used for the string, and the specific object type is used for the specific object type. The parameters of the same type are staggered as far as possible; if the above cannot be satisfied, they must be verified by the interface test. The interface parameter values must be different.

Reason 6: null pointer exception. A null pointer exception usually means that the object is not initialized correctly or does not detect whether the object is non-null before using it.

Improvement measures: for configuration objects, detect whether they are initialized successfully; for ordinary objects, detect whether they are not empty before getting entity objects to use.

Reason 7: network communication error. Network communication errors are usually caused by network delay, blocking, or failure. Network communication errors are usually small-probability events, but low-probability events are likely to lead to large-scale failures and difficult to reproduce BUG.

Improvement measures: type the INFO log at the end point of the former subsystem and the entry point of the latter subsystem respectively. Provide a clue through the time difference between the two.

Reason 8: transaction and concurrency errors. The combination of transactions and concurrency can easily lead to errors that are very difficult to locate.

Improvement measures: for concurrent operations in the program, involving shared variables and important state changes, add INFO logs.

If there is a more effective way, you are welcome to leave a message to point out.

Reason 9: configuration error.

Improvement: when starting the application or the corresponding configuration, detect all configuration items and print the corresponding INFO log to ensure that all configurations are loaded successfully.

Reason 10: errors caused by unfamiliarity with the business. In medium and large systems, part of the business logic and business interaction are more complex, the whole business logic may exist in the brains of multiple developers, and everyone's understanding is not complete. This can easily lead to business coding errors.

Improvement measures: through multi-person discussion and communication, the correct business use case is designed, and the business logic is written and implemented according to the business use case; the final business logic and business use case must be fully archived; the pre-condition, processing logic, post-check and matters needing attention of the business are specified in the business interface; when the business changes, the business comments need to be updated synchronously; the code REVIEW. Business annotations are important documents of business interfaces and play an important caching role in business understanding.

Reason 11: errors caused by design problems. For example, synchronous serial mode will have the problems of performance and slow response, while concurrent asynchronous mode can solve the problems of performance and slow response, but it will bring hidden dangers of security and correctness. Asynchronous mode will lead to changes in the programming model, adding new problems such as asynchronous message push and reception. Using caching can improve performance, but there is a problem with cache updates.

Improvement measures: prepare and carefully review design documents. The design document must describe the background, requirements, business objectives to be met, business performance indicators to be achieved, possible impact, overall design ideas, detailed scheme, foresee the advantages, disadvantages and possible impact of the scheme; through testing and acceptance, ensure that the modified design really meets the business objectives and business performance indicators.

Reason 12: errors caused by unknown details. Such as buffer overflow, SQL injection attacks. There is no problem in terms of function, but there are loopholes in terms of malicious use. For example, if you select the jackson library to parse JSON strings, by default, parsing errors will occur when new fields are added to the object. You must annotate the object with @ JsonIgnoreProperties (ignoreUnknown = true) to cope with the change correctly. This may not be a problem if you choose other JSON libraries.

Improvement measures: on the one hand, through the accumulation of experience, on the other hand, consider security issues and exceptions, select mature libraries that have been strictly tested.

Reason 13: bug that changes with time. Some solutions looked good in the past, but it is also common that they can become clumsy or even useless in current or future scenarios. For example, encryption and decryption algorithms, which may be considered perfect in the past, should be used cautiously after cracking.

Reply "offer" in the background of the top architect of the official account to get a surprise gift package of algorithm interview questions and answers.

Improvement measures: pay attention to changes and bug fix messages, and correct outdated code, libraries, and behaviors in a timely manner.

Reason 14: hardware-related errors. For example, memory leak, insufficient storage space, OutOfMemoryError and so on.

Improvement measures: increase the performance monitoring of important indicators such as CPU / memory / network of the application system.

Common errors in the system:

Reply "neat Architecture" in the background of the top architect on the official account to get a surprise gift package.

The record of the entity in the database does not exist. You must indicate which entity or entity identifies it.

The entity configuration is incorrect. You must indicate which configuration has the problem and what the correct configuration should be.

If the entity resource does not meet the conditions, it must specify what the current resource is and what the resource requirements are.

If the entity operation precondition is not met, it must indicate which precondition needs to be met and what is the current state.

If the entity operation post-check is not satisfied, you must specify what post-check needs to be satisfied and what is the current state.

Performance problems lead to timeouts. You must specify what caused the performance problems and how to optimize them later.

Errors in interactive communication between multiple subsystems result in state or data inconsistencies between them?

Errors that are generally difficult to locate will occur at the bottom. Because the underlying layer can not predict the specific business scenario, the error messages given are more general.

This requires as many clues as possible at the top of the business. The error must be caused by the failure to meet the preconditions on a certain stack in the process of multi-system or hierarchical interaction. When programming, make sure that all the necessary preconditions are met in each stack as much as possible, avoid passing the wrong parameters to the bottom as much as possible, and intercept the errors in the business layer as much as possible.

Most errors are caused by a combination of reasons. But every mistake must have a reason. After solving the errors, we should analyze in depth how the errors occurred and how to prevent them from happening again. If you work hard, you can succeed, but only by reflection can you make progress!

How to write an error log that is easier to troubleshoot

Basic principles for typing error logs:

As complete as possible. Each error log gives a complete description of what errors occurred in what scenarios, what causes (or possible causes), and how to solve (or solve tips)

Be as specific as possible. For example, if NC is short of resources, whether it can be directly specified through the program; general errors, such as VM NOT EXIST, should be specified under what scenarios, which may be convenient for follow-up statistical work.

As direct as possible. The ideal error log should allow people to know intuitively what caused it and how to solve it, rather than going through several steps to find out the real cause.

Integrate the existing experience directly into the system. All problems and experiences that have been solved should be integrated into the system in a friendly way as much as possible to give better hints to newcomers rather than buried elsewhere.

Typesetting should be neat and orderly, and the format should be unified and standardized. The dense, casual diary looks heart-wrenching, quite unfriendly, and not easy to troubleshoot.

Use multiple keywords to uniquely identify the request, highlighting keywords: time, entity identity (such as vmname), and operation name.

Basic steps for troubleshooting problems:

Log in to the application server-> Open the log file-> navigate to the error log location-> follow the instructions in the error log to troubleshoot, identify, and resolve problems.

Where:

From logging in to opening the log file. Because there are many application servers, it is not convenient to log in and view them one by one. You need to write a tool to put it on AG and view all server logs directly on AG, or even filter out the required error logs.

Locate the error log location. At present, the typesetting of the log is so dense that it is not easy to locate the error log. You can generally use "time" to locate near the front of the error log, and then use the entity keyword / operation name combination to lock the error log. Although it is more traditional to locate the error log according to requestId, it is necessary to find requestId first and is not descriptive. It is best to locate the error log location directly based on the time / content keyword.

Analyze the error log. The content of the error log is better to be more direct and clear, to be able to clearly indicate that it is consistent with the characteristics of the current problem to be troubleshooting, and to give important clues.

Usually, the problem with the program error log is that the content of the log can only be understood according to the current code situation, and it looks concise, but it is always incomplete and half-English format; once you leave the code situation, it is difficult to know what you are talking about. you have to make people think about it or look at the code to understand what the log means. Isn't this a self-inflicted pain?

For example:

If ((storageType = = StorageType.dfs1 | | storageType = = StorageType.dfs2) & & (zone.hasStorageType (StorageType.io3) | | zone.hasStorageType (StorageType.io4) {/ / enter dfs1 and dfs2 for storage in io3 io4. } else {log.info ("zone storage type not support, zone:" + zone.getZoneId () + ", storageType:" + storageType.name ()); throw new BizException (DeviceErrorCode.ZONE_STORAGE_TYPE_NOT_SUPPORT);}

What storage type is correct for zone to support? Do Not Let Me Think!

The error log should do this: even if you leave the code context, you can clearly describe what happened.

In addition, if you can directly explain the reason in the error log, you can also save some effort when doing the inspection log. In a sense, the error log can also be a very useful document that records a variety of illegal use cases.

There may be the following problems with the contents of the current program error log:

1. The error log does not specify the error parameters and contents:

Catch (Exception ex) {log.error ("control ip insert failed", ex); return new ResultSet (ControlIpErrorCode.ERROR_CONTROL_IP_INSERT_FAILURE);}

The control ip that failed to insert is not specified. If you add the control ip keyword, it is easier to search for and lock errors.

Similarly, there are:

Log.error ("Get some errors when insert subnet and its IPs into database. Add subnet or IP failure.", e)

It does not specify which subnet and which IP it belongs to. It is worth noting that specifying these extra things may have a slight impact on performance. Performance and debugability need to be weighed at this point.

Solution: use the String.format ("Some msg to ErrorObj:% s", errobj) method to indicate the wrong parameters and contents.

This usually requires writing readable toString methods on DO objects.

two。 The error scenario is not clear:

Log.error ("nc has exist, nc ip" + request.getIp ())

An error has been detected in NC that has been detected in createNc. However, the error scenario is not specified in the log, leading people to guess why an error has been reported to NC.

Can be changed to

Log.error ("nc has exist when want to create nc, please check nc parameters. Given nc ip:" + request.getIp ()); log.error ("[create nc] nc has exist, please check nc parameters. Given nc ip:" + request.getIp ())

Similarly, there are:

Log.error ("not all vm destroyed, nc id" + request.getNcId ())

Change to

Log.error ("[delete nc] some vms [% s] in the nc are not destroyed. Nc id:% s", vmNames, request.getNcId ())

Solution: add when to the error message, or add [API name] before the error message to indicate the error scenario, which is known directly from the error log.

Generally, those who can know executor can add [API name], service and when sentence.

3. The content is not clear or its meaning is not clear:

If (aliMonitorReporter = = null) {log.error ("aliMonitorReporter is null!");} else {aliMonitorReporter.attach (new ThreadPoolMonitor (namePrefix, asynTaskThreadPool.getThreadPoolExecutor ();}

Change to:

Log.error ("aliMonitorReporter is null, probably not initialized properly, please check configuration in file xxx.")

Similarly, there are:

If (diskWbps = = null & & diskRbps = = null & & diskWiops = = null & & diskRiops = = null) {log.error ("none of attribute is specified for modifying"); throw new BizException (DeviceErrorCode.NO_ATTRIBUTE_FOR_MODIFY);}

Change to

Log.error ("[modify disk attribute] None of [diskWbps,diskRbps,diskWiops,diskRiops] is specified for disk id:" + diskId)

Solution: describe the error more clearly and appropriately.

4. The guidance for troubleshooting is not clear:

Log.error ("get gw group ip segment failed. ZkPath:" + LockResource.getGwGroupIpSegmnetLockPath (request.getGwGroupId ()

ZkPath? How to troubleshoot this problem? Who should I go to? Where can I find more specific clues?

Solution: add the corresponding background knowledge and guide the inspection measures.

5. The error content is not specific enough:

If (! ncResourceService.isNcResourceEnough (ncResourceDO, vmResourceCondition)) {log.error ("disk space is not enough at vm's nc, nc id:" + vmDO.getNcId ()); throw new BizException (ResourceErrorCode.ERROR_RESOURCE_NOT_ENOUGH);}

What is the lack of resources? How much is left at present? How much do you need now? It is worth noting that specifying these extra things may have a slight impact on performance. Performance and debugability need to be weighed at this point.

Solution: by improving the program or program skills, reveal the specific differences as much as possible, and reduce the operation of manual comparison.

6. Semi-English sentence patterns are not clear enough to read, and you need to think to piece together a complete meaning:

Log.warn ("cache status conflict, device id" + deviceDO.getId () + "db status" + deviceDO.getStatus () + ", nc status" + status)

Change to:

Log.warn (String.format ("[query cache status] device cache status conflicts between regiondb and nc, status of device'% s'in regiondb is s, but is% s in nc.", deviceDO.getId (), deviceDO.getStatus (), status))

Solution: change to a naturally readable English sentence pattern.

To sum up, the error log format can be:

Log.error ("[interface name or operation name] [Some Error Msg] happens. [params] [Probably Because]. [Probably need to do]."); log.error ("[interface name or operation name] [Some Error Msg] happens. [% s]. [Probably Because]. [Probably need to do].", params)

Or

Log.error ("[Some Error Msg] happens to error parameter or content when [in some condition]. [Probably Because]. [Probably need to do]."); log.error (String.format ("[Some Error Msg] happens to% s when [in some condition]. [Probably Because]. [Probably need to do].", parameters)

[Probably Reason]. [Probably need to do]. It can be omitted in some cases; it is best explained in some important interfaces and scenarios.

Each error log is independent, as complete, specific and direct as possible to describe what errors occurred in what scenarios, what caused them, and what measures or steps to take.

Question:

Will the performance of 1.String.format affect logging?

In general, error logs should be relatively small, the frequency of using String.format will not be too high, and will not affect the application and logging.

two。 When development time is very tight, do you have time to think about words?

Establishing a standardized content format and putting the content into a format can save time in weighing words.

3. When to use info, warn, error?

Info is used to print the normal status information that should appear in the program, which is easy to track and locate.

Warn indicates that the system is slightly unreasonable but does not affect operation and use.

Error indicates that a system error and exception occurred and the target operation could not be completed properly.

Error log is one of the important means to troubleshoot problems. When we program to implement a function, we usually consider the various errors that may occur and the corresponding reasons:

To find out the corresponding cause, you need some key descriptions to locate the cause.

This forms a triple:

Error phenomenon-> error key description-> final cause of error.

It is necessary to provide the corresponding key description of the error for each error as far as possible, so as to locate the corresponding cause of the error.

That is, when programming, think carefully about which descriptions are very helpful in locating the cause of the error, and add these descriptions to the error log as much as possible.

At this point, I believe you have a deeper understanding of "printing logs in Java projects correctly if deterioration". You might as well do it in practice. Here is the website, more related content can enter the relevant channels to inquire, follow us, continue to learn!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report