Case Analysis of Google in Building static Code Analysis tools 07/03 Update SLTechnology News&Howtos

Case Analysis of Google in Building static Code Analysis tools

2025-07-03 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Network Security >

Shulou(Shulou.com)05/31 Report--

In this issue, the editor will bring you an example analysis of Google in building static code analysis tools. The article is rich in content and analyzed and described from a professional point of view. I hope you can get something after reading this article.

Software bug consumes a lot of time and money for developers and software companies. In 2014, for example, one ("goto fail") bug in the widely used implementation of the SSL protocol led to acceptable invalid SSL certificates, and another bug associated with date formatting led to widespread service disruptions for Twitter. Such errors can usually be detected by static analysis, but in fact they can be quickly identified during the reading of code or documentation, and the ultimate reality is that the situation still occurs in a production environment.

Previous work has refined the experience of applying bug testing tools to software development. But although there are so many successful examples of developers using static analysis tools, there are still the following reasons why engineers are not always willing to use static analysis tools or actively ignore alarm messages generated by tools:

It is not properly integrated. The tools are not integrated into the developer's workflow or the program takes too long to run

Invalid alarm. The feasibility of alarm information is poor.

Not trustworthy. Users no longer trust the result because of false positives

The actual utilization scene of the defect is not clear. The reported bug is feasible in theory, but the defect is not clear in the actual use scenario.

The cost of repair is too high. The cost of fixing detected code defects is too high or there are other risks

Alarms are not easy to understand. Users do not know the specific information and principle of alarm information.

Next, this paper describes how we learn from the experience and lessons gained from Google's previous use of FindBug for java language analysis and academic literature, and finally successfully build a static analysis infrastructure used by software engineers in Google. With advice from engineers, Google's tools can detect thousands of problems fixed by engineers every day before problematic code is incorporated into a company-level code repository.

In terms of the scope of the tools, we focus on integrating static analysis as part of the Google core development process and serving most Google developers. Many static code analysis tools are dwarfed by the 2 billion lines of code deployed in google, so techniques that run complex analysis in large-scale scenarios are not a high priority.

Of course, it is important to consider that Google external developers working in areas of expertise (such as aerospace and medical equipment) may use specific static analysis tools and workflows. Similarly, developers who are involved in specific types of development projects, such as kernel code and device drivers, may need to perform specific analysis methods. There has been a lot of excellent work on static analysis, and we don't think our feedback is unique, but we firmly believe that it is helpful to collate and share our work on improving the quality of google code and improving the development experience.

Term definition. We use the term definition: the analysis tool runs one or more "inspectors" on the source code and identifies "defects" that may present a software failure. If the developer does not take positive action after seeing the problem, and if the developer encounters an identified defect and does not take an appropriate fix, we regard it as an "actual false alarm". If the static analysis does not accurately identify a defect reported, but the developer takes the initiative to modify the code here to improve readability and maintainability, then this is not an effective "false alarm". If an analysis reports an actual code error, but the developer does not understand the code problem and does not take any action, it is considered an "actual false positive". We use this conceptual distinction to emphasize the importance of the R & D perspective. The perception of the tool by the developer rather than the tool author directly affects the false positive rate of the tool.

How Google compiles and builds software

Below we will outline the key points of the Google software development process. In Google, almost all development tools (except the development environment) are centralized and standardized. Many parts of this infrastructure are built by Scratch, which is owned by internal teams, retaining experimental flexibility.

Source code control and code ownership. Google has developed and used a single-source code management system. And experiment with a single branch to store (almost) all Google proprietary code. Developers use a "trunk-based" development approach that restricts branches, usually by release rather than by feature. Any engineer can change any code with the approval of the code owner. Code ownership is based on the path; the owner of the directory has the same permissions on the subdirectories.

System construction. All the code in the Google code base is compiled by a separate Bazel version of the compilation link, that is, all inputs must be explicitly declared and stored in source control so that the build process is easy to split and parallelize. In Google's build system, Java rules rely on JDK and source-managed Java compilers, and these binaries can be updated for all consumers by quickly introducing new versions. Builds usually come from source code (via head), and very few binary components check in branches. Because all developers use the same build system, any code can be compiled successfully without reporting errors.

Analytical tools. The static analysis tools used by Google are usually not complex. The Google infrastructure does not support running intermittent or program-based integrity analysis on systems at this level, nor does it use advanced static analysis techniques (such as separation logic technology) on a large scale. Even simple inspectors need to analyze the infrastructure to support workflow integration. The types of analyzers that have been deployed as a general development process include:

Style Checker (such as Checkstyle,Pylint, and Golint)

Extensible bug lookup compilers (such as Error Prone,ClangTidy, Clang Thread SafetyAnalysis, Govet, and CheckerFramework) include, but are not limited to, abstract syntax tree pattern matching tools, type-based checkers, and parsers that detect uncalled variables.

An analyzer that invokes the production service (for example, checking whether the employee mentioned in the code comments is still working on the Google)

Check the properties of the build output (such as the size of the output binaries).

Google's C++ linter can catch the "goto fail" vulnerability, which checks whether if statements are followed by parentheses. The checker based on pattern matching will recognize the date formatting error, so the code that caused the Twitter outage will not be compiled in Google. Google developers also use dynamic analysis tools such as AddressSanitizer to find buffer vulnerabilities and use ThreadSanitizerto to find data competition issues. These tools run during testing and sometimes even in an environment with production traffic.

Integrated development environment (IDE). The entry point for static analysis problems early in the development process is integration into IDE. But Google developers use a variety of editors, so it is difficult to consistently detect errors for all developers before invoking the build tool. While Google does use analytics integrated with the popular internal IDE, a specific IDE that can be analyzed has a long way to go, and the road is dangerous.

test. Almost all Google code contains the corresponding testing steps, from unit testing to large-scale integration testing. Testing activity is the first idea that needs to be integrated in system construction, just like the compilation process is independent and distributed. For most projects, developers write and maintain test cases for the code; projects usually do not have separate tests or QA groups.

Google's continuous build and testing system runs tests each time the code is submitted, giving timely feedback that the build fails or the test case fails due to code changes made by the developer. It also supports testing changes before committing to avoid breaking project dependencies.

Code review . Every time the code is submitted to Google, it goes through code review first. Although any developer can make changes to any part of the Google code, the owner of the code must approve the changes before the review commits the integration. In addition, even the code owner must review his code before committing the change. Code review is conducted through a centralized, Web-based tool that integrates closely with other development infrastructure. The results of static analysis can be displayed in the code review.

Code release. Google team releases frequently, and most of the release verification and deployment process is done automatically through the "push on green" method, which means it is difficult to rely on the painstaking manual release verification process. If Google engineers find errors in the production environment, they can roll back the new version and deploy it to the production server at a relatively low cost compared to having to interrupt the service.

Learned from FindBugs.

During the early exploratory research phase from 2008 to 2010, Google's static analysis technology focused on Java analysis using FindBug: independent tools created by William Pugh of the University of Maryland and DavidHovemeyer of York College in Pennsylvania. The principle is to analyze the compiled Java class files and extract the code structure model that can lead to bug. As of January 2018, FindBugs was used only as a command-line tool for a very small number of engineers in google. A small Google team called "BugBot" worked with the original author Pugh and made three major attempts to integrate FindBugs into the Google development process.

We have learned the following points by trying:

Try 1:Bug dashboard. Originally in 2006, FindBugs was integrated into a centralized tool to scan the entire Google code base nightly and store the resulting results for engineers to view through dashboard. Although FindBugs found hundreds of errors in Google's Java code base, the dashboard had little effect because the error message dashboard was out of touch with the day-to-day development process and could not be organically integrated with other existing static analysis results.

Try 2: focus on improving bug.

Next, the BugBot team began to manually classify the new issues found each night to identify and deal with relatively important bug reports. In May 2009, hundreds of Google engineers participated in the company-wide Fix it week, focusing on solving FindBugs alarm problems. A total of 3954 alerts (42 per cent of the total 9473) were reviewed, but only 16 per cent (640) were actually fixed. Actual 44% of the reported results (1746) have been submitted to bug feedback tracking. Although the Fixit activity confirms that many of the problems found by FindBugs are real-life code defects, many of them are not important enough to require actual fixes. Manual classification of problems and submission of bug reports are difficult to continue in a large-scale environment.

Try 3: integrate into code review. Next, the BugBot team integrates to implement such a system: when review personnel are notified by review, FindBugs will run automatically and display the scan results as comments for code review. The above code review team has been implemented for coding specification / style issues. Google developers can ignore false positives and filter on the credibility of FindBugs's results. The tool further attempts to display only new FindBugs alerts, but sometimes treats them as new problems because of misclassification. As the code review tool was replaced in 2011, this integration came to an end for two reasons: developers lost confidence in the tool due to high false positives in actual work, and developers freely customized filtering resulting in different views on the analysis results.

Incorporate into the compilation process

At the same time as FindBugs's experiment, Google's C++ development process improved by adding new checking rules to the Clang compiler. The Clang team implemented a new compiler checker, including fix advice information, using ClangMR to run updated compilers on the entire Google code base in a distributed way to optimize the check, and the coding implementation fixed the stock bug problem in the code base. Once the code base has been marked to fix the inventory problem, the Clang team will apply the new inspector to mark the new problem as a compiler error (rather than an alarm, which the Clang team found that Google developers will ignore) to abort the build, which must be resolved. The Clang team has been very successful in improving the quality of the code base through this strategy.

We follow this idea and build an easy-to-use Java static analysis tool based on schema analysis based on the javac compiler, called Error Prone. The first check rule, called PreconditionsCheckNotNull, is used to detect whether the method check input parameter is empty at the beginning of the program, such as checkNotNull ("uid is null", uid) rather than checkNotNull (uid, "uid was null").

To start an inspector like PreconditionsCheckNotNull without breaking any successive builds, the Error Prone team uses it to run such checks on javac-based MapReduce programs against the entire code base, analogous to ClangMR, and using FlumeJava builds called JavacFlume. JavacFlume generates a series of fixes, compares them, and then applies them to the entire code base to fix them. The Error Prone team uses the internal tool Rosie to split large-scale code changes into small changes, each of which only affects a single project and tests those changes, and sends them to the appropriate team for code review. The team reviews only the fixes that apply to its code, and Rosie commits the actual changes only if it approves merging through them. In the end, all the repairs and changes to the stock problems have been passed, and the existing defects have been dealt with. The team officially opened the compiler error way.

When we surveyed the developers who received these patches, 57% of those who received the fix solution with the code were happy to get such information, and 41% were neutral. Only 2% of people responded negatively and said, "this will only increase my workload."

The value of compiler check

Compilation errors are displayed early in the development process and integrated into the developer workflow. We find that the extended compilation checker effectively improves the code quality of Google. Because the checks in Error Prone are written internally against javac's abstract syntax tree rather than bytecode (unlike FindBugs), it is relatively easy for developers outside the team to check. Leveraging these external contributions is critical to improving the overall impact of Error Prone. As of January 2018, a total of 162 authors had provided 733 inspectors.

The sooner you report the problem, the better.

Google's centralized build system records all build processes and build results, so we make sure that all users can see the error messages in the specified time window. We sent a survey feedback to developers who have recently encountered compiler errors and who have received suggestions for fixes for the same problem. Google developers believe that problems marked at compile time (as opposed to patches during merging code) will catch more important errors; for example, survey participants believe that 74% of problems are marked as "real problems" at compile time. Only 21% of the problems were found in the merged code. In addition, survey participants considered 6% of the problems found at compile time (0% in the merging code phase) as an "important level". This result can be explained by the "survivor bias effect"; that is, errors are likely to be caught by more expensive tools such as testing and code review when the code is submitted. Putting as many checks as possible into the compiler is a reliable way to avoid these costs.

Standards for compiler inspection

In order to promote our work on a large scale, because interrupt compilation will be a large action, we have defined the standard to enable checking in the compiler, which is set to strictly high annotation mode. Compiler checks on Google should be easy to read, operable, and easy to fix (error message prompts as far as possible should include suggestions for fixes that can be implemented universally); there are no valid false positives (parsing actions should not interrupt the construction of error-free code that is actually correct); and only report feedback on real bug rather than style or coding specifications. The main goal of measuring parsers that meet these standards is not just to simply detect problems, but to automatically fix these compiler errors throughout the code library. But these standards also limit the scope of checks that the Error Prone team enables when compiling code; many problems that cannot be accurately detected or common problem fixes are still problems in front of us.

Display alarm messages in the code review phase

Once the Error Prone team has built and implemented the infrastructure needed to detect problems at compile time, this approach has proved to be effective, and we want to find more high-impact bug,bug beyond our compiler error checking and providing analysis results for languages other than Java and C++. The second entry point for integrating static analysis results is Google's code review tool-Critique; static analysis results are displayed in the Critique of Google's program analysis platform using Tricorder. As of January 2018, compiler errors in C + + and Java versions of Google have been cleared, and all analysis results are shown in compiler errors or in the code review phase.

Standards for code review

Unlike compile-time checks, the analysis results displayed during the code review allow for an effective false positive rate of up to 10%. The feedback expected during the code review is not always flawless, and developers need to evaluate the corresponding fix recommendations before actually adopting them. The inspector of Google during the code audit phase should meet the following criteria:

Easy to understand. It is clear and easy for engineers to understand.

The scheme is feasible and easy to repair. Fixes may take more time, thought, and effort than the compiler check phase, and the results of the check should include guidance on how to define the problem

Less than 10% of the effective false positive rate. Developers should feel that the inspector finds actual bug defects at least 90% of the time.

Have a significant impact on code quality. The problems found may not affect the correct operation of the program, but developers should take them seriously and choose to fix them.

Some problems are serious enough to be marked in the compiler, but it is not feasible to reduce or develop automatic fixes. For example, some fixes may require refactoring the code. Enabling these checks as compiler errors would require manual cleanup of existing implementations, which is not feasible on a code base as large as Google. The analysis tool shows in the code review that these checks avoid introducing new problems and allow developers to decide whether to take steps to fix them appropriately. Code review is also a good time to report relatively less important issues, such as specification issues or simplifying optimized code. In our experience, reporting during the compilation phase is always unacceptable to developers and makes rapid iteration and debugging more difficult; for example, a detector for unreachable code paths may hinder a block of code for debugging. But at the time of code review, developers are carefully preparing to complete their code; they are in an open mind and are more receptive to readability and style details.

Tricorder

The Tricorder design philosophy is designed to be easy to extend and to support many different types of program analysis tools, including static and dynamic analysis tools. We showed some Error Prone checkers in Tricorder that cannot be enabled as compiler errors. Error Prone has also created a new set of C++ analytics components, which are integrated with Tricorder and called ClangTidy. Tricorder Analyzer reports support results in more than 30 languages, support simple parsing such as style checkers, leverage Java,JavaScript and C + + compiler information, and can be directly integrated with production data (for example, about currently running task jobs). Tricorder continues to succeed at Google because it is a plug-in model for an ecological platform that supports profiler writers, highlights possible fixes during code review, and provides feedback channels to improve the profiler and ensure that profiler developers take action on positive feedback.

Enable users to contribute. As of January 2018, Tricorder included 146 parsers, of which 125 were from outside the Tricorder team, and seven plug-in systems were used for hundreds of additional checks (such as ErrorProne and ClangTidy, two of the seven analysis plug-in systems).

Reviewers are involved in providing suggestions for repair.

The Tricorder inspector can provide code review tools with reasonable fixes visible to code review personnel and developers. Reviewers can ask developers to fix defective code by clicking the "Please fix" button on the analysis results. Review users usually do not approve merging code changes until all their comments (manual and automatic discovery) are resolved.

Iterate feedback from users. In addition to the "Please fix" button, Tricorder also provides a "useless" button that commentators or proponents can click to indicate that they do not agree with the results of the analysis. The click action automatically submits the error in the bug tracker and points it to the team to which the analyzer belongs. The Tricorder team followed up on these "useless" click marks and calculated the click ratio between "Please fix" and "useless". If the percentage of the analyzer exceeds 10%, the Tricorder team disables the analyzer until the author improves it. Although the Tricorder team rarely permanently disables parsers, it has disabled some parsers (in several scenarios) until the profiler author removes and modifies those inspectors whose results are complex and useless.

Submitted errors often improve the effectiveness of parsers, thus greatly increasing developer satisfaction with these parsers; for example, the Error Prone team developed a check in 2014 that marks when too many parameters are passed to functions like printf in Guava. Functions like printf do not actually accept all printf specifiers, only% S. About once a week, the Error Prone team will receive a "useless" bug claiming that the analysis is incorrect, but that the number of format wildcards in the bug matching code matches the actual number of parameters passed. When the user tries to pass a wildcard placeholder other than% s, the parser is actually the right misunderstanding in any case. So the team changed the code review instructions to directly declare that the function accepted only the% s placeholder and stopped getting errors about the check.

The scale of use of Tricorder. As of January 2018, Tricorder had analyzed approximately 50000 code review changes per day. Analyze three times per second during peak hours. Review users click "Please repair" more than 5000 times a day, and the author applies the automatic repair program about 3000 times a day. The Tricorder analyzer receives 250 "useless" clicks a day.

The success of the code review analysis shows that it occupies the "best position" in the developer workflow of Google. The analysis results displayed at compile time must achieve relative quality and accuracy, and it is impossible to rely on the analyzer to continue to identify serious problems. After review and code are integrated, developers face increased resistance to making changes. As a result, developers struggle to make changes to code that has been tested and released and are less likely to solve low-risk and less important problems. Analysis projects in many other software development organizations (such as Facebook Infer analysis for Android / iOS applications) also emphasize code review as a key entry point for reporting analysis results.

Extended analyzer

As Google developers have accepted the results of the Tricorder profiler, they continue to ask for further extension of the profiler. Tricorder addresses this issue in two ways: allowing customization at the project level and adding presentation analysis results to other parts of the development process. In this section, we also discuss some of the reasons why Google has not used more complex analysis techniques as its core development process.

Project level customization

Not all requested parsers are of equal value to the entire Google code base; for example, some parsers are associated with high false positive rates, so inspectors with corresponding high false positive rates may need to enable configuration on a specific project to be effective. These analyzers are only useful for the right team.

To meet these requirements, our goal is to make Tricorder customizable. Our previous customized empirical practices for FindBugs were ineffective; customization based on the user level led to differences within and between teams, resulting in reduced tool usage. Because each user can see a different view of the problem, there is no way to ensure that everyone working on the same project can see a specific problem. If the developer removes all unused guide packages from their team's code, the change will be quickly rejected by fallback even if another developer is inconsistent in removing the unused guide package.

To avoid such problems, Tricorder only allows configuration at the project level, ensuring that anyone who makes changes to a particular project can see a consistent view of the analysis results related to that project. Maintaining the consistency of the results view enables several types of parsers to perform the following actions:

Produce a dichotomy result. For example, Tricorder includes a parser for protocol buffer definition that identifies changes that are not backward compatible. The team of developers uses it to ensure persistent information in the protocol buffer in serialized form, but it is annoying for teams that do not store data in this form. Another example is that an analyzer suggests that for projects that cannot use these libraries or language features, it does not make sense for Guava or Java code implementation

Specific settings or in-code comments are required. For example, if their code is properly annotated, the team can only use Checker Framework's null ness for analysis. Another analyzer, when properly configured, will check the binary size of specific Android binaries and the increase in the number of function calls, and warn developers whether they are expected to grow or are close to the limit.

Support for domain-specific languages (DSL) and team-specific coding guidelines. Some Google software development teams have developed small DSL related inspectors that they want to run. Other teams have implemented best practices in readability and maintainability and want to continue to perform these checks

At the same time, resources are highly utilized. A case of mixed analysis according to the results of the dynamic analysis. This kind of analysis provides some high value for some teams, but it is too expensive or time-consuming for everyone.

As of January 2018, there were approximately 70 optional analyses within Google, of which 2500 projects had at least one enabled. Dozens of teams across the company are actively developing new analyzers, most of them outside the development tool group.

Other workflow integration points

As developers increase their trust in these tools, they also require further integration into the workflow. Tricorder now provides analysis results by providing command-line tools, continuous integration systems, and code review tools.

Command line support. The Tricorder team has added command-line support for developers who are actually code managers who often browse and clean up various alarm analyses in the team's code library. These developers are also very familiar with the types of fixes that each parser will generate and have a high degree of trust in specific parsers. So developers can use command-line tools to automatically apply all fixes in a given analysis and clean up changes

Code submission threshold. Some teams want specific parsers to prevent code submission rather than just appearing in the code review tool. Typically, the ability to block submissions is proposed by a team with a highly customized inspector that ensures that there are no false positives, usually used in custom DSL or libraries.

The code shows the result. The code presentation is best suited to show the scale of problems in large projects (or the entire code base). For example, the results of an analysis when browsing code about deprecated API can show how much work is required for the migration, or some security and privacy analysis is global and requires a professional team to review the results before determining if there is a problem. Because the analysis results are not displayed by default, the code browser allows a specific team to enable the analysis view, then scan the entire code base and review the results without interfering with other developers' attention to these analyzers. If the analysis results have an associated fix, the developer can apply the fix simply by clicking the code browsing tool. The code browser is also well suited for displaying the analysis results utilized by production data because the data is not available until the code is submitted and run.

Complex analysis

All static analyses that are widely deployed on Google are relatively simple, although some teams conduct interprocedural analysis for project-specific analysis frameworks in specific areas, such as Android applications. The process analysis of Google scale is technically feasible. But such an analysis is very challenging to implement. As mentioned above, all Google code is stored in a separate overall source repository, so conceptually any code in the repository can be part of any binary file. So you can imagine a situation where the analysis results of a particular code review will need to analyze the entire code repository. Although Facebook's Infer focuses on interprocedural analysis, extending parsers based on separation logic to millions of lines of code base, extending such parsers to Google's billions of lines of code repository still requires a lot of engineering work. As of January 2018, implementing a more complex analysis system was not a priority for Google:

A lot of investment. Upfront infrastructure investment will be prohibitive

Efforts are needed to reduce the false alarm rate. The analysis team must develop techniques to significantly reduce false positives and / or strictly limit what error messages should be displayed in many analyzers, as shown in figure infer

There's more to be done. The analysis team still has more "simple" analyzers to implement and integrate.

High upfront costs. We found that this "simple" parser is cost-effective, which is the core motivation of FindBugs. By contrast, even if the cost of determining a more complex inspector is ROI, the upfront cost is high.

Note that this ROI can be very different for developers other than Google who work in areas of expertise (such as aerospace and medical devices) or specific projects (such as device drivers and mobile applications).

Experience

The tuition fees we try to integrate static analysis into the Google workflow teach us the following valuable lessons:

It is easy to find bug defects. When the contemporary code base is large enough, it contains almost any conceivable code pattern. Errors loom even in mature code libraries with full test coverage and a rigorous code review process. Sometimes the problem is not obvious in the local inspection, and sometimes the error is introduced by refactoring that seems harmless to humans and animals. For example, consider the following code snippet that uses field f of type long

Result =

31 * result

(int) (f ^ (f > > 32))

Imagine what would happen if the developer changed the type of f to int. The code continues to compile, but offsetting 32 to the right becomes a no-op operation, the field is XOR with itself, and the hash value of the variable becomes constant 0. 5. The result is that f no longer affects the values generated by the hashCode method. Any tool that can calculate the f type can correctly detect an offset of more than 31 to the right, and we fixed 31 items of code in Google's code base that occurred this error, while incorporating this check into the compiler error in Error Pone.

Because it is easy to find errors, Google uses simple tools to detect error types. Next, the analysis writer makes fine-tuning based on the results of running the Google code.

Most developers don't use static analysis tools as they think. With the development of many business tools, Google initially relied on the implementation of FindBugs, and engineers chose to visit the centralized dashboard to see the problems found in their projects, but few of them actually did so. It is too late to find errors in the merged code that may have been deployed and run without the user being aware of the problem. To ensure that static analysis warnings are visible to most or all engineers, the analysis tools must be integrated into the workflow and enabled by default for everyone. Projects such as Error Prone do not provide error dashboard, but extend the compiler through additional inspectors and present the results of the analysis during code review.

The feelings of developers are crucial. Based on our experience and material accumulation, many attempts to integrate static analysis into software development organizations have failed. Google management usually does not authorize engineers to use static analysis tools. Engineers engaged in static analysis must prove their influence through valid actual data. For a static analysis project to be successful, developers must feel that they benefit from it and enjoy the value of using it.

In order to build a successful analysis platform, we have built tools that provide high value for developers. The Tricorder team carefully reviews fixed issues, conducts practical research to understand how developers feel, makes it easier to submit bug through analysis tools, and uses all this data for continuous improvement. Developers need to build trust in analytical tools. If a tool wastes developers' time misreporting and feedback on low-level problems, developers will lose confidence and ignore the results.

Not limited to finding errors and fixing them. A typical way to popularize static analysis tools is to enumerate a large number of problems in the code base. The goal is to influence action by pointing out how to correct potential errors or to prevent bug from happening in the future. But if developers are not motivated to take action, this potential expected result will still not be achieved. This is a fundamental flaw: analysis tools measure their usefulness by the number of problems they identify, while process integration fails due to very few bug fixes. On the contrary, the Google static analysis team is responsible for the repair work as well as looking for bug, using it as a criterion for successful closed loops. Focusing on fixing errors ensures that the tool provides viable advice and minimizes false positives. In many cases, fixing errors is as easy as finding them through automated tools. Even for problems that are difficult to solve, research over the past five years has highlighted new techniques for automatically creating static analysis problem fixes.

The development of analyzer requires concerted efforts. Although specific static analysis tools require expert developers to write analyses, few experts may actually know which checks have a significant impact. In addition, profilers are usually not domain-specific experts (such as those who use API, language, and security). Through FindBugs integration, only a few Google employees know how to write a new inspector, so small BugBot teams must do all the work themselves. This limits the speed at which new checks can be added and does not actually benefit from the contributions of others from their domain knowledge. Teams like Tricorder are now focused on lowering the inspection standards provided by developers without prior experience in static analysis. For example, the Google tool Refaster allows developers to write inspectors by specifying examples before and after the code snippet. Because contributors are often motivated to contribute after debugging their own error code, the new checks will gradually save developers time.

Conclusion

Our experience is that attaching importance to integration into the development process is the key to the implementation of static analysis tools. While the author of the inspector tool may think that developers should be happy with a list of defects in the code they write, we don't actually find that such a list motivates developers to fix them. As analysis tool developers, we must define and measure results by actually correcting defects, rather than providing numbers to developers. This means that our responsibility goes far beyond the analytical tools themselves.

We advocate a system that focuses on promoting workflow integration as soon as possible. Whenever possible, enable the inspector as a compiler error. In order to avoid interrupting the build tool writer to take on the task of fixing all existing problems in the code base, it allows us to improve the quality of the Google code base step by step. Because we present an error alarm in the compiler, developers deal with it immediately after writing the code, so that they can still make changes in a timely manner. To achieve this goal, we developed an infrastructure to run the analysis and generate fixes across the vast Google code base. We also benefit from code review and submission automation that allows hundreds of files to be changed, and, of course, an engineering culture, which usually allows changes to legacy code, because improved code outweighs risk aversion.

Code review is the best entry point to display analysis warnings before committing code. To ensure that developers can accept the results of the analysis, the Tricorder only shows the problem during the modification code phase before the developer commits the change, and the Tricorder team applies a series of criteria to select the alerts to display. Tricorder further collects statistics in the code review tool, which is used to detect the root causes of a large number of invalid alarms generated by the parser.

In order to overcome the neglect of alarms, we strive to regain the trust of Google engineers and find that Google developers have a strong bias to ignore static analysis, and any reports with unsatisfactory false positives give them reasons for inaction. The analysis team is very careful to show the results as errors or warnings only after they are reviewed according to the objective criteria of the description, so developers are rarely inundated, confused, or annoyed by the analysis results. Investigation and feedback channels are important quality control methods in this process. Now that developers have regained confidence in the results of the analysis, the Tricorder team is meeting the need to be more involved in more analysis in the Google developer workflow.

We have built a successful static analysis infrastructure on Google that prevents hundreds of errors from entering the Google code base every day at compile time and during code review. We hope that others will benefit from our experience and successfully integrate static analysis into their own workflows.

The above is the example analysis of Google in building static code analysis tools shared by Xiaobian. If you happen to have similar doubts, you might as well refer to the above analysis to understand. If you want to know more about it, you are welcome to follow the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.