A preliminary study on how to carry out Open Source White-box Audit tools in PHP 07/08 Update SLTechnology News&Howtos

A preliminary study on how to carry out Open Source White-box Audit tools in PHP

2025-07-08 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Network Security >

Shulou(Shulou.com)05/31 Report--

This article is to share with you about PHP how to conduct open source white-box audit tool preliminary, the editor feels very practical, so share with you to learn, I hope you can learn something after reading this article, say no more, follow the editor to have a look.

In the last issue, Xiao Lu made a preliminary introduction to the PHP open source white-box audit tool based on text features. This issue makes a preliminary exploration of PHP open source white-box audit tool based on static analysis.

Based on static analysis: that is, white-box audit is carried out with the help of traditional static analysis technology. The common technologies include data flow analysis, stain propagation and control flow analysis. Static analysis can more accurately judge whether the external input has been processed by the security function, which is difficult to achieve based on text features. In addition, the static analysis-based approach is also more advantageous in determining whether a variable comes partly / entirely from external input, especially when auditing against the framework. However, there is a good side and a bad side, and the problem based on static analysis is:

1) it takes a lot of time to do a complete analysis, which is much longer than that based on text features.

2) the adaptation cost is high, so it is necessary to adapt the corresponding parser to generate the required AST tree and CFG information for different objectives. in addition, the adaptation of detection rules is also a relatively heavy cost.

Based on static analysis

Cobra

Github: https://github.com/WhaleShark-Team/cobra

Language: Python2

Last commit on 2019.08.23

Cobra, an open source white-box audit tool in 2016, is believed to have been heard by many people. So far, Cobra has received 2.1k of star on Github, and it is also the project with the largest number of star among the open source white-box audit tools of this analysis. Cobra supports white-box audit directly through the command line, and it also supports starting a local WEB service, and then controlling cobra to conduct white-box audit and view reports on the source code through the WEB interface or API interaction. Cobra supports PHP, Java, Python and other development languages, and supports dozens of types of files. It also supports the identification of WEB framework. The details are listed below:

Development language

PHP 、 Java 、 Python 、 JSP 、 C 、 Ruby 、 Perl 、 Lua 、 Go 、 Swift 、 C++ 、 C# 、 Header 、 Objective-C 、 Scale 、 Ceylon 、 Kotlin 、 Shell 、 Bat 、 JavaScript 、 HTML 、 CSS

File

Image 、 Font 、 Conf 、 CMake 、 SQL 、 Compression 、 Executable 、 Log 、 Text 、 Office 、 Media 、 Certificate 、 Source 、 Thumb 、 Git

Frame

WordPress 、 Joomla 、 Drupal 、 CodeIgniter 、 ThinkPHP 、 Laravel 、 Kohana 、 Yii 、 Symfony 、 Phalcon 、 Slim 、 CakePHP 、 Django 、 Flask 、 Sprin

The official document on how Cobra works has been described in detail, and is quoted as follows:

For some obvious features, we can use regular rules to match directly, such as hard-coded password, wrong configuration and so on. For the vulnerability of OWASP Top 10, Cobra pre-combs the harmful function and locates all the places in the code where the harmful function appears, and then parses the corresponding source code to AST (Abstract Syntax Tree, abstract syntax tree) based on Lex (Lexical Analyzer Generator, lexical analysis generator) and Yacc (Yet Another Compiler-Compiler, compiler code generator). Analyze whether the input parameters of the hazard function are controllable to determine whether there is a vulnerability (currently only connected to PHP-AST, other languages AST access).

Quoted from http://cobra.feei.cn/

The detection rules of Cobra are mainly divided into dependency checking rules and code security scanning rules. Dependency checking supports only three major languages: Python, Java, and NodeJS. The main purpose is to check whether the version of the referenced third-party library meets the version requirements specified in the corresponding language configuration files (requirements.txt, pom.xml, package.json). Cobra provides a total of 95 code security scanning rules, as follows:

Php: 57 java: 8 *: 3 jsp: 2 conf: 2

Certificate: 1 source: 1 lua: 1 log: 1 thumb: 1

Cobra-W

Github: https://github.com/LoRexxar/Cobra-W

Language: Python3

Last commit on 2020.01.17

Cobra-W is a white-box audit tool based on Cobra 2.0 to improve the accuracy and accuracy of vulnerability detection. The working principles of the two are basically the same, and the differences are mainly reflected in:

Deeply rewrite AST to greatly reduce the false alarm rate of vulnerabilities.

It provides a rule writing way that is easier to customize audit ideas from the code level, easier to use white hats, and easier to expand.

The underlying api is rewritten and supports multiple platforms such as windows and linux.

Multi-layer semantic parsing, function backtracking, secret mechanism, adding a variety of mechanisms to apply to semantic analysis.

Added javascript semantic analysis, which is used to scan code related to js.

Cobra-W not only rewrites the format of detection rules, but also simplifies and optimizes the existing detection rules. Cobra-W officially provides a total of 17 detection rules covering 12 different vulnerabilities.

Reflected XSS: 2

SSRF: 3

SQLI: 3

RFI: 1

Xml injection: 1

RCE: 2

LDAPI: 1

Information Disclosure: 1

URL Redirector Abuse: 1

Variable shadowing: 1

Unserialize vulerablity: 1

Phpcs-security-audit

Github: https://github.com/FloeDesignTechnologies/phpcs-security-audit

Language: PHP

Last commit on 2019.08.06

Phpcs-security-audit is another white-box audit framework, which is based on PHP_CodeSniffer (https://github.com/squizlabs/PHP_CodeSniffer). PHP_CodeSniffer is a code style detection tool, which mainly contains two types of scripts: 1) phpcs scripts that detect code that does not conform to the specification according to a series of predefined code specifications, and issue warnings or errors; 2) automatically correct some phpcbf scripts that do not conform to the specification in the code format. To be exact, phpcs-security-audit cannot be called a complete white-box audit framework, it just provides a set of rules for code specifications that can detect potential security vulnerabilities in code.

The core of phpcs_security-audit detection vulnerabilities is the phpcs script. It pre-defines the code specification that can detect potential security vulnerabilities, and then uses phpcs to detect whether there is any code in the source code that does not conform to the specification, so as to determine the potential loopholes. Among them, phpcs detects whether the code violates the code specification by tokenizing PHP, JavaScript, and CSS files.

Phpcs_security-audit detection rules are mainly divided into two categories: 1) general detection rules, including 16 detection rules in BadFunctions directory and 2 detection rules in Misc directory, totaling 18 different vulnerability detection rules; 2) detection rules for different frameworks, which provide detection rules for Drupal7, Drupal8 and Symfony2 frameworks.

Progpilot

Github: https://github.com/designsecurity/progpilot

Language: PHP

Last commit on 2019.06.02

Progpilot is a white-box audit framework for PHP developed by PHP. It realizes the generation of control flow graph (CFG) based on PHP-CFG (https://github.com/ircmaxell/php-cfg/) and abstract grammar book (AST tree) based on PHP-Parser (https://github.com/nikic/php-parser).

Progpilot does further vulnerability analysis on the basis of CFG and AST. Progpilot uses the generated CFG diagram to check whether the execution order between the specified functions conforms to the predefined rules. Progpilot also checks whether the parameters of the specified function meet the rules. Progpilot supports the following four ways to detect vulnerabilities:

Sources: in the process of analysis, analyze the function parameters specified in sources.json as stains

Sinks: indicate dangerous functions and the types of vulnerabilities they may cause in sinks.json

Sanitizers: specify the security function to defend against specific vulnerabilities and its use in sanitizers.json. Here, the security function modifies the value of the parameter, such as escaping

Validators: specify the security functions that defend against specific vulnerabilities and their usage in validators.json. Here, the security function does not modify the value of the stain.

The detection rules for progpilot are located in package/src/uptodate_data. It provides general detection rules for PHP and JavaScript. For PHP, progpilot also provides customized detection rules for the five frameworks of CodeIgniter,PrestaShop,SuiteCRM,SymFony,WordPress.

Pixy

Github: https://github.com/oliverklee/pixy

Language: Java

Last commit on 2018.01.24

Pixy is a white-box audit framework for PHP developed using Java. The prixy introduced in this article is developed by Oliver Klee, and its original author is Jenad Jovanonic. At that time, in order to develop pixy,Oliver Klee, another open source project, PhpParser (https://github.com/oliverklee/phpparser), was specially developed to realize the parsing of PHP programs in Java.

Pixy parses the source code with PhpParser, then generates the corresponding AST tree and CFG based on the parsing results, and then does stain analysis to check whether the parameters of the sensitive function are controllable.

The main detection rules of pixy are. / config/model_*.ini files. There are five types of functions / parameters that can be set in each ini file, such as security functions, dependent functions, and sensitive functions.

RIPS

Github: https://github.com/ripsscanner/rips

Language: PHP

Last commit on 2016.05.22

As a white-box audit tool-RIPS, which began to use parsing to exploit vulnerabilities a long time ago, I believe everyone is familiar with him. Even though it is now in commercial operation, and the latest version is only v0.55 updated in 2017 on SourceForge (https://sourceforge.net/projects/rips-scanner), this does not hinder our study of its code concept, and three RIPS analysis articles by plum winemakers are well worth reading. Commercial RIPS offers a free trial, and friends who are interested can go and have a try.

RIPS uses the PHP Zend engine parser to get the source Token stream information, and then generates the corresponding AST tree and CFG based on this Token stream information. Then, RIPS locates the sensitive function on this basis, and then traces the parameters of the sensitive function to check whether it is a user-controllable input to determine whether there is a vulnerability. RIPS checks whether the variable is handled by a security function during variable backtracking to reduce false positives.

RIPS does not have a separate inspection rule file, and its detection rules are mainly integrated in the configuration file and are located in the config directory.

Info.php: define some functions that require extra attention

Securing.php: define security handling functions

Sink.php: define sensitive functions and corresponding security functions

Sources.php: define output point functions / variables that the user may be able to control

Token.php: defining the flow of Token information related to sensitive functions needs to be focused on during the analysis process.

More than a dozen white box audit tools give me the impression that automated code audit is still a long way to go. As mentioned at the beginning, the current mainstream PHP white-box audit tools are based on the same experience. This experience can be simply divided into three sub-questions:

Location sensitive function

Some / all of the parameters come from external input

Whether the external input is processed by the security function during the transfer process.

With regard to the first problem, the current solution can basically be solved. In view of the second problem, most of the cases can be basically solved by static analysis, but there is still much room for improvement, such as more comprehensive and in-depth analysis of the source code, secondary input with the help of database, and so on. The third problem is still a hard-hit area, which basically stays in the judgment of whether there is a security function in the transfer chain, without considering the bypass problem caused by multiple coding, the security function can be bypassed and so on. In addition, in recent years, our principles of designing white-box audit tools are basically the same, is there any other better design scheme under today's technical conditions? This point is open to question. Interestingly, there are quite a number of people studying the binary system, both in industry and academia, but when it comes back to WEB, the main force is the industrial people (manual dog heads).

The above is a preliminary study of how PHP carries out open source white-box audit tools, and the editor believes that there are some knowledge points that we may see or use in our daily work. I hope you can learn more from this article. For more details, please follow the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.