Talk about some lethal knowledge about code security of PHP 04/20 Update SLTechnology News&Howtos

Talk about some lethal knowledge about code security of PHP

2025-04-20 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)06/02 Report--

Talk about some lethal knowledge about code security of PHP

target

This tutorial shows you how to defend against the most common security threats: SQL injection, manipulation of GET and POST variables, buffer overflows, cross-site scripting, data manipulation in browsers, and remote form submission.

prerequisite

This tutorial is written for PHP developers with at least one year of programming experience. You should understand the syntax and conventions of PHP; these are not explained here. Developers with experience in other languages such as Ruby, Python, and Perl can also benefit from this tutorial, as many of the rules discussed here apply to other languages and environments as well.

Quick introduction to Security

What is the most important part of a Web application? Depending on the person who answered the question, the answer to this question may be varied. Business people need reliability and scalability. The IT support team needs robust maintainable code. End users need a beautiful user interface and high performance when performing tasks. However, if you answer "security", then everyone will agree that this is important for Web applications.

However, most of the discussions stop here. Although security is on the project checklist, security issues are often not considered until the project is delivered. There are a staggering number of Web application projects in this way. Developers work for months, adding security features only at the end of the day to make Web applications available to the public.

The result is often chaos and even rework because the code has been tested and unit tests are integrated into a larger framework before security features are added to it. After adding security, the main component may stop working. The integration of security adds additional burden or steps to an otherwise smooth (but unsafe) process.

This tutorial provides a good way to integrate security into PHP Web applications. It discusses several general security topics, and then discusses in depth the major security vulnerabilities and how to plug them. After completing this tutorial, you will have a better understanding of security.

Topics include:

SQL injection *

Manipulate GET strings

Buffer overflow *

Cross-site scripting (XSS)

Data manipulation in browser

Remote form submission

Web Security 101

Before discussing the details of implementing security, it is best to discuss Web application security from a high perspective. This section introduces some of the basic tenets of the security philosophy that you should keep in mind no matter what Web application you are creating. Some of these ideas come from Chris Shiflett (his book on PHP security is a priceless treasure trove), some from Simson Garfinkel (see Resources), and some from years of accumulated knowledge.

Rule 1: never trust external data or input

The first thing to realize about Web application security is that external data should not be trusted. External data (outside data) includes any data that is not entered directly into PHP code by the programmer. Any data from any other source, such as GET variables, form POST, databases, configuration files, session variables, or cookie, cannot be trusted until steps are taken to ensure security.

For example, the following data elements can be considered safe because they are set in PHP.

Listing 1. Secure and flawless code

$myUsername = 'tmyer'

$arrayUsers = array ('tmyer',' tom', 'tommy')

Define ("GREETING", 'hello there'. $myUsername)

However, the following data elements are flawed.

Listing 2. Unsafe, flawed code

$myUsername = $_ POST ['username']; / / tainted!

$arrayUsers = array ($myUsername, 'tom',' tommy'); / / tainted!

Define ("GREETING", 'hello there'. $myUsername); / / tainted!

Why is the first variable $myUsername defective? Because it comes directly from the form POST. Users can enter any string in this input field, including malicious commands used to clear files or run previously uploaded files. You might ask, "can't you avoid this danger by using a client-side (JavaScript) form that accepts only the letter Amurz?" Yes, this is always a good step, but as you'll see later, anyone can download any form to their machine, modify it, and then resubmit whatever they need.

The solution is simple: the cleanup code must be run on $_ POST ['username']. If you don't, you may contaminate these objects at any other time you use $myUsername, such as in arrays or constants.

An easy way to clean up user input is to use regular expressions to process it. In this example, you only want to accept letters. It may also be a good idea to limit strings to a specific number of characters, or to require all letters to be lowercase.

Listing 3. Make user input secure

$myUsername = cleanInput ($_ POST ['username']); / / clean!

$arrayUsers = array ($myUsername, 'tom',' tommy'); / / clean!

Define ("GREETING", 'hello there'. $myUsername); / / clean!

Function cleanInput ($input) {

$clean = strtolower ($input)

$clean = preg_replace ("/ [^ amurz] /", ", $clean)

$clean = substr ($clean,0,12)

Return $clean

}

Rule 2: disable PHP settings that make security difficult to enforce

You already know that user input cannot be trusted, and you should also know that you should not trust the way PHP is configured on the machine. For example, make sure that register_globals is disabled. If register_globals is enabled, you can do something careless, such as replacing a GET or POST string with the same name with $variable. By disabling this setting, PHP forces you to reference the correct variable in the correct namespace. To use variables from the form POST, you should reference $_ POST ['variable']. This way, this particular variable is not misunderstood as a cookie, session, or GET variable.

The second setting to check is the error reporting level. During development, you want to get as many error reports as possible, but when you deliver the project, you want the errors to be logged to the log file rather than displayed on the screen. Why? Because malicious * * will use error reporting messages (such as SQL errors) to guess what the application is doing. This kind of reconnaissance can help * break through the application. To plug this vulnerability, edit the php.ini file, provide the appropriate destination for the error_log entry, and set display_errors to Off.

Rule 3: if you can't understand it, you can't protect it

Some developers use strange syntax or organize statements tightly to form short but ambiguous code. This approach can be efficient, but if you don't understand what the code is doing, you can't decide how to protect it.

For example, which of the following two pieces of code do you like?

Listing 4. Make the code easy to protect

/ / obfuscated code

$input = (isset ($_ POST ['username'])? $_ POST [' username']: ")

/ / unobfuscated code

$input = "

If (isset ($_ POST ['username'])) {

$input = $_ POST ['username']

} else {

$input = "

}

In the second clearer code snippet, it's easy to see that $input is flawed and needs to be cleaned up before it can be handled safely.

Rule 4: defense in depth is a new magic weapon

This tutorial will use examples to show how to secure an online form and take the necessary steps in the PHP code that processes the form. Similarly, even if you use PHP regex to ensure that GET variables are fully numeric, you can still take steps to ensure that SQL queries use escaped user input.

Defense in depth is not just a good idea, it ensures that you don't get into serious trouble.

Now that you've discussed the basic rules, let's look at the first threat: SQL injection *.

Prevent SQL injection *

In SQL injection, the user adds information to the database query by manipulating the form or GET query string. For example, suppose you have a simple login database. Each record in this database has a username field and a password field. Build a login form so that users can log in.

Listing 5. Simple login form

Username

Password

This form accepts the user name and password entered by the user and submits the user input to a file named verify.php. In this file, PHP processes the data from the login form, as shown below:

Listing 6. Unsafe PHP form processing code

$okay = 0

$username = $_ POST ['user']

$pw = $_ POST ['pw']

$sql = "select count (*) as ctr from users where

Username=' ". $username."' And password=' ". $pw."' Limit 1 "

$result = mysql_query ($sql)

While ($data = mysql_fetch_object ($result)) {

If ($data- > ctr = = 1) {

/ / they're okay to enter the application!

$okay = 1

}

If ($okay) {

$_ SESSION ['loginokay'] = true

Header ("index.php")

} else {

Header ("login.php")

}

This code looks fine, doesn't it? Hundreds (if not thousands) of PHP/MySQL sites around the world use such code. What's wrong with it? Okay, remember that user input cannot be trusted. There is no escape of any information from the user, thus making the application vulnerable. Specifically, any type of SQL injection may occur.

For example, if the user enters foo as the user name and'or'1 as the password, the following string is actually passed to PHP and then the query is passed to MySQL:

$sql = "select count (*) as ctr from users where

Username='foo' and password= "or'1 limit 1"

This query always returns a count value of 1, so PHP allows access. You can pretend to be a legitimate user by injecting some malicious SQL,*** at the end of the password string.

The solution to this problem is to use PHP's built-in mysql_real_escape_string () function as a wrapper for any user input. This function escapes characters in a string, making it impossible for the string to pass special characters such as apostrophes and lets MySQL operate on them. Listing 7 shows the code with escape processing.

Listing 7. Secure PHP form processing code

$okay = 0

$username = $_ POST ['user']

$pw = $_ POST ['pw']

$sql = "select count (*) as ctr from users where

Username=' ".MySQL _ real_escape_string ($username)."'

And password=' ".mysql_real_escape_string ($pw)." Limit 1 "

$result = mysql_query ($sql)

While ($data = mysql_fetch_object ($result)) {

If ($data- > ctr = = 1) {

/ / they're okay to enter the application!

$okay = 1

}

If ($okay) {

$_ SESSION ['loginokay'] = true

Header ("index.php")

} else {

Header ("login.php")

}

By using mysql_real_escape_string () as a wrapper for user input, you can avoid any malicious SQL injection in user input. If a user attempts to pass a malformed password through SQL injection, the following query is passed to the database:

Select count (*) as ctr from users where\

Username='foo' and password='\'or\'1\'=\'1' limit 1 "

There is nothing in the database that matches such a password. A big loophole in the Web application is plugged with just one simple step. The lesson here is that user input from SQL queries should always be escaped.

However, there are several security vulnerabilities that need to be plugged. The next item is to manipulate the GET variable.

Prevent users from manipulating variables

In the previous section, users were prevented from logging in with malformed passwords. If you are smart, you should apply the methods you have learned to ensure that all user input of the SQL statement is escaped.

However, the user is now safely logged in. Just because a user has a valid password doesn't mean he will play by the rules-he has plenty of opportunities to cause damage. For example, an application might allow users to view special content. All links point to locations such as template.php?pid=33 or template.php?pid=321. The part after the question mark in URL is called the query string. Because the query string is placed directly in the URL, it is also called the GET query string.

In PHP, if register_globals is disabled, you can access the string with $_ GET ['pid']. In the template.php page, you might do something similar to listing 8.

Listing 8. Sample template.php

$pid = $_ GET ['pid']

/ / we create an object of a fictional class Page

$obj = new Page

$content = $obj- > fetchPage ($pid)

/ / and now we have a bunch of PHP that displays the page

/ /.

Is there anything wrong here? First of all, it is implicitly believed that the GET variable pid from the browser is safe. What will happen to this? Most users are not smart enough to construct semantics. However, if they notice the pid=33 in the browser's URL location domain, they may start to make trouble. If they enter another number, it may not be a problem; but what happens if they enter something else, such as the SQL command or the name of a file (such as / etc/passwd), or do some other prank, such as entering a value up to 3000 characters?

In this case, remember the basic rules and do not trust user input. Application developers know that the personal identifier (PID) accepted by template.php should be numeric, so you can use the is_numeric () function of PHP to ensure that non-numeric PID is not accepted, as follows:

Listing 9. Use is_numeric () to restrict the GET variable

$pid = $_ GET ['pid']

If (is_numeric ($pid)) {

/ / we create an object of a fictional class Page

$obj = new Page

$content = $obj- > fetchPage ($pid)

/ / and now we have a bunch of PHP that displays the page

/ /.

} else {

/ / didn't pass the is_numeric () test, do something else!

}

This method seems to be valid, but the following inputs can easily pass the is_numeric () check:

100 (valid)

100.1 (there should be no decimal places)

+ 123.45 million (Scientific Counting-not good)

0xff33669f (hexadecimal-dangerous! DANGER! )

So what should security-conscious PHP developers do? Years of experience has shown that it is best to use regular expressions to ensure that the entire GET variable is made up of numbers, as follows:

Listing 10. Use regular expressions to restrict GET variables

$pid = $_ GET ['pid']

If (strlen ($pid)) {

If (! ereg ("^ [0-9] + $", $pid)) {

/ / do something appropriate, like maybe logging\

Them out or sending them back to home page

}

} else {

/ / empty $pid, so send them back to the home page

}

/ / we create an object of a fictional class Page, which is now

/ / moderately protected from evil user input

$obj = new Page

$content = $obj- > fetchPage ($pid)

/ / and now we have a bunch of PHP that displays the page

/ /.

All you need to do is use strlen () to check that the length of the variable is non-zero; if so, use an all-numeric regular expression to ensure that the data element is valid. If the PID contains letters, slashes, periods, or anything similar to hexadecimal, this routine captures it and blocks the page from user activity. If you look behind the scenes of the Page class, you can see that the security-aware PHP developer has escaped the user input $pid, thus protecting the fetchPage () method, as shown below:

Listing 11. Escape the fetchPage () method

Class Page {

Function fetchPage ($pid) {

$sql = "select pid,title,desc,kw,content,\"

Status from page where pid='

".MySQL _ real_escape_string ($pid)."'"

/ / etc, etc... .

}

You might ask, "Why escape when you've ensured that PID is a number?" Because I don't know how many different contexts and situations will use the fetchPage () method. It must be protected everywhere this method is called, and the escape in the method embodies the meaning of defense in depth.

What happens if a user tries to enter a very long number, such as 1000 characters, and tries to initiate a buffer overflow? The next section discusses this in more detail, but for now you can add another check to make sure that the PID you entered has the correct length. You know that the maximum length of the pid field of the database is 5 digits, so you can add the following check.

Listing 12. Use regular expressions and length checks to restrict GET variables

$pid = $_ GET ['pid']

If (strlen ($pid)) {

If (! ereg ("^ [0-9] + $", $pid) & & strlen ($pid) > 5) {

/ / do something appropriate, like maybe logging\

Them out or sending them back to home page

}

} else {

/ / empty $pid, so send them back to the home page

}

/ / we create an object of a fictional class Page, which is now

/ / even more protected from evil user input

$obj = new Page

$content = $obj- > fetchPage ($pid)

/ / and now we have a bunch of PHP that displays the page

/ /.

Right now, no one can cram a 5000-bit value into a database application-- at least not where GET strings are involved. Imagine gnashing your teeth when you get frustrated trying to break through your application. And because error reporting is turned off, it is more difficult to detect.

Buffer overflow *

Buffer overflow * attempts to overflow the memory allocation buffer in PHP applications (or, more precisely, in Apache or the underlying operating system). Keep in mind that you may be writing Web applications in a high-level language like PHP, but eventually you have to call C (in the case of Apache). Like most low-level languages, C has strict rules for memory allocation.

Buffer overflow * * sends a large amount of data to the buffer, overflowing some of the data into adjacent memory buffers, thereby breaking the buffer or rewriting logic. This can cause a denial of service, corrupt data, or execute malicious code on a remote server.

The only way to prevent buffer overflows is to check the length of all user inputs. For example, if you have a form element that asks for the user's name, add a maxlength attribute with a value of 40 on the field and check it with substr () at the back end. Listing 13 shows a short example of the form and PHP code.

Listing 13. Check the length of user input

If ($_ POST ['submit'] = = "go") {

$name = substr ($_ POST ['name'], 0Pal 40)

/ / continue processing... .

}

Why both the maxlength attribute and the substr () check at the back end? Because defense in depth is always good. Browsers prevent users from entering overly long strings that PHP or MySQL cannot safely handle (imagine someone trying to enter a name up to 1000 characters), while the back-end PHP check ensures that no one manipulates form data remotely or in the browser.

As you can see, this approach is similar to using strlen () in the previous section to check the length of the GET variable pid. In this example, any input values longer than 5 digits are ignored, but you can easily truncate the values to the appropriate length, as shown below:

Listing 14. Change the length of the input GET variable

$pid = $_ GET ['pid']

If (strlen ($pid)) {

If (! ereg ("^ [0-9] + $", $pid)) {

/ / if non numeric $pid, send them back to home page

}

} else {

/ / empty $pid, so send them back to the home page

}

/ / we have a numeric pid, but it may be too long, so let's check

If (strlen ($pid) > 5) {

$pid = substr ($pid,0,5)

}

/ / we create an object of a fictional class Page, which is now

/ / even more protected from evil user input

$obj = new Page

$content = $obj- > fetchPage ($pid)

/ / and now we have a bunch of PHP that displays the page

/ /.

Note that buffer overflows are not limited to long numeric or alphabetic strings. You may also see long hexadecimal strings (which often look like\ xA3 or\ xFF). Remember that the purpose of any buffer overflow * * is to flood a specific buffer and place malicious code or instructions in the next buffer to destroy data or execute malicious code. The easiest way to deal with hexadecimal buffer overflows is to not allow input to exceed a specific length.

If you are dealing with a form text area that allows longer entries to be entered in the database, you cannot easily limit the length of the data on the client side. After the data reaches PHP, you can use regular expressions to clear any string like hexadecimal.

Listing 15. Prevent hexadecimal strings

If ($_ POST ['submit'] = = "go") {

$name = substr ($_ POST ['name'], 0Pal 40)

/ / clean out any potential hexadecimal characters

$name = cleanHex ($name)

/ / continue processing... .

}

Function cleanHex ($input) {

$clean = preg_replace ("! [\] [xX] ([A-Fa-f0-9] {1pm 3})!", ", $input)

Return $clean

}

"method=" post "

Name

You may find this series of operations a little too strict. After all, hexadecimal strings have legitimate uses, such as outputting characters in a foreign language. It is up to you to decide how to deploy hexadecimal regex. A better strategy is to delete a hexadecimal string only if there are too many hexadecimal strings in a line, or if the string has more than a certain number of characters, such as 128 or 255.

Cross-site scripting *

In cross-site scripting (XSS), there is often a malicious user entering information in a form (or through other user input) that inserts malicious client tags into the process or database. For example, suppose there is a simple visitor register program on the site that allows visitors to leave their names, e-mail addresses, and short messages. Malicious users can take advantage of this opportunity to insert something other than short messages, such as JavaScript that is inappropriate for other users or redirect users to another site, or steal cookie information.

Fortunately, PHP provides the strip_tags () function, which clears anything enclosed in the HTML tag. The strip_tags () function also allows you to provide a list of allowed tags, such as or.

Listing 16 shows an example that builds on the previous example.

Listing 16. Clear the HTML tag from user input

If ($_ POST ['submit'] = = "go") {

/ / strip_tags

$name = strip_tags ($_ POST ['name'])

$name = substr ($name,0,40)

/ / clean out any potential hexadecimal characters

$name = cleanHex ($name)

/ / continue processing... .

}

Function cleanHex ($input) {

$clean = preg_replace\

("! [\] [xX] ([A-Fa-f0-9] {1jol 3})!", ", $input)

Return $clean

}

"" method= "post"

Name

"text" name= "name" id= "name" size= "20" maxlength= "40" / >

From a security perspective, it is necessary to use strip_tags () for public user input. If the form is in a protected area, such as a content management system, and you believe that users will perform their tasks correctly, such as creating HTML content for a Web site, then using strip_tags () may be unnecessary and can affect productivity.

One more problem: if you want to accept user input, such as comments on posts or guest registrations, and need to display this input to other users, be sure to put the response in PHP's htmlspecialchars () function. This function will be associated with symbols,

< 和 >

Symbols are converted to HTML entities. For example, the and symbol (&) becomes &. In this way, even if the malicious content avoids the processing of the front-end strip_tags (), it will be disposed of by htmlspecialchars () at the back-end.

Data manipulation in browser

There is a type of browser plug-in that allows users to tamper with header and form elements on a page. With Tamper Data, a Mozilla plug-in, you can easily manipulate a simple form that contains many hidden text fields to send instructions to PHP and MySQL.

Before the user clicks Submit on the form, he can start Tamper Data. When he submits the form, he sees a list of the form's data fields. Tamper Data allows the user to tamper with this data, and then the browser completes the form submission.

Let's go back to the example we established earlier. The string length has been checked, the HTML tag has been cleared, and hexadecimal characters have been deleted. However, some hidden text fields have been added, as follows:

Listing 17. Hidden variable

If ($_ POST ['submit'] = = "go") {

/ / strip_tags

$name = strip_tags ($_ POST ['name'])

$name = substr ($name,0,40)

/ / clean out any potential hexadecimal characters

$name = cleanHex ($name)

/ / continue processing... .

}

Function cleanHex ($input) {

$clean =\

Preg_replace ("! [\] [xX] ([A-Fa-f0-9] {1 A-Fa-f0 3})!", ", $input)

Return $clean

}

"" method= "post"

Name

"text" name= "name" id= "name" size= "20" maxlength= "40" / >

Notice that one of the hidden variables exposes the table name: users. You will also see an action field with a value of create. With basic SQL experience, you can see that these commands may control a SQL engine in the middleware. People who want to wreak havoc can simply change the name of the table or provide another option, such as delete.

Figure 1 illustrates the extent of damage that Tamper Data can provide. Note that Tamper Data allows users access not only to form data elements, but also to HTTP headers and cookie.

The easiest way to defend against this tool is to assume that any user may use Tamper Data (or similar tools). Provide only the minimum amount of information the system needs to process the form and submit the form to some dedicated logic. For example, the registration form should be submitted only to the registration logic.

What if you have established a general form handler and many pages use this general logic? What if I use hidden variables to control the flow direction? For example, you might specify which database table to write or which file repository to use in a hidden form variable. There are four options:

Do not change anything, secretly pray that there are no malicious users on the system.

Rewrite function, use more secure dedicated form handling functions, and avoid the use of hidden form variables.

Use md5 () or other encryption mechanisms to encrypt table names or other sensitive information in hidden form variables. Don't forget to decrypt them on the PHP side.

The values are converted in the PHP form handler by using abbreviations or nicknames to obscure the meaning of the values. For example, if you want to reference the users table, you can reference it with u or any string (such as u8y90 × 0jkL).

The last two options are not perfect, but they are much better than letting users easily guess the middleware logic or data model.

What are the remaining problems now? Remote form submission.

Remote form submission

The advantage of Web is that it can share information and services. The downside is that you can share information and services, because some people do things without scruples.

Take a form as an example. Anyone can access a Web site and make a local copy of the form using File > Save As on the browser. He can then modify the action parameter to point to a fully qualified URL (not to formHandler.php, but to http://www.yoursite.com/formHandler.php, because the form is on the site), make any changes he wants, click Submit, and the server receives the form data as a legitimate traffic stream.

You may first consider checking $_ SERVER ['HTTP_REFERER'] to determine whether the request is from your own server. This method can block most malicious users, but not the best *. These people are smart enough to tamper with the citation information in the header to make a remote copy of the form look like it was submitted from your server.

A better way to handle remote form submission is to generate a token based on a unique string or timestamp and place the token in the session variable and the form. After submitting the form, check to see if the two tokens match. If it doesn't match, you know that someone is trying to send data from a remote copy of the form.

To create random tokens, you can use PHP's built-in md5 (), uniqid (), and rand () functions, as follows:

Listing 18. Defend against remote form submission

Session_start ()

If ($_ POST ['submit'] = = "go") {

/ / check token

If ($_ POST ['token'] = = $_ SESSION [' token']) {

/ / strip_tags

$name = strip_tags ($_ POST ['name'])

$name = substr ($name,0,40)

/ / clean out any potential hexadecimal characters

$name = cleanHex ($name)

/ / continue processing... .

} else {

/ / stop all processing! Remote form posting attempt!

}

$token = md5 (uniqid (rand (), true))

$_ SESSION ['token'] = $token

Function cleanHex ($input) {

$clean = preg_replace ("! [\] [xX] ([A-Fa-f0-9] {1pm 3})!", ", $input)

Return $clean

}

"method=" post "

Name

This technique works because session data cannot be migrated between servers in PHP. Even if someone gets your PHP source code, transfers it to their own server, and submits information to your server, your server receives only empty or malformed session tokens and originally provided form tokens. If they don't match, the remote form submission fails.

Concluding remarks

This tutorial discusses a number of issues:

Use mysql_real_escape_string () to prevent SQL injection problems.

Use regular expressions and strlen () to ensure that the GET data has not been tampered with.

Use regular expressions and strlen () to ensure that the data submitted by the user does not overflow the memory buffer.

Use strip_tags () and htmlspecialchars () to prevent users from submitting potentially harmful HTML tags.

Avoid system breakthroughs by tools like Tamper Data.

Use a unique token to prevent users from submitting forms remotely to the server.

This tutorial does not cover more advanced topics, such as file injection, HTTP header spoofing, and other vulnerabilities. However, what you have learned can help you immediately add enough security to make your current project more secure.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.