One of the biggest mistakes you can make in website development is
trusting data that has come from the user. All information that is
passed from the user (eg cookies, form information and query string)
must be checked and verified before it is used inside your PHP script.
If you miss this step you can end up leaving wide security holes in
your program. These holes can be used to not only hack your website,
but also the server the site is hosted on.
Introduction to PHP security
Before I start going into validating the data, we really need to first
look at PHP security in general. There are a few settings we can change
to make our PHP install a lot more secure.
- Register Globals:
This setting is off by default in all new installs of PHP (since 4.2.0)
but some people turn it on to support legacy scripts or applications.
This setting should always be disabled in your main PHP configuration
file. If a specific script or application needs this setting on, use a
.htaccess file or settings in your Apache configuration file to enable
it for that one folder or domain.
- URL Opening:
This is a very useful PHP feature but can be a source of security
problems if your PHP scripts have holes in them. Generally, if you
don't use it, turn it off. If none of your scripts need to use this
feature it is better to have it disabled so if there is a security hole
that you somehow miss it cannot be used to take advantage of your
server.
Now that we have this covered lets look at how you should check data that is passed to your PHP script.
Checking data from the user
The way you check the data you receive from the user really depends on
what information you are expecting. PHP includes many functions which
can be used to verify some types of information.
- is_numeric:
This function is used when you are expecting a number of some type.
This function works on both integers and numeric strings (aka numbers
passed through form data). The function simply returns true or false.
- strtotime:
When you need to verify a date strtotime is the best function for doing
so. If the date is valid it will return the timestamp for that date, if
not (eg it is either unexpected data or a date like 31st Feb) it will
return -1 (or false as of PHP 5.1.0).
- preg_match:
Regular expressions can be very useful when checking data that is
supposed to conform to a certain pattern (eg. email addresses, phone
numbers). If you are unfamiliar with regular expressions a good
introduction to them can be found here: is_uploaded_file:
When dealing with files uploaded through forms you must always check
that the file information that has been passed to you is valid. This
function checks this information and returns either true or false. If
you have never worked with file uploads using forms then you must read
this section of the PHP manual: Handling File Uploads. It covers a lot of the basics you need to handle them properly and securely.
Including files based on user input
Many times have I come across websites which have one PHP file that
includes other files based on information in the query string. This
practice can be very dangerous if not done correctly. If you are just
passing the variable straight into the include function then you are
leaving your server wide open to attack. If you also have URL opening
enabled it is possible for me to run ANY PHP code of my choosing on
your server. All I need to do is include a text file from a remote
server which contains the PHP code I want to execute.
There are a few things you can do to check that the information being
passed is valid. You can either restrict the data to information
contained in an array you have defined, or you can check the file
system and see if the file being requested actually exists.
- Restricting the data can be easily done using the following code:
$pages=array("info"=>"information.htm", "contact"=>"contactus.htm");
if (array_key_exists($_GET['page'], $pages)) {
// page is valid, safe to do include
include($pages[$_GET['page']]);
} else {
// page is not valid, throw error
}
- Checking the file system is done using the following code:
$valid_files=glob('*.htm');
if (in_array($_GET['page'].".htm",$valid_files)) {
// page is valid, safe to do include
include($_GET['page'].".htm");
} else {
// page is not valid, throw error
}
The
first option is obviously the most secure of the two but does require
more maintenance. In the first case we are never actually using the
data from the user in the include function. The data is simply a flag
which is then used to include the file we want to use for that page.
The second example requires no changes when you add a file to the site.
The second option checks the file requested actually exists before
including it, this is not as secure since you are just restricting it
to a list of files on the server. It can be possible that a hacker has
found a way to upload a file and can use this script to then include
and run the file. Generally, though, if the hacker has got this far,
you already have a different security hole in your server that you need
to fix.
SQL injection
The final section of this article covers methods of sanitising data
before it is used in SQL statements. One thing that is very hard to
validate is strings (eg, text information) as these are normally meant
to contain all sorts of different characters include punctuation which
can break SQL statements. The best way to handle any information going
into a database is to filter it in such a way that any malicious attack
is handled the same as a valid request.
The methods used for preventing SQL injection vary a bit depending on
what database you are using. PHP is commonly used with MySQL and so
includes the
mysql_real_escape_string
function for sanitising data going into a MySQL database. The PHP
manual page includes a "best practice" example which is what I would
recommend most people use. If you are using a different database I
would recommend using a database abstraction layer like
ADOdb. The ADOdb package includes the
qstr function which escapes the string in different ways depending on the database you are using.
If you don't like escaping the data before passing it to the database
you can also use prepare statements. Prepare statements are like normal
SQL except that you use placeholders instead of the actual data within
your SQL. When you want to execute the SQL statement you send it to the
database along with the data. The database itself processes the data
and SQL together, filtering everything automatically. An example of
using
prepare statements with PostgreSQL can be seen in the PHP manual. The exact syntax of the prepare statement different between databases.
If you are using MySQL you might be wondering if it supports this
functionality. The default MySQL library included with PHP does not
support using the prepare statement, but it is supported in the
new MySQLi library. If you are using other databases with PHP I would (again) strongly recommend using a database abstraction layer like
ADOdb which does include the
prepare functionality.
Well, that about does it for my take on validating user input in PHP.
If you are new to PHP or website programming in general, read this
article a few times to make sure you take it all in. It is scary to see
the amount of sites out there which literally have the door wide open
to attacks simply because they have not checked user input correctly.
Never trust user data!
Never trust user data! How to validate user input