As part of my site rewrite and migration to MySQL PDO I am making sure that all of the input is sanitized before using—which has the side effect of stopping injection attempts, as discussed in the previous post—and either using prepared statements or whitelisted inputs.
Here’s the sanitizing portion of a crossword solver page. The input is the number of letters in the word and up to 14 letters. There should only be one letter in each space, the space can be empty, and the numbers can be from 1 to 14. I haven’t had any attacks on my forms yet, so I’ll assume any invalid input is due to fat fingers and make reasonable changes.
<?php
$MAX_LETTERS = 14;
$letters = array();
// Read in the letters and number of letters first so you can repopulate the fields.
// If it’s not a single letter in the letter field, or a valid number in the number field,
// don’t let it into the query.
// Log it in case we get lots of injection attempts.
if(isset($_POST)) {
$submitType = $_POST['submitType'];
if ($submitType != 'Clear') {
// Get and validate letters
for($i = 1; $i <= $MAX_LETTERS; $i++) {
$letters[$i] = $_POST['letter' . $i];
$letters[$i] = str_replace(" ","",$letters[$i]);
if (!preg_match("/^[a-zA-Z]$/",$letters[$i]) && ($letters[$i] <> '') ) {
if ($showError) error_log("Not a letter {$letters[$i]} in $calledFileName");
$letters[$i] = "";
}
}
// Get and validate the number of letters
$num_letters = (integer)$_POST['num_letters'];
if (!is_integer($num_letters) ) {
if ($showError) error_log("Not a number in $calledFileName");
}
$num_letters = (integer)$num_letters;
if ( $num_letters > $MAX_LETTERS ) {
$num_letters = $MAX_LETTERS;
if ($showError) error_log("Too many numbers in $calledFileName");
} else if ($num_letters < 0 ) {
$num_letters = $num_letters * -1;
if ($showError) error_log("Negative number in $calledFileName");
}
}
}
Note that I don’t use htmlspecialchars or mysql_real_escape_string when getting input because I explicitly allow only letters or numbers when validating the output. I don’t think that they would hurt anything, but they aren’t necessary.
$letters[$i] = htmlspecialchars($_POST['letter' . $i]);
$letters[$i] = mysql_real_escape_string($_POST['letter' . $i]);
Some of my pages allow more than one letter in each input space. I just add .* to the pre_match to allow more than one letter. I also allow a wildcard, *, in the web page so I need to escape it in the pre_match.
if (!preg_match("/[a-zA-Z\*].*/",$letters[$i]) && ($letters[$i] <> '') ) {
The safest way to sanitize input is to whitelist the query. There are lots of ways to do this. One way is to construct the query based on the input.
switch ($loc) {
case "I":
case "i":
$location = "Initial";
$searchString = "^{$letters}[A-Z ]*";
break;
case "M":
case "m":
$location = "Medial";
$searchString = "[A-Z ]+{$letters}[A-Z ]+";
break;
case "F":
case "f":
$location = "Final";
$searchString = "[A-Z ]*{$letters}\$";
break;
.....
The user provides a location—initial, medial, or final—and I construct the search string based on their input. No injection is possible because the user input is not seen by the search query.
Another example of whitelisting is to only allow certain values. It works if the list is small and not changed often.
if ($input == 'first value' || $input == 'second value' || $input == 'third value') {
-- do stuff with the database
} else {
include_once('dieInAFire');
}
You can do something similar by checking whether the input is part of a hard-coded array. The problem with the last two approaches is that they only work well if the list of acceptable values doesn’t change often. They can work to sanitize user input for things like states and provinces, occupation, and taxable status.