Sanitizing Database Query Input

As part of my site rewrite and migration to MySQL PDO I am making sure that all of the input is sanitized before using—which has the side effect of stopping injection attempts, as discussed in the previous post—and either using prepared statements or whitelisted inputs.

Here’s the sanitizing portion of a crossword solver page. The input is the number of letters in the word and up to 14 letters. There should only be one letter in each space, the space can be empty, and the numbers can be from 1 to 14. I haven’t had any attacks on my forms yet, so I’ll assume any invalid input is due to fat fingers and make reasonable changes.


<?php
$MAX_LETTERS = 14;
$letters = array();

// Read in the letters and number of letters first so you can repopulate the fields.
// If it’s not a single letter in the letter field, or a valid number in the number field,
// don’t let it into the query.
// Log it in case we get lots of injection attempts.
if(isset($_POST)) {
    $submitType = $_POST['submitType'];

    if ($submitType != 'Clear') {
        // Get and validate letters
        for($i = 1; $i <= $MAX_LETTERS; $i++) {
            $letters[$i] = $_POST['letter' . $i];
            $letters[$i] = str_replace(" ","",$letters[$i]);

            if (!preg_match("/^[a-zA-Z]$/",$letters[$i]) && ($letters[$i] <> '') ) {
                if ($showError) error_log("Not a letter {$letters[$i]} in $calledFileName");
                $letters[$i] = "";
            }
        }
        // Get and validate the number of letters
        $num_letters = (integer)$_POST['num_letters'];
        if (!is_integer($num_letters) ) {
            if ($showError) error_log("Not a number in $calledFileName");
        }
        $num_letters = (integer)$num_letters;
        if ( $num_letters > $MAX_LETTERS ) {
            $num_letters = $MAX_LETTERS;
            if ($showError) error_log("Too many numbers in $calledFileName");
        } else if ($num_letters < 0 ) {
            $num_letters = $num_letters * -1;
            if ($showError) error_log("Negative number in $calledFileName");
        }
    }
}

Note that I don’t use htmlspecialchars or mysql_real_escape_string when getting input because I explicitly allow only letters or numbers when validating the output. I don’t think that they would hurt anything, but they aren’t necessary.


$letters[$i] = htmlspecialchars($_POST['letter' . $i]);
$letters[$i] = mysql_real_escape_string($_POST['letter' . $i]);

Some of my pages allow more than one letter in each input space. I just add .* to the pre_match to allow more than one letter. I also allow a wildcard, *, in the web page so I need to escape it in the pre_match.


if (!preg_match("/[a-zA-Z\*].*/",$letters[$i]) && ($letters[$i] <> '') ) {

The safest way to sanitize input is to whitelist the query. There are lots of ways to do this. One way is to construct the query based on the input.


switch ($loc) {
    case "I":
    case "i":
        $location = "Initial";
        $searchString = "^{$letters}[A-Z ]*";
        break;
    case "M":
    case "m":
        $location = "Medial";
        $searchString = "[A-Z ]+{$letters}[A-Z ]+";
        break;
    case "F":
    case "f":
        $location = "Final";
        $searchString = "[A-Z ]*{$letters}\$";
        break;
.....

The user provides a location—initial, medial, or final—and I construct the search string based on their input. No injection is possible because the user input is not seen by the search query.

Another example of whitelisting is to only allow certain values. It works if the list is small and not changed often.


if ($input == 'first value' || $input == 'second value' || $input == 'third value') {
    -- do stuff with the database
} else {
   include_once('dieInAFire');
}

You can do something similar by checking whether the input is part of a hard-coded array. The problem with the last two approaches is that they only work well if the list of acceptable values doesn’t change often. They can work to sanitize user input for things like states and provinces, occupation, and taxable status.

More Thoughts on SQL Injection Attacks

As I discussed in an earlier post, my sites are frequently attacked by malicious actors trying to get into my database by using SQL injection techniques. I stopped most of them by doing input validation and if the input fails validation, returning a 404 not found. I didn’t do this on all of my pages so from time to time they find a page that does”t fail with bad input and send thousands of page requests to it.

I recently updated my sites and took the opportunity to familiarize myself with the latest best practices for mitigating attacks. I added a line to my database initialization script, sanitize all inputs, and use either prepared statements or queries that use whitelisted terms. Here’s my current initialization script.


<?php

$user = 'mysql_user';
$pass = 'supersecurepasswors';
$host = 'localhost';
$db_name = 'website_database';

try {
    $dbWG = new PDO("mysql:host=$host;db_name=$dbname", $user, $pass, array(PDO::MYSQL_ATTR_INIT_COMMAND => "SET NAMES utf8") );
    $dbWG->setAttribute( PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION );
    $dbWG->setAttribute(PDO::ATTR_DEFAULT_FETCH_MODE, PDO::FETCH_ASSOC);
    $dbWG->setAttribute(PDO::ATTR_EMULATE_PREPARES, false);
}
catch(PDOException $e) {
    $e->getMessage();

The only way I found to get attacks to stop is to return a 404 not found message to the bot. After a few tries they usually give up. Otherwise they will try thousands of times.

So first I sanitize the input. Here’s a simple one where the input has to be a digit:


$number_of_letters = (isset($_GET['n']) ? $_GET['n'] : 1);
if ( !is_numeric($number_of_letters) ) {
    include_once('dieInAFire.inc');
}

I have a $showError variable in my header that lets me turn off and on information on attacks. I usually leave it on and peruse the error log to make sure legitimate requests aren’t accidentally blocked. I wrote this include file to use after sanitizing inputs fails.


<?php
$url = isset($_SERVER['REQUEST_URI'])  ? $_SERVER['REQUEST_URI']  : '';
$ref = isset($_SERVER['HTTP_REFERER']) ? $_SERVER['HTTP_REFERER'] : '';
$ip  = isset($_SERVER['REMOTE_ADDR'])  ? $_SERVER['REMOTE_ADDR']  : '';

if ($showError == TRUE) {
    error_log("Long string in $calledFileName: URL is $url and IP is $ip & ref is $ref");
}
header("HTTP/1.0 404 Not Found");
die();
?>

Note that this has to be called before the statement.

If it is a legitimate request, I make then use a prepared statement to execute it.


$qry.=  "HAVING COUNT(sorted_letters) > :number_of_letters ";

$stmt = $dbWG->prepare($qry);
$stmt->bindParam(':number_of_letters', $number_of_letters);
$stmt->execute();

PHP Warning

I’ve been getting this warning when I open PhpMyAdmin and since it doesn’t affect anything, I haven’t bothered to fix it.


PHP Warning:  is_readable(): open_basedir restriction in effect. File(js/codemirror/lib/codemirror.js) is not within the allowed path(s): (/usr/share/phpmyadmin/:/etc/phpmyadmin/:/var/lib/phpmyadmin/) in /usr/share/phpmyadmin/js/get_scripts.js.php on line 31

Today I had some time, so I looked at what might be causing it. By grepping on part of the error message,


/usr/share/phpmyadmin/:/etc/phpmyadmin/

I found four files with that text, and more importantly, it was prefixed with
‘php_admin_value open_basedir’.
They were:

./apache2/conf.d/phpmyadmin.conf
./apache2/conf-enabled/phpmyadmin.conf
./apache2/sites-available/wellgolly.com
./apache2/sites-enabled/wellgolly.com

I tried adding ‘/usr/share/javascript/’ to the first one and restarting Apache, but that did’t work. Then I added it to the sites-available file and the did the trick.

Converting my site to https

I already use https with an SSL certificate for my orders, but given the rumblings from Google, I thought it would be nice to make sure the whole site uses secure links and definitely uses https links when I am connecting to pages where I collect customer input. Although, as we’ll see below, since I redirect the whole site to https, it doesn’t matter if I explicitly use https links to pages that receive customer data.

The first thing I did was to check for places in the text where I have hard coded a link to the site. These are mostly from manuals for apps that I copied from the app. A few are from links to manuals that were copied from the URL field and not cleaned up. All I need to do is remove the website portion and they will resolve as https links. There are also a few from redirects and they just need an ‘s’ added. The code I used is:


grep -r --exclude=\*.{png,jpg,pdf,pdf,odt,ods,svg,exe,dmg,psd, indd} "http://www.wellgolly"

Note that I have a bunch of pdf files and graphics files as well as the master LibreOffice and Photoshop files that I don’t want to search. Using this list excludes them for me. Your list may be different.

There are still a bunch of links that are not https and to see what they are, I changed the search slightly to print the file name and the part of the line that matches the search term. The option ‘o’ shows the part of the line that matches the search pattern. Not particularly useful if the pattern is ‘http:’ but more informative if the pattern is ‘http:.*’. There are lots of these, probably not worth changing now, but something to keep in mind when updating that portion of the site.

After I did this I noticed that the -I option excludes binary files. This is much shorter.


grep -roI "http:.*"

One more step and you are done. It took a while to find this and I should have looked on the Apache site first, but I didn’t. The preferred way to redirect your site is not with mod_rewrite, which is what most of the pages that turn up in a search suggest, but with a redirect directive inside your virtual host. Since I only have one website of many that I am converting, this solution works for me. I simply commented out my Document Root line and added a permanent redirect to the SSL section.


#DocumentRoot /www/WG/wellgolly
Redirect permanent / https://www.wellgolly.com/

Restart Apache and it works.

You can test your server at SSL Labs:


https://www.ssllabs.com/ssltest/analyze.html?d=www.wellgolly.com

Once you have it up and running, open all of your pages and see if you get the secure lock at the top of the page. If you missed any links to content that is displayed on your page—images, Facebook like buttons, etc. then it won’t be locked. I like to use Chrome for testing since the lock is bright green. Links to things outside of your site can still be http.

Once thing that I was concerned about was that the links to Yahoo’s Pure CSS files are not https. It doesn’t affect the browser’s reporting the the page is secure but it won’t load the styles either. I copied them to my server and now they load. I don’t know if it is a Safari security feature or if it is from Ghostery.

A couple of things to note. My SSL certificate only covers the www portion of the domain. So things like beta.wellgolly.com will not redirect to https. This change only affects the website so if you are adding SSL to your site, you might want to change your mail delivery as well to use the certificate.

Write to a file from PHP

I probably covered this before, but I want to keep some log files of what is happening on various pages on a site. I could write to the php error log, but I look at those every morning and then erase them. I want something that sticks around but I don’t need a whole lot.

This solution works for me.


 $fname = "./interesting_stuff.log";
 $info = "Message";
 
 $fp = fopen($fname, 'w');
 fwrite ($fp, "$info" . "\r");
 fclose($fp);

You’ll need to create the file and set the permissions so everyone can write to it before it will work.