Backups

Backup

It’s amazing how many files you can create when developing software. I had backups scattered over several disks and CDs and recently cleaned everything up. This is the backup of my backup.

I upgraded to a Mac Mini and just use my laptop for accounting and my database since the programs are too old to run on Mountain Lion. I got a little scare though when my laptop wouldn’t boot up. Turns out the battery was so far gone that it wouldn’t even hold enough charge to start the computer. Since I wasn’t using the computer for daily work, I got out of the habit of backing it up so the last backup was a couple of months old. I picked up a cheap ($12) 10GB USB stick for backup. Then I decided to make a second copy of the data on my computer. This one goes to the office so I’ll have a recent backup in case something happens to the computer and the backup in the house.

I also have a git repository for the apps that I’m writing and it gets updated every day. Worst case there is that I lose a couple of hours of work.

Checking the logs

We’ve been getting hit with SSH login attempts. Sometimes there were thousands per minute and they slowed the machine to a crawl. So we installed fail2ban and that has slowed the attempts considerably.

Recently one site has been hit with huge numbers of SQL injection attacks (18,000) per day. Right now, I trap them and return a static page.

Here’s what my URL looks like:

/products/product.php?id=1
This is what an attack looks like:


/products/product.php?id=-3000%27%20IN%20BOOLEAN%20MODE%29%20UNION%20ALL%20SELECT%2035%2C35%2C35%2C35%2C35%2C35%2C35%2C35%2C35%2C35%2C35%2C35%2C35%2C%27qopjq%27%7C%7C%27ijiJvkyBhO%27%7C%7C%27qhwnq%27%2C35%2C35%2C35%2C35%2C35%2C35%2C35%2C35%2C35%2C35%2C35%2C35%2C35%2C35%2C35%2C35%2C35%2C35%2C35%2C35%2C35%2C35%2C35%2C35%2C35%2C35%2C35%2C35%2C35%2C35%2C35%2C35%2C35%2C35%2C35%2C35--%20

I know for sure that this isn’t just a bad link or fat-fingered typing so I don’t want to send them to an overview page. I also don’t want to use any resources on my site delivering static pages.

First I get the productID, then check to see if it is a number. If it is, all is good and I skip the rest of this code. If not, they might have an extra space in the URL from copying and pasting, so I give them the benefit of the doubt and strip them out. If productID is still not a number, I send the page not found response and kill the rest of the page load.


$productID = (isset($_GET['id']) ? mysql_real_escape_string($_GET['id']) : '55');

// Attempts have been made to exploit the database with long strings. 
// This stops it without filling up the error log.
if ( !is_numeric($productID) ) {
    $url = $_SERVER['REQUEST_URI'];
    $ref = $_SERVER['HTTP_REFERER'];
    $ip  = $_SERVER['REMOTE_ADDR'];
    error_log("long string in products.php: URL is $url and IP is $ip & ref is $ref");
    $productID=preg_replace('/[\s]+/','',$productID);
    if ( !is_numeric($productID) ) {
        error_log("Still a long string in products.php after replacement: URL is $url and IP is $ip & ref is $ref");
        header("HTTP/1.0 404 Not Found");
        die();
    }
}

The bot thinks that there isn’t a page there and usually goes away. Sometimes it tries a few more times, but not the thousands of times it used to.

Command line tips and tricks

While cleaning up my server recently I found that stringing together several commands with pipes made it easier to check logs, find defaults, and remember Linux commands.

history
I don’t do a whole lot of things from the command line, so when I want to do something that I’ve done recently, I just use the up arrow key to find the command from the last time I used it. If I’ve done a lot of work on the command line, the history command will save some scrolling. Here’s the tail end of a recent history command.


  499  exit
  500  history
  501  sudo grep sshd:session /var/log/messages 
  502  sudo tail -n 1000 /var/log/php/error.log
  503  sudo grep Authentication /var/log/messages | wc -l
  504  history

If I want to check the error log I can just up arrow a couple of times and then hit return. Or I could type !502 and hit return.

grep and pipes
I can never remember all the options for tarring up a file, so I almost always find the last time I used it and use the same command again. But it was a while ago and searching through 500 lines of history isn’t particularly efficient. That’s where grep and pipes come in.

history | grep tar

history normally displays on the default output, in this case the terminal. But you can redirect the output to another command or a file. I used the pipe | to redirect the 500 lines of history into the grep command and looked for the characters tar. Rather than displaying 500 lines, I got a few with restart and a coupe of with tar. The target characters are highlighted in red on the terminal.


  279  sudo /etc/init.d/apache2 restart
  280  sudo /etc/init.d/mysql restart 
  428  sudo tar -czvf ./mysql-backup.sql.tgz mysql-backup.sql 

While checking the messages log, I noticed that someone was trying to break into the server by sending login requests every second or so. I was curious about how many attempts there were, so I looked for the words ‘Authentication failure’ in the logs. Note that there is a space in the text I’m looking for so I need to put the search text in quotes. I then piped the result to the wc -l command to count the number of lines. There were almost 10,000 all from the same IP address. We changed our iptables config to only allow 3 attempts from the same IP address and then disable logins for a while.


493  sudo grep 'Authentication failure' /var/log/messages
503  sudo grep 'Authentication failure' /var/log/messages | wc -l

The output of 493 was thousands of lines like this—all with different users.


Jan  1 11:04:56 server sshd[12551]: error: PAM: Authentication failure for illegal user testtest from 218.25.99.148
Jan  1 11:05:05 server sshd[12564]: error: PAM: Authentication failure for root from 218.25.99.148

You can have multiple pipes as well. Here I want to check just the Authentication’s for Jan 3 so I cat the messages file to a grep that looks for Jan 3 (note the quotes) and then pipe that to a grep that looks for Authentication.


sudo cat /var/log/messages | grep 'Jan  3'| grep Authentication

Now that the logs are cleaned up, I check for successful logins with this command>


sudo grep Accepted /var/log/auth.log | tail -20

If there have been more than 20 logins since I last checked, I can make the number larger.

php error log
We have our server set up to put error messages into an error log rather than displaying them on the screen. Visitors to the site don’t care about the error messages and crackers can take advantage of the messages to exploit vulnerabilities so there is no reason to display them. However, if you are writing a new page or changing an existing one, you as a programmer can benefit from knowing where your code failed. When I’m coding I always have a terminal window open with this command running.

tail -f /var/log/php/error.log

One more grep command
When we updated to the latest version of PHP we started getting messages in the logs for deprecated commands. We fixed most of them, but the locations of others weren’t obvious from the error messages. Specifically, we needed to replace all of the places we used PEAR to access the MySQL database. So starting at the root of our website code I look for every file where we use the PEAR initialization code for MySQL. The -R in the grep command means to recursively search through all folders. You start with the current location and traverse the entire directory tree. Notice the * at the end of the command. I don’t want to look at a specific file, like I did in other examples, but want to look at all files.

grep -R initialize_db.inc *

Organizing the spice drawer.

We try to keep our code clean and commented but every once in a while we do some maintenance that falls into what I call “Organizing the spice drawer.” It’s not something that has to be done and the amount of time lost by not doing it is small, but it’s just one sign of a well organized shop. The same applies to our server, backups, and the office in general. This time of year is perfect for these endeavors because there’s not usually much else going on.

Log files
We set up the server logs to automatically archive and then delete themselves so we don’t have to worry about them. We do have some custom logging programs that write data to the MySQL database. They’re not huge, but we’ll never use more than a year’s worth, so they get manually cleaned from time to time. I just deleted 45,000 records from one database.

/tmp
It’s a good idea to check what’s in here from time to time. We has some session cookies that weren’t being properly expired so we wrote a cron job to remove them. There were also some files created by a shell script that weren’t getting deleted, so we adjusted the script to remove them.

php.ini
Every once in a while, it’s a good idea to review the settings in the php.ini file to make sure you are happy with them. In our case, we were getting warnings from two deprecated commands, register_long_arrays and magic_quotes_gpc and we decided to turn them both to Off since we don’t have any code that relies on them and the code that triggered the warnings is in WordPress. The WprdPress code works fine if the settings are Off.

BAK files and databases
Before we do any major change to a file or database we make a backup copy. These usually get deleted when we’re happy with the result, but sometime they stay around when they aren’t needed any more. For the database, I make a complete dump of the database to a .sql file and then remove databases that are backups. If I ever need them, I’ve got the backup in storage.

Orphans
Our websites have been up since 1998 and we sometimes end up with files that are no longer referenced. If we suspect that a file is not used anymore we grep for its name in the web directory and if it’s not found we put a line of code in it that writes to the server when it is referenced.


$refer=mysql_real_escape_string($_SERVER['HTTP_REFERER']);
error_log("Accessing Orphan File with referrer $refer");

If we reference it from one of our pages, the referrer will be our site. If not, then either it’s being accessed by a spam-bot or another website that has an old link to us. Either way, we can then deal with it.

Spambots
Lately, spambots have been hitting just about every page with a submit button on it. They can’t do anything on those pages but I’ve started logging their attempts using the method described in Orphans above. If it’s a real problem we may start requiring CAPTCHAs or simple radio buttons that default to “I am a robot, please ignore this request”.

Error logs
When we updated to the latest version of PHP our database access method, PEAR, was deprecated. That generated literally thousands of error messages. Over time we updated the code, but there were still lots of errors generated for pages that aren’t accessed that much. With so many error messages, it’s hard to see real errors. So we spent a half day finishing off the update. Now the error logs only show the log messages that we want it to show and we can quickly check for errors.

Wikis
We have a wiki for office procedures, how-to’s for ordering supplies, adding users, version numbers for our products, etc. This is the kind of thing that can quickly become out of date if it is not updated regularly. A couple of times a year, we review the wiki and make updates for changes to our processes that were made, but didn’t make it into the wiki.

Office Electronics
Do we really need 12 VGA monitor cables? Three turntables? The 12 year old computer that only runs OS9? The box of old hard drives—the largest of which is 5GB? We don’t keep a lot of stuff we don’t use but things get shoved under desks and put in boxes on the bottom shelf and forgotten about. Every once in a while we make a big pile of stuff and take it to the electronics recycling center. I have yet to regret throwing away anything.

Books
We got rid of a complete shelf of books. I will never need the manual for Photoshop 4 or Flash 5. In fact, I haven’t used a manual for years, relying instead on the web for the answers to most of my questions. My guess is next year the other two shelves of books will be gone.

Hiding and showing files on OSX

OSX Mountain Lion hides a lot of files that were visible in previous versions. You make them visible in the finder with a single line of code in the Terminal.

For example, this line makes the contents of the /usr directory visible.

sudo chflags -R nohidden /usr

And this one makes your Library folder and its contents visible.
sudo chflags -R nohidden ~/Library

And this one makes the main Library folder and its contents visible.
sudo chflags -R nohidden /Library

If you are setting up Apache on you computer so you can test websites before deployment, you might want to make the private folder visible.
sudo chflags -R nohidden /private/

The first part of the command, sudo, just means that you will be prompted for your computer password before the command is executed.

Sometimes I want to see all the files in the Finder so I enter this line in the terminal.
sudo defaults write com.apple.Finder AppleShowAllFiles TRUE
Restart the Finder with
killall Finder
or use Force Quit to relaunch the finder.
After I’ve done whatever it was that required access to all the files, I usually turn them back off by changing TRUE to FALSE