Efficient, minimal MySQL backup

When someone asks you about what you do to backup your website's database, you'll probably have one of these responses:

  • "I don't think that we do any..." Read on! You must get some sort of backup in place.
  • "We pay company x to do that!" That's great - but are you sure that they're doing it right, cost-effectively or efficiently? Think who suffers if it goes horribly wrong... read on to see whether you could do it a whole lot cheaper yourself.
  • "We've got some scripts, and it seems to back things up ok." This article is for you. You're technically-minded, but have let the backup process slip. See what I recommend!
  • "Everything's under control. We manage everything ourselves and regularly test the integrity of our backups." Wow, that's awesome! You probably won't learn anything new here, but please read on and let me know what you think of my approach.

Introduction

MySQL has two simple ways of doing backup: mysqldump and SELECT INTO OUTFILE. The former is a command-line program, usually run from a shell script, and the latter is the start of a SQL statement that would be queried within a SQL file or a MySQL-friendly language (eg, PHP). With both of these methods you need to be thinking about table locking, so that the the rows in those backed-up tables make sense together.

Important questions to ask, to get an idea of your data:

  • Are there any large tables that might take a while to backup? These should be examined carefully to work out the best backup approach.
  • Which tables reference other tables? Tables with common fields (eg, foreign keys) should be backed up together.
  • Which tables are simply used for logging? Their existing rows are never changed, so you don't need to backup the entire table all the time. An incremental backup strategy can be used.
  • Are there any tables that you don't want to backup? Temporary tables and cache tables are an example. You can ignore these ones! (Although you probably still need to backup the schema.)

You should be able to group your tables according to the above considerations. Big tables should be backed up separately and incrementally where possible. No table should be backed up more than once. A group could contain just one table. Each group of tables can then be backed up like so:

LOCK the tables
BACKUP the tables
UNLOCK the tables

The locking and unlocking steps ensure that none of the tables can change whilst the backup is happening.

Full backup

Doing a full backup is easy; every row in every table is retrieved.

Incremental backup

We need a way of knowing which row to start from for each table that can be incrementally backed up; an obvious method is to add the key value of the last row to the backup filename. All other tables (ie, the non-incremental ones) must be fully backed up.

Verifying

It is important to verify the contents of backups. They should be verified as soon as possible after creation, and then later on at increasing intervals (just to make sure that they haven't got corrupted).

Restoring

The request to restore a dataset is generally along the lines of: "Restore to x date" or "Restore to the most recent backup". To do this, we will find the most recent full backup before the requested date, restore it, then restore all incremental backups up to the requested date.

It is also very important to be able to quickly locate the restore script when you need it, so I have decided to keep mine at the top level of my backup directory structure. When the backup files are shifted off-site (for extra security), the backup/restore script goes with them.

Harmonious Full and Incremental Backups

Instead of forcing a full backup for every table at regular intervals, why not allow each table to do its own thing? We can use the information_schema database (a MySQL meta-database) to check when a table was last updated, and only do a new backup for it if the table has changed since the last backup. The format of that backup would be incremental if possible, and only if there had also been a reasonably recent full backup.

The All-in-One Solution!

This is what you've been waiting so patiently for: the single PHP script that does it all! Download it now by clicking on the green source code, have a read of it, push it to your backups folder, and run it. Each operation has a dry-run option, which should stop the script from doing anything with side effects. Please take heed though: You use this script entirely at your own risk! I can't be held responsible for what it does when you run it, so do take the time to study the script and verify whether it is safe for you to use.

  • Run an analyse first, and check the PHP config file that it produces. Modify that file according to your personal preference and knowledge of the database.
  • Run a backup. The generated files will be in subdirectories of the directory where the script itself resides.
  • Run a verify. This creates a temporary database, which you can examine at your leisure. Don't forget to DROP the temporary database when you're done with it.
  • Add a crontab entry that will the script once a day (or more, or less, depending on your needs).

Finally...

Please let me know what you think! If you have any questions, suggestions, concerns or even praise, I'd love to hear from you! Either leave a comment on this page, or get in touch with me using the contact page.


Comments

It's quiet in here...Add your comment