Backup and Restore Fundamentals
Back up your SQL Server databases regularly to prevent lost data in the event of a disaster.
May 27, 2008
Most companies that are in business for a while eventually encounter a disastrous event that has the potential to put the company out of business. And every company that uses databases will at some point experience a database crash. A database backup is a copy of the data, structures, and security objects contained within a database. Each database should be backed up on its own schedule, based on the number of write transactions that occur each day. To minimize data loss when disaster hits, you must back up your databases—all of them. And to ensure that your backups are good, you should test them by using them in a restore operation. At the very least, you need to have copies of your databases that you can quickly restore, and you need to be comfortable with the restore operation itself.
After people, data is a company’s most valuable asset. Your responsibility as a DBA is to ensure that your company’s data is safe—that is, that you have a copy of the data that you can reinstate even if the entire data center is reduced to rubble. Database backups are the simplest, most cost-effective means of safeguarding your company’s data.
Don’t be lulled into a false sense of security by a high-availability system you might have recently purchased. If you’ve virtualized and consolidated, then you might have actually increased your risk. Life was easy when you ran just one SQL Server instance per computer—but 10 SQL Server instances running on virtual machines (VMs) can all come crashing down when the physical box fails. If you can afford the additional investment, you can avoid a disaster of such large proportions by clustering the virtual server hosts. High-availability schemes are crucial when your data and systems must be available all the time. But even high-availability systems can be affected by fire, flood, and earthquake. You still need to do backups. Not just anyone in a company can do a database backup. For information about who should be backing up your databases, see the Web sidebar “Who Can Do Backups?”
How often you should back up a database depends on how long you have to restore it. In general, the more often you back up a database and the type of backup you take, the shorter the restore time. You can tailor backups and restores for each database. The kind of backup you decide to use will depend on the size of the database and the amount of transactional activity. The three most common types of backups are full, log, and differential. (For information about recovery models, see the Web sidebar “Database Recovery Models;” for information about SQL Server backup commands, see the Web Sidebar “Standard Backup Commands.”)
Full Backups
The full backup strategy is the easiest to understand and implement. At the end of every business day, or during whatever time window you’ve allocated for database backups, you simply perform a full backup of the database, as Figure 1 illustrates. You perform no separate log backups, and you don’t have to remember any special parameters. Backup file management is simple, because you only need to manage the full backup file. In addition, restoring from a full backup is extremely easy because you have only one full backup file to apply. Full backups are especially useful in organizations with a limited or relatively new IT staff.
A full backup works best for a “small” database—which you can define as a database whose full backup can be completed in the time allowed. When SQL Server performs a full database backup, it first backs up all the extents on the hard disk (an extent is eight contiguous pages, with each page being 8K in size). Then, SQL Server backs up the transaction log so that any user changes made during the database backup are also captured in the full backup file.
If you’re performing only full backups, you might lose some data in the event of a system crash—specifically, any changes made since the last full backup. If your database is updated infrequently, such as by high-speed bulk operations, then you can plan full backups to run only immediately after the bulk data modifications, and your data should be protected.
Full backups aren’t appropriate for production systems that have anything other than a few transactions. After you use the full backup strategy to restore a database, you must redo any transactions or bulk data loads that were applied to the database after the backup. If your most current backup file is damaged, you need to use the next previous full backup to restore the database—and you’ll have to ensure that all transactions applied to the database after that backup are manually redone.
To perform a full backup, run the following code:
BACKUP DATABASE AdventureWorksTO DISK = ‘E:SQLdataBACKUPSAdventureWorks_FullDbBkup.bak’WITH INIT, NAME = ‘AdventureWorks Full Db backup’,DESCRIPTION = ‘AdventureWorks Full Database Backup
DISK is the destination for the backup file. You can back up to disk or to tape; in this case, you’re backing up to the hard disk. Make sure the folder you’ll use as a store for the backup files exists before you begin. In most cases, backing up to hard disk is faster but more expensive than backing up to tape. For an extra level of protection, you can first back up to hard disk, then perform a file-level backup to write the database backup file to tape. WITH INIT specifies that the backup file should be overwritten. This method works well as long as a Windows backup occurs after every database backup. NAME is the name you give the backup file, which can contain as many as 128 characters. If you don’t specify a name, it’s left blank. DESCRIPTION is a longer, friendly description that makes identifying the file weeks or months after the backup was made a relatively straightforward process.
To perform a full restore of the database, run the following code:
RESTORE DATABASE AdventureWorksFROM DISK = ‘E:SQLdataBACKUPSAdventureWorks_FullDbBkup.BAK’WITH RECOVERY, REPLACE
WITH RECOVERY instructs the restore operation to roll back any uncommitted transactions that might be on the transaction log and leave the database in operational mode, ready to resume work. REPLACE means overwrite any existing file with the same name. For more information, see the Web sidebar “Replacing a Database.”
If you use the full backup strategy, you need to monitor the size of the transaction log. A full backup doesn’t truncate (remove inactive entries from) the transaction log. If you perform only full database backups, you should follow the full backup with a log backup using the TRUNCATE_ONLY option, as the following code shows:
BACKUP LOG AdventureWorksWITH TRUNCATE_ONLY
TRUNCATE_ONLY doesn’t back up the transaction log; it simply forces SQL Server to take a checkpoint, which then truncates the log, getting rid of inactive entries and shrinking the size of the log file. Because this option will be dropped in future releases of SQL Server, you might instead use the simple recovery model and let SQL Server automatically rid the transaction log of inactive entries.
Full Plus Log Backups
If you can’t tolerate any data loss on restore, use the full plus log backup strategy. This method guarantees against data loss and works well for databases that are updated frequently. Although using this strategy increases your database’s complexity and maintenance, the total amount of time necessary to back up the database will decrease.
Figure 2 shows a sample schedule for a full plus log backup strategy—a weekly full backup on Sunday, a transaction log backup on Monday, a second log backup on Tuesday, and a log backup every day of the week until the following Sunday rolls around, when you take a new full backup. A log backup includes all data and structures that have changed since the last log backup. Thus, each log backup in this schedule contains only the changes for that day: Monday’s log backup contains all of Monday’s changes, Tuesday’s log backup contains all of Tuesday’s changes, and so on.
Unless you specify otherwise, inactive records in the log are “removed” (marked for overwriting) at the end of a transaction log backup by default. You can add NO_TRUNCATE or COPY_ONLY to the BACKUP LOG command, which will leave the log records as they were before the log backup began. However, you shouldn’t use this option unless you have a lot of experience.
SQL Server 2005 lets you perform a tail-log backup, which is a backup taken after a database crash—assuming that the transaction log file isn’t corrupt. A tail-log backup captures the last few transactions since the last transaction log backup. (For a more complete explanation of tail-log backups, see the Web sidebar “What Is a Tail-Log Backup?”)
Using the full recovery model provides relatively straightforward recovery and is preferable when using the full plus log backup strategy. You simply restore the full backup followed by each of the transaction log backups in chronological order (i.e., the order in which they were taken), finishing with a restore of the tail-log backup. This strategy works well for production systems, especially those that are mostly transactional with few bulk operations.
If your database has regular bulk operations (e.g., bulk inserts done daily), then you might want to use the bulk-logged recovery model. Because individual records included in the bulk operation aren’t logged, this approach eliminates the overhead of SQL Server writing to the transaction log. Although you might achieve a performance advantage during the time the bulk operations are running, you run the risk of losing data on a restore operation if you don’t have the source data needed to rerun the bulk operations. If you’re using a simple recovery model, you can’t perform a log backup because this model causes the log file to be truncated on checkpoint.
To perform a full plus log backup, you must first back up the entire database, as follows:
BACKUP DATABASE AdventureWorksTO DISK = ‘E:SQLdataBACKUPSAdventureWorks_FullDbBkup.bak’WITH INIT, NAME = ‘AdventureWorks Full Db backup’,DESCRIPTION = ‘AdventureWorks Full Database Backup’
Then run the following code for the transaction log backup:
BACKUP LOG AdventureWorksTO DISK = ‘E:SQLdataBACKUPSAdventureWorks_TlogBkup.bak’WITH NOINIT, NAME = ‘AdventureWorks Translog backup’,DESCRIPTION = ‘AdventureWorks Transaction Log Backup’, NOFORMAT
WITH NOINIT specifies that the backup files should be appended to the backup media, whether you’re using disk or tape. In this case, all the transaction log backups will be written to the same disk file, one after another, in sequence. NOFORMAT instructs the backup process to preserve any header information that might already be on the backup disk headers. This behavior is the default, so you don’t necessarily need to use this option, although doing so is helpful for self-documentation.
To restore a database that’s using a full or full plus log backup strategy, perform the following steps.
If the database is online, restrict database access by switching the database availability option (in the property window) to RESTRICTED_USER, which allows only members of the db_owner fixed database role and members of the dbcreator and sysadmin fixed server roles to access the database.
Perform a tail-log backup (new to SQL Server 2005).
Fix the problem that caused the database to crash.
Restore using the full backup with the NO-RECOVERY option.
Apply each of the transaction log backups with the NORECOVERY option, if applicable.
Restore the tail-log backup with the RECOVERY option.
The code to perform a tail-log backup is as follows:
BACKUP LOG AdventureWorksTO DISK = ‘E:SQLdataBACKUPSAdventureWorks_TaillogBkup.bak’WITH NORECOVER
To perform a complete restore from a full backup, you must first restore the files for the database as follows:
RESTORE DATABASE AdventureWorksFROM DISK = ‘E:SQLdataBACKUPSAdventureWorks_FullDbBkup.bak’WITH NORECOVERY
NORECOVERY instructs the recovery operation to leave partial transactions intact rather than roll them back. The transaction log backup(s) that follow the full database restore contain additional data that complete these partial transactions. NORECOVERY leaves the database in a nonoperational state. The full restore is immediately followed by a restore of each of the transaction log backups in chronological order, all using NORECOVERY as follows:
RESTORE LOG AdventureWorksFROM DISK = ‘E:SQLdataBACKUPSAdventureWorks_TlogBkup.bak’WITH NORECOVERY
Finally, apply the tail-log backup with the RECOVERY option, as follows:
RESTORE LOG AdventureWorksFROM DISK = ‘E:SQLdataBACKUPSAdventureWorks_TaillogBkup.bak’WITH RECOVERY
The full plus log backup strategy isn’t bulletproof. If one of the transaction log backups is corrupted, then you can restore only to a point before the corrupted log backup. For instance, suppose you run a weekly full backup on Sunday and transaction log backups on Monday through Saturday. If Tuesday’s log backup is corrupt, you can restore only through Monday’s backup. All of Tuesday’s work would be missing because of the corrupted log backup, so you wouldn’t want to risk violating data integrity by applying Wednesday’s transactions to Monday’s data. Even the tail-log backup would be useless.
Full Plus Differential Backups
If you want an extra level of insurance, consider adding differential backups to your full backup scheme instead of doing just log backups. This strategy is good for a transactional database that has many record inserts and updates and that can sustain little to no data loss on restore and recovery, as well as for administrators who place a priority on fast recovery.
A differential backup is cumulative; it includes all data and structures that have changed since the last full backup, regardless of when that last full backup was made, or how many previous differential backups have been run. Suppose you perform a full backup on Sunday and differential backups on subsequent days of the week, as Figure 3 illustrates. Monday’s differential backup will contain all of Monday’s changes, Tuesday’s differential backup will contain all of Monday’s plus Tuesday’s changes, Wednesday’s differential backup will contain all of Monday’s plus Tuesday’s plus Wednesday’s changes, and so on.
Restoring a differential backup generally takes less time than restoring a full plus log backup, because restoring just one differential backup takes less time than restoring a string of log backups. To perform a differential backup, run the following code:
BACKUP DATABASE AdventureWorksTO DISK = ‘E:SQLdataBACKUPSAdventureWorks_DiffDbBkup.bak’WITH INIT, DIFFERENTIAL, NAME = ‘AdventureWorks Diff Db backup’,DESCRIPTION = ‘AdventureWorks Differential Database Backup’
To restore a database using the full plus differential strategy, perform the following steps.
If the database is online, restrict database access by switching the database availability option (in the property window) to RESTRICTED_USER, which allows only members of the db_owner fixed database role and members of the dbcreator and sysadmin fixed server roles to access the database.
Perform a tail-log backup (new to SQL Server 2005).
Fix the problem that caused the database to crash.
Restore using the full backup with the NO-RECOVERY option.
Apply the latest differential backup with the NO-RECOVERY option.
Apply the tail-log backup with the RECOVERY option.
After restoring the full backup, do a differential restore as follows:
RESTORE DATABASE AdventureWorksFROM DISK = ‘E:SQLdataBACKUPSAdventureWorks_DiffDbBkup.bak’WITH NORECOVERY
Then, restore the tail-log backup with the RECOVERY option, as discussed previously.
The differential backup provides a level of insurance you can’t achieve when performing only log backups. If the most current differential backup is corrupted, you can still restore from the previous differential and maintain full data integrity.
Combining Strategies
If redoing transactions for the missing day isn’t practical, you can combine full, differential, and multiple daily log backups. For example, you could perform a full backup on Sunday and differential backups on subsequent nights, plus log backups on Monday through Saturday mornings and afternoons, as Figure 4 shows. If the database came down on Friday night and needed to be restored, but Thursday’s differential backup was corrupted, you could use Wednesday’s differential for the restore, then restore the log backups taken on Thursday and Friday. The database would then be restored to the point of failure. For more information, see the Web sidebar “How Do I Recover to a Point in Time?”
To minimize the risk of data loss, you should consider mixing and matching full, log, and differential backups, even though doing so will complicate your backup strategy and backup file management. You also need to realistically evaluate how much data loss you can live with following a database crash and restore. Using a full or bulk-logged recovery model rather than a simple recovery model means more transaction log file activity and a larger (and longer) log file backup, but the benefit is less lost data.
Alternative Backup Strategies
SQL Server backups aren’t limited to full, log, and differential. More advanced options include the file or file group backup strategy, the partial backup strategy, and the copy-only backup strategy. For information about these strategies, see the Web sidebar “Alternative Backup Strategies.”
Database Access During Backups and Restores
SQL Server backups are an online process; the data stored in SQL Server is highly available during this time. Operations such as INSERT, UPDATE, and DELETE are allowed, as are SELECT statements. However, operations that would modify the underlying table or file space architecture, such as ALTER DATABASE, ADD FILE, or SHRINKFILE, can’t be done while the backup is running. If auto-shrink is turned on in your database configuration file, you might experience a conflict during a backup operation. For example, if auto-shrink tries to initiate while the backup is running, both operations might fail. Whichever operation starts first will set a lock on the file; the second operation will wait for that lock to be released before it begins. If the first operation releases the lock, then the second operation will commence. If the lock times out on the first operation, the second operation will fail. This development seems unfair to the second operation, which has to wait for the lock timeout, only to then fail. However, the rationale is that the second operation’s viability is based on the first operation succeeding. If the first operation fails, the second operation doesn’t need to proceed. To prevent this problem, consider turning off auto-shrink before performing a backup.
Most SQL Server restores are offline operations; users can’t access the database while it’s being restored. If you’re using SQL Server 2005 Enterprise Edition with the full recovery model, partial restores and restores of nonprimary file groups are online operations by default. The parts of the database that aren’t being restored, such as read-only file groups, are accessible throughout the entire restore operation. Read/write file groups are available except when they’re pulled offline to be restored. This option is immensely valuable for large databases that are heavily accessed 24 × 7 × 365. For more information, see “Why Can’t My Database Restore Be an Online Operation?”
Putting It All Together
Data has become so central to businesses’ success that safeguarding it is mandatory. Backups are therefore crucial to maintain a healthy back office environment. The first step toward building business and database continuity is to make regular database backups and test them to ensure they can be successfully restored. When you create a new database, you should write the backup and restore scripts at the same time. SQL Server gives you many backup and restore options, which you can customize to meet the needs of each database.
About the Author
You May Also Like