SQL Server also needs to perform redo operations when restoring full, differential, or log backups. As we've discussed, for a minimally logged operation the affected pages are on disk at the point the transaction completes and so SQL Server simply copies those pages into any full or differential backup files, and restores from these backups are unaffected.
Restores from log backups, however, are more interesting. If the log backup only
contained the log records relating to extent allocation, then on restoring the log backup there would be no way to re-create the contents of the extents that the minimally logged operation affected. This is because, as we saw earlier, the log does not contain the inserted data, just the extent allocations and metadata.
In order to enable log restores when there are minimally logged operations, included in the log backup are not just the log records, but also the images of any extent (set of eight pages) affected by the minimally logged operation. This doesn't mean images of them as they were after the minimally logged operation, but the pages as they are at the time of the log backup. SQL Server maintains a bitmap allocation page, called the ML map or bulk-logged change map, with a bit for every extent. Any extents affected by the minimally logged operation have their bit set to 1. The log backup operation reads this page and so knows exactly what extents to include in the backup. That log backup will then clear the ML map.
So, for example, let's say we have a timeline like that shown in Figure 6.7, where a log backup occurs at 10:00, then at (1) a minimally logged operation (let's say a BULK INSERT) affected pages 1308–1315 among others, at (2) an UPDATE affected pages 1310 and 1311 and at (3) another log backup occurred.
Figure 6.7: Example timeline for database operations and log backups.
The log backup at 10:30 backs up the log records covering the period 10:00–10:30. Since there was a minimally logged operation within that log interval, it copies the extents affected by the minimally logged operations into the log backup. It copies them as they appear at the time of the log backup, so they will reflect the effects of the BULK INSERT and the UPDATE, plus any further modifications that may have taken place between the UPDATE and the log backup.
This affects how we can restore the log. It also affects the size of the log backups and, under certain circumstances, may affect tail log backups, but we'll get to that in more detail in the next section.
Let's take a look an example of how minimally logged operations can affect a point-in-time restore. Figure 6.8 depicts an identical backup timeline for two databases. The green bar represents a full database backup and the yellow bars represent a series of log backups. The only difference between the two databases is that the first is operating in FULL recovery model, and the second in BULK LOGGED.
Figure 6.8: Database backup timeline.
The time span of the fifth log backup is 10:00 to 10:30. At 10:10, a BULK INSERT
command (1) loaded a set of data. This bulk data load completed without a hitch but, in an unrelated incident at 10:20, a user ran a "rogue" data modification (2) and crucial data was lost. The project manager informs the DBA team and requests that they restore the database to a point in time just before the transaction that resulted in data loss started, at 10:20.
In the FULL recovery model database, this is not an issue. The bulk data load was fully logged and we can restore the database to any point in time within that log file. We simply restore the last full database backup, without recovery, and apply the log files to the point in time right before the unfortunate data loss incident occurred, using the RESTORE LOG command with the STOPAT parameter, to stop the restore operation sometime before 10:20.
In the BULK_LOGGED database, we have a problem. We can restore to any point in time within the first four log backups, but not to any point in time within the fifth log backup, which contains the minimally logged operations. Remember that for this log backup we only have the extents affected by the minimally logged operation, as they existed at the
file, or we apply all of them, restoring to the end of the file, or proceeding to restore to any point in time within the sixth log backup.
If we tried to restore that fifth log backup, with STOPAT 10:15 (a time between the minimally logged operation and the rogue modification), SQL is not going to walk the rest of the log backup figuring out what operations it needs to undo on the pages that were affected by the minimally logged operation. It has a simpler reaction:
Msg 4341, Level 16, State 1, Line 2
This log backup contains bulk-logged changes. It cannot be used to stop at an arbitrary point in time.
Unfortunately, if we apply the whole of the fifth log file backup, this would defeat the purpose of the recovery, since the errant process committed its changes somewhere inside of that log backup file, so we'd simply be removing the data we were trying to get back! We have little choice but to restore up to the end of the fourth log, recover the database, and report the loss of any data changes made after this time.
This inability to restore the database to point-in-time if there are minimally logged operations within the log interval is something that must be considered when choosing to run a database in BULK_LOGGED recovery, either for short or long periods. It is easy to identify whether or not a specific log backup contains any minimally logged operations.
A RESTORE HEADERONLY returns a set of details about the backup in question, including a column HasBulkLoggedData. In addition, the msdb backupset table has a column, has_bulk_logged_data. If the column value is 1, then the log backup contains minimally logged operations and can only be restored entirely or not at all. That said, finding this out while planning or executing a restore may be an unpleasant surprise.