Oracle7 Server Concepts

Contents Index Home Previous Next

Recovery from Media Failure

Media failure is a failure that occurs when a file, portion of a file, or a disk either cannot be read from or cannot be written to because it is damaged or missing. For example, this can happen if one or more datafiles are erased accidentally or lost due to a disk head crash.

Recovery from a media failure can take two forms, depending on the archiving mode in which the database is operated:

Recovery from a media failure, no matter what form, always recovers the entire database to a transaction-consistent state before the media failure. It is not logical or possible to recover a part of a database (such as a tablespace) to one point in time, and recover (or leave untouched) another part of a database to a different point in time; otherwise, the database would not be in a transaction-consistent state with respect to itself.

The following sections describe the different types of media recovery available if a database is operated in ARCHIVELOG mode: complete media recovery and incomplete media recovery.

Read-Only Tablespaces and Media Recovery

Normal media recovery does not check the read-only status of a datafile. When you perform media recovery of a tablespace that was once read-only, you have three possible options, depending upon when the tablespace was made read-only and when you performed the most recent backup. These scenarios are illustrated in Figure 24 - 3.

Figure 24 - 3. Type of Media Recovery

Case 1 The tablespace being recovered is read-only, and was read-only when the last backup occurred. In this case, you can simply restore the tablespace from the backup. There is no need to apply any redo information.
Case 2 The tablespace being recovered is writeable, but was read-only when the last backup occurred. In this case, you would need to restore the tablespace from the backup and apply the redo information from the point of time when the tablespace was made writeable.
Case 3 The tablespace being recovered is read-only, but was writeable when the last backup occurred. Because you should always backup a tablespace after making it read-only, you should not experience this situation. However, if this does occur, you must restore the tablespace from the backup and recover up to the time that the tablespace was made read-only.
Unlike writeable datafiles, read-only datafiles are not taken offline automatically if a media failure occurs. If you experience a media failure that affects only a portion of your datafiles, you should take these datafiles offline and follow the instructions in the Oracle7 Server Administrator's Guide for performing recovery of offline tablespaces in an open database.

Complete Media Recovery

Complete media recovery recovers all lost changes; no work is lost. Complete media recovery is possible only if all necessary redo logs (online and archived) are available.

Different types of complete media recovery are available, depending on the files that are damaged and the availability of the database that is required during recovery operations.

Closed Database Recovery

Complete media recovery of all or individual damaged datafiles can proceed while a database is mounted but closed and completely unavailable for normal use. Closed database recovery is used in the following situations:

Open Database-Offline Tablespace Recovery

Complete media recovery can proceed while a database is open. Undamaged tablespaces of the database are online and available for use, while a damaged tablespace is offline, and all datafiles that constitute the damaged tablespace are recovered as a unit. Offline tablespace recovery is used in the following situations:

Open Database-Offline Tablespace-Individual Datafile Recovery

Complete media recovery can proceed while a database is open. Undamaged tablespaces of the database are online and available for use, while a damaged tablespace is offline and specific damaged datafiles associated with the damaged tablespace are recovered. Individual datafile recovery is used in the following situations:

Complete Media Recovery Using a Backup of the Control File

Complete media recovery can proceed without loss of data, even if all copies of the control file are damaged by a disk failure. Media recovery of datafile backups can be done even if the control file is a backup. The control file is not recovered by media recovery; rather the RESETLOGS at database open recovers the control file.

The Mechanisms of Complete Media Recovery

The mechanisms that Oracle uses to perform any type of complete media recovery are best described using an example. The following is an example of complete media recovery of damaged datafiles while the database is open and a damaged tablespace is offline. Assume the following:

Recovery of the two datafiles that constitute the USERS tablespace is necessary because Disk X has been damaged, and the system has automatically taken the tablespace offline. In this case, the datafile of the SYSTEM tablespace is not damaged. Therefore, the database can be open with the SYSTEM tablespace online and available for use while recovery is completed on the offline tablespace needing recovery (USERS).

The following sections describe the phases of complete media recovery.

Phase 1: Restoration of Backup Datafiles After Disk X has been repaired, the most recent backup files are used to restore only the damaged datafiles USERS1 and USERS2. After restoration, the datafiles of the database exist as illustrated in Figure 24 - 4.

Figure 24 - 4. Phase 1 of Complete Media Recovery

Each datafile header contains the most recent log sequence number being written at the time the datafile was being written. The restored backup files will have earlier log sequence numbers than those of the datafiles that were not affected (not restored) by the disk crash. The control file contains a pointer to the last log sequence number that was written.

Phase 2: Rolling Forward with the Redo Log As complete media recovery proceeds, Oracle applies redo log files (archived and online) to datafiles, as necessary, as illustrated in Figure 24 - 5. Oracle automatically detects when a redo log file does not contain any redo information corresponding to a restored backup datafile. Therefore, Oracle optimizes the recovery process by not attempting to apply the redo log file to the restored datafile.

Figure 24 - 5. Phase 2 of Complete Media Recovery

In this case, the redo log file with the log sequence number of 12 is applied exclusively to USERS1, and the redo log files with log sequence numbers ranging from 13 to 31 are applied to both USERS1 and USERS2. No redo log files are applied to the datafiles that do not require recovery.

There is a flag in the header of the current redo log that indicates if it is the last available redo log file to apply to the restored datafiles.

Phase 3: Rolling Back Using Rollback Segments Once the necessary redo log files have been applied to the damaged datafiles, all uncommitted data that exists as a result of the roll forward in Phase 2 must be removed. This is completed by applying the deferred rollback segment as the tablespace is brought online, as illustrated in Figure 24 - 6.

Figure 24 - 6. Phase 3 of Complete Media Recovery

After Phase 3 is complete, notice how the log sequence number contained in the datafile headers of the previously damaged and restored datafiles, USERS1 and USERS2, has been updated during Phase 2 of the recovery process. The USERS tablespace can now be brought online. Deferred rollback segments are applied to the files of the offline tablespace as it is brought back online. Once the rollback is complete, the datafiles USERS1 and USERS2 exist as they did at the instant before the disk failure. Once this is complete, all data in the tablespace is now consistent and available for use.

Incomplete Media Recovery

In specific situations (for example, the loss of all active online redo log files, or a user error, such as the accidental dropping of an important table), complete media recovery may not be possible or may not be desired. In such situations, incomplete media recovery is performed to reconstruct the damaged database to a transaction consistent state before the media failure or user error.

In most cases, unless desired, incomplete media recovery is not necessary if the online redo log has been mirrored to protect against having a single point of failure.

There are different types of incomplete media recovery that might be used, depending on the situation that requires incomplete media recovery: cancel-based, time-based, and change-based incomplete recovery.

Cancel-Based Recovery

In certain situations, incomplete media recovery must be controlled so that the administrator can cancel the operation at a specific point. Specifically, cancel-based recovery is used when one or more redo log groups (online or archived) have been damaged by a media failure and are not available for required recovery procedures (for example, the online redo log is not mirrored, and the single active online redo log file has been damaged by a disk failure). If one or more redo log groups are not available, the missing redo log groups cannot be applied during recovery procedures. Therefore, media recovery must be controlled so the recovery operation is terminated after the most recent, undamaged redo log group has been applied to the datafiles.

Time-Based and Change-Based Recovery

Incomplete media recovery is desirable if the database administrator would like to recover to a specific point in the past. This might be useful in the following situations:

In both of these cases, the endpoint of incomplete media recovery can be specified by a point in time or a specific system change number (SCN). An SCN is recorded in the redo log, along with the redo entries, each time that a transaction is committed. If a time is given, the database is recovered to the transaction consistent state just before the specified time. If an SCN is given, the database is recovered to the transaction committed just before the specified SCN.

The Mechanisms of Incomplete Database Recovery

Incomplete database recovery proceeds in the same way as complete media recovery, with a few exceptions:

Figure 24 - 7. Effects of Resetting the Log Sequence Number after Incomplete Media Recovery


Contents Index Home Previous Next