04 July, 2015

RMAN -- 4 : Recovering from an Incomplete Restore

What do you do if a RESTORE fails mid-way ?  Do you need to rerun the whole restore ?  If it is a very large database, it could take [many ?] hours.

RMAN is "smart" enough to detect datafiles that have been restored and not re-attempt a restore.

UPDATE 08-Jul-15 :  For a database with Oracle Managed Files  (OMF), please see Oracle Support Note Doc ID 1621319.1  {Thus, see my next blog post on Recovering from an Incomplete Restore with OMF Files}

This example below is for non-OMF files.

Here, I begin a database restore.

 
RMAN> restore controlfile from autobackup;

Starting restore at 04-JUL-15
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=19 device type=DISK

recovery area destination: /NEW_FS/oracle/FRA
database name (or database unique name) used for search: HEMANTDB
channel ORA_DISK_1: AUTOBACKUP /NEW_FS/oracle/FRA/HEMANTDB/autobackup/2015_07_04/o1_mf_s_884175832_bsgqysyq_.bkp found in the recovery area
AUTOBACKUP search with format "%F" not attempted because DBID was not set
channel ORA_DISK_1: restoring control file from AUTOBACKUP /NEW_FS/oracle/FRA/HEMANTDB/autobackup/2015_07_04/o1_mf_s_884175832_bsgqysyq_.bkp
channel ORA_DISK_1: control file restore from AUTOBACKUP complete
output file name=/home/oracle/app/oracle/oradata/orcl/control01.ctl
output file name=/home/oracle/app/oracle/flash_recovery_area/orcl/control02.ctl
Finished restore at 04-JUL-15

RMAN> 
RMAN> alter database mount;

database mounted
released channel: ORA_DISK_1

RMAN> 
RMAN> restore database;

Starting restore at 04-JUL-15
Starting implicit crosscheck backup at 04-JUL-15
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=19 device type=DISK
allocated channel: ORA_DISK_2
channel ORA_DISK_2: SID=21 device type=DISK
Crosschecked 8 objects
Finished implicit crosscheck backup at 04-JUL-15

Starting implicit crosscheck copy at 04-JUL-15
using channel ORA_DISK_1
using channel ORA_DISK_2
Finished implicit crosscheck copy at 04-JUL-15

searching for all files in the recovery area
cataloging files...
cataloging done

List of Cataloged Files
=======================
File Name: /NEW_FS/oracle/FRA/HEMANTDB/archivelog/2015_07_04/o1_mf_1_628_bsgrjztp_.arc
File Name: /NEW_FS/oracle/FRA/HEMANTDB/archivelog/2015_07_04/o1_mf_1_632_bsgrk8od_.arc
File Name: /NEW_FS/oracle/FRA/HEMANTDB/archivelog/2015_07_04/o1_mf_1_630_bsgrk48j_.arc
File Name: /NEW_FS/oracle/FRA/HEMANTDB/archivelog/2015_07_04/o1_mf_1_631_bsgrk49w_.arc
File Name: /NEW_FS/oracle/FRA/HEMANTDB/archivelog/2015_07_04/o1_mf_1_629_bsgrk0tw_.arc
File Name: /NEW_FS/oracle/FRA/HEMANTDB/autobackup/2015_07_04/o1_mf_s_884175832_bsgqysyq_.bkp

using channel ORA_DISK_1
using channel ORA_DISK_2

channel ORA_DISK_1: starting datafile backup set restore
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
channel ORA_DISK_1: restoring datafile 00001 to /home/oracle/app/oracle/oradata/orcl/system01.dbf
channel ORA_DISK_1: reading from backup piece /NEW_FS/oracle/FRA/HEMANTDB/backupset/2015_07_04/o1_mf_nnndf_TAG20150704T121859_bsgqonjj_.bkp
channel ORA_DISK_2: starting datafile backup set restore
channel ORA_DISK_2: specifying datafile(s) to restore from backup set
channel ORA_DISK_2: restoring datafile 00004 to /home/oracle/app/oracle/oradata/orcl/users01.dbf
channel ORA_DISK_2: reading from backup piece /NEW_FS/oracle/FRA/HEMANTDB/backupset/2015_07_04/o1_mf_nnndf_TAG20150704T121859_bsgqsccg_.bkp
channel ORA_DISK_2: piece handle=/NEW_FS/oracle/FRA/HEMANTDB/backupset/2015_07_04/o1_mf_nnndf_TAG20150704T121859_bsgqsccg_.bkp tag=TAG20150704T121859
channel ORA_DISK_2: restored backup piece 1
channel ORA_DISK_2: restore complete, elapsed time: 00:02:34
channel ORA_DISK_2: starting datafile backup set restore
channel ORA_DISK_2: specifying datafile(s) to restore from backup set
channel ORA_DISK_2: restoring datafile 00003 to /home/oracle/app/oracle/oradata/orcl/undotbs01.dbf
channel ORA_DISK_2: reading from backup piece /NEW_FS/oracle/FRA/HEMANTDB/backupset/2015_07_04/o1_mf_nnndf_TAG20150704T121859_bsgqwt4s_.bkp
channel ORA_DISK_2: piece handle=/NEW_FS/oracle/FRA/HEMANTDB/backupset/2015_07_04/o1_mf_nnndf_TAG20150704T121859_bsgqwt4s_.bkp tag=TAG20150704T121859
channel ORA_DISK_2: restored backup piece 1
channel ORA_DISK_2: restore complete, elapsed time: 00:00:35
channel ORA_DISK_2: starting datafile backup set restore
channel ORA_DISK_2: specifying datafile(s) to restore from backup set
channel ORA_DISK_2: restoring datafile 00002 to /home/oracle/app/oracle/oradata/orcl/sysaux01.dbf
channel ORA_DISK_2: reading from backup piece /NEW_FS/oracle/FRA/HEMANTDB/backupset/2015_07_04/o1_mf_nnndf_TAG20150704T121859_bsgqonjl_.bkp
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-00601: fatal error in recovery manager
RMAN-03004: fatal error during execution of command
ORA-01092: ORACLE instance terminated. Disconnection forced
ORACLE error from target database: 
ORA-03135: connection lost contact
Process ID: 3777
Session ID: 1 Serial number: 9

[oracle@localhost ~]$ 

After having restored a few datafiles, the restore failed on being disconnected from the database. (The  server or database instance has crashed).  Since the controlfile has been restored, I can bring up the database in MOUNT mode and then re-attempt a RESTORE DATABASE.

[oracle@localhost ~]$ rman target sys/oracle@orcl

Recovery Manager: Release 11.2.0.2.0 - Production on Sat Jul 4 12:56:41 2015

Copyright (c) 1982, 2009, Oracle and/or its affiliates.  All rights reserved.

connected to target database: ORCL (DBID=1229390655, not open)

RMAN> 
RMAN> restore database;

Starting restore at 04-JUL-15
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=20 device type=DISK
allocated channel: ORA_DISK_2
channel ORA_DISK_2: SID=21 device type=DISK

skipping datafile 3; already restored to file /home/oracle/app/oracle/oradata/orcl/undotbs01.dbf
skipping datafile 4; already restored to file /home/oracle/app/oracle/oradata/orcl/users01.dbf
channel ORA_DISK_1: starting datafile backup set restore
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
channel ORA_DISK_1: restoring datafile 00001 to /home/oracle/app/oracle/oradata/orcl/system01.dbf
channel ORA_DISK_1: reading from backup piece /NEW_FS/oracle/FRA/HEMANTDB/backupset/2015_07_04/o1_mf_nnndf_TAG20150704T121859_bsgqonjj_.bkp
channel ORA_DISK_2: starting datafile backup set restore
channel ORA_DISK_2: specifying datafile(s) to restore from backup set
channel ORA_DISK_2: restoring datafile 00002 to /home/oracle/app/oracle/oradata/orcl/sysaux01.dbf
channel ORA_DISK_2: reading from backup piece /NEW_FS/oracle/FRA/HEMANTDB/backupset/2015_07_04/o1_mf_nnndf_TAG20150704T121859_bsgqonjl_.bkp
channel ORA_DISK_1: piece handle=/NEW_FS/oracle/FRA/HEMANTDB/backupset/2015_07_04/o1_mf_nnndf_TAG20150704T121859_bsgqonjj_.bkp tag=TAG20150704T121859
channel ORA_DISK_1: restored backup piece 1
channel ORA_DISK_1: restore complete, elapsed time: 00:02:36
channel ORA_DISK_1: starting datafile backup set restore
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
channel ORA_DISK_1: restoring datafile 00010 to /home/oracle/app/oracle/oradata/orcl/APEX_2614203650434107.dbf
channel ORA_DISK_1: reading from backup piece /NEW_FS/oracle/FRA/HEMANTDB/backupset/2015_07_04/o1_mf_nnndf_TAG20150704T121859_bsgqxovh_.bkp
channel ORA_DISK_1: piece handle=/NEW_FS/oracle/FRA/HEMANTDB/backupset/2015_07_04/o1_mf_nnndf_TAG20150704T121859_bsgqxovh_.bkp tag=TAG20150704T121859
channel ORA_DISK_1: restored backup piece 1
channel ORA_DISK_1: restore complete, elapsed time: 00:00:01
channel ORA_DISK_1: starting datafile backup set restore
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
channel ORA_DISK_1: restoring datafile 00005 to /home/oracle/app/oracle/oradata/orcl/example01.dbf
channel ORA_DISK_1: reading from backup piece /NEW_FS/oracle/FRA/HEMANTDB/backupset/2015_07_04/o1_mf_nnndf_TAG20150704T121859_bsgqxjv6_.bkp
channel ORA_DISK_1: piece handle=/NEW_FS/oracle/FRA/HEMANTDB/backupset/2015_07_04/o1_mf_nnndf_TAG20150704T121859_bsgqxjv6_.bkp tag=TAG20150704T121859
channel ORA_DISK_1: restored backup piece 1
channel ORA_DISK_1: restore complete, elapsed time: 00:00:07
channel ORA_DISK_2: piece handle=/NEW_FS/oracle/FRA/HEMANTDB/backupset/2015_07_04/o1_mf_nnndf_TAG20150704T121859_bsgqonjl_.bkp tag=TAG20150704T121859
channel ORA_DISK_2: restored backup piece 1
channel ORA_DISK_2: restore complete, elapsed time: 00:04:02
Finished restore at 04-JUL-15

RMAN> 

RMAN detects that datafiles 3 (undotbs01.dbf) and 4 (users01.dbf) had already been restored.
 If you look at the previous RESTORE run, you can see that these were restored by Channel ORA_DISK_2. The first channel ORA_DISK_1 had started restoring system01.dbf but hadn't completed restoring the datafile when the restore crashed. That restore of datafile 1 (system01.dbf) had to be redone.

 (Another thing to note : Oracle doesn't necessarily restore datafiles in the order of file_id (file#) ! There really is no ORDER BY for a RESTORE)

RMAN> recover database;

Starting recover at 04-JUL-15
using channel ORA_DISK_1
using channel ORA_DISK_2

starting media recovery

archived log for thread 1 with sequence 628 is already on disk as file /NEW_FS/oracle/FRA/HEMANTDB/archivelog/2015_07_04/o1_mf_1_628_bsgrjztp_.arc
archived log for thread 1 with sequence 629 is already on disk as file /NEW_FS/oracle/FRA/HEMANTDB/archivelog/2015_07_04/o1_mf_1_629_bsgrk0tw_.arc
archived log for thread 1 with sequence 630 is already on disk as file /NEW_FS/oracle/FRA/HEMANTDB/archivelog/2015_07_04/o1_mf_1_630_bsgrk48j_.arc
archived log for thread 1 with sequence 631 is already on disk as file /NEW_FS/oracle/FRA/HEMANTDB/archivelog/2015_07_04/o1_mf_1_631_bsgrk49w_.arc
archived log for thread 1 with sequence 632 is already on disk as file /NEW_FS/oracle/FRA/HEMANTDB/archivelog/2015_07_04/o1_mf_1_632_bsgrk8od_.arc
archived log for thread 1 with sequence 633 is already on disk as file /home/oracle/app/oracle/oradata/orcl/redo03.log
archived log file name=/NEW_FS/oracle/FRA/HEMANTDB/archivelog/2015_07_04/o1_mf_1_628_bsgrjztp_.arc thread=1 sequence=628
archived log file name=/NEW_FS/oracle/FRA/HEMANTDB/archivelog/2015_07_04/o1_mf_1_629_bsgrk0tw_.arc thread=1 sequence=629
archived log file name=/NEW_FS/oracle/FRA/HEMANTDB/archivelog/2015_07_04/o1_mf_1_630_bsgrk48j_.arc thread=1 sequence=630
archived log file name=/NEW_FS/oracle/FRA/HEMANTDB/archivelog/2015_07_04/o1_mf_1_631_bsgrk49w_.arc thread=1 sequence=631
archived log file name=/NEW_FS/oracle/FRA/HEMANTDB/archivelog/2015_07_04/o1_mf_1_632_bsgrk8od_.arc thread=1 sequence=632
archived log file name=/home/oracle/app/oracle/oradata/orcl/redo03.log thread=1 sequence=633
media recovery complete, elapsed time: 00:00:02
Finished recover at 04-JUL-15

RMAN>
RMAN> alter database open resetlogs;

database opened

RMAN> 


UPDATE 07-Jul-15 :  Also see my earlier (year 2012) post "Datafiles not Restored -- using V$DATAFILE and V$DATAFILE_HEADER"  which also shows retrying a RESTORE DATABASE after a failure of restoring a datafile.  There, a single file in a BackupSet failed to restore.  Oracle didn't continue and try the other datafiles in that BackupSet.  I could either fix the error and retry the entire BackupSet (RESTORE DATABASE would have identified the right BackupSet containing those files) OR I could, as I did in that scenario, individually restore DataFiles from the BackupSet.

It can be a good idea to have your database backup consist of multiple BackupSets, using either multiple CHANNELs or FILESPERSET during the BACKUP.


You could also note, as an aside, that Log Sequence 633 was an online redo log file. RMAN automatically verifies that the online redo log files designated by the controlfile are present and uses them.

.
.
.

1 comment:

Foued said...

Thanks Hemant for sharing.