My Oracle: Restore Standby Database from Tape

My Data Guard database was running on 11.2.0.3 version and the DB size was around 10 TB. Some time back, the primary database Hardware was crashed and whole DB server went down and UNIX admin not able to bring the server up. It is VERY critical and high SLA database, management wanted to bring it up as quickly as possible.

Here is what i have done to bring the application up. I would like to share this and thought it might be helpful for others.

Environment :

Oracle version 11.2.0.3
Data Guard replication from primary to standby
Flashback feature is not enabled on the database.
Data Guard is running in Maximum performance mode

Here are the major steps :

1. Activate the standby database

2. Point the application to old standby and currently it is primary
3. Drop the old primary and restore from tape and setup the standby

Step 1 Activate the standby database.

As a first step, we need to activate the current standby data base to bring the application up.

Please refer How to Activate the Standby database

Step 2 Point the application to current primary(activated the standby from step 1).

Application team will take care of this. DBA can provide the DB/Host info just in case application engineer needs it.

Step 3 Rebuild the standby database from tape.

Now the old Primary database is useless. We will have to restore from the scratch. We have several ways we can restore the DB. But I restored the DB from tape. Since i did not want to clone the DB from current primary. The reason is, i did not want to add additional load on the current primary database.

Step A

Disable archive log delete job in current Primary database. The reason is, we need to apply the archive log files after building the standby database. It is safe to disable the archive log delete job. I just want to save all the archive logs on the server.

Step B Drop the broken database(the one it crashed)

Once UNIX SA bring the server up, drop the database and restore the database from the scratch.

We don't need to restore the database from scratch if flashback is enabled on the database. But apparently, the flashback is not enabled on my environment and end up restoring the database from scratch.

We have several ways to drop the database. Please refer this just in case if you need more info. How to drop the database

I used RMAN to drop the database. You can use different approach if you like.

RMAN> connect target

RMAN> startup force mount

RMAN> sql 'alter system enable restricted session';

RMAN> drop database;

Step C Verify the backup list from tape.

Here is the sample command to check the tape backup list..

/usr/openv/netbackup/bin/bplist -B -C hostname.bu -t 4 -l -s 02/22/2015 -e 02/22/2015 -R /

From the above command, we can find out the control file and spfile and backup info.

Step D Restore the SPFILE

Restore the spfile from tape by using below script. I masked XXXX in some places due to security reason. Please use the appropriate name according to your environment. I restored the sfile under /work area.

run

{

allocate channel t01 device type 'sbt_tape' parms 'ENV=(NB_ORA_CLIENT=XXXXX.bu,NB_ORA_SERV=hcmsnbu1)';

restore spfile to '/work/XXXXX_spfile.ora' from '/XXXXX_SPFILE_20140222_0400_58082_n2p1bbbi_1_1_840281458';

RELEASE CHANNEL t01;

}

Move the spfile to $ORACLE_HOME/dbs

Step E Restore the control file

Restore control file from tape using below script.

run

{

allocate channel t01 device type 'sbt_tape' parms 'ENV=(NB_ORA_CLIENT=XXXXX.bu,NB_ORA_SERV=hcmsnbu1)';

restore controlfile to '/work/XXXX_control.ctl' from 'XXXXX_CTRL_20140222_0400_58083_n3p1bbdp_1_1_840281529';

RELEASE CHANNEL t01;

}

Move the control file to appropriate location.

cd /work/

cp XXXXXX_control.ctl /data01/oradata/XXXX/control01.ctl

cp XXXXXX_control.ctl /data02/oradata/XXXX/control02.ctl

cp XXXXXX_control.ctl /data03/oradata/XXXX/control03.ctl

Step F Start the instance.

Startup the instance as below

SQL> startup nomount

SQL> alter database mount;

Step G Restore and Recover the database. This step took me 10 hours. In your environment, restore time would be depends on your DB size, number of CPU on your host.

Go to Current primary and get the below SCN.

system@XXXXXX> SELECT TO_CHAR(STANDBY_BECAME_PRIMARY_SCN) FROM V$DATABASE;

TO_CHAR(STANDBY_BECAME_PRIMARY_SCN)

----------------------------------------

66132064577

system@XXXXXX>

Apply the above SCN +1 on the below script.

Here is the shell script i used to restore the database. I ran this script in nohup background mode. I used 8 channel since my host has 8 CPU with hyper threading enabled.

$ORACLE_HOME/bin/rman msglog=rman_restore_db.log <

connect target

run

{

allocate channel t01 device type 'sbt_tape' parms 'ENV=(NB_ORA_CLIENT=XXXXXX.bu,NB_ORA_SERV=hcmsnbu1)';

allocate channel t02 device type 'sbt_tape' parms 'ENV=(NB_ORA_CLIENT=XXXXXX.bu,NB_ORA_SERV=hcmsnbu1)';

allocate channel t03 device type 'sbt_tape' parms 'ENV=(NB_ORA_CLIENT=XXXXXX.bu,NB_ORA_SERV=hcmsnbu1)';

allocate channel t04 device type 'sbt_tape' parms 'ENV=(NB_ORA_CLIENT=XXXXXX.bu,NB_ORA_SERV=hcmsnbu1)';

allocate channel t05 device type 'sbt_tape' parms 'ENV=(NB_ORA_CLIENT=XXXXXX.bu,NB_ORA_SERV=hcmsnbu1)';

allocate channel t06 device type 'sbt_tape' parms 'ENV=(NB_ORA_CLIENT=XXXXXX.bu,NB_ORA_SERV=hcmsnbu1)';

allocate channel t07 device type 'sbt_tape' parms 'ENV=(NB_ORA_CLIENT=XXXXXX.bu,NB_ORA_SERV=hcmsnbu1)';

allocate channel t08 device type 'sbt_tape' parms 'ENV=(NB_ORA_CLIENT=XXXXXX.bu,NB_ORA_SERV=hcmsnbu1)';

SET UNTIL SCN 66132064578;

RESTORE DATABASE;

RECOVER DATABASE ;

RELEASE CHANNEL t01;

RELEASE CHANNEL t02;

RELEASE CHANNEL t03;

RELEASE CHANNEL t04;

RELEASE CHANNEL t05;

RELEASE CHANNEL t06;

RELEASE CHANNEL t07;

RELEASE CHANNEL t08;

}

exit

EOF

Here is the log file info for the above restore step.

Recovery Manager: Release 11.2.0.3.0 - Production on Mon Feb 24 14:34:49 2014

RMAN>

connected to target database: XXXXXX (DBID=1081194209, not open)

RMAN> 2> 3> 4> 5> 6> 7> 8> 9> 10> 11> 12> 13> 14> 15> 16> 17> 18> 19> 20> 21> 22>

using target database control file instead of recovery catalog

allocated channel: t01

channel t01: SID=518 device type=SBT_TAPE

channel t01: Veritas NetBackup for Oracle - Release 7.5 (2012060523)

allocated channel: t02

channel t02: SID=604 device type=SBT_TAPE

channel t02: Veritas NetBackup for Oracle - Release 7.5 (2012060523)

allocated channel: t03

channel t03: SID=690 device type=SBT_TAPE

channel t03: Veritas NetBackup for Oracle - Release 7.5 (2012060523)

allocated channel: t04

channel t04: SID=776 device type=SBT_TAPE

channel t04: Veritas NetBackup for Oracle - Release 7.5 (2012060523)

allocated channel: t05

channel t05: SID=862 device type=SBT_TAPE

channel t05: Veritas NetBackup for Oracle - Release 7.5 (2012060523)

allocated channel: t06

channel t06: SID=948 device type=SBT_TAPE

channel t06: Veritas NetBackup for Oracle - Release 7.5 (2012060523)

allocated channel: t07

channel t07: SID=1034 device type=SBT_TAPE

channel t07: Veritas NetBackup for Oracle - Release 7.5 (2012060523)

allocated channel: t08

channel t08: SID=1120 device type=SBT_TAPE

channel t08: Veritas NetBackup for Oracle - Release 7.5 (2012060523)

executing command: SET until clause

Starting restore at 02/24/2014 14:34:53

channel t01: starting datafile backup set restore

channel t01: specifying datafile(s) to restore from backup set

channel t01: restoring datafile 00190 to /data24/oradata/XXXXXX/EDCFDT11_TRANSHISTORY_INDX_F28.dbf

channel t01: reading from backup piece

 Skipped more lines due to volume of the log

channel t04: reading from backup piece XXXXXX_AR_20140223_1045_58030_lep1b8li_1_1_840278706

channel t06: piece handle=XXXXXX_AR_20140222_1445_57789_dtp192bh_1_1_840206705 tag=XXXXXX_DG_AR_20140222_1445

channel t06: restored backup piece 1

channel t06: restore complete, elapsed time: 00:03:26

channel t05: piece handle=XXXXXX_AR_20140222_1245_57767_d7p18rah_1_1_840199505 tag=XXXXXX_DG_AR_20140222_1245

channel t05: restored backup piece 1

channel t05: restore complete, elapsed time: 00:03:26

archived log file name=/dbArch/XXXXXX/XXXXXX_1_29305_818382098.arc thread=1 sequence=29305

archived log file name=/dbArch/XXXXXX/XXXXXX_1_29306_818382098.arc thread=1 sequence=29306

archived log file name=/dbArch/XXXXXX/XXXXXX_1_29307_818382098.arc thread=1 sequence=29307

archived log file name=/dbArch/XXXXXX/XXXXXX_1_29308_818382098.arc thread=1 sequence=29308

archived log file name=/dbArch/XXXXXX/XXXXXX_1_29309_818382098.arc thread=1 sequence=29309

archived log file name=/dbArch/XXXXXX/XXXXXX_1_29310_818382098.arc thread=1 sequence=29310

channel t01: piece handle=XXXXXX_AR_20140223_0045_57885_gtp1a5gh_1_1_840242705 tag=XXXXXX_DG_AR_20140223_0045

channel t01: restored backup piece 1

channel t01: restore complete, elapsed time: 00:02:15

channel t03: piece handle=XXXXXX_AR_20140222_2045_57849_fpp19neh_1_1_840228305 tag=XXXXXX_DG_AR_20140222_2045

channel t03: restored backup piece 1

channel t03: restore complete, elapsed time: 00:02:35

archived log file name=/dbArch/XXXXXX/XXXXXX_1_29311_818382098.arc thread=1 sequence=29311

channel t07: piece handle=XXXXXX_AR_20140222_2245_57865_g9p19ufi_1_1_840235506 tag=XXXXXX_DG_AR_20140222_2245

channel t07: restored backup piece 1

channel t07: restore complete, elapsed time: 00:02:45

archived log file name=/dbArch/XXXXXX/XXXXXX_1_29312_818382098.arc thread=1 sequence=29312

archived log file name=/dbArch/XXXXXX/XXXXXX_1_29313_818382098.arc thread=1 sequence=29313

channel t08: piece handle=XXXXXX_AR_20140223_0845_57988_k4p1b1kh_1_1_840271505 tag=XXXXXX_DG_AR_20140223_0845

channel t08: restored backup piece 1

channel t08: restore complete, elapsed time: 00:02:37

channel t02: piece handle=XXXXXX_AR_20140223_0245_57905_hhp1achi_1_1_840249906 tag=XXXXXX_DG_AR_20140223_0245

channel t02: restored backup piece 1

channel t02: restore complete, elapsed time: 00:02:37

archived log file name=/dbArch/XXXXXX/XXXXXX_1_29314_818382098.arc thread=1 sequence=29314

archived log file name=/dbArch/XXXXXX/XXXXXX_1_29315_818382098.arc thread=1 sequence=29315

archived log file name=/dbArch/XXXXXX/XXXXXX_1_29316_818382098.arc thread=1 sequence=29316

channel t04: piece handle=XXXXXX_AR_20140223_1045_58030_lep1b8li_1_1_840278706 tag=XXXXXX_DG_AR_20140223_1045

channel t04: restored backup piece 1

channel t04: restore complete, elapsed time: 00:02:03

archived log file name=/dbArch/XXXXXX/XXXXXX_1_29317_818382098.arc thread=1 sequence=29317

archived log file name=/dbArch/XXXXXX/XXXXXX_1_29318_818382098.arc thread=1 sequence=29318

archived log file name=/dbArch/XXXXXX/XXXXXX_1_29319_818382098.arc thread=1 sequence=29319

archived log file name=/dbArch/XXXXXX/XXXXXX_1_29320_818382098.arc thread=1 sequence=29320

archived log file name=/dbArch/XXXXXX/XXXXXX_1_29321_818382098.arc thread=1 sequence=29321

archived log file name=/dbArch/XXXXXX/XXXXXX_1_29322_818382098.arc thread=1 sequence=29322

archived log file name=/dbArch/XXXXXX/XXXXXX_1_29323_818382098.arc thread=1 sequence=29323

archived log file name=/dbArch/XXXXXX/XXXXXX_1_29324_818382098.arc thread=1 sequence=29324

media recovery complete, elapsed time: 00:05:16

Finished recover at 02/25/2014 02:06:15

released channel: t01

released channel: t02

released channel: t03

released channel: t04

released channel: t05

released channel: t06

released channel: t07

released channel: t08

RMAN>

Recovery Manager complete.

Step H Convert the database to Standby

ALTER DATABASE COVERT TO PHYSICAL STANDBY

Now the DB is converted to standby database. Run the below SQL to check the status.

set linesize 200

col DB_UNIQUE_NAME form a30

col DATABASE_ROLE for a20

col OPEN_MODE for a30

col SWITCHOVER_STATUS for a15

column protection_mode form a30

select DB_UNIQUE_NAME,Database_role,open_mode,switchover_status,protection_mode from v$database;

In my environment, the status shows as below.

database role = Physical Standby
open mode = Mounted
switch over status = sessions active

Step I Start the Standby and Enable the MRP

Enable the log shipping in primary database. In my environment, log_archive_dest_stat_2 is pointing to standby database. Open the primary and standby database alert log and tail the log files on different putty window.

Login to current primary database and run the below command.

alter system set log_archive_dest_state_2 = 'ENABLE'

Start up the current restored database in standby mode as below. Login the current standby database
and run the below command. Monitor the alert log while running the below commands.

shutdown immediate;

startup nomount;

alter database mount standby database;

alter database recover managed standby database disconnect from session;
-- Stop here and verify the alert and make sure it is shipping the log files to standby
-- wait for 5 minutes and double check the alert log and make sure no issues.

alter database recover managed standby database cancel;

alter database open;

alter database recover managed standby database using current logfile disconnect;

Verify the log shipping and DG replication.

Step J Post Verification.

Verify the alert log on both database and make sure no errors. Enable the archive log delete join in primary database once standby is caught up with primary database.

Lesson learned : Enabling flashback is highly important on data guard environment. Otherwise, we will have to rebuild the database from the scratch if the DB is crashed. I enabled flashback features on all the data guard environment after i went through the costly exercise.

Hope this post helps! Please let me know if you have any questions or comments.

My Oracle

Monday, December 28, 2015

Restore Standby Database from Tape

No comments:

Oracle10g RAC Administrator

Oracle12c Certified DBA

Welcome to my Blog

Blog Archive

FEEDJIT Live Traffic Feed

Online Users