DOYENSYS Knowledge Portal




We Welcome you to visit our DOYENSYS KNOWLEDGE PORTAL : Doyensys Knowledge Portal




Thursday, March 30, 2017

Correcting our RAC Services UNKNOWN / OFFLINE problem


I was having some trouble while running my RAC on RHEL4 (I think the problem primarily arose because I supplied my virtual machines with inadequate memory. 

When I noticed that I was getting some alerts , I looked at the status of my RAC.

oracle@hpora01 ~]$ cd /u01/app/oracle/product/10.2.0/crs/bin
[oracle@hpora01 bin]$ crs_stat -t
Name           Type           Target    State     Host
------------------------------------------------------------
ora.crmrac.db application    ONLINE    ONLINE    hpora02
ora....c1.inst application    ONLINE    ONLINE    hpora01
ora....c2.inst application    OFFLINE   UNKNOWN   hpora02
ora....serv.cs application    ONLINE    ONLINE    hpora02
ora....ac1.srv application    ONLINE    ONLINE    hpora01
ora....ac2.srv application    ONLINE    OFFLINE
ora....SM1.asm application    ONLINE    ONLINE    hpora01
ora....H4.lsnr application    ONLINE    ONLINE    hpora01
ora....rh4.gsd application    ONLINE    UNKNOWN   hpora01
ora....rh4.ons application    ONLINE    UNKNOWN   hpora01
ora....rh4.vip application    ONLINE    ONLINE    hpora01
ora....SM2.asm application    OFFLINE   UNKNOWN   hpora02
ora....H4.lsnr application    OFFLINE   UNKNOWN   hpora02
ora....rh4.gsd application    ONLINE    UNKNOWN   hpora02
ora....rh4.ons application    OFFLINE   UNKNOWN   hpora02
ora....rh4.vip application    ONLINE    ONLINE    hpora02


As you can see above, some of the applications are UNKNOWN or OFFLINE, either of which is not good for my RAC.
The crs_stat command gives you the names of the applications, which you might need to shut down some applications manually, in order to shut the whole cluster down and restart it.

[oracle@hpora01 bin]$ crs_stat
NAME=ora.crmrac.db
TYPE=application
TARGET=ONLINE
STATE=ONLINE on hpora02
NAME=ora.crmrac.crmrac1.inst
TYPE=application
TARGET=ONLINE
STATE=ONLINE on hpora01
NAME=ora.crmrac.crmrac2.inst
TYPE=application
TARGET=OFFLINE
STATE=OFFLINE
NAME=ora.crmrac.crmracserv.cs
TYPE=application
TARGET=ONLINE
STATE=ONLINE on hpora02
NAME=ora.crmrac.crmracserv.crmrac1.srv
TYPE=application
TARGET=ONLINE
STATE=ONLINE on hpora01
NAME=ora.crmrac.crmracserv.crmrac2.srv
TYPE=application
TARGET=ONLINE
STATE=OFFLINE
NAME=ora.hpora01.ASM1.asm
TYPE=application
TARGET=ONLINE
STATE=ONLINE on hpora01
NAME=ora.hpora01.LISTENER_hpora01.lsnr
TYPE=application
TARGET=ONLINE
STATE=ONLINE on hpora01
NAME=ora.hpora01.gsd
TYPE=application
TARGET=ONLINE
STATE=UNKNOWN on hpora01
NAME=ora.hpora01.ons
TYPE=application
TARGET=ONLINE
STATE=UNKNOWN on hpora01
NAME=ora.hpora01.vip
TYPE=application
TARGET=ONLINE
STATE=ONLINE on hpora01
NAME=ora.hpora02.ASM2.asm
TYPE=application
TARGET=OFFLINE
STATE=UNKNOWN on hpora02
NAME=ora.hpora02.LISTENER_hpora02.lsnr
TYPE=application
TARGET=OFFLINE
STATE=UNKNOWN on hpora02
NAME=ora.hpora02.gsd
TYPE=application
TARGET=ONLINE
STATE=UNKNOWN on hpora02
NAME=ora.hpora02.ons
TYPE=application
TARGET=OFFLINE
STATE=UNKNOWN on hpora02
NAME=ora.hpora02.vip
TYPE=application
TARGET=ONLINE
STATE=ONLINE on hpora02

I could have also attempted to stop them all using crs_stop –all, but it normally throws enough errors to force you do it manually one by one.

[oracle@hpora01 bin]$ crs_stop -all
Attempting to stop `ora.hpora01.ons` on member `hpora01`
Attempting to stop `ora.hpora02.ons` on member `hpora02`
`ora.hpora02.ons` on member `hpora02` has experienced an unrecoverable failure.
Human intervention required to resume its availability.
Stop of `ora.hpora01.ons` on member `hpora01` succeeded.
Attempting to stop `ora.hpora01.ASM1.asm` on member `hpora01`
Attempting to stop `ora.crmrac.crmrac2.inst` on member `hpora02`
`ora.crmrac.crmrac2.inst` on member `hpora02` has experienced an unrecoverable failure.
Human intervention required to resume its availability.
Attempting to stop `ora.hpora02.ASM2.asm` on member `hpora02`
`ora.hpora02.ASM2.asm` on member `hpora02` has experienced an unrecoverable failure.
Human intervention required to resume its availability.
Attempting to stop `ora.hpora02.LISTENER_hpora02.lsnr` on member `hpora02`
`ora.hpora02.LISTENER_hpora02.lsnr` on member `hpora02` has experienced an unrecoverable failure.
Human intervention required to resume its availability.
Attempting to stop `ora.crmrac.crmrac2.inst` on member `hpora02`
`ora.crmrac.crmrac2.inst` on member `hpora02` has experienced an unrecoverable failure.
Human intervention required to resume its availability.
Attempting to stop `ora.hpora02.ASM2.asm` on member `hpora02`
`ora.hpora02.ASM2.asm` on member `hpora02` has experienced an unrecoverable failure.
Human intervention required to resume its availability.
Attempting to stop `ora.hpora02.vip` on member `hpora02`
Stop of `ora.hpora02.vip` on member `hpora02` succeeded.
Stop of `ora.hpora01.ASM1.asm` on member `hpora01` succeeded.
Attempting to stop `ora.hpora01.LISTENER_hpora01.lsnr` on member `hpora01`
Stop of `ora.hpora01.LISTENER_hpora01.lsnr` on member `hpora01` succeeded.
Attempting to stop `ora.hpora01.vip` on member `hpora01`
Stop of `ora.hpora01.vip` on member `hpora01` succeeded.
CRS-0216: Could not stop resource 'ora.hpora02.ASM2.asm'.
CRS-0216: Could not stop resource 'ora.hpora02.ons'.
CRS-0216: Could not stop resource 'ora.hpora02.vip'.


For the very same reason we will go ahead and do it our way. Therefore, we need to stop our instances first.

[oracle@hpora01 bin]$ srvctl stop instance -d crmrac -i crmrac1
[oracle@hpora01 bin]$ srvctl stop instance -d crmrac -i crmrac2
Check our status
[oracle@hpora01 bin]$ crs_stat -t
Name           Type           Target    State     Host
------------------------------------------------------------
ora.crmrac.db application    OFFLINE   OFFLINE
ora....c1.inst application    OFFLINE   OFFLINE
ora....c2.inst application    OFFLINE   OFFLINE
ora....serv.cs application    ONLINE    UNKNOWN   hpora02
ora....ac1.srv application    OFFLINE   OFFLINE
ora....ac2.srv application    OFFLINE   OFFLINE
ora....SM1.asm application    OFFLINE   OFFLINE
ora....H4.lsnr application    OFFLINE   OFFLINE
ora....rh4.gsd application    ONLINE    UNKNOWN   hpora01
ora....rh4.ons application    OFFLINE   OFFLINE
ora....rh4.vip application    OFFLINE   OFFLINE
ora....SM2.asm application    OFFLINE   UNKNOWN   hpora02
ora....H4.lsnr application    OFFLINE   UNKNOWN   hpora02
ora....rh4.gsd application    ONLINE    UNKNOWN   hpora02
ora....rh4.ons application    OFFLINE   UNKNOWN   hpora02
ora....rh4.vip application    OFFLINE   OFFLINE

Stop the service

[oracle@hpora01 bin]$ srvctl stop service -d crmrac -s crmracserv
Check status again
[oracle@hpora01 bin]$ crs_stat -t
Name           Type           Target    State     Host
------------------------------------------------------------
ora.crmrac.db application    OFFLINE   OFFLINE
ora....c1.inst application    OFFLINE   OFFLINE
ora....c2.inst application    OFFLINE   OFFLINE
ora....serv.cs application    OFFLINE   OFFLINE
ora....ac1.srv application    OFFLINE   OFFLINE
ora....ac2.srv application    OFFLINE   OFFLINE
ora....SM1.asm application    OFFLINE   OFFLINE
ora....H4.lsnr application    OFFLINE   OFFLINE
ora....rh4.gsd application    ONLINE    UNKNOWN   hpora01
ora....rh4.ons application    OFFLINE   OFFLINE
ora....rh4.vip application    OFFLINE   OFFLINE
ora....SM2.asm application    OFFLINE   UNKNOWN   hpora02
ora....H4.lsnr application    OFFLINE   UNKNOWN   hpora02
ora....rh4.gsd application    ONLINE    UNKNOWN   hpora02
ora....rh4.ons application    OFFLINE   UNKNOWN   hpora02
ora....rh4.vip application    OFFLINE   OFFLINE


OK, so we need to stop those applications now.

[oracle@hpora01 bin]$ crs_stop ora.hpora01.gsd
Attempting to stop `ora.hpora01.gsd` on member `hpora01`
Stop of `ora.hpora01.gsd` on member `hpora01` succeeded.
[oracle@hpora01 bin]$ crs_stop ora.hpora02.ASM2.asm
Attempting to stop `ora.hpora02.ASM2.asm` on member `hpora02`
Stop of `ora.hpora02.ASM2.asm` on member `hpora02` succeeded.
[oracle@hpora01 bin]$ crs_stop ora.hpora02.LISTENER_hpora02.lsnr
Attempting to stop `ora.hpora02.LISTENER_hpora02.lsnr` on member `hpora02`
Stop of `ora.hpora02.LISTENER_hpora02.lsnr` on member `hpora02` succeeded.
[oracle@hpora01 bin]$ crs_stop ora.hpora02.gsd
Attempting to stop `ora.hpora02.gsd` on member `hpora02`
Stop of `ora.hpora02.gsd` on member `hpora02` succeeded.
[oracle@hpora01 bin]$ crs_stop ora.hpora02.ons
Attempting to stop `ora.hpora02.ons` on member `hpora02`
Stop of `ora.hpora02.ons` on member `hpora02` succeeded.

Check status

[oracle@hpora01 bin]$ crs_stat -t
Name           Type           Target    State     Host
------------------------------------------------------------
ora.crmrac.db application    OFFLINE   OFFLINE
ora....c1.inst application    OFFLINE   OFFLINE
ora....c2.inst application    OFFLINE   OFFLINE
ora....serv.cs application    OFFLINE   OFFLINE
ora....ac1.srv application    OFFLINE   OFFLINE
ora....ac2.srv application    OFFLINE   OFFLINE
ora....SM1.asm application    OFFLINE   OFFLINE
ora....H4.lsnr application    OFFLINE   OFFLINE
ora....rh4.gsd application    OFFLINE   OFFLINE
ora....rh4.ons application    OFFLINE   OFFLINE
ora....rh4.vip application    OFFLINE   OFFLINE
ora....SM2.asm application    OFFLINE   OFFLINE
ora....H4.lsnr application    OFFLINE   OFFLINE
ora....rh4.gsd application    OFFLINE   OFFLINE
ora....rh4.ons application    OFFLINE   OFFLINE
ora....rh4.vip application    OFFLINE   OFFLINE

OK all set , now lets bring them all online.

[oracle@hpora01 bin]$ crs_start -all
Attempting to start `ora.hpora02.vip` on member `hpora02`
Attempting to start `ora.hpora01.vip` on member `hpora01`
Start of `ora.hpora02.vip` on member `hpora02` succeeded.
Start of `ora.hpora01.vip` on member `hpora01` succeeded.
Attempting to start `ora.hpora01.ASM1.asm` on member `hpora01`
Attempting to start `ora.hpora02.ASM2.asm` on member `hpora02`
Start of `ora.hpora02.ASM2.asm` on member `hpora02` succeeded.
Attempting to start `ora.crmrac.crmrac2.inst` on member `hpora02`
Start of `ora.hpora01.ASM1.asm` on member `hpora01` succeeded.
Attempting to start `ora.crmrac.crmrac1.inst` on member `hpora01`
Start of `ora.crmrac.crmrac2.inst` on member `hpora02` succeeded.
Attempting to start `ora.hpora02.LISTENER_hpora02.lsnr` on member `hpora02`
Start of `ora.crmrac.crmrac1.inst` on member `hpora01` succeeded.
Attempting to start `ora.hpora01.LISTENER_hpora01.lsnr` on member `hpora01`
Start of `ora.hpora02.LISTENER_hpora02.lsnr` on member `hpora02` succeeded.
Start of `ora.hpora01.LISTENER_hpora01.lsnr` on member `hpora01` succeeded.
CRS-1002: Resource 'ora.hpora02.ons' is already running on member 'hpora02'
CRS-1002: Resource 'ora.hpora01.ons' is already running on member 'hpora01'
Attempting to start `ora.crmrac.crmracserv.crmrac1.srv` on member `hpora01`
Attempting to start `ora.hpora01.gsd` on member `hpora01`
Attempting to start `ora.crmrac.db` on member `hpora01`
Attempting to start `ora.crmrac.crmracserv.crmrac2.srv` on member `hpora02`
Attempting to start `ora.crmrac.crmracserv.cs` on member `hpora02`
Attempting to start `ora.hpora02.gsd` on member `hpora02`
Start of `ora.crmrac.crmracserv.crmrac2.srv` on member `hpora02` succeeded.
Start of `ora.crmrac.crmracserv.cs` on member `hpora02` succeeded.
Start of `ora.crmrac.db` on member `hpora01` succeeded.
Start of `ora.hpora02.gsd` on member `hpora02` succeeded.
Start of `ora.hpora01.gsd` on member `hpora01` succeeded.
Start of `ora.crmrac.crmracserv.crmrac1.srv` on member `hpora01` succeeded.
*CRS-0223: Resource 'ora.hpora01.ons' has placement error.
CRS-0223: Resource 'ora.hpora02.ons' has placement error.
*Don’t bother about those errors, as they just did not report back to us in the sequence they were started by the clusterware.

[oracle@hpora01 bin]$ crs_stat -t
Name           Type           Target    State     Host
------------------------------------------------------------
ora.crmrac.db application    ONLINE    ONLINE    hpora01
ora....c1.inst application    ONLINE    ONLINE    hpora01
ora....c2.inst application    ONLINE    ONLINE    hpora02
ora....serv.cs application    ONLINE    ONLINE    hpora02
ora....ac1.srv application    ONLINE    ONLINE    hpora01
ora....ac2.srv application    ONLINE    ONLINE    hpora02
ora....SM1.asm application    ONLINE    ONLINE    hpora01
ora....H4.lsnr application    ONLINE    ONLINE    hpora01
ora....rh4.gsd application    ONLINE    ONLINE    hpora01
ora....rh4.ons application    ONLINE    ONLINE    hpora01
ora....rh4.vip application    ONLINE    ONLINE    hpora01
ora....SM2.asm application    ONLINE    ONLINE    hpora02
ora....H4.lsnr application    ONLINE    ONLINE    hpora02
ora....rh4.gsd application    ONLINE    ONLINE    hpora02
ora....rh4.ons application    ONLINE    ONLINE    hpora02
ora....rh4.vip application    ONLINE    ONLINE    hpora02
[oracle@hpora01 bin]$

Conclusion:-

This article has discussed how the CRS commands can be used to fix/troubleshoot the quirkiness of our RAC.

No comments: