TIP 104#: Node Eviction in RAC 11gr2 due to temporary network hiccups on heartbeat communication

In next couple months, I will examine different eviction scenarios in RAC 11gR2 and when I see some strange behavior, I will do my best to inform it in this blog, so RAC lover, please stay tuned and here is the scenario number 1 :

AS you know, in 11gR2, oracle uses UDP protocol for heartbeats between the nodes.
In this post, I present a node eviction scenario when UDP communication is blocked between the nodes and you see that depends on where and how UDP is blocked, a different situation could occur.
The test is done on two node RAC in 11.2.0.3PSU3 version on Linux

Scenario 1: When UDP communication is blocked on the second node


In this scenario, outgoing UDP for ocssd process on node2 is blocked.
To do so, we find out UDP port on which ocssd is listening and will disable any outgoing traffic

netstat -a --inet |grep -i udp | grep -i racnode2

udp        0      0 racnode2-priv:14081         *:*                                     
udp        0      0 racnode2-priv:52358         *:*                                     
udp        0      0 racnode2-priv:52242         *:*                                     
udp        0      0 racnode2-priv:42517         *:*  --> ocssd
udp        0      0 racnode2-priv:31126         *:*                                     
udp        0      0 racnode2-priv:60741         *:*  

[root@racnode2 ~]# lsof -i :42517
COMMAND    PID USER   FD   TYPE DEVICE SIZE NODE NAME
ocssd.bin 3005 grid   55u  IPv4  22340       UDP racnode2-priv:42517 


42517 is the port which ocssd on racnode2 sends its heartbeat.
To break heartbeat communication between node2 and node1, any outgoing traffic on racnode2 for ocssd process (port 42517) is blocked with this command

iptables -A OUTPUT -s 192.168.2.152 -p udp --sport 42517 -j DROP

alter log on node1 shows that node2 is evicted (bootless eviction) and then it is reconfigured and is joined the cluster.

[cssd(3015)]CRS-1612:Network communication with node racnode2 (2) missing for 50% of timeout interval.  Removal of this node from cluster in 14.840 seconds
2012-12-31 10:50:54.213
[cssd(3015)]CRS-1611:Network communication with node racnode2 (2) missing for 75% of timeout interval.  Removal of this node from cluster in 6.820 seconds
2012-12-31 10:50:58.232
[cssd(3015)]CRS-1610:Network communication with node racnode2 (2) missing for 90% of timeout interval.  Removal of this node from cluster in 2.810 seconds
2012-12-31 10:51:01.056
[cssd(3015)]CRS-1607:Node racnode2 is being evicted in cluster incarnation 249572820; details at (:CSSNM00007:) in /u01/app/11.2.0/grid/log/racnode1/cssd/ocssd.log.
2012-12-31 10:51:02.584
[cssd(3015)]CRS-1625:Node racnode2, number 2, was manually shut down
2012-12-31 10:51:02.590
[cssd(3015)]CRS-1601:CSSD Reconfiguration complete. Active nodes are racnode1 .
2012-12-31 10:51:02.630
[crsd(3393)]CRS-5504:Node down event reported for node 'racnode2'.
2012-12-31 10:51:05.827
[crsd(3393)]CRS-2773:Server 'racnode2' has been removed from pool 'Generic'.
2012-12-31 10:51:05.829
[crsd(3393)]CRS-2773:Server 'racnode2' has been removed from pool 'ora.orcl'.
2012-12-31 10:51:37.987
[cssd(3015)]CRS-1601:CSSD Reconfiguration complete. Active nodes are racnode1 racnode2 .
2012-12-31 10:52:13.720
[crsd(3393)]CRS-2772:Server 'racnode2' has been assigned to pool 'Generic'.
2012-12-31 10:52:13.720
[crsd(3393)]CRS-2772:Server 'racnode2' has been assigned to pool 'Generic'.
2012-12-31 10:52:13.720
[crsd(3393)]CRS-2772:Server 'racnode2' has been assigned to pool 'ora.orcl'.

ocssed on racnode2 has more details, I highlighted couple key lines

2012-12-31 10:51:01.129: [    CSSD][3019058064]###################################
2012-12-31 10:51:01.129: [    CSSD][3019058064]clssscExit: CSSD aborting from thread clssnmvKillBlockThread
2012-12-31 10:51:01.129: [    CSSD][3019058064]###################################
.
.
2012-12-31 10:51:02.559: [    CSSD][3029027728]clssgmClientShutdown: total iocapables 0
2012-12-31 10:51:02.559: [    CSSD][3029027728]clssgmClientShutdown: graceful shutdown completed.
2012-12-31 10:51:02.559: [    CSSD][3029027728]clssnmSendManualShut: Notifying all nodes that this node has been manually shut down
.
.
2012-12-31 10:51:25.352: [    CSSD][3040868032]clssscmain: Starting CSS daemon, version 11.2.0.3.0, in (clustered) mode with uniqueness value 1356979885
2012-12-31 10:51:25.353: [    CSSD][3040868032]clssscmain: Environment is production
.
.
2012-12-31 10:51:26.167: [GIPCHTHR][3024477072] gipchaWorkerCreateInterface: created local interface for node 'racnode2', haName 'CSS_racnode-cluster', inf 'udp://192.168.2.152:29788'

A key note here is that udp is reconfigured to be run on different port and after that node2 is able to join the cluster and starts up all its resources.
The following netstat also confirms that ocssd.bin listens on the new port

[root@racnode2 ~]# netstat -a --inet |grep -i udp | grep -i racnode2
udp        0      0 racnode2-priv:31126         *:*                                     
udp        0      0 racnode2-priv:35489         *:*                                     
udp        0      0 racnode2-priv:38321         *:*                                     
udp        0      0 racnode2-priv:60741         *:*                                     
udp        0      0 racnode2-priv:10321         *:*                                     
udp        0      0 racnode2-priv:29788         *:*   --> new port is created.....
   
[root@racnode2 working]# lsof -i :29788
COMMAND    PID USER   FD   TYPE DEVICE SIZE NODE NAME
ocssd.bin 5919 grid   52u  IPv4 764918       UDP racnode2-priv:29788 

If I sum up, the following sequence of events occurs :

  •  udp communication for heartbeat is blocked (outgoing udp on ocssd port)
  •  Node1 evicts Node2
  •  Node2 is able to stop all IO capable resources and as the result, no need to boot the node (11g  feature).
  •  Node2 starts CSSD and reconfigures UDP port
  •  Node2 is able to join the cluster

This sounds perfect as node2 is able to recover by itself. It looks like transparent and straight forward recovery.
Let see how this failure is recovered if UDP hiccups occur on node1 (master node in two node RAC)


Scenario 2: When UDP communication is blocked on the first node

To follow the same step, UDP port for hearbeat is found and it is blocked as it is shown in below

bash-3.2$ netstat -a --inet |grep -i udp | grep -i racnode1
udp        0      0 racnode1-priv:36613         *:*   
udp        0      0 racnode1-priv:36892         *:*   
udp        0      0 racnode1-priv:26055         *:*   
udp        0      0 racnode1-priv:13167         *:*   
udp        0      0 racnode1-priv:17914         *:*   
udp        0      0 racnode1-priv:51067         *:*   

[root@racnode1 ~]# lsof -i :36613
COMMAND    PID USER   FD   TYPE DEVICE SIZE NODE NAME
ocssd.bin 3010 grid   55u  IPv4  19676       UDP racnode1-priv:36613 

To block heartbeat, all outgoing traffic on port 36613 is blocked
iptables -A OUTPUT -s 192.168.2.151 -p udp --sport 36613 -j DROP

Based on scenario 1, I expected to see the same sequence of events. In other words, I expected to see node2 is evicted and is reconfigured and is rejoined the cluster.
However, in this case, it is seen that node2 is evicted and then as it is shown in below cssd is hung in start up and joining the cluster.

[root@racnode1 ~]# crsctl check cluster -all
**************************************************************
racnode1:
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online
**************************************************************
racnode2:
CRS-4535: Cannot communicate with Cluster Ready Services
CRS-4530: Communications failure contacting Cluster Synchronization Services daemon
CRS-4534: Cannot communicate with Event Manager
**************************************************************
[root@racnode2 ~]# crsctl stat res -init -t
--------------------------------------------------------------------------------
NAME           TARGET  STATE        SERVER                   STATE_DETAILS       
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.asm
      1        ONLINE  OFFLINE                               Abnormal Termination
ora.cluster_interconnect.haip
      1        ONLINE  OFFLINE                                                   
ora.crf
      1        ONLINE  ONLINE       racnode2                                     
ora.crsd
      1        ONLINE  OFFLINE                                                   
ora.cssd
      1        ONLINE  OFFLINE                               STARTING      
ora.cssdmonitor
      1        ONLINE  ONLINE       racnode2                                     
ora.ctssd
      1        ONLINE  OFFLINE                                                   
ora.diskmon
      1        OFFLINE OFFLINE                                                   
ora.drivers.acfs
      1        ONLINE  ONLINE       racnode2                                     
ora.evmd
      1        ONLINE  OFFLINE                                                   
ora.gipcd
      1        ONLINE  ONLINE       racnode2                                     
ora.gpnpd
      1        ONLINE  ONLINE       racnode2                                     
ora.mdnsd
      1        ONLINE  ONLINE       racnode2          

Even,unblocking the same port with dropping the rule from iptables does not help and still CSS on node2 is not able to join the cluster.

iptables -L      
      
iptables -D OUTPUT -s 192.168.2.151 -p udp --sport 36613 -j DROP


[root@racnode1 ~]# iptables -L      
Chain INPUT (policy ACCEPT)
target     prot opt source               destination         

Chain FORWARD (policy ACCEPT)
target     prot opt source               destination         

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination  


After reviewing all logs (quite length log, so I avoid to copy it here!), It is seen that ocssd in node2 complains about network heartbeat and no reconfig attempt is done, also in node1, after blocking UDP, the interface was disabled and no try is done to setup communication on differently.
As I mentioned earlier, Although UDP port is unblocked, still the following error is reported on node2 and node1 repeatedly.

Node 2
==========
 [    CSSD][3013077904]clssnmvDHBValidateNcopy: node 1, racnode1, has a disk HB, but no network HB, DHB has rcfg 249572810, wrtcnt, 181183, LATS 1471304, lastSeqNo 181182, uniqueness 1356788464, timestamp 1356790643/1482244
 
Node1 
=========
[GIPCHALO][3023862672] gipchaLowerProcessNode: no valid interfaces found to node for 25790 ms, node 0xa062a88 { host 'racnode2', haName 'CSS_racnode-cluster', srcLuid be28e076-9f3aafb1, dstLuid 61a1f895-ba260945 numInf 0, contigSeq 2639, lastAck 2626, lastValidAck 2638, sendSeq [2627 : 2683], createTime 4294328280, sentRegister 1, localMonitor 1, flags 0x2408 }

It turned out that the issue is reported as Bug 14281269 : "NODE CAN'T REJOIN THE CLUSTER AFTER A TEMPORARY INTERCONNECT FAILURE - PROBLEM:after an interconnect failure on the first node the second node restarts the clusterware (rebootless restart) as expected, but can't join the cluster again till the interconnect interface of node1
is not shutdown/startup manually "
At the time of posting this, there is no patch available and the suggested workaround is to bounce interconnect interface.

In my test, even bouncing node2 (evicted node) did not help and I ended up to kill gipc daemon on node1 (master node/surviving node) and it did help and whole cluster recovered and node2 was able to join the cluster.

[root@racnode1 working]# ps -ef |grep -i gipc
grid      2961     1  0 05:40 ?        00:00:16 /u01/app/11.2.0/grid/bin/gipcd.bin
root      7709  4792  0 06:44 pts/1    00:00:00 grep -i gipc
[root@racnode1 working]# kill -9 2961
[root@racnode1 working]# ps -ef |grep -i gipc
grid      7717     1 15 06:44 ?        00:00:00 /u01/app/11.2.0/grid/bin/gipcd.bin
root      7755  4792  0 06:44 pts/1    00:00:00 grep -i gipc


[/u01/app/11.2.0/grid/bin/oraagent.bin(3528)]CRS-5822:Agent '/u01/app/11.2.0/grid/bin/oraagent_grid' disconnected from server. Details at (:CRSAGF00117:) {0:1:5} in /u01/app/11.2.0/grid/log/racnode1/agent/crsd/oraagent_grid/oraagent_grid.log.
2012-12-29 06:44:56.972
[/u01/app/11.2.0/grid/bin/orarootagent.bin(3535)]CRS-5822:Agent '/u01/app/11.2.0/grid/bin/orarootagent_root' disconnected from server. Details at (:CRSAGF00117:) {0:2:23} in /u01/app/11.2.0/grid/log/racnode1/agent/crsd/orarootagent_root/orarootagent_root.log.
2012-12-29 06:44:56.974
[/u01/app/11.2.0/grid/bin/oraagent.bin(3741)]CRS-5822:Agent '/u01/app/11.2.0/grid/bin/oraagent_oracle' disconnected from server. Details at (:CRSAGF00117:) {0:5:63} in /u01/app/11.2.0/grid/log/racnode1/agent/crsd/oraagent_oracle/oraagent_oracle.log.
2012-12-29 06:44:57.098
[ohasd(2414)]CRS-2765:Resource 'ora.ctssd' has failed on server 'racnode1'.
2012-12-29 06:44:59.141
[ctssd(7732)]CRS-2401:The Cluster Time Synchronization Service started on host racnode1.
2012-12-29 06:44:59.141
[ctssd(7732)]CRS-2407:The new Cluster Time Synchronization Service reference node is host racnode1.
2012-12-29 06:45:01.164
[cssd(3010)]CRS-1601:CSSD Reconfiguration complete. Active nodes are racnode1 racnode2 .
2012-12-29 06:45:02.363
[crsd(7759)]CRS-1012:The OCR service started on node racnode1.
2012-12-29 06:45:03.155
[evmd(7762)]CRS-1401:EVMD started on node racnode1.
2012-12-29 06:45:05.147
[crsd(7759)]CRS-1201:CRSD started on node racnode1.
2012-12-29 06:45:38.798
[crsd(7759)]CRS-2772:Server 'racnode2' has been assigned to pool 'Generic'.
2012-12-29 06:45:38.800
[crsd(7759)]CRS-2772:Server 'racnode2' has been assigned to pool 'ora.orcl'.


===== alert for node2 =========

2012-12-29 06:39:38.132
[cssd(7700)]CRS-1605:CSSD voting file is online: /dev/sda1; details in /u01/app/11.2.0/grid/log/racnode2/cssd/ocssd.log.
2012-12-29 06:45:01.165
[cssd(7700)]CRS-1601:CSSD Reconfiguration complete. Active nodes are racnode1 racnode2 .
2012-12-29 06:45:03.641
[ctssd(8061)]CRS-2401:The Cluster Time Synchronization Service started on host racnode2.
2012-12-29 06:45:03.641
[ctssd(8061)]CRS-2407:The new Cluster Time Synchronization Service reference node is host racnode1.
2012-12-29 06:45:05.257
[ohasd(2405)]CRS-2767:Resource state recovery not attempted for 'ora.diskmon' as its target state is OFFLINE
2012-12-29 06:45:16.836
[ctssd(8061)]CRS-2408:The clock on host racnode2 has been updated by the Cluster Time Synchronization Service to be synchronous with the mean cluster time.
2012-12-29 06:45:25.475
[crsd(8199)]CRS-1012:The OCR service started on node racnode2.
2012-12-29 06:45:25.541
[evmd(8079)]CRS-1401:EVMD started on node racnode2.
2012-12-29 06:45:27.331
[crsd(8199)]CRS-1201:CRSD started on node racnode2.
2012-12-29 06:45:35.642
[/u01/app/11.2.0/grid/bin/oraagent.bin(8321)]CRS-5016:Process "/u01/app/11.2.0/grid/opmn/bin/onsctli" spawned by agent "/u01/app/11.2.0/grid/bin/oraagent.bin" for action "check" failed: details at "(:CLSN00010:)" in "/u01/app/11.2.0/grid/log/racnode2/agent/crsd/oraagent_grid/oraagent_grid.log"
2012-12-29 06:45:36.181
[/u01/app/11.2.0/grid/bin/oraagent.bin(8347)]CRS-5011:Check of resource "orcl" failed: details at "(:CLSN00007:)" in "/u01/app/11.2.0/grid/log/racnode2/agent/crsd/oraagent_oracle/oraagent_oracle.log"
2012-12-29 06:45:37.301
[/u01/app/11.2.0/grid/bin/oraagent.bin(8321)]CRS-5016:Process "/u01/app/11.2.0/grid/bin/lsnrctl" spawned by agent "/u01/app/11.2.0/grid/bin/oraagent.bin" for action "check" failed: details at "(:CLSN00010:)" in "/u01/app/11.2.0/grid/log/racnode2/agent/crsd/oraagent_grid/oraagent_grid.log"

To conclude, in 2 node RAC :
 




  1. Network hiccups on heartbeat port on node2 is recovered automatically.
  2. Network hiccups on heartbeat port on node1 requires manual intervention due to bug 14281269
  3. Due to several reported bug, it is recommended to be on 11203 PSU3 at least.check out metalink note for other bugs: List of gipc defects that prevent GI from starting/joining after network hiccups [ID 1488378.1])

 




TIP 103# : GoldenGate and statistics

After having GoldenGate setup in your environment, the main question you most likely face is whether or not GoldenGate is working and if so, how much work is done.
GoldenGate provides stats command to report the work which is done, the following present how to use it properly and effectively.

To get stats from now on wards, follow the below steps :

1. Reset numbers    : stats extract ext_int, reset
2. Apply the changes
3. Reterive stats  :
                     stats extract ext_int, table xxxx.yyyy, latest
                     stats extract ext_int, totalsonly xxxx.yyyy, latest


Please be careful that if you do not reset the numbers, all the reported statistics is since startup of a GoldenGate process. In other words, the following stats commands work in the following way if numbers are not reset

stats extract ext_int, totalsonly xxxx.yyyyy  --> since startup of ext_int
 stats extract ext_int, daily, table xxxx.yyyy --> since startup of day
 stats extract ext_int, daily, table owner.*   --> for all tables since startup of day


Hope this helps for those who may find numbers misleading/incorrect.


TIP 102#: GoldenGate replicat performance

One of the nice feature of GoldenGate is BATCHSQL which enables GoldenGate replicat to organize similar SQL statements into array and apply them at once instead of normal mode to apply SQL statement one by one. Oracle document says that around 5K bytes of data change per row, BATCHSQL loses its benefits.

The following shows a sample run of BATCHSQL with INSERTAPPEND and how it saves resources and improves the performance.

Scenario 1 : Small table with massive changes

create table repuser.tbl1 (id number, name varchar2(30));
alter table repuser.tbl1 add constraint uc1 unique (id);


Initial load :

SQL> delete from repuser.tbl1;

0 rows deleted.

SQL> insert into repuser.tbl1 select object_id*-1,object_name from all_objects where object_id is not null;

56135 rows created.

SQL> c/-1/-100000
1* insert into repuser.tbl1 select object_id*-100000,object_name from all_objects where object_id is not null
SQL> /

56135 rows created.

SQL> c/-100000/1
1* insert into repuser.tbl1 select object_id*1,object_name from all_objects where object_id is not null
SQL> /

56135 rows created.

SQL> commit;

Commit complete. 

To find out the impact of BATCHSQL in terms of performance, The changes are tested on Oracle 11203 using GG 11201 and the changes are shipped to target with Oracle 11203 using GG 11201. The following shows different setting for replicat and how each setting improves the performance

Replicat setting 1 : Default setting - No BATCHSQL, No INSERTAPPEND


GGSCI (localhost.localdomain) 33> info all

Program     Status      Group       Lag at Chkpt  Time Since Chkpt

MANAGER     RUNNING                                           
REPLICAT    RUNNING     RENA_IN     00:22:24      00:00:02    

 It takes more than 22 minutes to apply changes (224450 deletes and 224450 inserts)


The following trace shows that 224450 times delete and 224450 times insert is run and it took 1261.54+27.72=1289.26 seconds to complete executions of these two statements (Overheads and other internal executions are ignored).


DELETE FROM "REPUSER"."TBL1"
WHERE
"ID" = :b0 AND ROWNUM = 1


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute 224540   1261.54    1296.81       2979  567004335     238878      224540
Fetch        0      0.00       0.00          0          0          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total   224541   1261.54    1296.81       2979  567004335     238878      224540

INSERT INTO "REPUSER"."TBL1" ("ID","NAME")
VALUES
(:a0,:a1)


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute 224540     26.91      27.72          0        787     237105      224540
Fetch        0      0.00       0.00          0          0          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total   224541     26.91      27.72          0        787     237105      224540



Replicat setting 2 : BATCHSQL, INSERTAPPEND




GGSCI (localhost.localdomain) 23> !
info all

Program     Status      Group       Lag at Chkpt  Time Since Chkpt

MANAGER     RUNNING                                           
REPLICAT    RUNNING     RENA_IN     00:07:04      00:00:00    

The same changes only took more than 7 mins to be completed. It is more than 3 times faster setting 1.


The following shows that only 450 times delete and 450 times insert are run and it took only 452.10+0.71=452.81 seconds to complete the executions of two statements.
Also it shows that GoldenGate runs insert in DIRECT PATH with adding hints.


DELETE FROM "REPUSER"."TBL1"
WHERE
"ID" = :b0 AND ROWNUM = 1


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute    450    440.00     452.10         12  139050526     238818      224540
Fetch        0      0.00       0.00          0          0          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total      451    440.00     452.10         12  139050526     238818      224540

INSERT /*+ APPEND APPEND_VALUES */ INTO "REPUSER"."TBL1" ("ID","NAME")
VALUES
(:a0,:a1)


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute    450      0.60       0.71          0        922       5263      224540
Fetch        0      0.00       0.00          0          0          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total      451      0.60       0.71          0        922       5263      224540



Replicat setting 3 : BATCHSQL, No INSERTAPPEND




GGSCI (localhost.localdomain) 23> !
info all

Program     Status      Group       Lag at Chkpt  Time Since Chkpt

MANAGER     RUNNING                                           
REPLICAT    RUNNING     RENA_IN     00:07:54      00:00:00    

It took around 8 minutes to apply the same change, Timing is very close to BATCHSQL/INSERTAPPEND but almost 3 times faster than traditional GoldenGate.

Trace also shows that 450 executions of delete and 450 executions of inserts for the same amount of the change.Elapsed time is 518.87+0.48=519.35 seconds.



DELETE FROM "REPUSER"."TBL1"
WHERE
"ID" = :b0 AND ROWNUM = 1


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute    450    509.74     518.87          0  168600692     238673      224540
Fetch        0      0.00       0.00          0          0          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total      451    509.74     518.87          0  168600692     238673      224540


INSERT INTO "REPUSER"."TBL1" ("ID","NAME")
VALUES
(:a0,:a1)


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute    450      0.46       0.48          0       2312      11070      224540
Fetch        0      0.00       0.00          0          0          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total      451      0.46       0.48          0       2312      11070      224540

So far, BATCHSQL and INSERTAPPEND sounds promising for massive changes in small table. In the next post, I will try to test this feature with wider table to see how this feature performs for changes around 5KB per row.

TIP 101#: GoldenGate and Archivelog cleanup

In earlier version of GoldenGate, there was always a concern on not to cleanup archivelogs which are required by GoldenGate (when it runs or in case it is bounced).
The way that I do archivelog cleanup in GoldenGate environment was to have a script to extract the sequence which recovery checkpoint of all extracts is and make sure not to cleanup any archivelog after the minimum of sequences.
However, with the new version of GoldenGate, Oracle introduced integration of RMAN with GoldenGate which makes RMAN aware of GoldenGate and the required archivelogs.
This feautre works with some configurations in place for Classic capture and it should automatically works for Integrated Capture (Oracle newly introduced GoldenGate capture).
I ran a test on integrated capture in 11.2.0.1 on Linux 32bit and the following shows that although RMAN is aware of GoldenGate but because of a bug???, no archivelogs which are generated after start up of extract is cleaned up by RMAN. (Concern of archivelog dest full !!!)

Please see the following example :

Exatract checkpoints


GGSCI (localhost.localdomain) 15> info extract ext_int, showch

EXTRACT    EXT_INT   Last Started 2012-09-15 12:40   Status RUNNING
Checkpoint Lag       00:00:00 (updated 00:00:10 ago)
Log Read Checkpoint  Oracle Integrated Redo Logs
2012-09-15 13:07:26
SCN 0.6199997 (6199997)


Current Checkpoint Detail:

Read Checkpoint #1

Oracle Integrated Redo Log

Startup Checkpoint (starting position in the data source):
Timestamp: 2012-09-15 12:40:41.000000
SCN: Not available

  Recovery Checkpoint (position of oldest unprocessed transaction in the data source):
Timestamp: 2012-09-15 13:07:26.000000
SCN: 0.6199997 (6199997)

Current Checkpoint (position of last record read in the data source):
Timestamp: 2012-09-15 13:07:26.000000
SCN: 0.6199997 (6199997)

Write Checkpoint #1

GGS Log Trail

Current Checkpoint (current write position):
Sequence #: 15
RBA: 1065
Timestamp: 2012-09-15 13:07:30.085648
Extract Trail: ./dirdat/tg

CSN state information:
CRC: D5-D0-B1-2C
CSN: Not available

Header:
Version = 2
Record Source = A
Type = 12
# Input Checkpoints = 1
# Output Checkpoints = 1

File Information:
Block Size = 2048
Max Blocks = 100
Record Length = 2048
Current Offset = 0

Configuration:
Data Source = 3
Transaction Integrity = 1
Task Type = 0

Status:
Start Time = 2012-09-15 12:40:57
Last Update Time = 2012-09-15 13:07:30
Stop Status = A
Last Result = 400


Recovery checkpoint is on 6199997 scn which according the following is sequence
664


select 'arch',sequence#,first_time,next_time,RESETLOGS_TIME from v$archived_log where &your_scn between first_change# and next_change#
union all
select 'log-hist',sequence#,first_time,null,RESETLOGS_TIME from v$log_history where &your_scn between first_change# and next_change#
union all
select 'log-current',sequence#,first_time,next_time,null from v$log where &your_scn between first_change# and next_change#

arch 664 09/15/2012 13:05:41 09/15/2012 13:07:34 07/26/2012 08:03:23
log-hist 664 09/15/2012 13:05:41  07/26/2012 08:03:23
log-current 664 09/15/2012 13:05:41 09/15/2012 13:07:34 


Checking archive destinaton 

SQL> archive log list
Database log mode              Archive Mode
Automatic archival             Enabled
Archive destination            /u02/oracle/app/oracle/product/arch
Oldest online log sequence     663
Next log sequence to archive   665
Current log sequence           665
SQL> exit
Disconnected from Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - Production
With the Partitioning, Data Mining and Real Application Testing options
[oracle@localhost ~]$ cd  /u02/oracle/app/oracle/product/arch

[oracle@localhost arch]$ ls -ltr
total 458680
-rw-rw---- 1 oracle oracle 41370624 Sep 12 18:18 1_647_789638603.dbf
-rw-rw---- 1 oracle oracle 44616192 Sep 12 18:27 1_648_789638603.dbf
-rw-rw---- 1 oracle oracle 45473792 Sep 12 18:27 1_649_789638603.dbf
-rw-rw---- 1 oracle oracle 43464192 Sep 12 18:28 1_650_789638603.dbf
-rw-rw---- 1 oracle oracle 45917696 Sep 12 18:28 1_651_789638603.dbf
-rw-rw---- 1 oracle oracle 45963776 Sep 12 18:28 1_652_789638603.dbf
-rw-rw---- 1 oracle oracle 44632576 Sep 12 18:29 1_653_789638603.dbf
-rw-rw---- 1 oracle oracle 41152512 Sep 12 18:29 1_654_789638603.dbf
-rw-rw---- 1 oracle oracle 46688256 Sep 12 18:30 1_655_789638603.dbf
-rw-rw---- 1 oracle oracle 28416512 Sep 15 12:26 1_656_789638603.dbf
-rw-rw---- 1 oracle oracle   103936 Sep 15 12:27 1_657_789638603.dbf
-rw-rw---- 1 oracle oracle 33522176 Sep 15 12:37 1_658_789638603.dbf
-rw-rw---- 1 oracle oracle  1222144 Sep 15 12:49 1_659_789638603.dbf
-rw-rw---- 1 oracle oracle  6498816 Sep 15 13:05 1_660_789638603.dbf
-rw-rw---- 1 oracle oracle     1024 Sep 15 13:05 1_661_789638603.dbf
-rw-rw---- 1 oracle oracle     7680 Sep 15 13:05 1_662_789638603.dbf
-rw-rw---- 1 oracle oracle     2560 Sep 15 13:05 1_663_789638603.dbf
-rw-rw---- 1 oracle oracle    67072 Sep 15 13:07 1_664_789638603.dbf


I expect RMAN cleans up all archivelog but 664
Running RMAN Cleanup


[oracle@localhost arch]$ rman target /
Recovery Manager: Release 11.2.0.3.0 - Production on Sat Sep 15 13:11:29 2012
Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.
connected to target database: ORCL1123 (DBID=3438744075)
RMAN> delete archivelog all;
.
.
.
Recovery Manager complete.
[oracle@localhost arch]$ ls -ltr
total 68316
-rw-rw---- 1 oracle oracle 28416512 Sep 15 12:26 1_656_789638603.dbf
-rw-rw---- 1 oracle oracle   103936 Sep 15 12:27 1_657_789638603.dbf
-rw-rw---- 1 oracle oracle 33522176 Sep 15 12:37 1_658_789638603.dbf
-rw-rw---- 1 oracle oracle  1222144 Sep 15 12:49 1_659_789638603.dbf
-rw-rw---- 1 oracle oracle  6498816 Sep 15 13:05 1_660_789638603.dbf
-rw-rw---- 1 oracle oracle     1024 Sep 15 13:05 1_661_789638603.dbf
-rw-rw---- 1 oracle oracle     7680 Sep 15 13:05 1_662_789638603.dbf
-rw-rw---- 1 oracle oracle     2560 Sep 15 13:05 1_663_789638603.dbf
-rw-rw---- 1 oracle oracle    67072 Sep 15 13:07 1_664_789638603.dbf



For some reasons RMAN keep archives between sequence 656 to 664.
No retention policy for archivelog is defined.


Checking the capture which is started by integrated capture.


select session_restart_scn from v$streams_capture;

6156108


this scn is scn which log miner started from.
sequence 656 based on the following query is mapped to above scn.


select 'arch',sequence#,first_time,next_time,RESETLOGS_TIME from v$archived_log where &your_scn between first_change# and next_change#
union all
select 'log-hist',sequence#,first_time,null,RESETLOGS_TIME from v$log_history where &your_scn between first_change# and next_change#
union all
select 'log-current',sequence#,first_time,next_time,null from v$log where &your_scn between first_change# and next_change#


6156108  ==> sequence 656 

'ARCH' SEQUENCE# FIRST_TIME NEXT_TIME RESETLOGS_TIME
arch 656 09/12/2012 18:30:07 09/15/2012 12:26:54 07/26/2012 08:03:23
log-hist 656 09/12/2012 18:30:07  07/26/2012 08:03:23


RMAN keeps all archivelogs after scn which integrated capture is started from
To confirm, I bounced extract and confirmed that now RMAN moved to the next scn which extract is started from.


stop extract ext_int
start extract ext_int
select session_restart_scn from v$streams_capture;

6200279


Above scn is mapped to sequence 665.
Now if I run RMAN archivelog deletion, then all archivelogs till sequence 665 is removed.


[oracle@localhost arch]$ ls -ltr
total 69468
-rw-rw---- 1 oracle oracle 28416512 Sep 15 12:26 1_656_789638603.dbf
-rw-rw---- 1 oracle oracle   103936 Sep 15 12:27 1_657_789638603.dbf
-rw-rw---- 1 oracle oracle 33522176 Sep 15 12:37 1_658_789638603.dbf
-rw-rw---- 1 oracle oracle  1222144 Sep 15 12:49 1_659_789638603.dbf
-rw-rw---- 1 oracle oracle  6498816 Sep 15 13:05 1_660_789638603.dbf
-rw-rw---- 1 oracle oracle     1024 Sep 15 13:05 1_661_789638603.dbf
-rw-rw---- 1 oracle oracle     7680 Sep 15 13:05 1_662_789638603.dbf
-rw-rw---- 1 oracle oracle     2560 Sep 15 13:05 1_663_789638603.dbf
-rw-rw---- 1 oracle oracle    67072 Sep 15 13:07 1_664_789638603.dbf
-rw-rw---- 1 oracle oracle  1161728 Sep 15 13:23 1_665_789638603.dbf
-rw-rw---- 1 oracle oracle     3584 Sep 15 13:23 1_666_789638603.dbf
-rw-rw---- 1 oracle oracle     6656 Sep 15 13:23 1_667_789638603.dbf

rman target /
delete archivelog all;


[oracle@localhost arch]$ ls -ltr
total 1152  
-rw-rw---- 1 oracle oracle 1161728 Sep 15 13:23 1_665_789638603.dbf
-rw-rw---- 1 oracle oracle    3584 Sep 15 13:23 1_666_789638603.dbf
-rw-rw---- 1 oracle oracle    6656 Sep 15 13:23 1_667_789638603.dbf



If you use integrated capture or classic Goldengate, it is safer to prepare your own script for cleanup archivelog after extracting recovery checkpoint of all extract processes. Currently, RMAN does cleanup archivelog only to the scn which extracts are started, If you want to rely on RMAN cleanup, make sure to plan to bounce extract regularly AND watch archivelog destination closely.



TIP 100#: OpenFiler and Oracle Virtual Box

If you want to build a lab for Oracle RAC, I am with you, I know the pain specially if you are trying to build your home lab on Oracle Virtual box.
One of the first thing which may come to your mind regarding RAC is shared storage. Options which you could consider are : Fiber Channel(?), SCSI, Network File system.
Due to cost of Fiber Channel and complexity of maintenance and special configuration of Network File System,It looks SCSI is still one of the leading technology for shared storage. So it makes sense to use ISCSI technology for building shared storage area to make LAB also as close as possible to real world experience.
To do so, nodes are required to have ISCSI drive to initiate ISCSI request (Drives are available in Open_iscsi. For Target, you may use OpenFiler which simulates a storage component which has information that we want and is able to answer request from SCSI initiator (nodes).

If you use Oracle Virtual box like me, the challenge for quick setup is that neither Oracle nor openFiler has provided any pre-build appliance for Oracle Virtual box. Searching Internet and different site also suggested to install OpenFiler from its ISO image which makes the installation longer.
However, since OpenFiler released pre-build disk for VMWARE, the better solution is to download VMware pre-build OpenFiler appliance and then convert it to a format which Oracle Virtual box can import. In more details, step is like this :




1. Download openfiler vmware appliance - Link
2. Install OVFTool from VMWare and use the following command to convert appliance from vmdk to ovf...

ovftool -st=VMX -tt=OVF upload_file_with_full_path target_file_with_full_path

3. Import ovf file in Virtual box.
4. Start the new imported Virtual machine.
5. Use OpenFiler URL for connection to storage and architect/admin the storage.




TIP 99# : Connection to VMs in Oracle Vistual box

For couple months, I have been involved in setting up a lab and for this, I started using Oracle Vistual box with some pre-build template to speed up the initial process. It was always in back of my mind that when it is built, how other users in different offices gonna connect and use it ? Also what will happen when a lab on my laptop is moved from one network to the other network ?
I was thinking of using WebEx in worse case scenario till I saw an article in Oracle which provided different connection options with which each VM can be setup in Virtual box. They are listed in below and for my case in which users want to login to vms via SSH or SQL*Developer, Bridged Networking is the best.

I put digram from the Oracle article here for my ease of reference. If you are interested in more details, please check the link at the end of this post.

NAT


Bridged Networking



Internal networking



Host-Only networking




Reference : https://blogs.oracle.com/fatbloke/entry/networking_in_virtualbox1

TIP 98#: Oracle Restart

I was working on a database maintenance couple days ago and interesting enough, as it is shown in below, I noticed that after killing listener, it is started up somehow...
Environment has 11202 standalone database on Linux 32bit using ASM.

[oracle@localhost bin]$ ps -ef |grep tns
oracle 4365 1 0 15:29 ? 00:00:00 /home/oracle/app/oracle/product/11.2.0/grid/bin/tnslsnr LISTENER -inherit
oracle 4370 3513 0 15:29 pts/1 00:00:00 grep tns
[oracle@localhost bin]$ kill 4365
[oracle@localhost bin]$ ps -ef |grep tns
oracle 4373 3513 0 15:29 pts/1 00:00:00 grep tns
[oracle@localhost bin]$ ps -ef |grep tns
oracle 4375 3513 0 15:29 pts/1 00:00:00 grep tns
[oracle@localhost bin]$ ps -ef |grep tns
oracle 4385 1 0 15:30 ? 00:00:00 /home/oracle/app/oracle/product/11.2.0/grid/bin/tnslsnr LISTENER -inherit
oracle 4394 3513 0 15:31 pts/1 00:00:00 grep tns


This turned out to be due to Oracle Restart
For those of you who are not familiar, starting 11gR2, ASM as well as Oracle Restart are installed out of Grid Infrastructure home.
Oracle Restart provides managed startup and restart of a single-instance (non-clustered) Oracle Database, Oracle ASM instance, service, listener, and any other
process running on the server.Oracle Restart is used in standalone server (non-clustered) environments only. For RAC, Clusterware provides this functionality.

This explains why listener was started up automatically, So now that we know why, the question is how to stop different components in 11gR2 for single instance DB ?

Oracle recommends to use srvctl to stop/start different components.
However, tools such as sqlplus, lsnrctl, ascmd are integrated with Oracle Restart and in other words, if any component is stopped/started via these tools, Oracle Restart will not try to intervene.
Also if there is a case like maintenance/patch which DBA wants to keep some components down and does not want Oracle Restart to intervene, the best option is to disable Oracle Restart before starting maintenance. To do so, following below steps:



-------- To stop/disable Oracle Restart --------
cd $GI_HOME/bin
./crsctl stop has
./crsctl disable has
-------- To start/enable Oracle Restart --------
cd $GI_HOME/bin
./crsctl enable has
./crsctl start has




For more information about Oracle Restart and how to bounce DB in 11g, please check out : Oracle® Database Administrator's Guide,11g Release 2 (11.2),Part Number E25494-02111



TIP 97#: Apache2 in AIX



I was working for a client to upgrade the existing AS10g to Fusion 11g. As part of upgrade, Apache is upgraded to version >2. Interesting enough, starting Apache under port 1024 threw the following error although the regular permission change was applied.


[2012-05-15T10:27:53.6378-04:00] [OHS] [INCIDENT_ERROR:20] [OHS-9999] [mod_onsint.c] [host_id: comet.isc-seo.upenn.edu] [host
_addr: 165.123.81.34] [pid: 5308510] [tid: 1] [user: root] [VirtualHost: main] (22)Invalid argument: setgid: unable to set group id to Group 4294967295


This turned out to be due to the fact Apache under port 1024 can only be started as a named user/group other than nobody.In other words, checkout httpd.conf and make sure user and group is set first of all and it is set to anything but nobody.As I mentioned, Apache in AS10g can still be started without this setting but not anymore in Fusion11g!

I have not tested if this is the same case on other platforms, please feel free to chime in.

TIP 96#: Good to know/be reminded of (2)

In my effort to share some basics which are forgotten/missed,for today, I have the following two items


Index for FK columns
********************

Do I need FK index for every single FK columns, I would say yes, unless the following conditions are met

1. The primary key in parent table is not updated.
2. The record in the parent table is not deleted. (Either with delete cascade or without)
3. The primary and child table are not accessed together in a join condition.

The benefit that you gain from indexed FK in one of this condition at least is not met is much higher than being without index


Out of memory as instance startup
*********************************

DB box has enough physical memory but when you startup instance with SGA much lower than free physical memory, you get out of memory.
This is most likely due to shmall and shmmax, shmall is total shared memory which can be used in the system and shmmax is the max bite(chunk) which can be allocated.
If shmall is lower than SGA, you will see the issue


getconf PAGESIZE
4096
cat /proc/sys/kernel/shmall
4194304

max shared size = 4194304x4096 Bytes= 16G
SGA can not be set to higher than 16GB although
box has 32 or 64GB or much higher.