How to avoid using a third NFS-mounted voting file when using Oracle ASM to mirror to a redundant disk array / SAN…. maybe.

The Aim and the Issue

We want to use ASM to mirror data between two SANs so that a cluster can survive the loss of one.  Normal redundancy and two failure groups will achieve this for everything except voting disks, which need an odd number of failure groups.  Also, more than half of the voting files must remain accessible for the cluster nodes to remain up.

The only suggestion I could find from Oracle was to use an NFS mount from another server for the third voting file.  (Even though Oracle also say that this is only allowed for extended clusters).  I can understand why a voting file in three sites is required for extended clusters, but when our aim is only to protect against SAN failure and all the equipment is stored in the same server room, do we really have to use NFS mounts for the third voting file?

The Possible Solution

With a bit of black magic, I might have found a way to avoid using an NFS voting disk when mirroring across arrays with ASM in 11.2.0.2.  I cannot be sure because the Solaris administrators couldn’t simulate a SAN failure to my satisfaction.

SAN Failure Simulation

We used dd to write over the raw LUNs, but the cluster ran all night without noticing.  The voting file location had already been determined (AU 64) and was written to directly, without needing any ASM operations or to re-open the device or read headers etc.  Removing the device files or changing their file permissions had no effect either.  In this version of Oracle, the cluster processes don’t regularly read the voting file data.  Not even when CRS on a node is stopped, or when the voting files are relocated.  (I expect voting file data to be read when the network heart beat fails or CRS starts, but I didn’t test this).

Our alternative method was to use the SAN management software to un-map the LUNs from the servers, which did cause errors.  But, even though we unmapped both LUNs for the same SAN in the same second, it took 8-9secs between the I/O errors for each LUN reported in the cluster logs.  (The initial errors for one of the two un-mapped LUNs was ‘retryable’ due to LUN reconfiguration).  The hot spare disk was utilised 3 seconds after the first failure, so I don’t know if this was a fair test.

Overview

  • Four disks in separate failgroups, two on each SAN, in one normal redundancy diskgroup that is dedicated to voting files.  (OCR, ASM spfile and datafiles on separate disk groups).
  • Configure the disk partnership so that each disk is mirrored only to the two disks/failgroups on the other SAN.  (This required the temporary use of an ‘underscore parameter’ and a carefully constructed create diskgroup syntax to achieve).
  • In our test, three disks had voting files and PSTs (partnership status tables) on them, and the fourth was ‘empty’ – a hot spare.  (Verified using Oracle’s kfed tool).
  • We unmapped the two disks on the same SAN that stored VF and PST data to simulate a SAN failure.
  • ASM quickly copied the surviving PST and VF to the hot spare disk automatically.  No outage of the cluster.  The diskgroup stayed online.
  • After a short time the un-mapped ASM disks were cleanly expelled from the disk group automatically.

I’ve sent my solution to Oracle Support for comment, but they have been slow to respond.

The Details

ASM Disk Partnership

ASM redundancy is implemented by mirroring primary extents to disks in different failure groups.  Simply mirroring one disk to another might cause hot spots, so each disk is partnered with as many other disks as possible, up to a maximum of 8 (in 11.2.0.2) other disks.  This limit reduces the chances of data loss when two disks fail, but where there are more than 8 disks in other failure groups.  It might be better if the number of partners scaled up to a maximum of 8, depending on the number of disks available.

This default strategy causes a problem for the design of the voting disk group.  We want to use four disks/failure groups, two on each SAN.  The default mirroring strategy will mirror extents in one disk using the other three disks.  Losing one SAN would could cause both the primary and mirror extents to be lost and so the disk group would be forced offline.  The cluster would stay up without the voting disk group mounted, as long as there are two voting files directly accessible.  However, I want the surviving voting file to be able to be copied to the hot spare if two voting disks are lost.  Also, the partnership status table (PST) needs to have a quorum of 2/3 available, so it has to be able to automatically recover to the hot spare disk too.

What we have to do is ensure the failure groups on the same SAN will never be partners.  This would usually be accomplished by only having two failure groups, one for each SAN, but in this case we need more than two for the voting files.

We can set an undocumented Oracle initialisation parameter temporarily, to limit the number of partners each disk uses for mirroring.  I will set it to 1 in this example, but the actual minimum will be 2 when we have more than two failgroups in a disk group.

alter system set "_asm_partner_target_disk_part"=1 scope=memory;

Next, we need to create the disk group so that failure groups for SAN1 are partnered to SAN2 failure groups. Until Oracle allows control over this, we’ll have to rely on the order we add disks to achieve this.

Disks are always numbered in the order of the create disk group command.  You may need to experiment a little to find the pattern in each case, but in my tests the disks were always partnered as follows:

disk 0 - 1,2
disk 1 - 0,3
disk 2 - 0,3
disk 3 - 1,2

create diskgroup VOTETEST normal redundancy
failgroup fg1_SAN1 disk '/asm/disks/VOTETEST_01'
failgroup fg4_SAN2 disk '/asm/disks/VOTETEST_04'
failgroup fg2_SAN2 disk '/asm/disks/VOTETEST_02'
failgroup fg3_SAN1 disk '/asm/disks/VOTETEST_03'
attribute 'COMPATIBLE.ASM' = '11.2.0.0.0';

                      Primary                                       Mirror
                      Fail                                          Fail
GROUP_NAME            Group          PDsk PRIMARY_DISK_PATH         Group          MDsk MIRROR_DISK_PATH
--------------------- -------------- ---- ------------------------- -------------- ---- -------------------------
VOTETEST              FG1_SAN1          0 /asm/disks/VOTETEST_01    FG2_SAN2          2 /asm/disks/VOTETEST_02
                                                                    FG4_SAN2          1 /asm/disks/VOTETEST_04
                      FG2_SAN2          2 /asm/disks/VOTETEST_02    FG1_SAN1          0 /asm/disks/VOTETEST_01
                                                                    FG3_SAN1          3 /asm/disks/VOTETEST_03
                      FG3_SAN1          3 /asm/disks/VOTETEST_03    FG2_SAN2          2 /asm/disks/VOTETEST_02
                                                                    FG4_SAN2          1 /asm/disks/VOTETEST_04
                      FG4_SAN2          1 /asm/disks/VOTETEST_04    FG1_SAN1          0 /asm/disks/VOTETEST_01
                                                                    FG3_SAN1          3 /asm/disks/VOTETEST_03

GRP Dsk MOUNT_S HEADER_STATU MODE_ST STATE    FAILGROUP                  LABEL                            READ_ERRS WRITE_ERRS V
--- --- ------- ------------ ------- -------- -------------------------- ------------------------------- ---------- ---------- -
  4   0 CACHED  MEMBER       ONLINE  NORMAL   FG1_SAN1                                                            0          0 Y
      2 CACHED  MEMBER       ONLINE  NORMAL   FG2_SAN2                                                            0          0 Y
      3 CACHED  MEMBER       ONLINE  NORMAL   FG3_SAN1                                                            0          0 N
      1 CACHED  MEMBER       ONLINE  NORMAL   FG4_SAN2                                                            0          0 Y

                                           Fail                                                           Repair
                                           Group                                                           Timer
GRP Dsk REDUNDA FAILGROUP                  Type    PATH                              PRODUCT                 Sec CREATE_DA MOUNT_DAT
--- --- ------- -------------------------- ------- --------------------------------- -------------------- ------ --------- ---------
  4   0 UNKNOWN FG1_SAN1                   REGULAR /asm/disks/VOTETEST_01                                      0 20-JUN-11 20-JUN-11
      2 UNKNOWN FG2_SAN2                   REGULAR /asm/disks/VOTETEST_02                                      0 20-JUN-11 20-JUN-11
      3 UNKNOWN FG3_SAN1                   REGULAR /asm/disks/VOTETEST_03                                      0 20-JUN-11 20-JUN-11
      1 UNKNOWN FG4_SAN2                   REGULAR /asm/disks/VOTETEST_04                                      0 20-JUN-11 20-JUN-11

                                                                                                       OS Total  Free Sectr
GRP Dsk FAILGROUP                  PATH                              NAME                              GB    GB    GB  Size
--- --- -------------------------- --------------------------------- ------------------------------ ----- ----- ----- -----
  4   0 FG1_SAN1                   /asm/disks/VOTETEST_01            VOTETEST_0000                      1     1     1   512
      1 FG4_SAN2                   /asm/disks/VOTETEST_04            VOTETEST_0001                      1     1     1   512
      2 FG2_SAN2                   /asm/disks/VOTETEST_02            VOTETEST_0002                      1     1     1   512
      3 FG3_SAN1                   /asm/disks/VOTETEST_03            VOTETEST_0003                      1     1     1   512

                                    Sectr  Block       AU                    Offl             Database
GRP Disk Group                      Size   Size     Size STATE       TYPE   Dsks COMPATIBILI Compatibili V
--- ------------------------------ ----- ------ -------- ----------- ------ ---- ----------- ----------- -
  4 VOTETEST                         512   4096  1048576 MOUNTED     NORMAL    0 11.2.0.0.0  10.1.0.0.0  Y

Tue Jun 21 10:26:57 BST 2011
##  STATE    File Universal Id                File Name Disk group
--  -----    -----------------                --------- ---------
 1. ONLINE   27d754d38a234f4bbf44d8afdbb204ff (/asm/disks/VOTETEST_01) [VOTETEST]
 2. ONLINE   0d1766b0f3c34fe0bff3bd901ef4327b (/asm/disks/VOTETEST_04) [VOTETEST]
 3. ONLINE   3184ff5f43784f76bf6a9a934ca2ae80 (/asm/disks/VOTETEST_02) [VOTETEST]
Located 3 voting disk(s).
$ kfed dev=/asm/disks/VOTETEST_04 op=read aun=1 blkn=1 | grep  type
kfbh.type:                           17 ; 0x002: KFBTYP_PST_META
$ kfed dev=/asm/disks/VOTETEST_03 op=read aun=1 blkn=1 | grep  type
kfbh.type:                           13 ; 0x002: KFBTYP_PST_NONE
$ kfed dev=/asm/disks/VOTETEST_02 op=read aun=1 blkn=1 | grep  type
kfbh.type:                           17 ; 0x002: KFBTYP_PST_META
$ kfed dev=/asm/disks/VOTETEST_01 op=read aun=1 blkn=1 | grep  type
kfbh.type:                           17 ; 0x002: KFBTYP_PST_META

So:

SAN VF PST
/asm/disks/VOTETEST_01   1  Y   Y
/asm/disks/VOTETEST_02   2  Y   Y
/asm/disks/VOTETEST_03   1  N   N
/asm/disks/VOTETEST_04   2  Y   Y

After the SAN Failure

GRP Dsk MOUNT_S HEADER_STATU MODE_ST STATE    FAILGROUP                  LABEL                            READ_ERRS WRITE_ERRS V
--- --- ------- ------------ ------- -------- -------------------------- ------------------------------- ---------- ---------- -
  4   0 CACHED  MEMBER       ONLINE  NORMAL   FG1_SAN1                                                            0          0 Y
      2 MISSING UNKNOWN      OFFLINE NORMAL   FG2_SAN2                                                                         N
      3 CACHED  MEMBER       ONLINE  NORMAL   FG3_SAN1                                                            0          0 Y
      1 MISSING UNKNOWN      OFFLINE NORMAL   FG4_SAN2                                                                         N

                                           Fail                                                           Repair
                                           Group                                                           Timer
GRP Dsk REDUNDA FAILGROUP                  Type    PATH                              PRODUCT                 Sec CREATE_DA MOUNT_DAT
--- --- ------- -------------------------- ------- --------------------------------- -------------------- ------ --------- ---------
  4   0 UNKNOWN FG1_SAN1                   REGULAR /asm/disks/VOTETEST_01                                      0 20-JUN-11 20-JUN-11
      2 UNKNOWN FG2_SAN2                   REGULAR                                                             0 20-JUN-11 20-JUN-11
      3 UNKNOWN FG3_SAN1                   REGULAR /asm/disks/VOTETEST_03                                      0 20-JUN-11 20-JUN-11
      1 UNKNOWN FG4_SAN2                   REGULAR                                                             0 20-JUN-11 20-JUN-11

                                                                                                       OS Total  Free Sectr
GRP Dsk FAILGROUP                  PATH                              NAME                              GB    GB    GB  Size
--- --- -------------------------- --------------------------------- ------------------------------ ----- ----- ----- -----
  4   0 FG1_SAN1                   /asm/disks/VOTETEST_01            VOTETEST_0000                      1     1     1   512
      1 FG4_SAN2                                                     VOTETEST_0001                      0     1     1     0
      2 FG2_SAN2                                                     VOTETEST_0002                      0     1     1     0
      3 FG3_SAN1                   /asm/disks/VOTETEST_03            VOTETEST_0003                      1     1     1   512

                                   Sectr  Block       AU                    Offl             Database
GRP Disk Group                      Size   Size     Size STATE       TYPE   Dsks COMPATIBILI Compatibili V
--- ------------------------------ ----- ------ -------- ----------- ------ ---- ----------- ----------- -
  4 VOTETEST                         512   4096  1048576 MOUNTED     NORMAL    2 11.2.0.0.0  10.1.0.0.0  Y

                      Primary                                       Mirror
                      Fail                                          Fail
GROUP_NAME            Group          PDsk PRIMARY_DISK_PATH         Group          MDsk MIRROR_DISK_PATH
--------------------- -------------- ---- ------------------------- -------------- ---- -------------------------
VOTETEST              FG1_SAN1          0 /asm/disks/VOTETEST_01    FG2_SAN2          2
                                                                    FG4_SAN2          1
                      FG2_SAN2          2                           FG1_SAN1          0 /asm/disks/VOTETEST_01
                                                                    FG3_SAN1          3 /asm/disks/VOTETEST_03
                      FG3_SAN1          3 /asm/disks/VOTETEST_03    FG4_SAN2          1
                                                                    FG2_SAN2          2
                      FG4_SAN2          1                           FG1_SAN1          0 /asm/disks/VOTETEST_01
                                                                    FG3_SAN1          3 /asm/disks/VOTETEST_03

After a bit longer…

GRP Dsk MOUNT_S HEADER_STATU MODE_ST STATE    FAILGROUP                  LABEL                            READ_ERRS WRITE_ERRS V
--- --- ------- ------------ ------- -------- -------------------------- ------------------------------- ---------- ---------- -
  4   0 CACHED  MEMBER       ONLINE  NORMAL   FG1_SAN1                                                            0          0 Y
      3 CACHED  MEMBER       ONLINE  NORMAL   FG3_SAN1                                                            0          0 Y

                                           Fail                                                           Repair
                                           Group                                                           Timer
GRP Dsk REDUNDA FAILGROUP                  Type    PATH                              PRODUCT                 Sec CREATE_DA MOUNT_DAT
--- --- ------- -------------------------- ------- --------------------------------- -------------------- ------ --------- ---------
  4   0 UNKNOWN FG1_SAN1                   REGULAR /asm/disks/VOTETEST_01                                      0 20-JUN-11 20-JUN-11
      3 UNKNOWN FG3_SAN1                   REGULAR /asm/disks/VOTETEST_03                                      0 20-JUN-11 20-JUN-11

                                                                                                       OS Total  Free Sectr
GRP Dsk FAILGROUP                  PATH                              NAME                              GB    GB    GB  Size
--- --- -------------------------- --------------------------------- ------------------------------ ----- ----- ----- -----
  4   0 FG1_SAN1                   /asm/disks/VOTETEST_01            VOTETEST_0000                      1     1     1   512
      3 FG3_SAN1                   /asm/disks/VOTETEST_03            VOTETEST_0003                      1     1     1   512

                                   Sectr  Block       AU                    Offl             Database
GRP Disk Group                      Size   Size     Size STATE       TYPE   Dsks COMPATIBILI Compatibili V
--- ------------------------------ ----- ------ -------- ----------- ------ ---- ----------- ----------- -
  4 VOTETEST                         512   4096  1048576 MOUNTED     NORMAL    0 11.2.0.0.0  10.1.0.0.0  Y

                      Primary                                       Mirror
                      Fail                                          Fail
GROUP_NAME            Group          PDsk PRIMARY_DISK_PATH         Group          MDsk MIRROR_DISK_PATH
--------------------- -------------- ---- ------------------------- -------------- ---- -------------------------
VOTETEST              FG1_SAN1          0 /asm/disks/VOTETEST_01    FG3_SAN1          3 /asm/disks/VOTETEST_03
                      FG3_SAN1          3 /asm/disks/VOTETEST_03    FG1_SAN1          0 /asm/disks/VOTETEST_01
Tue Jun 21 10:30:09 BST 2011
##  STATE    File Universal Id                File Name Disk group
--  -----    -----------------                --------- ---------
 1. ONLINE   27d754d38a234f4bbf44d8afdbb204ff (/asm/disks/VOTETEST_01) [VOTETEST]
 2. ONLINE   0d543a81bb704f76bf1632902fd9fb1b (/asm/disks/VOTETEST_03) [VOTETEST]
Located 2 voting disk(s).

The Logs

tail -999f alert_*log
Tue Jun 21 10:28:48 2011
Errors in file /data/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_gmon_24680.trc:
ORA-27063: number of bytes read/written is incorrect
SVR4 Error: 5: I/O error
Additional information: -1
Additional information: 4096
WARNING: Write Failed. group:4 disk:1 AU:1 offset:1044480 size:4096
WARNING: disk 1.4050638309 (VOTETEST_0001) not responding to heart beat
Tue Jun 21 10:28:48 2011
NOTE: process 5474 initiating offline of disk 1.4050638309 (VOTETEST_0001) with mask 0x7e in group 4
NOTE: checking PST: grp = 4
GMON checking disk modes for group 4 at 114 for pid 28, osid 5474
NOTE: group VOTETEST: updated PST location: disk 0000 (PST copy 0)
NOTE: group VOTETEST: updated PST location: disk 0002 (PST copy 1)
NOTE: group VOTETEST: updated PST location: disk 0003 (PST copy 2)
NOTE: checking PST for grp 4 done.
WARNING: Disk VOTETEST_0001 in mode 0x7f is now being offlined
WARNING: Disk VOTETEST_0001 in mode 0x7f is now being taken offline
NOTE: initiating PST update: grp = 4, dsk = 1/0xf16fd5e5, mode = 0x15
GMON updating disk modes for group 4 at 115 for pid 28, osid 5474
NOTE: PST update grp = 4 completed successfully
NOTE: initiating PST update: grp = 4, dsk = 1/0xf16fd5e5, mode = 0x1
GMON updating disk modes for group 4 at 116 for pid 28, osid 5474
NOTE: cache closing disk 1 of grp 4: VOTETEST_0001
NOTE: PST update grp = 4 completed successfully
Tue Jun 21 10:28:51 2011
NOTE: Attempting voting file refresh on diskgroup VOTETEST
NOTE: Voting file relocation is required in diskgroup VOTETEST
NOTE: Attempting voting file relocation on diskgroup VOTETEST
NOTE: voting file allocation on grp 4 disk VOTETEST_0003
Errors in file /data/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_gmon_24680.trc:
ORA-27063: number of bytes read/written is incorrect
SVR4 Error: 5: I/O error
Additional information: -1
Additional information: 4096
WARNING: Write Failed. group:4 disk:2 AU:1 offset:1044480 size:4096
WARNING: disk 2.4050638307 (VOTETEST_0002) not responding to heart beat
NOTE: process 5474 initiating offline of disk 2.4050638307 (VOTETEST_0002) with mask 0x7e in group 4
NOTE: checking PST: grp = 4
GMON checking disk modes for group 4 at 117 for pid 28, osid 5474
NOTE: group VOTETEST: updated PST location: disk 0000 (PST copy 0)
NOTE: group VOTETEST: updated PST location: disk 0003 (PST copy 1)
NOTE: checking PST for grp 4 done.
WARNING: Disk VOTETEST_0002 in mode 0x7f is now being offlined
WARNING: Disk VOTETEST_0002 in mode 0x7f is now being taken offline
NOTE: initiating PST update: grp = 4, dsk = 2/0xf16fd5e3, mode = 0x15
GMON updating disk modes for group 4 at 118 for pid 28, osid 5474
NOTE: group VOTETEST: updated PST location: disk 0000 (PST copy 0)
NOTE: group VOTETEST: updated PST location: disk 0003 (PST copy 1)
NOTE: group VOTETEST: updated PST location: disk 0000 (PST copy 0)
NOTE: group VOTETEST: updated PST location: disk 0003 (PST copy 1)
NOTE: PST update grp = 4 completed successfully
NOTE: initiating PST update: grp = 4, dsk = 2/0xf16fd5e3, mode = 0x1
GMON updating disk modes for group 4 at 119 for pid 28, osid 5474
NOTE: group VOTETEST: updated PST location: disk 0000 (PST copy 0)
NOTE: group VOTETEST: updated PST location: disk 0003 (PST copy 1)
NOTE: group VOTETEST: updated PST location: disk 0000 (PST copy 0)
NOTE: group VOTETEST: updated PST location: disk 0003 (PST copy 1)
NOTE: cache closing disk 2 of grp 4: VOTETEST_0002
NOTE: PST update grp = 4 completed successfully
NOTE: Attempting voting file refresh on diskgroup VOTETEST
NOTE: Voting file relocation is required in diskgroup VOTETEST
NOTE: Attempting voting file relocation on diskgroup VOTETEST
Tue Jun 21 10:30:55 2011
WARNING: PST-initiated drop of 2 disk(s) in group 4(.1625236994))
SQL> alter diskgroup VOTETEST drop disk VOTETEST_0001 force, VOTETEST_0002 force /* ASM SERVER */
NOTE: GroupBlock outside rolling migration privileged region
NOTE: requesting all-instance membership refresh for group=4
Tue Jun 21 10:30:56 2011
GMON updating for reconfiguration, group 4 at 120 for pid 28, osid 5474
NOTE: group VOTETEST: updated PST location: disk 0000 (PST copy 0)
NOTE: group VOTETEST: updated PST location: disk 0003 (PST copy 1)
NOTE: group VOTETEST: updated PST location: disk 0000 (PST copy 0)
NOTE: group VOTETEST: updated PST location: disk 0003 (PST copy 1)
NOTE: group 4 PST updated.
Tue Jun 21 10:30:57 2011
NOTE: membership refresh pending for group 4/0x60df2602 (VOTETEST)
GMON querying group 4 at 121 for pid 18, osid 24678
NOTE: group VOTETEST: updated PST location: disk 0000 (PST copy 0)
NOTE: group VOTETEST: updated PST location: disk 0003 (PST copy 1)
SUCCESS: refreshed membership for 4/0x60df2602 (VOTETEST)
NOTE: starting rebalance of group 4/0x60df2602 (VOTETEST) at power 1
SUCCESS: alter diskgroup VOTETEST drop disk VOTETEST_0001 force, VOTETEST_0002 force /* ASM SERVER */
SUCCESS: PST-initiated drop disk in group 4(1625236994))
Starting background process ARB0
Tue Jun 21 10:30:59 2011
ARB0 started with pid=32, OS id=5583
NOTE: assigning ARB0 to group 4/0x60df2602 (VOTETEST) with 1 parallel I/O

[grid@dice]$ tail -f ocssd.log
2011-06-21 10:28:46.895: [    CSSD][39]clssnmSendingThread: sending status msg to all nodes
2011-06-21 10:28:46.895: [    CSSD][39]clssnmSendingThread: sent 4 status msgs to all nodes
2011-06-21 10:28:47.369: [   SKGFD][47]ERROR: -9(Error 27072, OS Error (SVR4 Error: 5: I/O error
Additional information: 4
Additional information: 131088
Additional information: -1)
)
2011-06-21 10:28:47.369: [    CSSD][47](:CSSNM00060:)clssnmvReadBlocks: read failed at offset 16 of /asm/disks/VOTETEST_04
2011-06-21 10:28:47.369: [    CSSD][47]clssnmvDiskAvailabilityChange: voting file /asm/disks/VOTETEST_04 now offline
2011-06-21 10:28:47.369: [    CSSD][47]clssnmvScanCompletions: completed 1 items
2011-06-21 10:28:48.005: [   SKGFD][45]Lib :UFS:: closing handle 101190d90 for disk :/asm/disks/VOTETEST_04:
....
2011-06-21 10:28:52.005: [    CSSD][45]clssnmvDiskOpen: Opening /asm/disks/VOTETEST_04
2011-06-21 10:28:52.005: [   SKGFD][45]ERROR: -9(Error 27041, OS Error (SVR4 Error: 5: I/O error
Additional information: 9)
)
2011-06-21 10:28:52.006: [    CSSD][45]clssnmvGetDiskHandle: Unable to open disk /asm/disks/VOTETEST_04
2011-06-21 10:28:52.006: [    CSSD][45]clssnmvDiskOpen:failed to open /asm/disks/VOTETEST_04
....
2011-06-21 10:28:52.536: [    CSSD][5]  Listing unique IDs for 3 voting files:
2011-06-21 10:28:52.536: [    CSSD][5]    voting file 1: 27d754d3-8a234f4b-bf44d8af-dbb204ff
2011-06-21 10:28:52.536: [    CSSD][5]    voting file 2: 3184ff5f-43784f76-bf6a9a93-4ca2ae80
2011-06-21 10:28:52.536: [    CSSD][5]    voting file 3: 0d543a81-bb704f76-bf163290-2fd9fb1b
2011-06-21 10:28:52.537: [    CSSD][5]clssnmSendVFDiscover: Sending discover voting files request
2011-06-21 10:28:52.537: [    CSSD][41]clssnmHandleVFDiscover: Processing voting file discovery  requested by node dice, number 1
2011-06-21 10:28:52.682: [    CSSD][6]clssnmReadDiscoveryProfile: voting file discovery string(/asm/disks)
2011-06-21 10:28:52.682: [    CSSD][6]clssnmvDDiscThread: using discovery string /asm/disks for voting file add
....
2011-06-21 10:28:52.959: [    CSSD][6]clssnmvDiskVerify: Successful discovery of 1 disks
2011-06-21 10:28:52.959: [    CSSD][6]clssnmCompleteRmtDiscoveryReq: Completing voting file discovery  requested by node dice, number 1
2011-06-21 10:28:52.959: [    CSSD][6]clssnmSendDiscoverAck: Discovery complete, notifying requestor node dice
2011-06-21 10:28:52.960: [    CSSD][41]clssnmvDiskStateChange: state from discovered to pending disk /asm/disks/VOTETEST_03
....
2011-06-21 10:28:52.966: [    CSSD][41]clssnmvDiskAvailabilityChange: voting file /asm/disks/VOTETEST_03 now online
2011-06-21 10:28:52.967: [   SKGFD][41]Lib :UFS:: closing handle 101190910 for disk :/asm/disks/VOTETEST_03:
2011-06-21 10:28:52.967: [    CSSD][57]clssnmvWorkerThread: spawned for disk /asm/disks/VOTETEST_03
2011-06-21 10:28:52.967: [    CSSD][41]clssnmHandleVFDiscoverAck: Copying lease blocks to new voting files

+ASM1_gmon_24680.trc

POST res = 12
=============== PST ====================
grpNum:    4
state:     1
callCnt:   110
(lockvalue) valid=0 version=1.1 ndisks=0 flags=0x2 from inst=1 (I am 1) version=3
--------------- HDR --------------------
next:    1
last:    1
pst count:       3
pst locations:   0  1  2
dta size:        4
version:         1
ASM version:     186646528 = 11.2.0.0.0
--------------- LOC MAP ----------------
0: dirty 0       cur_loc: 0      stable_loc: 0
1: dirty 0       cur_loc: 1      stable_loc: 1
--------------- DTA --------------------
0: sts v v(rw) p(rw) a(x) d(x) fg# = 1 addTs = 1839798475 parts: 2 (amp) 1 (amp)
1: sts v v(rw) p(rw) a(x) d(x) fg# = 2 addTs = 1839798475 parts: 0 (amp) 3 (amp)
2: sts v v(rw) p(rw) a(x) d(x) fg# = 3 addTs = 1839798475 parts: 0 (amp) 3 (amp)
3: sts v v(rw) p(rw) a(x) d(x) fg# = 4 addTs = 1839798475 parts: 1 (amp) 2 (amp)
--------------- HBEAT ------------------
state=3, inst=1, ts=32955025.2783754240, rnd=4252265020.1374678618.872105960.2928318545.
--------------- KFGRP ------------------
kfgrp: VOTETEST number: 4/1625236994 type: 2 compat: 11.2.0.0.0 dbcompat:1.0.0.0.1
timestamp: 28071436 state: 4 flags: 10 gpnlist: 8a8fa928 8a8fa928
KFGPN at 8a8fa868 in dependent chain
  kfdsk:38cbb20f8
  disk: VOTETEST_0000 num: 0/4050638306 grp: 4/1625236994 compat: 11.2.0.0.0 dbcompat:10.1.0.0.0
  fg: FG1_SAN1 path: /asm/disks/VOTETEST_01
  mnt: O hdr: M mode: v v(rw) p(rw) a(x) d(x) sta: N flg: 1001
    kfts: 2011/06/20 17:35:11.946000
    kfts: 2011/06/20 17:41:30.201000
  pcnt: 2 (2 1)
    kfkid: 38cbcb968, kfknm: , status: IDENTIFIED
      fob: (KSFD)8e00c488, magic: bebe ausize: 1048576
    kfdds: dn=0 inc=4050638306 dsk=38cbb20f8 usrp=0
      kfkds ffffffff7bb01798, kfkid 38cbcb968, magic abbe, libnum 0, bpau 2048, fob 38e01f488
  kfdsk:38cbb2b00
  disk: VOTETEST_0001 num: 1/4050638309 grp: 4/1625236994 compat: 11.2.0.0.0 dbcompat:10.1.0.0.0
  fg: FG4_SAN2 path: /asm/disks/VOTETEST_04
  mnt: O hdr: M mode: v v(rw) p(rw) a(x) d(x) sta: N flg: 1001
    kfts: 2011/06/20 17:35:11.946000
    kfts: 2011/06/20 17:41:30.201000
  pcnt: 2 (0 3)
    kfkid: 38cbcc538, kfknm: , status: IDENTIFIED
      fob: (KSFD)8e0200f0, magic: bebe ausize: 1048576
    kfdds: dn=1 inc=4050638309 dsk=38cbb2b00 usrp=0
      kfkds ffffffff7bb016f8, kfkid 38cbcc538, magic abbe, libnum 0, bpau 2048, fob 38e020718
  kfdsk:38cbb2448
  disk: VOTETEST_0002 num: 2/4050638307 grp: 4/1625236994 compat: 11.2.0.0.0 dbcompat:10.1.0.0.0
  fg: FG2_SAN2 path: /asm/disks/VOTETEST_02
  mnt: O hdr: M mode: v v(rw) p(rw) a(x) d(x) sta: N flg: 1001
    kfts: 2011/06/20 17:35:11.946000
    kfts: 2011/06/20 17:41:30.201000
  pcnt: 2 (0 3)
    kfkid: 38cbcbd50, kfknm: , status: IDENTIFIED
      fob: (KSFD)8e017878, magic: bebe ausize: 1048576
    kfdds: dn=2 inc=4050638307 dsk=38cbb2448 usrp=0
      kfkds ffffffff7bb01658, kfkid 38cbcbd50, magic abbe, libnum 0, bpau 2048, fob 38e022628
  kfdsk:38cbb27b0
  disk: VOTETEST_0003 num: 3/4050638308 grp: 4/1625236994 compat: 11.2.0.0.0 dbcompat:10.1.0.0.0
  fg: FG3_SAN1 path: /asm/disks/VOTETEST_03
  mnt: O hdr: M mode: v v(rw) p(rw) a(x) d(x) sta: N flg: 1001
    kfts: 2011/06/20 17:35:11.946000
    kfts: 2011/06/20 17:41:30.201000
  pcnt: 2 (1 2)
    kfkid: 38cbcc150, kfknm: , status: IDENTIFIED
      fob: (KSFD)8e01ee48, magic: bebe ausize: 1048576
    kfdds: dn=3 inc=4050638308 dsk=38cbb27b0 usrp=0
      kfkds ffffffff7bb015b8, kfkid 38cbcc150, magic abbe, libnum 0, bpau 2048, fob 38e022c50

*** 2011-06-20 17:41:36.822
GMON querying group 4 at 112 for pid 27, osid 26501

*** 2011-06-20 17:41:39.064
GMON querying group 4 at 113 for pid 18, osid 24678
ORA-27063: number of bytes read/written is incorrect

*** 2011-06-21 10:28:48.690
SVR4 Error: 5: I/O error
Additional information: -1
Additional information: 4096
WARNING: Write Failed. group:4 disk:1 AU:1 offset:1044480 size:4096
path:/asm/disks/VOTETEST_04
         incarnation:0xf16fd5e5 synchronous result:'I/O error'
         subsys:System iop:0xffffffff7a370260 bufp:0xffffffff7a36ee00 osderr:0x0 osderr1:0x0
WARNING: disk 1.4050638309 (VOTETEST_0001) not responding to heart beat
NOTE: Set to be offline flag for disk VOTETEST_0001 only locally: flag 0x3211
----- Abridged Call Stack Trace -----
ksedsts()+1208<-kfdpGc_doTobeoflnAsync()+40<-kfdpGc_checkTobeofln()+660<.....

----- End of Abridged Call Stack Trace -----
GMON checking disk modes for group 4 at 114 for pid 28, osid 5474
  dsk = 1/0xf16fd5e5, mode = 0x1
InvalLck (group 4) upgraded to X
NOTE: GMON selects PST disk VOTETEST_0003 in failgroup FG3_SAN1
InvalLck (group 4) downgraded to S
PRE
=============== PST ====================
grpNum:    4
state:     1
callCnt:   114
(lockvalue) valid=1 version=1.1 ndisks=3 flags=0x0 from inst=1 (I am 1) version=2
(lockvalue) dsks: 0 2 3
--------------- HDR --------------------
next:    2
last:    2
pst count:       3
pst locations:   0  2  3
dta size:        4
version:         1
ASM version:     186646528 = 11.2.0.0.0
.....
*** 2011-06-21 10:28:58.010
ORA-27063: number of bytes read/written is incorrect
SVR4 Error: 5: I/O error
Additional information: -1
Additional information: 4096
WARNING: Write Failed. group:4 disk:2 AU:1 offset:1044480 size:4096
path:/asm/disks/VOTETEST_02
         incarnation:0xf16fd5e3 synchronous result:'I/O error'
         subsys:System iop:0xffffffff7a370260 bufp:0xffffffff7a36ee00 osderr:0x0 osderr1:0x0
WARNING: disk 2.4050638307 (VOTETEST_0002) not responding to heart beat
NOTE: Set to be offline flag for disk VOTETEST_0002 only locally: flag 0x3211
----- Abridged Call Stack Trace -----
ksedsts()+1208<-kfdpGc_doTobeoflnAsync()+40<-kfdpGc_checkTobeofln()+660<-kfdpGc_timeout()+4<-kf......

----- End of Abridged Call Stack Trace -----
GMON checking disk modes for group 4 at 117 for pid 28, osid 5474
  dsk = 2/0xf16fd5e3, mode = 0x1
InvalLck (group 4) upgraded to X
InvalLck (group 4) downgraded to S
PRE
=============== PST ====================
grpNum:    4
state:     1
callCnt:   117
(lockvalue) valid=1 version=1.1 ndisks=2 flags=0x0 from inst=1 (I am 1) version=7
(lockvalue) dsks: 0 3
--------------- HDR --------------------
next:    7
last:    7
pst count:       2
pst locations:   0  3
dta size:        4
version:         1
ASM version:     186646528 = 11.2.0.0.0
Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s