Good Contents Are Everywhere, But Here, We Deliver The Best of The Best.Please Hold on!
Database Management Services, Oracle Database Management Solution, Oracle Databases, Oracle Exadata
 
You will end up performing storage cell rescue under the following situations:

  • Improper Battery Replacement
  • Improper Card Seating
  • Card Damage During Battery Replacement
  • Corrupted Root File System

In this article we will demonstrate step by step process to Rescue an Exadata Storage Cell or server.
 
Open a browser and enter the ILOM hostname or IP address of the Storage cell you want to rescue
https://dm01cel02-ilom.netsoftmate.com
 
Enter root crendentials

 
On the left pane under “Remote Control”, click “Redirection”. Select “Use video redirection” and click “Launch Remote Console” button

 
Click OK
 
 Click OK

 
Click Continue

 
Click Run

 
Click Continue (not recommended)

 
From the ILOM video console we can see that the root file system can’t be mounted due to corruption and it will be rebooted again in 60 seconds

 
On the left pane under “Host Management” click on “Power Control”. From the drop down list Select “Power Cycle”

 
Click Save

 
Click OK

 
Rebooting in progress

 
Server is no rebooting

 
 
Immediately press Ctrl+S on keyboard 

 
Select the “CELL_USB_BOOT_CELLBOOT_usb_in_rescue_mode

 
At the point, we will have continue the rescue process using serial ILOM

 
As root, ssh to the storage cell ILOM and start the serial console

 
Enter r and hit return

 
Enter y and hit return

 
Enter the rescue password sos1exadata. Enter n and hit return

 
Enter the root user password 

 
We are into the rescue mode. At this moment check to make sure that the there are no file system issue. Fix any other issue you may have. Consult Oracle if required
 
Reboot the server again to complete the rescue process

 
Hit return

 
The server is powered off

 
Power on the server using web ILOM as shown below

 
Rescue process is completed and we got the root login prompt

 
 
Login to the server as root user and perform the post rescue steps

  
Verify the image version of the storage cell

 
 
Post Storage Cell Rescue steps:
 
[root@dm01cel02 ~]# imageinfo

Kernel version: 4.1.12-94.8.4.el6uek.x86_64 #2 SMP Sat May 5 16:14:51 PDT 2018 x86_64
Cell version: OSS_18.1.7.0.0AUG_LINUX.X64_180821
Cell rpm version: cell-18.1.7.0.0_LINUX.X64_180821-1.x86_64

Active image version: 18.1.7.0.0.180821
Active image kernel version: 4.1.12-94.8.4.el6uek
Active image activated: 2019-03-17 03:27:41 -0500
Active image status: success
Active system partition on device: /dev/md5
Active software partition on device: /dev/md7

Cell boot usb partition: /dev/sdm1
Cell boot usb version: 18.1.7.0.0.180821

Inactive image version: undefined
Rollback to the inactive partitions: Impossible


CellCLI> import celldisk all force
No cell disks qualified for this import operation

CellCLI> list physicaldisk
         12:0            PST0XV          normal
         12:1            PZNDSV          normal
         12:2            PT5Z4V          normal
         12:3            PU3XLV          normal
         12:4            PYAKLV          normal
         12:5            PV828V          normal
         12:6            PZE5NV          normal
         12:7            PYV0YV          normal
         12:8            PZKUXV          normal
         12:9            PYD86V          normal
         12:10           PZL15V          normal
         12:11           PZPLAV          normal
         FLASH_1_1       S2T7NCAHA00958  normal
         FLASH_2_1       S2T7NCAHA00986  normal
         FLASH_4_1       S2T7NCAHA00956  normal
         FLASH_5_1       S2T7NCAHA00947  normal

CellCLI> list celldisk
         CD_00_dm01cel02        normal
         CD_01_dm01cel02        normal
         CD_02_dm01cel02        normal
         CD_03_dm01cel02        normal
         CD_04_dm01cel02        normal
         CD_05_dm01cel02        normal
         CD_06_dm01cel02        normal
         CD_07_dm01cel02        normal
         CD_08_dm01cel02        normal
         CD_09_dm01cel02        normal
         CD_10_dm01cel02        normal
         CD_11_dm01cel02        normal
         FD_00_dm01cel02        normal
         FD_01_dm01cel02        normal
         FD_02_dm01cel02        normal
         FD_03_dm01cel02        normal

CellCLI> list griddisk
         DATA_DM01_CD_00_dm01cel02     active
         DATA_DM01_CD_01_dm01cel02     active
         DATA_DM01_CD_02_dm01cel02     active
         DATA_DM01_CD_03_dm01cel02     active
         DATA_DM01_CD_04_dm01cel02     active
         DATA_DM01_CD_05_dm01cel02     active
         DATA_DM01_CD_06_dm01cel02     active
         DATA_DM01_CD_07_dm01cel02     active
         DATA_DM01_CD_08_dm01cel02     active
         DATA_DM01_CD_09_dm01cel02     active
         DATA_DM01_CD_10_dm01cel02     active
         DATA_DM01_CD_11_dm01cel02     active
         DBFS_DG_CD_02_dm01cel02       active
         DBFS_DG_CD_03_dm01cel02       active
         DBFS_DG_CD_04_dm01cel02       active
         DBFS_DG_CD_05_dm01cel02       active
         DBFS_DG_CD_06_dm01cel02       active
         DBFS_DG_CD_07_dm01cel02       active
         DBFS_DG_CD_08_dm01cel02       active
         DBFS_DG_CD_09_dm01cel02       active
         DBFS_DG_CD_10_dm01cel02       active
         DBFS_DG_CD_11_dm01cel02       active
         RECO_DM01_CD_00_dm01cel02     active
         RECO_DM01_CD_01_dm01cel02     active
         RECO_DM01_CD_02_dm01cel02     active
         RECO_DM01_CD_03_dm01cel02     active
         RECO_DM01_CD_04_dm01cel02     active
         RECO_DM01_CD_05_dm01cel02     active
         RECO_DM01_CD_06_dm01cel02     active
         RECO_DM01_CD_07_dm01cel02     active
         RECO_DM01_CD_08_dm01cel02     active
         RECO_DM01_CD_09_dm01cel02     active
         RECO_DM01_CD_10_dm01cel02     active
         RECO_DM01_CD_11_dm01cel02     active


[root@dm01cel02 ~]# cellcli -e list flashcache detail
         name:                   dm01cel02_FLASHCACHE
         cellDisk:               FD_03_dm01cel02,FD_01_dm01cel02,FD_02_dm01cel02,FD_00_dm01cel02
         creationTime:           2019-03-17T03:19:43-05:00
         degradedCelldisks:
         effectiveCacheSize:     11.64312744140625T
         id:                     574c3bd1-7a35-42ba-a03b-75f3a93edac7
         size:                   11.64312744140625T
         status:                 normal

[root@dm01cel02 ~]# cellcli -e list flashlog detail
         name:                   dm01cel02_FLASHLOG
         cellDisk:               FD_03_dm01cel02,FD_00_dm01cel02,FD_01_dm01cel02,FD_02_dm01cel02
         creationTime:           2019-03-17T03:19:43-05:00
         degradedCelldisks:
         effectiveSize:          512M
         efficiency:             100.0
         id:                     73cd8288-c6d8-42c3-95a1-97ce287cf7d0
         size:                   512M
         status:                 normal
 
SQL> select a.name,b.path,b.state,b.mode_status,b.failgroup
    from v$asm_diskgroup a, v$asm_disk b
    where a.group_number=b.group_number
    and b.failgroup=’dm01cel02′
    order by 2,1;

no rows selected


SQL> alter diskgroup DBFS_DG add disk ‘o/192.168.1.1;192.168.1.2/DBFS_DG_*_dm01cel02’ force;

Diskgroup altered.

 
SQL> alter diskgroup DATA_DM01 add disk ‘o/192.168.1.1;192.168.1.2/DATA_DM01_*_dm01cel02’ force;

Diskgroup altered.

 
SQL> alter diskgroup RECO_DM01 add disk ‘o/192.168.1.1;192.168.1.2/RECO_DM01_*_dm01cel02’ force;

Diskgroup altered.

 
SQL> select * from v$asm_operation;

GROUP_NUMBER OPERA STAT      POWER     ACTUAL      SOFAR   EST_WORK   EST_RATE EST_MINUTES ERROR_CODE
———— —– —- ———- ———- ———- ———- ———- ———– ——————————————–
           1 REBAL RUN           4          4     204367    3521267      13041         254
           3 REBAL WAIT          4


 
SQL> select * from v$asm_operation;

no rows selected


SQL> col path for a70
SQL> set lines 200
SQL> set pages 200
SQL> select a.name,b.path,b.state,b.mode_status,b.failgroup
    from v$asm_diskgroup a, v$asm_disk b
    where a.group_number=b.group_number
    and b.failgroup=’dm01cel02′
    order by 2,1;  2    3    4    5

NAME                           PATH                                                                   STATE    MODE_ST FAILGROUP
—————————— ———————————————————————- ——– ——- ——————————
DATA_DM01                     o/192.168.1.1;192.168.1.2/DATA_DM01_CD_00_dm01cel02              NORMAL   ONLINE  dm01cel02
DATA_DM01                     o/192.168.1.1;192.168.1.2/DATA_DM01_CD_01_dm01cel02              NORMAL   ONLINE  dm01cel02
DATA_DM01                     o/192.168.1.1;192.168.1.2/DATA_DM01_CD_02_dm01cel02              NORMAL   ONLINE  dm01cel02
DATA_DM01                     o/192.168.1.1;192.168.1.2/DATA_DM01_CD_03_dm01cel02              NORMAL   ONLINE  dm01cel02
DATA_DM01                     o/192.168.1.1;192.168.1.2/DATA_DM01_CD_04_dm01cel02              NORMAL   ONLINE  dm01cel02
DATA_DM01                     o/192.168.1.1;192.168.1.2/DATA_DM01_CD_05_dm01cel02              NORMAL   ONLINE  dm01cel02
DATA_DM01                     o/192.168.1.1;192.168.1.2/DATA_DM01_CD_06_dm01cel02              NORMAL   ONLINE  dm01cel02
DATA_DM01                     o/192.168.1.1;192.168.1.2/DATA_DM01_CD_07_dm01cel02              NORMAL   ONLINE  dm01cel02
DATA_DM01                     o/192.168.1.1;192.168.1.2/DATA_DM01_CD_08_dm01cel02              NORMAL   ONLINE  dm01cel02
DATA_DM01                     o/192.168.1.1;192.168.1.2/DATA_DM01_CD_09_dm01cel02              NORMAL   ONLINE  dm01cel02
DATA_DM01                     o/192.168.1.1;192.168.1.2/DATA_DM01_CD_10_dm01cel02              NORMAL   ONLINE  dm01cel02
DATA_DM01                     o/192.168.1.1;192.168.1.2/DATA_DM01_CD_11_dm01cel02              NORMAL   ONLINE  dm01cel02
DBFS_DG                        o/192.168.1.1;192.168.1.2/DBFS_DG_CD_02_dm01cel02                 NORMAL   ONLINE  dm01cel02
DBFS_DG                        o/192.168.1.1;192.168.1.2/DBFS_DG_CD_03_dm01cel02                 NORMAL   ONLINE  dm01cel02
DBFS_DG                        o/192.168.1.1;192.168.1.2/DBFS_DG_CD_04_dm01cel02                 NORMAL   ONLINE  dm01cel02
DBFS_DG                        o/192.168.1.1;192.168.1.2/DBFS_DG_CD_05_dm01cel02                 NORMAL   ONLINE  dm01cel02
DBFS_DG                        o/192.168.1.1;192.168.1.2/DBFS_DG_CD_06_dm01cel02                 NORMAL   ONLINE  dm01cel02
DBFS_DG                        o/192.168.1.1;192.168.1.2/DBFS_DG_CD_07_dm01cel02                 NORMAL   ONLINE  dm01cel02
DBFS_DG                        o/192.168.1.1;192.168.1.2/DBFS_DG_CD_08_dm01cel02                 NORMAL   ONLINE  dm01cel02
DBFS_DG                        o/192.168.1.1;192.168.1.2/DBFS_DG_CD_09_dm01cel02                 NORMAL   ONLINE  dm01cel02
DBFS_DG                        o/192.168.1.1;192.168.1.2/DBFS_DG_CD_10_dm01cel02                 NORMAL   ONLINE  dm01cel02
DBFS_DG                        o/192.168.1.1;192.168.1.2/DBFS_DG_CD_11_dm01cel02                 NORMAL   ONLINE  dm01cel02
RECO_DM01                     o/192.168.1.1;192.168.1.2/RECO_DM01_CD_00_dm01cel02              NORMAL   ONLINE  dm01cel02
RECO_DM01                     o/192.168.1.1;192.168.1.2/RECO_DM01_CD_01_dm01cel02              NORMAL   ONLINE  dm01cel02
RECO_DM01                     o/192.168.1.1;192.168.1.2/RECO_DM01_CD_02_dm01cel02              NORMAL   ONLINE  dm01cel02
RECO_DM01                     o/192.168.1.1;192.168.1.2/RECO_DM01_CD_03_dm01cel02              NORMAL   ONLINE  dm01cel02
RECO_DM01                     o/192.168.1.1;192.168.1.2/RECO_DM01_CD_04_dm01cel02              NORMAL   ONLINE  dm01cel02
RECO_DM01                     o/192.168.1.1;192.168.1.2/RECO_DM01_CD_05_dm01cel02              NORMAL   ONLINE  dm01cel02
RECO_DM01                     o/192.168.1.1;192.168.1.2/RECO_DM01_CD_06_dm01cel02              NORMAL   ONLINE  dm01cel02
RECO_DM01                     o/192.168.1.1;192.168.1.2/RECO_DM01_CD_07_dm01cel02              NORMAL   ONLINE  dm01cel02
RECO_DM01                     o/192.168.1.1;192.168.1.2/RECO_DM01_CD_08_dm01cel02              NORMAL   ONLINE  dm01cel02
RECO_DM01                     o/192.168.1.1;192.168.1.2/RECO_DM01_CD_09_dm01cel02              NORMAL   ONLINE  dm01cel02
RECO_DM01                     o/192.168.1.1;192.168.1.2/RECO_DM01_CD_10_dm01cel02              NORMAL   ONLINE  dm01cel02
RECO_DM01                     o/192.168.1.1;192.168.1.2/RECO_DM01_CD_11_dm01cel02              NORMAL   ONLINE  dm01cel02

34 rows selected.

 
 
Conclusion
 
In this article we have demonstrated step by step procedure to perform Storage Cell Rescue. You may have to perform the Storage cell rescue for multiple reasons such as root file system corrupted, Kernel panic, server rebooting continuously and so on. With the help of CELLBOOT usb one can perform the storage cell rescue very easily.
 
0

We had a failed hard disk on a Exadata Storage cell X6-2. So we scheduled the Oracle Field Engineer to replace the bad disk. Oracle Field Engineer came onsite and replaced the faulty hard disk. Post hard disk replacement we found that the physical disk and luns are created successfully but the Cell disk and Grid disks were not created automatically. When a hard disk is replaced, the lun, cell disk and grid disks are created automatically and grid disks are added to ASM disk group for you without any manual intervention. In some odd cases, the Cell disk and grid disks are not created automatically, in those cases you must manually create the Cell disk, create the Grid disks with proper sizes and add them to the ASM disk group.

In this article we will demonstrate how to create the Cell disk, Grid disks manually and add them to the respective ASM Disk Group.

Environment

  • Exadata X6-2 Elastic Configuration
  • 4 Compute nodes and 6 Storage cells
  • Hard Disk Size: 8TB
  • 3 ASM Disk Group: DATA, RECO & DBFS_DG
  • Total Number of Grid disks: DATA – 72, RECO – 72 & DBFS_DG – 60

Here the disk in the location 8:5 was back and replaced.

Before Replacing Hard Disk:

CellCLI> list physicaldisk
         8:0             PYJZKV                  normal
         8:1             PMU3LV                  normal
         8:2             P1Y2KV                  normal
         8:3             PYH48V                  normal
         8:4             PY7MAV                  normal
         8:5             PPZ47V                  not present
         8:6             PEJKHR                  normal
         8:7             PY4XSV                  normal
         8:8             PYL00V                  normal
         8:9             PV5RGV                  normal
         8:10            PSU26V                  normal
         8:11            PY522V                  normal
         FLASH_1_1       CVMD522500AG1P6NGN      normal
         FLASH_2_1       CVMD522401AC1P6NGN      normal
         FLASH_4_1       CVMD522500AC1P6NGN      normal
         FLASH_5_1       CVMD5230000Y1P6NGN      normal

CellCLI> list lun
         0_0     0_0     normal
         0_1     0_1     normal
         0_2     0_2     normal
         0_3     0_3     normal
         0_4     0_4     normal
         0_5     0_5     not present
         0_6     0_6     normal
         0_7     0_7     normal
         0_8     0_8     normal
         0_9     0_9     normal
         0_10    0_10    normal
         0_11    0_11    normal
         1_1     1_1     normal
         2_1     2_1     normal
         4_1     4_1     normal
         5_1     5_1     normal

After replacing Hard Disk:

CellCLI> list physicaldisk
         8:0             PYJZKV                  normal
         8:1             PMU3LV                  normal
         8:2             P1Y2KV                  normal
         8:3             PYH48V                  normal
         8:4             PY7MAV                  normal
         8:5             PPZ47V                  normal
         8:6             PEJKHR                  normal
         8:7             PY4XSV                  normal
         8:8             PYL00V                  normal
         8:9             PV5RGV                  normal
         8:10            PSU26V                  normal
         8:11            PY522V                  normal
         FLASH_1_1       CVMD522500AG1P6NGN      normal
         FLASH_2_1       CVMD522401AC1P6NGN      normal
         FLASH_4_1       CVMD522500AC1P6NGN      normal
         FLASH_5_1       CVMD5230000Y1P6NGN      normal

CellCLI> list lun
         0_0     0_0     normal
         0_1     0_1     normal
         0_2     0_2     normal
         0_3     0_3     normal
         0_4     0_4     normal
         0_5     0_5     normal
         0_6     0_6     normal
         0_7     0_7     normal
         0_8     0_8     normal
         0_9     0_9     normal
         0_10    0_10    normal
         0_11    0_11    normal
         1_1     1_1     normal
         2_1     2_1     normal
         4_1     4_1     normal
         5_1     5_1     normal

[root@dm01cel03 ~]# cellcli -e list physicaldisk 8:5 detail
         name:                   8:5
         deviceId:               21
         deviceName:             /dev/sdf
         diskType:               HardDisk
         enclosureDeviceId:      8
         errOtherCount:          0
         luns:                   0_5
         makeModel:              “HGST    H7280A520SUN8.0T”
         physicalFirmware:       PD51
         physicalInsertTime:     2018-05-18T10:52:29-05:00
         physicalInterface:      sas
         physicalSerial:         PPZ47V
         physicalSize:           7.1536639072000980377197265625T
         slotNumber:             5
         status:                 normal

[root@dm01cel03 ~]# cellcli -e list celldisk where lun=0_5 detail


[root@dm01cel03 ~]# cellcli -e list griddisk where cellDisk=CD_05_cm01cel01 attributes name,status
DATA_CD_05_dm01cel03 not present
DBFS_DG_CD_05_dm01cel03 not present
RECO_CD_05_dm01cel03 not present

[root@dm01cel03 ~]# cellcli -e list griddisk where celldisk=CD_05_dm01cel03 detail
         name:                   DATA_CD_05_dm01cel03
         availableTo:
         cachingPolicy:          default
         cellDisk:               CD_05_dm01cel03
         comment:                “Cluster dm01-cluster diskgroup DATA”
         creationTime:           2016-03-29T20:25:56-05:00
         diskType:               HardDisk
         errorCount:             0
         id:                     db221d77-25b0-4f9e-af6f-95e1c3134af5
         size:                   5.6953125T
         status:                 not present

         name:                   DBFS_DG_CD_05_dm01cel03
         availableTo:
         cachingPolicy:          default
         cellDisk:               CD_05_dm01cel03
         comment:                “Cluster dm01-cluster diskgroup DBFS_DG”
         creationTime:           2016-03-29T20:25:53-05:00
         diskType:               HardDisk
         errorCount:             0
         id:                     216fbec9-6ed4-4ef6-a0d4-d09517906fd5
         size:                   33.796875G
         status:                 not present

         name:                   RECO_CD_05_dm01cel03
         availableTo:
         cachingPolicy:          none
         cellDisk:               CD_05_dm01cel03
         comment:                “Cluster dm01-cluster diskgroup RECO”
         creationTime:           2016-03-29T20:25:58-05:00
         diskType:               HardDisk
         errorCount:             0
         id:                     e8ca6943-0ddd-48ab-b890-e14bbf4e591c
         size:                   1.42388916015625T
         status:                 not present

We can clearly see that the GRID DISKs are not present. So we have to create the GRID DISKs Manually.

Steps to create Celldisk, Griddisks and add them to ASM Disk Group


  • List Cell Disks

[root@dm01cel03 ~]# cellcli -e list celldisk
         CD_00_dm01cel03         normal
         CD_01_dm01cel03         normal
         CD_02_dm01cel03         normal
         CD_03_dm01cel03         normal
         CD_04_dm01cel03         normal
         CD_05_dm01cel03         not present
         CD_06_dm01cel03         normal
         CD_07_dm01cel03         normal
         CD_08_dm01cel03         normal
         CD_09_dm01cel03         normal
         CD_10_dm01cel03         normal
         CD_11_dm01cel03         normal
         FD_00_dm01cel03         normal
         FD_01_dm01cel03         normal
         FD_02_dm01cel03         normal
         FD_03_dm01cel03         normal

  • List Grid Disks

[root@dm01cel03 ~]# cellcli -e list griddisk
         DATA_CD_00_dm01cel03       active
         DATA_CD_01_dm01cel03       active
         DATA_CD_02_dm01cel03       active
         DATA_CD_03_dm01cel03       active
         DATA_CD_04_dm01cel03       active
         DATA_CD_05_dm01cel03       not present
         DATA_CD_06_dm01cel03       active
         DATA_CD_07_dm01cel03       active
         DATA_CD_08_dm01cel03       active
         DATA_CD_09_dm01cel03       active
         DATA_CD_10_dm01cel03       active
         DATA_CD_11_dm01cel03       active
         DBFS_DG_CD_02_dm01cel03    active
         DBFS_DG_CD_03_dm01cel03    active
         DBFS_DG_CD_04_dm01cel03    active
         DBFS_DG_CD_05_dm01cel03    not present
         DBFS_DG_CD_06_dm01cel03    active
         DBFS_DG_CD_07_dm01cel03    active
         DBFS_DG_CD_08_dm01cel03    active
         DBFS_DG_CD_09_dm01cel03    active
         DBFS_DG_CD_10_dm01cel03    active
         DBFS_DG_CD_11_dm01cel03    active
         RECO_CD_00_dm01cel03       active
         RECO_CD_01_dm01cel03       active
         RECO_CD_02_dm01cel03       active
         RECO_CD_03_dm01cel03       active
         RECO_CD_04_dm01cel03       active
         RECO_CD_05_dm01cel03       not present
         RECO_CD_06_dm01cel03       active
         RECO_CD_07_dm01cel03       active
         RECO_CD_08_dm01cel03       active
         RECO_CD_09_dm01cel03       active
         RECO_CD_10_dm01cel03       active
         RECO_CD_11_dm01cel03       active

  • List Physical Disk details

[root@dm01cel03 ~]# cellcli -e list physicaldisk where physicalSerial=PPZ47V detail
         name:                   8:5
         deviceId:               21
         deviceName:             /dev/sdf
         diskType:               HardDisk
         enclosureDeviceId:      8
         errOtherCount:          0
         luns:                   0_5
         makeModel:              “HGST    H7280A520SUN8.0T”
         physicalFirmware:       PD51
         physicalInsertTime:     2018-05-18T10:52:29-05:00
         physicalInterface:      sas
         physicalSerial:         PPZ47V
         physicalSize:           7.1536639072000980377197265625T
         slotNumber:             5
         status:                 normal

  • Let’s try to create the Cell Disk

[root@dm01cel03 ~]# cellcli -e create celldisk CD_09_dm01cel03 lun=0_5

CELL-02526: Pre-existing cell disk: CD_09_dm01cel03

It says the Cell Disk already exists.

  • Let’s try to create the Grid Disk. To create the Grid Disk with proper size, get the Grid Disk size from a good Cell Disk as shown below.

[root@dm01cel03 ~]# cellcli -e list griddisk where celldisk=CD_07_dm01cel03 attributes name,size,offset
         DATA_CD_07_dm01cel03       5.6953125T              32M
         DBFS_DG_CD_07_dm01cel03         33.796875G         7.1192474365234375T
         RECO_CD_07_dm01cel03       1.42388916015625T       5.6953582763671875T

  • Now create the Grid Disk

[root@dm01cel03 ~]# cellcli -e create griddisk DATA_CD_05_dm01cel03 celldisk=CD_05_dm01cel03,size=5.6953125T

CELL-02701: Cannot create grid disk on cell disk CD_05_dm01cel03 because its status is not normal.

Looks like we can’t create the Grid Disk. We will now drop the Cell Disk and recreate it.

  • Drop Cell Disk

CellCLI> drop celldisk CD_05_dm01cel03 force
CellDisk CD_05_dm01cel03 successfully dropped

  • Create Cell Disk

CellCLI> create celldisk CD_05_dm01cel03 lun=0_5
CellDisk CD_05_dm01cel03 successfully created

  • Create Grid Disks with proper sizes

CellCLI> create griddisk DATA_CD_05_dm01cel03 celldisk=CD_05_dm01cel03,size=5.6953125T
GridDisk DATA_CD_05_dm01cel03 successfully created

CellCLI> create griddisk RECO_CD_05_dm01cel03 celldisk=CD_05_dm01cel03,size=1.42388916015625T
GridDisk RECO_CD_05_dm01cel03 successfully created

CellCLI> create griddisk DBFS_DG_CD_05_dm01cel03 celldisk=CD_05_dm01cel03,size=33.796875G
GridDisk DBFS_DG_CD_05_dm01cel03 successfully created

  • List Grid Disks

CellCLI> list griddisk where celldisk=CD_05_dm01cel03 attributes name,size,offset
         DATA_CD_05_dm01cel03       5.6953125T              32M
         DBFS_DG_CD_05_dm01cel03         33.796875G              7.1192474365234375T
         RECO_CD_05_dm01cel03       1.42388916015625T       5.6953582763671875T

CellCLI> list griddisk
         DATA_CD_00_dm01cel03       active
         DATA_CD_01_dm01cel03       active
         DATA_CD_02_dm01cel03       active
         DATA_CD_03_dm01cel03       active
         DATA_CD_04_dm01cel03       active
         DATA_CD_05_dm01cel03       active
         DATA_CD_06_dm01cel03       active
         DATA_CD_07_dm01cel03       active
         DATA_CD_08_dm01cel03       active
         DATA_CD_09_dm01cel03       active
         DATA_CD_10_dm01cel03       active
         DATA_CD_11_dm01cel03       active
         DBFS_DG_CD_02_dm01cel03    active
         DBFS_DG_CD_03_dm01cel03    active
         DBFS_DG_CD_04_dm01cel03    active
         DBFS_DG_CD_05_dm01cel03    active
         DBFS_DG_CD_06_dm01cel03    active
         DBFS_DG_CD_07_dm01cel03    active
         DBFS_DG_CD_08_dm01cel03    active
         DBFS_DG_CD_09_dm01cel03    active
         DBFS_DG_CD_10_dm01cel03    active
         DBFS_DG_CD_11_dm01cel03    active
         RECO_CD_00_dm01cel03       active
         RECO_CD_01_dm01cel03       active
         RECO_CD_02_dm01cel03       active
         RECO_CD_03_dm01cel03       active
         RECO_CD_04_dm01cel03       active
         RECO_CD_05_dm01cel03       active
         RECO_CD_06_dm01cel03       active
         RECO_CD_07_dm01cel03       active
         RECO_CD_08_dm01cel03       active
         RECO_CD_09_dm01cel03       active
         RECO_CD_10_dm01cel03       active
         RECO_CD_11_dm01cel03       active

The Grid Disks show active now. We can go ahead and add them to ASM disk Group Manually by connecting to ASM instance.


  • Log into +ASM1 instance and add the new disk.  Set the rebalance power higher (11) to perform faster rebalance operation.

dm01db01-orcldb1 {/home/oracle}:. oraenv
ORACLE_SID = [orcldb1] ? +ASM1
The Oracle base remains unchanged with value /u01/app/oracle
dm01db01-+ASM1 {/home/oracle}:sqlplus / as sysasm

SQL*Plus: Release 11.2.0.4.0 Production on Wed May 23 09:30:13 2018
Copyright (c) 1982, 2013, Oracle.  All rights reserved.

Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 – 64bit Production
With the Real Application Clusters and Automatic Storage Management options

SQL> alter diskgroup DATA add failgroup dm01CEL03 disk ‘o/192.168.10.1;192.168.10.2/DATA_CD_05_dm01cel03’ name DATA_CD_05_dm01cel03 rebalance power 11;

Diskgroup altered.

SQL> alter diskgroup RECO add failgroup dm01CEL03 disk ‘o/192.168.10.1;192.168.10.2/RECO_CD_05_dm01cel03’ name RECO_CD_05_dm01cel03 rebalance power 11;

Diskgroup altered.

SQL> alter diskgroup DBFS_DG add failgroup dm01CEL03 disk ‘o/192.168.10.1;192.168.10.2/DBFS_DG_CD_05_dm01cel03’ name DBFS_DG_CD_05_dm01cel03 rebalance power 11;

Diskgroup altered.

SQL> select a.name,a.total_mb,a.free_mb,a.type,
    decode(a.type,’NORMAL’,a.total_mb/2,’HIGH’,a.total_mb/3) avail_mb,
    decode(a.type,’NORMAL’,a.free_mb/2,’HIGH’,a.free_mb/3) usable_mb,
    count(b.path) cell_disks  from v$asm_diskgroup a, v$asm_disk b
    where a.group_number=b.group_number group by a.name,a.total_mb,a.free_mb,a.type,
    decode(a.type,’NORMAL’,a.total_mb/2,’HIGH’,a.total_mb/3) ,
    decode(a.type,’NORMAL’,a.free_mb/2,’HIGH’,a.free_mb/3)
   order by 2,1;

               Total MB    Free MB          Total MB    Free MB
Disk Group          Raw        Raw TYPE       Usable     Usable     CELL_DISKS
———— ———- ———- —— ———- ———- ———-
DBFS_DG    2076480    2074688 NORMAL    1038240    1037344         60
RECO     107500032   57573496 HIGH     35833344   19191165         72
DATA     429981696  282905064 HIGH    143327232   94301688         72

SQL> select * from v$asm_operation;

GROUP_NUMBER OPERA STAT      POWER     ACTUAL      SOFAR   EST_WORK   EST_RATE EST_MINUTES ERROR_CODE
———— —– —- ———- ———- ———- ———- ———- ———– ——————————————–
           1 REBAL RUN          11         11      85992    6697959      11260         587
           3 REBAL WAIT         11


SQL> select * from gv$asm_operation;

no rows selected


Conclusion

In this article we have learned how to create the Celldisk, Griddisks and add the newly created Griddisks to ASM Disk Group. When a hard disk is replaced, the lun, celldisk and griddisks are created automatically and griddisks are added to ASM disk group for you without any manual intervention. In some cases, if the Celldisk and grid disks are not created automatically, then you must manually create them and add them to the ASM disk group.

1

While working on Exadata Storage cell patching, the patching failed due to failed internal USB drive on a storage cell.

Oracle uses internal USB drive to backup Exadata Storage cell automatically. We don’t have to backup Storage cell manually.

In this article I will demonstrate how to replace a failed USB drive an Exadata Storage cell

  • You will receive an automated smtp alert (if configured) similar to below.



  • You can also use the following command to check for USB drive failure

[root@dm01cel01 ~]# cellcli -e list alerthistory
         1_1     2018-04-10T18:25:42-05:00       warning         “Internal USB status is not present.  Affected USB Slots : 0”

  • You can also use the following ILOM command to check for USB drive failure

[root@dm01cel01 ~]# ssh dm01cel01-ilom
Password: *******


Oracle(R) Integrated Lights Out Manager



Version 3.2.10.22.a r121524



Copyright (c) 2017, Oracle and/or its affiliates. All rights reserved.


Warning: HTTPS certificate is set to factory default.



Hostname: dm01cel01-ilom



-> show /SYS/MB/USB0


  • Open an SR with Oracle if an ASR is already generated
  • Upload sundgia.sh and ILOM Snapshot to the SR for investigation
  • Oracle confirms the that USB drive is faulted
  • Oracle opens a Field task
  • Oracle dispatch team contacts the SR owner with the hardware dispatch details
  • Confirm the Hardware replacement schedule over email and/or SR
  • Schedule the Hardware replacement
  • Oracle FE arrives at the data center with the new USB drive
  • Shutdown the storage cell by following the steps from the MOS below

Steps to shut down or reboot an Exadata storage cell without affecting ASM (Doc ID 1188080.1)
  • Oracle FE replaces the faulty USB drive and power up the storage cell
  • Confirm that the USB drive is good

-> show /SYS/MB/USB0


 /SYS/MB/USB0

    Targets:


    Properties:

        type = USB Port
        fault_state = OK
        clear_fault_action = (none)


    Commands:

        cd
        set
        show


->



[root@dm01cel01 ~]# cellcli -e list alerthistory

         1_2     2018-04-11T02:45:49-05:00       clear           “Internal USB status is back to normal.  Affected USB Slots : 0”

  • You will receive an automated smtp alert (if configured) similar to below that the USB status is back to normal




Conclusion

In this article we have learned how to replace a faulty USB drive in Exadata Storage cell. Oracle uses USB drive to backup Exadata Storage cell automatically. We don’t have to backup Storage cell manually.

0