Good Contents Are Everywhere, But Here, We Deliver The Best of The Best.Please Hold on!
Uncategorized
I was working on changing password for the administrative user accounts on all Exadata Components. I encountered a strange issue while changing the root password on Infiniband Switch. We were unable to change the root password on IB Siwtch using command line method. We used couple different command line methods to change the root password on IB switches but all of them failed. This could be a BUG, firmware issue or something else.

In this article we demonstrate how to change the root password on an Exadata infiniband switch using Browser User Interface.

Issue 1: Using passwd command

Tried to change the root user password using passwd command using dcli. This method assumes you are have ssh equivalence setup from compute node 1. As you can see the command failed saying to use the ILOM shell. In the past I have used the same command successfully to change the root password on IB Switches.

[root@dm01db01 ~]#  dcli -g ibswitch_group -l root “echo welcome1 | passwd –stdin root”
dm01sw-ibb01: This command should not be used for ILOM users.
dm01sw-ibb01: Please use ILOM shell to handle password for this user.
dm01sw-ibb01: Example:
dm01sw-ibb01: -> set /SP/users/root password
dm01sw-ibb01:
dm01sw-iba01: This command should not be used for ILOM users.
dm01sw-iba01: Please use ILOM shell to handle password for this user.
dm01sw-iba01: Example:
dm01sw-iba01: -> set /SP/users/root password
dm01sw-iba01:


So I decided to login to the IB switch directly and use the passwd command instead of running from dcli. The passwd command fail again with the same error.

[root@dm01sw-iba01 ~]# ssh dm01sw-ibb01
You are now logged in to the root shell.
It is recommended to use ILOM shell instead of root shell.
All usage should be restricted to documented commands and documented
config files.
To view the list of documented commands, use “help” at linux prompt.

[root@dm01sw-ibb01 ~]# hostname
dm01sw-ibb01

[root@dm01sw-iba01 ~]# passwd root
This command should not be used for ILOM users.
Please use ILOM shell to handle password for this user.
Example:
   -> set /SP/users/root password



eBook - Oracle Exadata X8M Patching Recipes | Netsoftmate

Issue 2: Using ILOM Shell

As the passwd command failed asking to use the ILOM shell, I login to the IB switch as ilom-admin and executed the change password command. What I see is, the password change command failed at ILOM prompt as well.

[root@dm01sw-iba01 ~]# su – ilom-admin

Oracle(R) Integrated Lights Out Manager
Version 2.2.7-1 ILOM 3.2.6 r118629
Copyright (c) 2017, Oracle and/or its affiliates. All rights reserved.
Warning: HTTPS certificate is set to factory default.
 

Hostname: dm01sw-iba01

-> set /SP/users/root welcome1
set: Invalid command syntax
Usage: set [-script] [target] <property>=<value> [<property>=<value>…]



 

Solution: Using Browser User Interface

I have decided to use the BUI to change the password.

Steps:

  • Open a Browser and enter the IB Switch hostname or IP address
https://dm01sw-ibb01.netsoftmate.com
  • Accept the security warning and proceed to connect to the IB Switch
  • Enter the username and password to connect to the IB Switch

  • This show the summary page

  • On the left Pan, expand ILOM administration and select User Management

  • Click on  User Accounts, Select root user and click on edit button

  • Enter the new password and confirm and Finally click on the Save button to change the password.

  • To Verify the new password, open a Putty session and ssh to IB Switch using new password.
[root@dm01db01 ~]# ssh dm01sw-ibb01
Password:
You are now logged in to the root shell.
It is recommended to use ILOM shell instead of root shell.
All usage should be restricted to documented commands and documented
config files.
To view the list of documented commands, use “help” at linux prompt.

[root@dm01sw-ibb01 ~]# hostname
dm01sw-ibb01



Conclusion

In this article we have learned how to change the root password on Infiniband Switch using Browser User Interface when the command line option doesn’t work.
1

Introduction

We had a FAN failure on Exadata Infiniband Switch (FAN2). Scheduled the faulty hardware replacement with Oracle. The Oracle Feild Engineer came to the Customer Data Center and replaced the faulty FAN on Infiniband Switch. The FAN replacement was successful however the fault was not cleared automatically. We can still see the FAN was marked faulted from Infiniband BUI and CLI.

From Infiniband Browser User Interface



In this article we will demonstrate how to clear the fault on Infiniband Switch after hardware replacement.


  • Login to the Infiniband switch using Putty as root user and check the Infiniband health. From the output below we can see the FANs are all good.
[root@dm01sw-iba01 ~]# env_test
Environment test started:
Starting Environment Daemon test:
Environment daemon running
Environment Daemon test returned OK
Starting Voltage test:
Voltage ECB OK
Measured 3.3V Main = 3.28 V
Measured 3.3V Standby = 3.39 V
Measured 12V = 11.97 V
Measured 5V = 5.02 V
Measured VBAT = 3.14 V
Measured 2.5V = 2.49 V
Measured 1.8V = 1.79 V
Measured I4 1.2V = 1.22 V
Voltage test returned OK
Starting PSU test:
PSU 0 present OK
PSU 1 present OK
PSU test returned OK
Starting Temperature test:
Back temperature 40
Front temperature 41
SP temperature 57
Switch temperature 55, maxtemperature 59
Temperature test returned OK
Starting FAN test:
Fan 0 not present
Fan 1 running at rpm 17004
Fan 2 running at rpm 15696
Fan 3 running at rpm 17004
Fan 4 not present
FAN test returned OK
Starting Connector test:
Connector test returned OK
Starting Onboard ibdevice test:
Switch OK
All Internal ibdevices OK
Onboard ibdevice test returned OK
Starting SSD test:
SSD test returned OK
Starting Auto-link-disable test:
Auto-link-disable test returned OK
Environment test PASSED

  • Check the FAN Speed. FAN looks good.
[root@dm01sw-iba01 ~]# getfanspeed
Fan 0 not present
Fan 1 running at rpm 17004
Fan 2 running at rpm 15478
Fan 3 running at rpm 17004
Fan 4 not present


  • Switch to the ilom-admin user
[root@dm01sw-iba01 ~]# su – ilom-admin

Oracle(R) Integrated Lights Out Manager

Version 2.2.9-3 ILOM 3.2.11 r124039

Copyright (c) 2018, Oracle and/or its affiliates. All rights reserved.

Warning: HTTPS certificate is set to factory default.

Hostname: dm01sw-iba01.netsoftmate.com

->


  • Now check the fault table for any faulty components. Now we can see the FAN2 is Faulted though the FAN was replaced with a new FAN.
-> show / -a -l 4 -o table fault_state
Target                                  | Property                                     | Value
—————————————-+———————————————-+——————————————————————–
/SYS                                    | fault_state                                  | OK
/SYS/MB                                 | fault_state                                  | OK
/SYS/PSU0                               | fault_state                                  | OK
/SYS/PSU1                               | fault_state                                  | OK
/SYS/FAN1                               | fault_state                                  | OK
/SYS/FAN2                               | fault_state                                  | Faulted /SYS/FAN3                               | fault_state                                  | OK

->


  • You can also execute the below command to identify the fault
-> show -d targets /SP/faultmgmt

 /SP/faultmgmt
    Targets:
        shell
        0 (/SYS/FAN2)


  • Clear the Fault as show below
-> set /SYS/FAN2 clear_fault_action=true
Are you sure you want to clear /SYS/FAN2 (y/n)? y
Set ‘clear_fault_action’ to ‘true’


  • Verify the fault is cleared
-> show / -a -l 4 -o table fault_state
Target                                  | Property                                     | Value
—————————————-+———————————————-+——————————————————————–
/SYS                                    | fault_state                                  | OK
/SYS/MB                                 | fault_state                                  | OK
/SYS/PSU0                               | fault_state                                  | OK
/SYS/PSU1                               | fault_state                                  | OK
/SYS/FAN1                               | fault_state                                  | OK
/SYS/FAN2                               | fault_state                               
   | OK
/SYS/FAN3                               | fault_state                                  | OK

-> show -d targets /SP/faultmgmt

 /SP/faultmgmt
    Targets:
        shell


  • Verify from the Infiniband Band BUI


Conclusion

In this article we have learned how to identify the fault and clear it manually on an Exadata Infiniband Switch. The ILOM commands comes handy for clearing the fault. You can also clear the fault using the Browser User Interface (BUI).
2

Oracle provides “Exachk” utility to conduct a comprehensive Health Check on Oracle SuperCluster to validate hardware, firmware and configuration. Exachk Utility is available for Oracle Engineered Systems such as Exadata (V2 and above), Exalogic, Exalytics, SuperCluster, MiniCluster, ZDLRA & Big Data. 

When Exachk is run from the primary LDOM as user ‘root’ it will discover and run exachk utility for each component:
  • Configuration checks for Compute nodes, Storage cells and InfiniBand Switches
  • Grid Infrastructure, Database and ASM and Operating System software checks

When Exachk is run in a Database zone or Virtualized environment it will collect data for:
  • All RAC Node
  • All Database Instance
  • Grid Infrastructure

You can also run Exachk on a specific component such as:
  • Database Servers
  • Storage Cells
  • Infiniband Switches
  • Grid Infrastructure, Database & ASM and so on

It is recommended to run Exachk as root user and have SSH equivalence setup in the SuperCluster. But you can run Exachk as ordinary user and without having root ssh setup.

It is recommended to execute the latest exachk at the following situation:
  • Monthly
  • Before any planned maintenance activity
  • Immediately after completion of planned maintenance activity
  • Immediately after an outage or incident

Exachk Binary and output file location:
  • Default Exachk Location: /opt/oracle.SupportTools/exachk
  • Defautl Exachk Output Location: /opt/oracle.SupportTools/exachk


Courtesy Oracle


Steps to Deploy and Execute Exachk utility on SuperCluster


  • Download Latest Exachk Utility
You can download the latest Exachk from MOS note 1070954.1

  • Download deploy_exachk.sh script to deploy and install Exachk in all Primary LDOM and in each Zone

  • Copy the downloaded Exachk utility and deploy_exachk.sh into /opt/oracle.SupportTools
# cd /opt/oracle.SupportTools
# mv exachk Exachk-bkp

  • Deploy Exachk as follows
# cd /opt/oracle.SupportTools/
# ./deploy_exachk.sh exachk.zip
# ls -ltr
# cd exachk
# ls -l exachk

As of writing the latest Exachk available is 18.2.0_20180518

  • Verify Exachk Version on LDOM
# cd /opt/oracle.SupportTools/exachk
# ./exachk -v

  • To verify Exachk version on all zones in a LDOM
# zoneadm list | grep -v global > zone_list
# hostname >> zone_list
# /opt/oracle.supercluster/bin/dcli -g zone_list -l root /opt/oracle.SupportTools/exachk/exachk -v

Note: root RSA keys should be set up for SSH

  • Execute Exachk on Primary LDOM or Global Zone
# cd /opt/oracle.SupportTools/exachk
# ./exachk

  • Execute Exachk in non-global zone local zone
Login to non-global zone local zone using zlogin and execute the following commands

# zlogin <hostname>
# cd /opt/oracle.SupportTools/exachk
# ./exachk

Important Note: In zones there is currently an issue with discovery, and so one must set the RAT_ORACLE_HOME and RAT_GRID_HOME environment variables in some cases.


Conclusion
In this article we have learned to perform Oracle SuperCluster Stack Health Check using Exachk utility. Exachk Utility is available for Oracle Engineered Systems such as Exadata (V2 and above), Exalogi, Exalytics, SuperCluster, MiniCluster, ZDLRA & Big Data.

1

Oracle has released Exachk utility 18c on May 18th, 2018. Let’s quickly check if there are differences in Exachk 18c or it is similar to Exachk 12c.

Download latest Exachk 18c utility from MOS note:
Oracle Exadata Database Machine exachk or HealthCheck (Doc ID 1070954.1)

Changes in Exachk 18.2 can be found at:
https://docs.oracle.com/cd/E96145_01/OEXUG/changes-in-this-release-18-2-0.htm#OEXUG-GUID-88FCFBC6-C647-47D3-898C-F4C712117B8B

Steps to Execute Exachk 18c on Exadata Database Machine


Download the latest Exachk from MOS note. Here I am downloading Exachk 18c.

Download Completed

Using WinSCP copy the exachk.zip file to Exadata Compute node



Copy completed. List the Exachk file on Compute node

Unzip the Exachk zip file

Verify Exachk version

Execute Exachk Health by running the following command

Exachk execution completed

Review the Exachk report and take necessary action



Conclusion
In this article we have learned how to execute Oracle Exadata Database Machine health Check using Exachk 18c. Using Exachk 18c is NO different than it’s previous releases.

0

You want to execute Operating System or Exadata commands on multiple Exadata Compute nodes and Storage Cell in parallel. To accomplish this you must setup passwordless SSH across compute nodes and storage cells.

If SSH equivalence is NOT setup and you execute the dcli command you will see the follow messages. This mean the SSH equivalence is not configured.

[root@dm01db01 ~]# dcli -g dbs_group -l root ‘uptime’
The authenticity of host ‘dm01db03 (10.10.10.195)’ can’t be established.
RSA key fingerprint is 40:81:3c:6d:ef:e7:1f:d7:a0:df:eb:f5:ea:92:a5:db.
Are you sure you want to continue connecting (yes/no)? The authenticity of host ‘dm01db05 (10.10.10.197)’ can’t be established.
RSA key fingerprint is 1b:95:47:0b:92:b4:13:9f:55:b7:a3:2a:56:27:9f:1c.
Are you sure you want to continue connecting (yes/no)? The authenticity of host ‘dm01db02 (10.10.10.194)’ can’t be established.
RSA key fingerprint is e1:0d:90:46:16:88:74:01:02:5a:11:90:63:b1:6b:1c.
Are you sure you want to continue connecting (yes/no)? The authenticity of host ‘dm01db01 (10.10.10.193)’ can’t be established.
RSA key fingerprint is 2b:6f:43:4b:86:29:bb:ed:a6:03:c5:34:75:cf:45:34.
Are you sure you want to continue connecting (yes/no)? The authenticity of host ‘dm01db04 (10.10.10.196)’ can’t be established.
RSA key fingerprint is 44:a7:ad:65:c3:1c:fb:0b:0b:28:2c:b6:a5:f3:59:99.
Are you sure you want to continue connecting (yes/no)? The authenticity of host ‘dm01db07 (10.10.10.199)’ can’t be established.
RSA key fingerprint is 25:5f:9a:e6:a4:7a:13:ba:e2:e7:7d:2e:79:53:49:2b.
Are you sure you want to continue connecting (yes/no)? root@dm01db06’s password: root@dm01db08’s password:

In this article we will demonstrate how to setup SSH equivalence on Exadata Database Machine.


Steps to Setup SSH Equivalence

1. Create the following files if doesn’t exist

[root@dm01db08 ~]# cat dbs_group
dm01db01
dm01db02
dm01db03
dm01db04
dm01db05
dm01db06
dm01db07
dm01db08

[root@dm01db08 ~]# cat cell_group
dm01cel01
dm01cel02
dm01cel03
dm01cel04
dm01cel05
dm01cel06
dm01cel07

[root@dm01db08 ~]# cat all_group
dm01db01
dm01db02
dm01db03
dm01db04
dm01db05
dm01db06
dm01db07
dm01db08
dm01cel01
dm01cel02
dm01cel03
dm01cel04
dm01cel05
dm01cel06
dm01cel07
dm01sw-iba01
dm01sw-ibb01

2. Navigate to Support directory on Compute node 1 as shown below

[root@dm01db01 ~]# cd /opt/oracle.SupportTools/

3. Oracle has provided a script *setup_ssh_eq.sh* to configure SSH equivalence across Exadata components. Execute the script as shown below. Here we are setting the SSH equivalence for root user

[root@dm01db01 oracle.SupportTools]# ./setup_ssh_eq.sh ~/all_group root welcome1
/root/.ssh/id_dsa already exists.
Overwrite (y/n)?
/root/.ssh/id_rsa already exists.
Overwrite (y/n)?
spawn dcli -c dm01db01 -l root -k
dm01db01: ssh key already exists
expect: spawn id exp4 not open
    while executing
“expect “*?assword:*””
spawn dcli -c dm01db02 -l root -k
dm01db02: ssh key already exists
expect: spawn id exp4 not open
    while executing
“expect “*?assword:*””
spawn dcli -c dm01db03 -l root -k
dm01db03: ssh key already exists
expect: spawn id exp4 not open
    while executing
“expect “*?assword:*””
spawn dcli -c dm01db04 -l root -k
dm01db04: ssh key already exists
expect: spawn id exp4 not open
    while executing
“expect “*?assword:*””
spawn dcli -c dm01db05 -l root -k
dm01db05: ssh key already exists
expect: spawn id exp4 not open
    while executing
“expect “*?assword:*””
spawn dcli -c dm01db06 -l root -k
dm01db06: ssh key already exists
expect: spawn id exp4 not open
    while executing
“expect “*?assword:*””
spawn dcli -c dm01db07 -l root -k
dm01db07: ssh key already exists
expect: spawn id exp4 not open
    while executing
“expect “*?assword:*””
spawn dcli -c dm01db08 -l root -k
dm01db08: ssh key already exists
expect: spawn id exp4 not open
    while executing
“expect “*?assword:*””
spawn dcli -c dm01cel01 -l root -k
dm01cel01: ssh key already exists
expect: spawn id exp4 not open
    while executing
“expect “*?assword:*””
spawn dcli -c dm01cel02 -l root -k
dm01cel02: ssh key already exists
expect: spawn id exp4 not open
    while executing
“expect “*?assword:*””
spawn dcli -c dm01cel03 -l root -k
dm01cel03: ssh key already exists
expect: spawn id exp4 not open
    while executing
“expect “*?assword:*””
spawn dcli -c dm01cel04 -l root -k
dm01cel04: ssh key already exists
expect: spawn id exp4 not open
    while executing
“expect “*?assword:*””
spawn dcli -c dm01cel05 -l root -k
dm01cel05: ssh key already exists
expect: spawn id exp4 not open
    while executing
“expect “*?assword:*””
spawn dcli -c dm01cel06 -l root -k
dm01cel06: ssh key already exists
expect: spawn id exp4 not open
    while executing
“expect “*?assword:*””
spawn dcli -c dm01cel07 -l root -k
dm01cel07: ssh key already exists
expect: spawn id exp4 not open
    while executing
“expect “*?assword:*””
spawn dcli -c dm01sw-iba01 -l root -k
dm01sw-iba01: ssh key already exists
expect: spawn id exp4 not open
    while executing
“expect “*?assword:*””
spawn dcli -c dm01sw-ibb01 -l root -k
dm01sw-ibb01: ssh key already exists
expect: spawn id exp4 not open
    while executing
“expect “*?assword:*””

4. Verify SSH equivalence is working fine

[root@dm01db08 ~]# dcli -g ~/all_group -l root ‘uptime’
dm01db01: 09:16:41 up 21 days, 15:47,  1 user,  load average: 1.80, 3.02, 3.35
dm01db02: 09:16:41 up 21 days, 15:38,  0 users,  load average: 2.93, 2.44, 2.37
dm01db03: 09:16:41 up 21 days, 15:19,  0 users,  load average: 2.16, 2.27, 2.77
dm01db04: 09:16:41 up 21 days, 15:12,  0 users,  load average: 4.07, 4.33, 4.14
dm01db05: 09:16:41 up 21 days, 15:09,  0 users,  load average: 2.45, 2.82, 2.75
dm01db06: 09:16:41 up 21 days, 15:06,  0 users,  load average: 1.70, 2.04, 2.60
dm01db07: 09:16:41 up 21 days, 15:02,  0 users,  load average: 6.39, 4.46, 4.20
dm01db08: 09:16:41 up 21 days, 14:59,  1 user,  load average: 1.66, 1.81, 1.97
dm01cel01: 09:16:41 up 203 days, 19:00,  0 users,  load average: 1.40, 1.97, 2.21
dm01cel02: 09:16:41 up 203 days, 18:59,  0 users,  load average: 1.52, 2.08, 2.38
dm01cel03: 09:16:41 up 203 days, 18:59,  0 users,  load average: 1.00, 1.71, 2.02
dm01cel04: 09:16:41 up 203 days, 18:59,  0 users,  load average: 1.08, 1.59, 1.92
dm01cel05: 09:16:41 up 203 days, 18:59,  0 users,  load average: 1.24, 1.53, 1.82
dm01cel06: 09:16:41 up 203 days, 18:59,  0 users,  load average: 1.09, 1.60, 1.96
dm01cel07: 09:16:41 up 203 days, 19:00,  0 users,  load average: 1.01, 1.37, 1.60
dm01sw-iba01: 09:16:42 up 539 days,  6:21,  0 users,  load average: 0.79, 0.99, 1.07
dm01sw-ibb01: 14:49:54 up 539 days,  9:43,  0 users,  load average: 1.26, 1.44, 1.41

[root@dm01db08 ~]# dcli -g dbs_group -l root ‘imageinfo | grep “Image version”‘
dm01db01: Image version: 12.1.2.3.6.170713
dm01db02: Image version: 12.1.2.3.6.170713
dm01db03: Image version: 12.1.2.3.6.170713
dm01db04: Image version: 12.1.2.3.6.170713
dm01db05: Image version: 12.1.2.3.6.170713
dm01db06: Image version: 12.1.2.3.6.170713
dm01db07: Image version: 12.1.2.3.6.170713
dm01db08: Image version: 12.1.2.3.6.170713



[root@dm01db08 ~]# dcli -g cell_group -l root ‘imageinfo | grep “Active image version”‘
dm01cel01: Active image version: 12.1.2.3.6.170713
dm01cel02: Active image version: 12.1.2.3.6.170713
dm01cel03: Active image version: 12.1.2.3.6.170713
dm01cel04: Active image version: 12.1.2.3.6.170713
dm01cel05: Active image version: 12.1.2.3.6.170713
dm01cel06: Active image version: 12.1.2.3.6.170713
dm01cel07: Active image version: 12.1.2.3.6.170713

[root@dm01db08 ~]# ssh dm01sw-iba01 version
SUN DCS 36p version: 2.1.8-1
Build time: Sep 18 2015 10:26:47
SP board info:
Manufacturing Date: 2015.05.13
Serial Number: “NCDKO0980”
Hardware Revision: 0x0200
Firmware Revision: 0x0000
BIOS version: SUN0R100
BIOS date: 06/22/2010

Conclusion

In this article we have learned how to configure SSH equivalence on Exadata Database Machine. Using the setup_ssh_eq.sh script is very easy setup SSH equivalence.

0