Uncategorized

Exadata Diagnostic Utilities

Overview

As we know Exadata database machine is a combination of Hardware and Software. Over a period of time these hardware and software can failure or give performance issues. In my experience I have seen hardware failures during the Exadata install and as well immediate after completion of Installation. But the newer generations of Exadata machines are more stable and you might find fewer hardware failures.

When you work with Support on hardware, software or performance issues they would request you to run the following Diagnostic utilities and uploaded the diagnostic data.

The example of hardware, software or performance issues are as follows:
  • Hardware failure: Hard disk, Flash disk, mother board, processor, DIMM and so on
  • Software issues: Operating system, firmware, Oracle software and so on
  • Performance issues: Operating system and database

In this article I will demonstrator how to execute these utilities with a live example.

Diagnostic Utilities at a Glance 
Utility Name
Description
SOSREPORT
collects detailed information about the hardware and configuration of Oracle Linux server
SUNDIAG
The utility is used for gathering hardware related information
ILOM SNAPSHOT
The utility is used for gathering hardware related information
EXAWATCHER
It collects the system data and reporting utilities. This information is mostly used for troubleshooting OS or performance issue.


Now let’s take a look at these utilities in little more detail

SOSREPORT UTILITY

SOSREPORT utility collects detailed information about the hardware and configuration of Oracle Linux server.

Steps to run SOSREPORT:
  • Log in to the compute node or storage cell as root user account for which you are running SOSREPORT (example: dm01db01) 

[root@dm01db01 ~]# id
uid=0(root) gid=0(root) groups=0(root), 1(bin), 2(daemon), 3(sys), 4(adm), 6(disk), 10(wheel)
  • You will find the sosreport utility under /usr/sbin location. You also use the Linux command “locate” to search for the utility.

[root@dm01db01 ~]# locate sosreport
/usr/sbin/sosreport
  • Execute the sosreport utility at the shell as follows

[root@dm01db01 ~]# /usr/sbin/sosreport
On the execution of this utility it will ask you for the input.
a.      Press ENTER to continue, or CTRL-C to quit.
Press ENTER on your keyboard

b.      Please enter your first initial and last name [dm01db01]:
Press ENTER to accept the default or enter a value of your choice
                       
c.       Please enter the case number that you are generating this report for:
Enter the SR number

At this time it will take a while (approximately 5-6 minutes) and generate a compressed archive file in /tmp directory.
  • Use WinScp or similar utility to copy the output file to your desktop
  • Upload the output file to Oracle Support for review.

Sample SOSREPORT Run:

[root@dm01db01 ~]# locate sosreport
/usr/sbin/sosreport

[root@dm01db01 ~]# /usr/sbin/sosreport

sosreport (version 2.2)

   This command will collect diagnostic and configuration
information from this Oracle Linux system and installed
applications.

  An archive containing the collected information will be
generated in /tmp and may be provided to a Oracle USA
support representative.

  Any information provided to Oracle USA will be treated
in accordance with the published support policies at:

   https://linux.oracle.com/

  The generated archive may contain data considered
sensitive and its content should be reviewed by the
originating organization before being passed to any third
party.

  No changes will be made to system configuration.

Press ENTER to continue, or CTRL-C to quit.

Please enter your first initial and last name [dm01db01]:
Please enter the case number that you are generating this report for [None]:  3-1386095xxxx

  Running plugins. Please wait …

  Completed [66/66] …
Creating compressed archive…

Your sosreport has been generated and saved in:
  /tmp/sosreport-dm01db01.3-1386095-20161230023010-9f83.tar.xz

The md5sum is: b1ccc01a773cbd36d463ba07b57c9f83

Please send this file to your support representative.
[root@dm01db01 ~]#


SUNDIAG UTILITY

The utility is used for gathering hardware related information. Oracle Support uses this diagnostic data to assess the hardware failure.

Steps to run SUNDIAG report:

Follow the steps listed below to run the sundiag.sh utility.
  •  Log in to the compute node or storage cell as root user account for which you are running       SUNDIAG (example: dm01db01)
  • You will find the sundiag utility under /opt/oracle.SupportTools location. You also use the Linux           command “locate” to search for the utility.

[root@dm01cel01 ~]# locate sundiag
/opt/oracle.SupportTools/sundiag.sh
  • Run the sundiag.sh utility

 [root@dm01cel01 ~]# /opt/oracle.SupportTools/sundiag.sh
  • Use WinScp or similar utility to copy the output file to your deskto
  • Upload the output file to Oracle Support for review.

Sample sundiag Run:

[root@dm01cel01 ~]# locate sundiag
/opt/oracle.SupportTools/sundiag.sh

 [root@dm01cel01 ~]# /opt/oracle.SupportTools/sundiag.sh

Oracle Exadata Database Machine – Diagnostics Collection Tool

Last alert date is beyond 7 days. Skipping OSW/Metrics collection
Gathering Linux information

Skipping collection of OSWatcher/ExaWatcher logs, Cell Metrics and Traces
Skipping ILOM collection. Use the ilom or snapshot options, or login to ILOM
over the network and run Snapshot separately if necessary.

/var/log/exadatatmp/sundiag_dm01cel01_1605NM70AD_2016_12_30_02_30
Gathering Cell information

==============================================================================
Done. The report files are bzip2 compressed in /var/log/exadatatmp/sundiag_dm01cel01_1605NM70AD_2016_12_30_02_30.tar.bz2
==============================================================================

If you read the output carefully, sundiag utility doesn’t collect the ILOM data and Exawatcher data. That is the reason we need to run separate utilities to gather these data.

ILOM SNAPSHOT

This data is required to troubleshoot a hardware issue by Oracle Support.
Follow the steps listed below to run the snapshot utility in GUI.

Using GUI Interface:
  • For ILOM 2.x and 3.0
  • Open a web browser (use something other than Internet Explorer) and enter the following address

Note:  You may see complaints about security – ignore or override – click I understand the risks/Add exception/Confirm Security Exception
  • Enter root as User Name and its password and click on Log In. This will take you to the Home screen.

  • Select Maintenance -> Snapshot.  (ILOM 2.x and 3.0)

 The Service Snapshot Utility page appears.
ILOM 2.x and 3.0 will look similar to this:
  • From the above Screen, Select Data Set “Normal”, Select Transfer Method as “Browser” and Click “Run”.

Normal – Specifies that ILOM, operating system, and hardware information is collected.
The download file will be saved according to your browser settings.
Important Note:  Do not enable this option:Collect Only Log Files from Data Set‘.  Doing so will limit the snapshot to a much smaller sub-section of log files.
  •  In the dialog box, specify the directory to which to save the file and the file name.

Click OK.
The file is saved to the specified directory.
  • For ILOM version 3.1

If the ILOM version is 3.1 which is the latest version shipped with X3/X4 Exadata. There a little difference in the design.
  • Open a web browser (use something other than Internet Explorer) and enter the following address

Note:  You may see complaints about security – ignore or override – click I understand the risks/Add exception/Confirm Security Exception.
  • Enter root as User Name and its password and click on Log In. This will take you to the Home Screen.

  • Select ILOM Administration -> Maintenance -> Snapshot (ILOM 3.1)

The Service Snapshot Utility page appears.
ILOM 3.1 will look similar to this:
  • Select Data Set “Normal”, Select Transfer Method as “Browser” and Click “Run”.

Normal – Specifies that ILOM, operating system, and hardware information is collected.
The download file will be saved according to your browser settings.
Important Note:  Do not enable this option:Collect Only Log Files from Data Set‘.  Doing so will limit the snapshot to a much smaller sub-section of log files.
  • In the dialog box, specify the directory to which to save the file and the file name.

Click OK.
The file is saved to the specified directory.

  • Using CLI

Follow the steps listed below to run the snapshot utility in command line.

  • Log in to the ILOM CLI interface

[root@dm01db01 ~]# ssh dm01db01-ilom
Password:

  • You will see a similar output

Oracle(R) Integrated Lights Out Manager
Version 3.0.16.15.j r101695
Copyright (c) 2015, Oracle and/or its affiliates. All rights reserved.
->
  • After the ‘->’ prompt, type the command in below:

-> set /SP/diag/snapshot dataset=normal
Set ‘dataset’ to ‘normal’
  • Type the following command:

-> set /SP/diag/snapshot dump_uri=sftp://root:welcome@10.10.10.51/tmp
Set ‘dump_uri’ to ‘sftp://root:welcome@10.10.10.51/tmp’
  • Next cd to the snapshot directory and view the status:

-> cd /SP/diag/snapshot
/SP/diag/snapshot

-> show

 /SP/diag/snapshot
    Targets:

    Properties:
        dataset = normal
        dump_uri = (Cannot show property)
        encrypt_output = false
        result = Running

    Commands:
        cd
        set
        show

->

Wait for the snapshot process to complete. It may take several minutes.
Continue to check until the status is shows ‘Snapshot Complete’
Do not use, access, view, copy or move the snapshot file until it has completed.

-> show

 /SP/diag/snapshot
    Targets:

    Properties:
        dataset = normal
        dump_uri = (Cannot show property)
        encrypt_output = false
        result = Collecting data into sftp://root:*****@10.10.10.51/tmp/dm01db01-ilom_10.10.23.56_2016-12-29T08-44-09.zip
Snapshot Complete.
Done.

    Commands:
        cd
        set
        show
  •  You can now exit the CLI interface and find your snapshot in the directory you specified.

-> exit
Connection to dm01db01-ilom closed.
  • The file name will look similar to this example:

 dm01db01-ilom_10.10.10.56_2016-12-29T08-44-09.zip
Do not rename the snapshot file.

Exawatcher
The /opt/oracle.ExWatcher directory contains the Oracle ExaWatcher system data gathering and reporting utilities. This information is mostly used for troubleshooting OS or performance issue.

Steps for ExaWatcher collection:
  •  Navigate to the Exawatcher directory and execute the GetExawatcherResults.sh script.

[root@dm01db01 ~]#  cd /opt/oracle.ExaWatcher/

[root@dm01db01 oracle.ExaWatcher]# ls -ltr GetExaWatcherResults.sh
-rwx—— 1 root root 21012 Oct 21  2015 GetExaWatcherResults.sh

[root@dm01db01 oracle.ExaWatcher]# ./GetExaWatcherResults.sh -h
Usage:
  ./GetExaWatcherResults.sh {–from $FromTime [–to $ToTime] | –at $AtTime [–range $Hours]}
                                               [–archivedir $ArchiveDir]
                                               [–scp $UserName@SrvName]
                                               [–filter $SamplerName]
                                               [–resultdir $ResultDir]

  • To collect from/to a certain date and time:

# ./GetExaWatcherResults.sh –from 07/31/2015_00:00:00 –to 07/31/2015_23:00:00
                                                   mm/dd/yyyy hh:mm:si

Default output location: /opt/oracle.ExaWatcher/archive/ExtractedResults

[root@dm01cel07 ExtractedResults]# cd /opt/oracle.ExaWatcher/archive/ExtractedResults
  • To collect for a time range. In this case, we are collecting for 4 hrs before and after 1300:

# ./GetExaWatcherResults.sh –at 08/05/2015_13:00:00 –range 4
                                                 mm/dd/yyyy hh:mm:si

The default archive directory is /opt/oracle.ExaWatcher/archive/ExtractedResults; however, you can change this using [-d|–archivedir] flag:
Example of changed default archive location to /tmp/ExaWatcherArchive:

# ./GetExaWatcherResults.sh –from 01/25/2014_13:00:00 –to 01/25/2014_14:00:00 –archivedir /tmp/ExaWatcherArchive


Conclusion
In this article we have seen different Exadata diagnostic utilities and how to execute them to collect the diagnostic data. These utilities are used on a daily basis to assess the hardware, software and performance issues.