Tag: Exadata diagnostic utilities

  • Exadata Diagnostic Utilities

    Overview
    As we know Exadata database
    machine is a combination of Hardware and Software. Over a period of time these
    hardware and software can failure or give performance issues. In my experience
    I have seen hardware failures during the Exadata install and as well immediate
    after completion of Installation. But the newer generations of Exadata machines
    are more stable and you might find fewer hardware failures.

    When you work with Support on
    hardware, software or performance issues they would request you to run the
    following Diagnostic utilities and uploaded the diagnostic data.
    The example of hardware, software
    or performance issues are as follows:
    • Hardware
      failure:
      Hard disk, Flash disk, mother board, processor, DIMM and so on
    • Software
      issues:
      Operating system, firmware, Oracle software and so on
    • Performance
      issues:
      Operating system and database

    In this article I will
    demonstrator how to execute these utilities with a live example.

    Diagnostic Utilities at a Glance 
    Utility Name
    Description
    SOSREPORT
    collects
    detailed information about the hardware and configuration of Oracle Linux
    server
    SUNDIAG
    The
    utility is used for gathering hardware related information
    ILOM
    SNAPSHOT
    The
    utility is used for gathering hardware related information
    EXAWATCHER
    It
    collects the system data and reporting utilities. This information is mostly
    used for troubleshooting OS or performance issue.



    Now let’s take a look at these
    utilities in little more detail

    SOSREPORT UTILITY

    SOSREPORT
    utility collects detailed information about the hardware and configuration of
    Oracle Linux server.
    Steps to run SOSREPORT:
    • Log in to
      the compute node or storage cell as root user account for which you are running
      SOSREPORT (example: dm01db01) 

    [root@dm01db01
    ~]# id
    uid=0(root)
    gid=0(root) groups=0(root), 1(bin), 2(daemon), 3(sys), 4(adm), 6(disk),
    10(wheel)
    • You will
      find the sosreport utility under
      /usr/sbin location. You also use the Linux command “locate” to search for the
      utility.

    [root@dm01db01 ~]# locate
    sosreport
    /usr/sbin/sosreport
    • Execute the
      sosreport utility at the shell as follows

    [root@dm01db01 ~]# /usr/sbin/sosreport
    On the
    execution of this utility it will ask you for the input.
    a.     
    Press ENTER
    to continue, or CTRL-C to quit.
    Press ENTER
    on your keyboard
    b.     
    Please enter
    your first initial and last name [dm01db01]:
    Press ENTER
    to accept the default or enter a value of your choice
                           
    c.      
    Please enter
    the case number that you are generating this report for:
    Enter the SR
    number
    At this time
    it will take a while (approximately 5-6 minutes) and generate a compressed
    archive file in /tmp directory.
    • Use WinScp or
      similar utility to copy the output file to your desktop
    • Upload the output file to
      Oracle Support for review.

    Sample SOSREPORT Run:
    [root@dm01db01
    ~]# locate sosreport
    /usr/sbin/sosreport
    [root@dm01db01
    ~]# /usr/sbin/sosreport
    sosreport
    (version 2.2)
       This command will collect diagnostic and
    configuration
    information
    from this Oracle Linux system and installed
    applications.
      An archive containing the collected
    information will be
    generated
    in /tmp and may be provided to a Oracle USA
    support
    representative.
      Any information provided to Oracle USA will
    be treated
    in
    accordance with the published support policies at:
       https://linux.oracle.com/
      The generated archive may contain data
    considered
    sensitive
    and its content should be reviewed by the
    originating
    organization before being passed to any third
    party.
      No changes will be made to system
    configuration.
    Press
    ENTER to continue, or CTRL-C to quit.
    Please
    enter your first initial and last name [dm01db01]:
    Please
    enter the case number that you are generating this report for [None]:  3-1386095xxxx
      Running plugins. Please wait …
      Completed [66/66] …
    Creating
    compressed archive…
    Your
    sosreport has been generated and saved in:
      /tmp/sosreport-dm01db01.3-1386095-20161230023010-9f83.tar.xz
    The
    md5sum is: b1ccc01a773cbd36d463ba07b57c9f83
    Please
    send this file to your support representative.
    [root@dm01db01
    ~]#



    SUNDIAG UTILITY
    The utility is used for gathering
    hardware related information. Oracle Support uses this diagnostic data to
    assess the hardware failure.
    Steps to run SUNDIAG report:
    Follow the steps listed below to
    run the sundiag.sh utility.
    •  Log in to
      the compute node or storage cell as root user account for which you are running       SUNDIAG (example: dm01db01)
    • You will
      find the sundiag utility under /opt/oracle.SupportTools
      location. You also use the Linux           command “locate” to search for the utility.

    [root@dm01cel01 ~]# locate sundiag
    /opt/oracle.SupportTools/sundiag.sh
    • Run the
      sundiag.sh utility

     [root@dm01cel01 ~]#
    /opt/oracle.SupportTools/sundiag.sh
    • Use WinScp
      or similar utility to copy the output file to your deskto
    • Upload the
      output file to Oracle Support for review.

    Sample sundiag Run:
    [root@dm01cel01
    ~]# locate sundiag
    /opt/oracle.SupportTools/sundiag.sh
     [root@dm01cel01 ~]#
    /opt/oracle.SupportTools/sundiag.sh
    Oracle
    Exadata Database Machine – Diagnostics Collection Tool
    Last
    alert date is beyond 7 days. Skipping OSW/Metrics collection
    Gathering
    Linux information
    Skipping collection of
    OSWatcher/ExaWatcher logs, Cell Metrics and Traces
    Skipping ILOM collection. Use the
    ilom or snapshot options, or login to ILOM
    over the network and run Snapshot
    separately if necessary.
    /var/log/exadatatmp/sundiag_dm01cel01_1605NM70AD_2016_12_30_02_30
    Gathering
    Cell information
    ==============================================================================
    Done.
    The report files are bzip2 compressed in
    /var/log/exadatatmp/sundiag_dm01cel01_1605NM70AD_2016_12_30_02_30.tar.bz2
    ==============================================================================
    If you read the output carefully,
    sundiag utility doesn’t collect the ILOM data and Exawatcher data. That is the
    reason we need to run separate utilities to gather these data.

    ILOM SNAPSHOT
    This data is required to
    troubleshoot a hardware issue by Oracle Support.
    Follow the steps listed below to
    run the snapshot utility in GUI.
    Using GUI Interface:
    • For ILOM 2.x and 3.0
    • Open a web browser (use something
      other than Internet Explorer) and enter the following address

    Note:  You may see complaints about security –
    ignore or override – click I understand the risks/Add exception/Confirm
    Security Exception
    • Enter root as User Name and its
      password and click on Log In. This will take you to the
      Home screen.

    • Select Maintenance
      -> Snapshot.  (ILOM 2.x and 3.0)

     The Service Snapshot
    Utility page appears.
    ILOM 2.x and 3.0 will
    look similar to this:
    • From the above Screen, Select
      Data Set “Normal”, Select Transfer Method as “Browser” and Click “Run”.

    Normal – Specifies
    that ILOM, operating system, and hardware information is collected.
    The download file will be saved
    according to your browser settings.
    Important Note:  Do
    not enable this option:
    Collect Only Log Files from Data Set‘. 
    Doing so will limit the snapshot to a much smaller sub-section of log files.
    •  In the dialog box, specify the directory to which to save the file and
      the file name.

    Click OK.
    The file is saved to the specified directory.
    • For ILOM version 3.1

    If the ILOM version is 3.1 which is the latest version shipped with
    X3/X4 Exadata. There a little difference in the design.
    • Open
      a web browser (use something other than Internet Explorer) and enter the
      following address

    Note:  You may see complaints about security –
    ignore or override – click I understand the risks/Add exception/Confirm
    Security Exception.
    • Enter root as User Name and its
      password and click on Log In. This will take you to the Home Screen.

    • Select ILOM
      Administration -> Maintenance -> Snapshot (ILOM 3.1)

    The Service Snapshot
    Utility page appears.
    ILOM 3.1 will look
    similar to this:
    • Select Data Set “Normal”, Select
      Transfer Method as “Browser” and Click “Run”.

    Normal – Specifies
    that ILOM, operating system, and hardware information is collected.
    The download file will be saved
    according to your browser settings.
    Important Note:  Do
    not enable this option:
    Collect Only Log Files from Data Set‘. 
    Doing so will limit the snapshot to a much smaller sub-section of log files.
    • In the dialog box, specify the directory to which to save the file and
      the file name.

    Click OK.
    The file is saved to the specified directory.


    • Using CLI

    Follow the steps listed below to
    run the snapshot utility in command line.

    • Log in to the ILOM CLI interface

    [root@dm01db01 ~]# ssh dm01db01-ilom
    Password:
    • You will see a similar output

    Oracle(R)
    Integrated Lights Out Manager
    Version
    3.0.16.15.j r101695
    Copyright
    (c) 2015, Oracle and/or its affiliates. All rights reserved.
    ->
    • After the ‘->’ prompt, type
      the command in below:

    -> set /SP/diag/snapshot
    dataset=normal
    Set ‘dataset’ to ‘normal’
    • Type the following command:

    -> set /SP/diag/snapshot
    dump_uri=sftp://root:welcome@10.10.10.51/tmp
    Set ‘dump_uri’ to ‘sftp://root:welcome@10.10.10.51/tmp’
    • Next cd to the snapshot directory
      and view the status:

    -> cd /SP/diag/snapshot
    /SP/diag/snapshot
    -> show
     /SP/diag/snapshot
        Targets:
        Properties:
            dataset = normal
            dump_uri = (Cannot show property)
            encrypt_output = false
            result = Running
        Commands:
            cd
            set
            show
    ->
    Wait for the snapshot process to
    complete. It may take several minutes.
    Continue to check until the
    status is shows ‘Snapshot Complete’
    Do not use, access, view, copy or
    move the snapshot file until it has completed.
    -> show
     /SP/diag/snapshot
        Targets:
        Properties:
            dataset = normal
            dump_uri = (Cannot show property)
            encrypt_output = false
            result = Collecting data into sftp://root:*****@10.10.10.51/tmp/dm01db01-ilom_10.10.23.56_2016-12-29T08-44-09.zip
    Snapshot Complete.
    Done.
        Commands:
            cd
            set
            show
    •  You can now exit the CLI
      interface and find your snapshot in the directory you specified.

    -> exit
    Connection to dm01db01-ilom
    closed.
    • The file name will look similar
      to this example:

     dm01db01-ilom_10.10.10.56_2016-12-29T08-44-09.zip
    Do not rename the snapshot file.

    Exawatcher
    The /opt/oracle.ExWatcher
    directory contains the Oracle ExaWatcher system data gathering and reporting
    utilities. This information is mostly used for troubleshooting OS or performance
    issue.
    Steps for ExaWatcher collection:
    •  Navigate to the Exawatcher directory and execute the GetExawatcherResults.sh script.

    [root@dm01db01 ~]#  cd /opt/oracle.ExaWatcher/

    [root@dm01db01
    oracle.ExaWatcher]# ls -ltr GetExaWatcherResults.sh
    -rwx—— 1 root root 21012 Oct
    21  2015 GetExaWatcherResults.sh

    [root@dm01db01 oracle.ExaWatcher]#
    ./GetExaWatcherResults.sh -h
    Usage:
     
    ./GetExaWatcherResults.sh {–from $FromTime [–to $ToTime] | –at
    $AtTime [–range $Hours]}
                                                  
    [–archivedir $ArchiveDir]
                                                   [–scp $UserName@SrvName]
                                                  
    [–filter $SamplerName]
                                                  
    [–resultdir $ResultDir]
    • To collect from/to a certain
      date and time:

    # ./GetExaWatcherResults.sh
    –from 07/31/2015_00:00:00 –to 07/31/2015_23:00:00
                                                       mm/dd/yyyy hh:mm:si
    Default output location:
    /opt/oracle.ExaWatcher/archive/ExtractedResults
    [root@dm01cel07
    ExtractedResults]# cd /opt/oracle.ExaWatcher/archive/ExtractedResults
    • To collect for a time range.
      In this case, we are collecting for 4 hrs before and after 1300:

    # ./GetExaWatcherResults.sh –at
    08/05/2015_13:00:00 –range 4
                                                     mm/dd/yyyy hh:mm:si

    The default archive directory
    is /opt/oracle.ExaWatcher/archive/ExtractedResults; however, you can change
    this using [-d|–archivedir] flag:
    Example of changed default
    archive location to /tmp/ExaWatcherArchive:
    # ./GetExaWatcherResults.sh
    –from 01/25/2014_13:00:00 –to 01/25/2014_14:00:00 –archivedir
    /tmp/ExaWatcherArchive
    Conclusion
    In this article we have seen
    different Exadata diagnostic utilities and how to execute them to collect the
    diagnostic data. These utilities are used on a daily basis to assess the
    hardware, software and performance issues.