How to do a remote power cycle on NVIDIA QM9700 Switch?

How to do a remote power cycle on NVIDIA QM9700 Switch?

1. Purpose

To perform a remote reboot of NVIDIA QM9700 switch using the NVIDIA's Web GUI.
If the remote reboot does not resolve any issues occurred, a physical power-cycle should be carried out onsite as per OEM recommendations.

2. Scope

This MOP applies to:

  • NVIDIA QM9700 Switch

  • All connected nodes

3. Impact Analysis      

  • During the reboot/power-cycle, respective link connectivity for nodes connected to the switch will be unavailable.

  • InfiniBand fabric is capable of auto-rerouting; however, application-level disruption cannot be completely avoided.

  • Running workloads may experience interruptions or slight performance degradation.

4. Pre-Checks

  1. Confirm no high-priority or long-running jobs are active on affected nodes

  2. Notify all stakeholders and users about the maintenance window

  3. Ensure access to NVIDIA QM9700 Web GUI

  4. Verify backup access to Out-Of-Band (OOB) management

  5. Confirm onsite team availability if physical intervention becomes necessary

  6. Ensure switch configuration backup is available

5. Tools & Requirements

For Remote Reboot

  • NVIDIA QM9700 Web GUI credentials

  • Stable management network connectivity

6. Method of Procedure

Step 1: Remote Reboot via NVIDIA QM9700 Web GUI

Procedure

  1. Log in to the NVIDIA QM9700 Web GUI using management IP.

  2. Navigate to: System -> Reboot.


  1. Confirm switch system health and ensure no ongoing internal processes.

  2. Click Reboot and confirm the action when prompted.


  1. Wait for 5 to 10 minutes for the switch to fully reboot.

  2. Monitor switch status.

  3. Verify that all fabric ports come back online.

Success Criteria

  • Switch is fully operational after reboot.

  • No errors in logs.

    • Related Articles

    • Collect Logs from NVIDIA QM9700 InfiniBand Switch (Sysdump) - Web GUI

      Purpose This article describes the procedure to collect diagnostic logs (sysdump) from an NVIDIA QM9700 InfiniBand switch. The sysdump file is typically requested by NVIDIA Networking Support for troubleshooting fabric, port, firmware, or stability ...
    • Fix: DGX Spark Kernal Panic - OS Reinstall via System Recovery

      The Issue : Kernel Panic: VFS Unable to Mount Root FS on Unknown-Block(0,0) This error is one of the more alarming things you can encounter on a Linux-based system. When the DGX Spark throws a kernel panic with the message VFS: Unable to mount root ...
    • How to Collect Logs from NVIDIA Cumulus Linux Switch

      Purpose This article describes how to collect diagnostic logs from a switch running NVIDIA Cumulus Linux. These logs are typically required by NVIDIA Networking Support for troubleshooting switch-level issues such as port flaps, routing problems, ...
    • How to Collect Logs from NVIDIA UFM (UFM System Dump)

      Purpose This article explains how to collect diagnostic logs from NVIDIA Unified Fabric Manager (UFM) using the web-based GUI. The UFM system dump is typically required by NVIDIA Support for troubleshooting fabric health, host visibility, alerts, and ...
    • How to Collect NVIDIA Bug Report

      Purpose This article provides step-by-step instructions to collect an NVIDIA bug report from servers equipped with NVIDIA GPUs. The NVIDIA bug report is commonly required by NVIDIA Support for troubleshooting GPU driver, CUDA, NVLink, PCIe, and ...