How to do a remote power cycle on NVIDIA QM9700 Switch?
1. Purpose
To perform a remote reboot of NVIDIA QM9700 switch using the NVIDIA's Web GUI.
If the remote reboot does not resolve any issues occurred, a physical power-cycle should be carried out onsite as per OEM recommendations.
2. Scope
This MOP applies to:
NVIDIA QM9700 Switch
All connected nodes
3. Impact Analysis
During the reboot/power-cycle, respective link connectivity for nodes connected to the switch will be unavailable.
InfiniBand fabric is capable of auto-rerouting; however, application-level disruption cannot be completely avoided.
Running workloads may experience interruptions or slight performance degradation.
4. Pre-Checks
Confirm no high-priority or long-running jobs are active on affected nodes
Notify all stakeholders and users about the maintenance window
Ensure access to NVIDIA QM9700 Web GUI
Verify backup access to Out-Of-Band (OOB) management
Confirm onsite team availability if physical intervention becomes necessary
Ensure switch configuration backup is available
For Remote Reboot
6. Method of Procedure
Step 1: Remote Reboot via NVIDIA QM9700 Web GUI
Procedure
Log in to the NVIDIA QM9700 Web GUI using management IP.
Navigate to: System -> Reboot.
Confirm switch system health and ensure no ongoing internal processes.
Click Reboot and confirm the action when prompted.
Wait for 5 to 10 minutes for the switch to fully reboot.
Monitor switch status.
Verify that all fabric ports come back online.
Success Criteria
Related Articles
Collect Logs from NVIDIA QM9700 InfiniBand Switch (Sysdump) - Web GUI
Purpose This article describes the procedure to collect diagnostic logs (sysdump) from an NVIDIA QM9700 InfiniBand switch. The sysdump file is typically requested by NVIDIA Networking Support for troubleshooting fabric, port, firmware, or stability ...
Fix: DGX Spark Kernal Panic - OS Reinstall via System Recovery
The Issue : Kernel Panic: VFS Unable to Mount Root FS on Unknown-Block(0,0) This error is one of the more alarming things you can encounter on a Linux-based system. When the DGX Spark throws a kernel panic with the message VFS: Unable to mount root ...
How to Collect Logs from NVIDIA Cumulus Linux Switch
Purpose This article describes how to collect diagnostic logs from a switch running NVIDIA Cumulus Linux. These logs are typically required by NVIDIA Networking Support for troubleshooting switch-level issues such as port flaps, routing problems, ...
How to Collect Logs from NVIDIA UFM (UFM System Dump)
Purpose This article explains how to collect diagnostic logs from NVIDIA Unified Fabric Manager (UFM) using the web-based GUI. The UFM system dump is typically required by NVIDIA Support for troubleshooting fabric health, host visibility, alerts, and ...
How to Collect NVIDIA Bug Report
Purpose This article provides step-by-step instructions to collect an NVIDIA bug report from servers equipped with NVIDIA GPUs. The NVIDIA bug report is commonly required by NVIDIA Support for troubleshooting GPU driver, CUDA, NVLink, PCIe, and ...