How to do a remote power cycle on NVIDIA QM9700 Switch?
1. Purpose
To perform a remote reboot of NVIDIA QM9700 switch using the NVIDIA's Web GUI.
If the remote reboot does not resolve any issues occurred, a physical power-cycle should be carried out onsite as per OEM recommendations.
2. Scope
This MOP applies to:
NVIDIA QM9700 Switch
All connected nodes
3. Impact Analysis
During the reboot/power-cycle, respective link connectivity for nodes connected to the switch will be unavailable.
InfiniBand fabric is capable of auto-rerouting; however, application-level disruption cannot be completely avoided.
Running workloads may experience interruptions or slight performance degradation.
4. Pre-Checks
Confirm no high-priority or long-running jobs are active on affected nodes
Notify all stakeholders and users about the maintenance window
Ensure access to NVIDIA QM9700 Web GUI
Verify backup access to Out-Of-Band (OOB) management
Confirm onsite team availability if physical intervention becomes necessary
Ensure switch configuration backup is available
For Remote Reboot
6. Method of Procedure
Step 1: Remote Reboot via NVIDIA QM9700 Web GUI
Procedure
Log in to the NVIDIA QM9700 Web GUI using management IP.
Navigate to: System -> Reboot.
Confirm switch system health and ensure no ongoing internal processes.
Click Reboot and confirm the action when prompted.
Wait for 5 to 10 minutes for the switch to fully reboot.
Monitor switch status.
Verify that all fabric ports come back online.
Success Criteria
Related Articles
Collect Logs from NVIDIA QM9700 InfiniBand Switch (Sysdump) - Web GUI
Purpose This article describes the procedure to collect diagnostic logs (sysdump) from an NVIDIA QM9700 InfiniBand switch. The sysdump file is typically requested by NVIDIA Networking Support for troubleshooting fabric, port, firmware, or stability ...
How to Collect Logs from NVIDIA Cumulus Linux Switch
Purpose This article describes how to collect diagnostic logs from a switch running NVIDIA Cumulus Linux. These logs are typically required by NVIDIA Networking Support for troubleshooting switch-level issues such as port flaps, routing problems, ...
How to Collect Logs from NVIDIA UFM (UFM System Dump)
Purpose This article explains how to collect diagnostic logs from NVIDIA Unified Fabric Manager (UFM) using the web-based GUI. The UFM system dump is typically required by NVIDIA Support for troubleshooting fabric health, host visibility, alerts, and ...
How to Collect NVIDIA Bug Report
Purpose This article provides step-by-step instructions to collect an NVIDIA bug report from servers equipped with NVIDIA GPUs. The NVIDIA bug report is commonly required by NVIDIA Support for troubleshooting GPU driver, CUDA, NVLink, PCIe, and ...
How to Connect GNS3 (Windows) to a Remote Server
Purpose This article explains how to connect GNS3 installed on Windows to a remote GNS3 server.This is useful when the remote server has more CPU and RAM. Connecting to a remote server helps when your local computer does not have enough CPU or RAM to ...