Fixing "Cable Data Invalid EEPROM" Error on NVIDIA QM9700 InfiniBand Switch

Fixing "Cable Data Invalid EEPROM" Error on NVIDIA QM9700 InfiniBand Switch


Issue

On NVIDIA QM9700 InfiniBand switches, some ports may appear down and show an error such as:

This issue is often caused by outdated CPLD firmware and can be resolved by updating the CPLD version on the switch.

Root Cause

The EEPROM error is typically due to compatibility or communication issues between the switch and the optical cables/transceivers. A CPLD firmware update resolves low-level hardware interface bugs related to EEPROM communication.

Solution: Update CPLD Firmware


Requirements

      ·       Linux server with SSH access to the switch

      ·       updateswitchcpld_v3.1.tgz CPLD update tool

      ·       Switch IP (management or IPoIB)

      ·       Admin username and password

Step-by-Step Procedure

1. Transfer and Extract the Update Tool

Copy updateswitchcpld_v3.1.tgz to your Linux host having connectivity to ib switch and extract it:

tar xzf updateswitchcpld_v3.1.tgz cd updateswitchcpld_v3.1

2. Run the CPLD Update Tool

Use the following command to start the CPLD update process:
./updateswitchcpld --managed -t <SWITCH_IP> -u <ADMIN_USERNAME> -p <PASSWORD> --os mlnx-os

Example:

./updateswitchcpld --managed -t 10.152.15.55 -u admin -p <password> --os mlnx-os

·       The switch will reboot during the CPLD update.

·       Save configuration before running this step.

3. OS Version Compatibility Note

There is a known compatibility issue with MLNX-OS version 3.11.4002 when running the CPLD update script. If you are using this version:

·       Upgrade temporarily to a compatible version like 3.12.1000 or 3.11.4000

·       Run the CPLD update

·       Revert back if needed

Switch OS Upgrade/Downgrade Commands
show images image boot next <image_name> configuration write reload

4. Confirm CPLD Upgrade

After reboot, validate the CPLD version:updateswitchcpld --managed -t <SWITCH_IP> -u <ADMIN_USERNAME> -p <PASSWORD> --os mlnx-os --verbose --check_cpld 

Post-Update

·       Ensure ports previously showing “EEPROM” errors are now active.

·       Validate the cables using ibdiagnet or the switch web UI/CLI.