Mapping mlx5_x Names to Actual InfiniBand Ports

Mapping mlx5_x Names to Actual InfiniBand Ports


Purpose

In large GPU/AI clusters, each compute node often has multiple InfiniBand (IB) Host Channel Adapters (HCAs). On Linux, these HCAs appear as mlx5_x devices.
To troubleshoot connectivity or performance issues, it is essential to map:

  • mlx5_x (Linux device name)

  • Port GUID (from ibstat)

  • Linux netdev interface (from ip a)

  • Link speed (from ibstat or mlxlink)

  • Connected switch and port (from ibnetdiscover or subnet manager topology dump)

Step 1 – Get Linux Device Mapping

Check available Mellanox devices:

ls /sys/class/infiniband

Example output:

mlx5_0 mlx5_1 mlx5_2 mlx5_3

Each corresponds to one HCA port.

Step 2 – Get Port GUIDs (ibstat)

Run the following for each HCA:

ibstat mlx5_0

Look for Port GUID:

Port GUID: 0x248a070300abcd01

This uniquely identifies the IB port in the fabric.

Step 3 – Map to Linux netdev (ip a)

Check ip a output for ibX interfaces:

ip -o link show | grep ib

Sample:

5: ib0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 4092 ... 6: ib1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 4092 ...

The netdev (e.g., ib0) corresponds to the mlx5_x device.
Cross-check via:

cat /sys/class/infiniband/mlx5_0/device/net/*

Use either:

ibstat mlx5_0

or

mlxlink -d mlx5_0 -p 1

Look for:

Link layer: InfiniBand State: Active Rate: 400 Gb/sec (EDR/HDR/NDR depending on gen)

Step 5 – Find Connected Switch Port

To identify which switch and port the HCA is connected to:

  1. Run topology discovery on the fabric:

    ibnetdiscover | grep <Port GUID>

    or

    ibdiagnet -v | grep <Port GUID>
  2. Output will show mapping like:

    CA: mlx5_0 (GUID 0x248a070300abcd01) port 1 <--> SwitchX-2 (GUID 0x248a070400123456) port 12

This gives the node HCA → switch port mapping.

Example Table

mlx5_xPort GUID (ibstat)Linux netdev (ip a)SpeedConnected Switch:Port
mlx5_00x248a070300abcd01ib0400GSW1:12
mlx5_10x248a070300abcd02ib1400GSW2:14
mlx5_20x248a070300abcd03ib2200G⚠SW3:18
mlx5_30x248a070300abcd04ib3200G⚠SW4:20

⚠️ Indicates ports running below expected speed.


Summary

  • mlx5_x = Linux HCA device name.

  • Use ibstat to fetch Port GUID and speed.

  • Use ip a or /sys/class/infiniband to map to Linux netdev (ib0, ib1, etc.).

  • Use ibnetdiscover / ibdiagnet to identify the connected switch and port.

  • Build a mapping table for each node to simplify troubleshooting.