Enabling and Validating RoCE (RDMA over Converged Ethernet) on SONiC

Enabling and Validating RoCE (RDMA over Converged Ethernet) on SONiC

 Purpose

This guide explains how to enable RoCE on a SONiC switch, verify RDMA interfaces on connected compute nodes, and validate performance using RDMA bandwidth tests.

 1. Enable RoCE on SONiC Switch

Step 1.1 — Enable Priority Flow Control (PFC)

PFC ensures lossless Ethernet behavior required by RoCE.

sonic-cli
configure terminal
# Enable RoCe with default settings
sonic(config)# roce enable
  force-defaults  Clear any previous applied QOS buffer and force Initialize RoCEv2 default buffer configuration
  pfc-priority    RoCEv2 buffer configuration based on the PFC priorities
  <cr>

sonic(config)# roce enable
This command will also restart the node after saving all configurations,if ROCE is configured first time or force-default. [Proceed y/N]: y
sonic(config)# Waiting for the reboot operation to complete






Step 1.4 — Verify RoCE-Ready QoS Configuration

Check that PFC is active and watchdog is off (optional):

show qos interface Ethernet0

Expected output:

sonic# show qos interface Ethernet 0
          scheduler policy: ROCE
          dscp-tc-map: ROCE
          dot1p-tc-map: ROCE
          tc-queue-map: ROCE
          tc-pg-map: ROCE
          pfc-priority-queue-map: ROCE
          pfc-priority-pg-map: ROCE
          pfc-asymmetric: off
          pfc-priority  : 3,4
          PFC Watchdog
            Status            : on
            Action            : drop
            Detection Time    : 200ms
            Restoration Time  : 400ms

Result: SONiC switch ports are now RoCE-ready.

 2. Verify RDMA on Compute Nodes

Run these on each Ubuntu compute node.

Step 2.1 — Check RDMA Interfaces

rdma link show

Expected output:

link rocep65s0f0/1 state ACTIVE physical_state LINK_UP netdev enp65s0f0np0
link rocep65s0f1/1 state ACTIVE physical_state LINK_UP netdev enp65s0f1np1
ibstat

Expected output:

Link layer: Ethernet
Rate: 200
State: Active

Result: Indicates that RDMA (RoCE) links are active and running on Ethernet mode at 200 Gb/s.

 3. Perform RoCE Bandwidth Test Between Nodes

Test Setup

VLAN Purpose Node 1 IP Node 2 IP Interface
VLAN 100 HSN-A 10.1.1.11 10.1.1.12 rocep65s0f1 / rocep1s0f0
VLAN 200 HSN-B 10.1.2.11 10.1.2.12 rocep65s0f0 / rocep1s0f1

Make sure RDMA utilities are installed:

sudo apt install rdma-core ibverbs-utils perftest -y

Step 3.1 — Run Bandwidth Test (HSN-A / VLAN 100)

On Node 1 (Server)

ib_write_bw -d rocep65s0f1 -F --report_gbits

On Node 2 (Client)

ib_write_bw -d rocep1s0f0 -F --report_gbits 10.1.1.11

Expected output:

---------------------------------------------------------------------------------------
 #bytes     #iterations    BW peak[Gb/sec]    BW average[Gb/sec]   MsgRate[Mpps]
 65536      5000             185.05             185.03             0.352922
---------------------------------------------------------------------------------------

Result: Confirms 185+ Gb/s effective throughput on 200G RoCE link.


 4. Validate Performance Counters


Check RDMA statistics:

rdma statistic show


    • Related Articles

    • Creating a VLAN on SONiC Switch (SONiC-CLI)

      Title: How to Create a VLAN on a SONiC Switch Purpose: This guide explains how to create a VLAN on a SONiC switch using SONiC-CLI. Applicable Products: EdgecoreSONiC Enterprise v 4.4.3 Steps Access SONiC CLI: sonic-cli Enter configuration mode: ...
    • Adding a Description to an Interface on SONiC Switch

      Title: How to Set Interface Description on SONiC Switch Purpose: This guide explains how to add or update descriptions for switch interfaces using SONiC-CLI. Applicable Products: Edgecore SOnic 4.4.3 Steps Access SONiC CLI: sonic-cli Enter ...
    • Configuring Management Interface 0 (eth0) on SONiC 4.5.0 Switch via SONiC CLI

      Purpose: This article provides step-by-step instructions to configure the management interface (Management 0 / eth0) on a SONiC 4.5.0 switch with a static IP and default gateway using the SONiC CLI. Scope: Applicable to Edgecore S-series switches ...
    • What is Boot Process in Linux OS.

      Boot Process in Linux OS. Have you ever wondered what happens behind the scenes from the time you press the power button until the Linux login prompt appears? Press the power button on your system, and after few moments you see the Linux login ...
    • Collecting TAC-PAC Logs from Cisco NX-OS and Copying via SCP

      1. Generating a TAC-PAC Log Bundle The TAC-PAC (tac-pac snapshot logs) command collects comprehensive diagnostic information from an NX-OS switch. This is commonly requested by Cisco TAC for troubleshooting. Command: tac-pac snapshot logs ...