rdma_cm

Establishes communication over RDMA transports.

Syntax

#include <rdma/rdma_cma.h>

Description

Establishes communication over RDMA transports.

Notes:
  • The RDMA CM is a communication manager (CM) used to set up reliable, connected, and unreliable datagram data transfers. It provides an RDMA transport neutral interface for establishing connections. The API concepts are based on sockets, but adapted for queue pair (QP) based semantics. The communication for QP must be over a specific RDMA device, and data transfers are message-based.
  • The RDMA CM can control both the QP and communication management (that is connection setup or teardown) functions of an RDMA API, or only the communication management. It works in conjunction with the verbs API that is defined by the libibverbs library. The libibverbs library provides the underlying interfaces needed to send and receive data.
  • The RDMA CM can operate asynchronously or synchronously. The mode of operation is controlled by using the rdma_cm event channel parameter in specific calls. If an event channel is provided, an rdma_cm identifier reports its event data (that is results of establishing a connection, for example), on that channel. If a channel is not provided, then all rdma_cm operation for the selected rdma_cm identifier is blocked until the channel completes.

RDMA verbs

The rdma_cm manager supports the verbs that are available in the libibverbs library and interfaces. However, it also provides wrapper functions for the commonly used verbs. The set of abstracted verb call are:

rdma_reg_msgs
Registers an array of buffers for sending and receiving.
rdma_reg_read
Registers a buffer for RDMA read operations.
rdma_reg_write
Registers a buffer for RDMA write operations.
rdma_dereg_m
Reregisters a memory region.
rdma_post_recv
Posts a buffer to receive a message.
rdma_post_send
Posts a buffer to send a message.
rdma_post_read
Posts an RDMA to read data into a buffer.
rdma_post_write
Posts an RDMA to send data from a buffer.
rdma_post_recvv
Posts a vector of buffers to receive a message.
rdma_post_sendv
Posts a vector of buffers to send a message.
rdma_post_readv
Posts a vector of buffers to receive an RDMA read.
rdma_post_writev
Posts a vector of buffers to send an RDMA write.
rdma_post_ud_send
Posts a buffer to send a message on a UD QP.
rdma_get_send_comp
Gets completion status for a send or RDMA operation.
rdma_get_recv_comp
Gets information about a completed receive.

Examples

  1. CLIENT operation
    An overview of the basic operation for the active, or client, side of communication is described in this section. This flow is for asynchronous operation with low-level call details. For synchronous operation, calls to rdma_create_event_channel, rdma_get_cm_event, rdma_ack_cm_event, and rdma_destroy_event_channel is eliminated. Abstracted calls, such as rdma_create_ep contains several calls under a single API. A general connection flow includes the following calls:
    rdma_getaddrinfo
    Retrieves address information of the destination.
    rdma_create_event_channel
    Creates channel to receive events.
    rdma_create_id
    Allocates an rdma_cm_id identifier, this call is similar in function to a socket.
    rdma_resolve_addr
    Obtains a local RDMA device to reach the remote address.
    rdma_get_cm_event
    Waits for RDMA_CM_EVENT_ADDR_RESOLVED event.
    rdma_ack_cm_event
    Acknowledges an event.
    rdma_create_qp
    Allocates a queue pair (QP) for the communication.
    rdma_resolve_route
    Determines the route to the remote address.
    rdma_get_cm_event
    Waits for theRDMA_CM_EVENT_ROUTE_RESOLVED event.
    rdma_ack_cm_event
    Acknowledges an event.
    rdma_connect
    Connects to the remote server.
    rdma_get_cm_event
    Waits for the RDMA_CM_EVENT_ESTABLISHED event
    rdma_ack_cm_event
    Acknowledges an event.
    To perform data transfers over connection, follow these steps:
    rdma_disconnect
    Tears-down a connection.
    rdma_get_cm_event
    Waits for an RDMA_CM_EVENT_DISCONNECTED event.
    rdma_ack_cm_event
    Acknowledges an event.
    rdma_destroy_qp
    Destroys the QP.
    rdma_destroy_id
    Releases the rdma_cm_id identifier.
    rdma_destroy_event_channel
    Releases the event channel.

    An identical process is used to set up unreliable datagram (UD) communication between nodes. No actual connection is formed between the queue pairs, so disconnection is not required. This example shows initiating the client for disconnect, either side of a connection can initiate the disconnect.

  2. Server connection
    A general overview of the basic operation for the passive, or server, side of communication is explained. A general connection flow includes the following events:
    rdma_create_event_channel
    Creates channel to receive events.
    rdma_create_id
    Allocates an rdma_cm_id identifier, this call is similar in function to a socket.
    rdma_bind_addr
    Sets the local port number to listen.
    rdma_listen
    Begins to listen for connection requests.
    rdma_get_cm_event
    Waits for RDMA_CM_EVENT_CONNECT_REQUEST event with a new rdma_cm_id identifier.
    rdma_create_qp
    Allocates a QP for the communication on the new rdma_cm_id identifier.
    rdma_accept
    Accepts the connection request.
    rdma_ack_cm_event
    Acknowledges an event.
    rdma_get_cm_event
    Waits for RDMA_CM_EVENT_ESTABLISHED event.
    rdma_ack_cm_event
    Acknowledges an event.
    To perform data transfers over connection, follow these steps:
    rdma_get_cm_event
    Waits for an RDMA_CM_EVENT_DISCONNECTED event.
    rdma_ack_cm_event
    Acknowledges an event.
    rdma_disconnect
    Tears-down a connection.
    rdma_destroy_qp
    Destroys the QP.
    rdma_destroy_id
    Releases the connectedrdma_cm_id identifier.
    rdma_destroy_id
    Releases the listening rdma_cm_id identifier.
    rdma_destroy_event_channel
    Releases the event channel.

Exit Status

= 0
Success
= -1
Error. See errno for more details.

Most librdmacm functions return 0 to indicate success, and a -1 return value to indicate failure. If a function operates asynchronously, a return value of 0 means that the operation started successfully. The operation can complete in error, and you must check the status of the related event. If the return value is -1, then errno contains additional information for the failure.

Note: The earlier versions of the library would return -errno and is not set to errno for some cases related to ENOMEM, ENODEV, ENODATA, EINVAL, and EADDRNOTAVAIL codes. Applications that require to verify the earlier version of the codes and that are compatible must manually set errno to negative of the return code, if it is < -1.