InfiniBand and Remote DMA (RDMA) Interfaces¶
Introduction and Overview¶
TBD
InfiniBand core interfaces¶
- structiwpm_nlmsg_request*iwpm_get_nlmsg_request(__u32nlmsg_seq,u8nl_client,gfp_tgfp)¶
Allocate and initialize netlink message request
Parameters
__u32nlmsg_seqSequence number of the netlink message
u8nl_clientThe index of the netlink client
gfp_tgfpIndicates how the memory for the request should be allocated
Description
Returns the newly allocated netlink request object if successful,otherwise returns NULL
Parameters
structkref*krefHolds reference of netlink message request
- structiwpm_nlmsg_request*iwpm_find_nlmsg_request(__u32echo_seq)¶
Find netlink message request in the request list
Parameters
__u32echo_seqSequence number of the netlink request to find
Description
Returns the found netlink message request,if not found, returns NULL
- intiwpm_wait_complete_req(structiwpm_nlmsg_request*nlmsg_request)¶
Block while servicing the netlink request
Parameters
structiwpm_nlmsg_request*nlmsg_requestNetlink message request to service
Description
Wakes up, after the request is completed or expiredReturns 0 if the request is complete without error
- intiwpm_get_nlmsg_seq(void)¶
Get the sequence number for a netlink message to send to the port mapper
Parameters
voidno arguments
Description
Returns the sequence number for the netlink message.
- voidiwpm_add_remote_info(structiwpm_remote_info*reminfo)¶
Add remote address info of the connecting peer to the remote info hash table
Parameters
structiwpm_remote_info*reminfoThe remote info to be added
- u32iwpm_check_registration(u8nl_client,u32reg)¶
Check if the client registration matches the given one
Parameters
u8nl_clientThe index of the netlink client
u32regThe given registration type to compare with
Description
Calliwpm_register_pid() to register a clientReturns true if the client registration matches reg,otherwise returns false
- voidiwpm_set_registration(u8nl_client,u32reg)¶
Set the client registration
Parameters
u8nl_clientThe index of the netlink client
u32regRegistration type to set
- u32iwpm_get_registration(u8nl_client)¶
Get the client registration
Parameters
u8nl_clientThe index of the netlink client
Description
Returns the client registration type
- intiwpm_send_mapinfo(u8nl_client,intiwpm_pid)¶
Send local and mapped IPv4/IPv6 address info of a client to the user space port mapper
Parameters
u8nl_clientThe index of the netlink client
intiwpm_pidThe pid of the user space port mapper
Description
If successful, returns the number of sent mapping info records
- intiwpm_mapinfo_available(void)¶
Check if any mapping info records is available in the hash table
Parameters
voidno arguments
Description
Returns 1 if mapping information is available, otherwise returns 0
- intiwpm_compare_sockaddr(structsockaddr_storage*a_sockaddr,structsockaddr_storage*b_sockaddr)¶
Compare two sockaddr storage structs
Parameters
structsockaddr_storage*a_sockaddrfirst sockaddr to compare
structsockaddr_storage*b_sockaddrsecond sockaddr to compare
Return
0 if they are holding the same ip/tcp address info,otherwise returns 1
- intiwpm_validate_nlmsg_attr(structnlattr*nltb[],intnla_count)¶
Check for NULL netlink attributes
Parameters
structnlattr*nltb[]Holds address of each netlink message attributes
intnla_countNumber of netlink message attributes
Description
Returns error if any of the nla_count attributes is NULL
- structsk_buff*iwpm_create_nlmsg(u32nl_op,structnlmsghdr**nlh,intnl_client)¶
Allocate skb and form a netlink message
Parameters
u32nl_opNetlink message opcode
structnlmsghdr**nlhHolds address of the netlink message header in skb
intnl_clientThe index of the netlink client
Description
Returns the newly allcated skb, or NULL if the tailroom of the skbis insufficient to store the message header and payload
- intiwpm_parse_nlmsg(structnetlink_callback*cb,intpolicy_max,conststructnla_policy*nlmsg_policy,structnlattr*nltb[],constchar*msg_type)¶
Validate and parse the received netlink message
Parameters
structnetlink_callback*cbNetlink callback structure
intpolicy_maxMaximum attribute type to be expected
conststructnla_policy*nlmsg_policyValidation policy
structnlattr*nltb[]Array to store policy_max parsed elements
constchar*msg_typeType of netlink message
Description
Returns 0 on success or a negative error code
- voidiwpm_print_sockaddr(structsockaddr_storage*sockaddr,char*msg)¶
Print IPv4/IPv6 address and TCP port
Parameters
structsockaddr_storage*sockaddrSocket address to print
char*msgMessage to print
- intiwpm_send_hello(u8nl_client,intiwpm_pid,u16abi_version)¶
Send hello response to iwpmd
Parameters
u8nl_clientThe index of the netlink client
intiwpm_pidThe pid of the user space port mapper
u16abi_versionThe kernel’s abi_version
Description
Returns 0 on success or a negative error code
- intib_process_cq_direct(structib_cq*cq,intbudget)¶
process a CQ in caller context
Parameters
structib_cq*cqCQ to process
intbudgetnumber of CQEs to poll for
Description
This function is used to process all outstanding CQ entries.It does not offload CQ processing to a different context and doesnot ask for completion interrupts from the HCA.Using direct processing on CQ with non IB_POLL_DIRECT type may triggerconcurrent processing.
Note
do not pass -1 asbudget unless it is guaranteed that the numberof completions that will be processed is small.
- structib_cq*__ib_alloc_cq(structib_device*dev,void*private,intnr_cqe,intcomp_vector,enumib_poll_contextpoll_ctx,constchar*caller)¶
allocate a completion queue
Parameters
structib_device*devdevice to allocate the CQ for
void*privatedriver private data, accessible from cq->cq_context
intnr_cqenumber of CQEs to allocate
intcomp_vectorHCA completion vectors for this CQ
enumib_poll_contextpoll_ctxcontext to poll the CQ from.
constchar*callermodule owner name.
Description
This is the proper interface to allocate a CQ for in-kernel users. ACQ allocated with this interface will automatically be polled from thespecified context. The ULP must use wr->wr_cqe instead of wr->wr_idto use this CQ abstraction.
- structib_cq*__ib_alloc_cq_any(structib_device*dev,void*private,intnr_cqe,enumib_poll_contextpoll_ctx,constchar*caller)¶
allocate a completion queue
Parameters
structib_device*devdevice to allocate the CQ for
void*privatedriver private data, accessible from cq->cq_context
intnr_cqenumber of CQEs to allocate
enumib_poll_contextpoll_ctxcontext to poll the CQ from
constchar*callermodule owner name
Description
Attempt to spread ULP Completion Queues over each device’s interruptvectors. A simple best-effort mechanism is used.
- voidib_free_cq(structib_cq*cq)¶
free a completion queue
Parameters
structib_cq*cqcompletion queue to free.
- structib_cq*ib_cq_pool_get(structib_device*dev,unsignedintnr_cqe,intcomp_vector_hint,enumib_poll_contextpoll_ctx)¶
Find the least used completion queue that matches a given cpu hint (or least used for wild card affinity) and fits nr_cqe.
Parameters
structib_device*devrdma device
unsignedintnr_cqenumber of needed cqe entries
intcomp_vector_hintcompletion vector hint (-1) for the driver to assigna comp vector based on internal counter
enumib_poll_contextpoll_ctxcq polling context
Description
Finds a cq that satisfiescomp_vector_hint andnr_cqe requirements andclaim entries in it for us. In case there is no available cq, allocatea new cq with the requirements and add it to the device pool.IB_POLL_DIRECT cannot be used for shared cqs so it is not a valid valueforpoll_ctx.
- voidib_cq_pool_put(structib_cq*cq,unsignedintnr_cqe)¶
Return a CQ taken from a shared pool.
Parameters
structib_cq*cqThe CQ to return.
unsignedintnr_cqeThe max number of cqes that the user had requested.
- intib_cm_listen(structib_cm_id*cm_id,__be64service_id)¶
Initiates listening on the specified service ID for connection and service ID resolution requests.
Parameters
structib_cm_id*cm_idConnection identifier associated with the listen request.
__be64service_idService identifier matched against incoming connectionand service ID resolution requests. The service ID should be specifiednetwork-byte order. If set to IB_CM_ASSIGN_SERVICE_ID, the CM willassign a service ID to the caller.
- structib_cm_id*ib_cm_insert_listen(structib_device*device,ib_cm_handlercm_handler,__be64service_id)¶
Create a new listening ib_cm_id and listen on the given service ID.
Parameters
structib_device*deviceDevice associated with the cm_id. All related communication willbe associated with the specified device.
ib_cm_handlercm_handlerCallback invoked to notify the user of CM events.
__be64service_idService identifier matched against incoming connectionand service ID resolution requests. The service ID should be specifiednetwork-byte order. If set to IB_CM_ASSIGN_SERVICE_ID, the CM willassign a service ID to the caller.
Description
If there’s an existing ID listening on that same device and service ID,return it.
Callers should call ib_destroy_cm_id when done with the listener ID.
- intrdma_rw_ctx_init(structrdma_rw_ctx*ctx,structib_qp*qp,u32port_num,structscatterlist*sg,u32sg_cnt,u32sg_offset,u64remote_addr,u32rkey,enumdma_data_directiondir)¶
initialize a RDMA READ/WRITE context
Parameters
structrdma_rw_ctx*ctxcontext to initialize
structib_qp*qpqueue pair to operate on
u32port_numport num to which the connection is bound
structscatterlist*sgscatterlist to READ/WRITE from/to
u32sg_cntnumber of entries insg
u32sg_offsetcurrent byte offset intosg
u64remote_addrremote address to read/write (relative torkey)
u32rkeyremote key to operate on
enumdma_data_directiondirDMA_TO_DEVICEfor RDMA WRITE,DMA_FROM_DEVICEfor RDMA READ
Description
Returns the number of WQEs that will be needed on the workqueue ifsuccessful, or a negative error code.
- intrdma_rw_ctx_signature_init(structrdma_rw_ctx*ctx,structib_qp*qp,u32port_num,structscatterlist*sg,u32sg_cnt,structscatterlist*prot_sg,u32prot_sg_cnt,structib_sig_attrs*sig_attrs,u64remote_addr,u32rkey,enumdma_data_directiondir)¶
initialize a RW context with signature offload
Parameters
structrdma_rw_ctx*ctxcontext to initialize
structib_qp*qpqueue pair to operate on
u32port_numport num to which the connection is bound
structscatterlist*sgscatterlist to READ/WRITE from/to
u32sg_cntnumber of entries insg
structscatterlist*prot_sgscatterlist to READ/WRITE protection information from/to
u32prot_sg_cntnumber of entries inprot_sg
structib_sig_attrs*sig_attrssignature offloading algorithms
u64remote_addrremote address to read/write (relative torkey)
u32rkeyremote key to operate on
enumdma_data_directiondirDMA_TO_DEVICEfor RDMA WRITE,DMA_FROM_DEVICEfor RDMA READ
Description
Returns the number of WQEs that will be needed on the workqueue ifsuccessful, or a negative error code.
- structib_send_wr*rdma_rw_ctx_wrs(structrdma_rw_ctx*ctx,structib_qp*qp,u32port_num,structib_cqe*cqe,structib_send_wr*chain_wr)¶
return chain of WRs for a RDMA READ or WRITE operation
Parameters
structrdma_rw_ctx*ctxcontext to operate on
structib_qp*qpqueue pair to operate on
u32port_numport num to which the connection is bound
structib_cqe*cqecompletion queue entry for the last WR
structib_send_wr*chain_wrWR to append to the posted chain
Description
Return the WR chain for the set of RDMA READ/WRITE operations described byctx, as well as any memory registration operations needed. Ifchain_wris non-NULL the WR it points to will be appended to the chain of WRs posted.Ifchain_wr is not setcqe must be set so that the caller gets acompletion notification.
- intrdma_rw_ctx_post(structrdma_rw_ctx*ctx,structib_qp*qp,u32port_num,structib_cqe*cqe,structib_send_wr*chain_wr)¶
post a RDMA READ or RDMA WRITE operation
Parameters
structrdma_rw_ctx*ctxcontext to operate on
structib_qp*qpqueue pair to operate on
u32port_numport num to which the connection is bound
structib_cqe*cqecompletion queue entry for the last WR
structib_send_wr*chain_wrWR to append to the posted chain
Description
Post the set of RDMA READ/WRITE operations described byctx, as well asany memory registration operations needed. Ifchain_wr is non-NULL theWR it points to will be appended to the chain of WRs posted. Ifchain_wris not setcqe must be set so that the caller gets a completionnotification.
- voidrdma_rw_ctx_destroy(structrdma_rw_ctx*ctx,structib_qp*qp,u32port_num,structscatterlist*sg,u32sg_cnt,enumdma_data_directiondir)¶
release all resources allocated by rdma_rw_ctx_init
Parameters
structrdma_rw_ctx*ctxcontext to release
structib_qp*qpqueue pair to operate on
u32port_numport num to which the connection is bound
structscatterlist*sgscatterlist that was used for the READ/WRITE
u32sg_cntnumber of entries insg
enumdma_data_directiondirDMA_TO_DEVICEfor RDMA WRITE,DMA_FROM_DEVICEfor RDMA READ
- voidrdma_rw_ctx_destroy_signature(structrdma_rw_ctx*ctx,structib_qp*qp,u32port_num,structscatterlist*sg,u32sg_cnt,structscatterlist*prot_sg,u32prot_sg_cnt,enumdma_data_directiondir)¶
release all resources allocated by rdma_rw_ctx_signature_init
Parameters
structrdma_rw_ctx*ctxcontext to release
structib_qp*qpqueue pair to operate on
u32port_numport num to which the connection is bound
structscatterlist*sgscatterlist that was used for the READ/WRITE
u32sg_cntnumber of entries insg
structscatterlist*prot_sgscatterlist that was used for the READ/WRITE of the PI
u32prot_sg_cntnumber of entries inprot_sg
enumdma_data_directiondirDMA_TO_DEVICEfor RDMA WRITE,DMA_FROM_DEVICEfor RDMA READ
- unsignedintrdma_rw_mr_factor(structib_device*device,u32port_num,unsignedintmaxpages)¶
return number of MRs required for a payload
Parameters
structib_device*devicedevice handling the connection
u32port_numport num to which the connection is bound
unsignedintmaxpagesmaximum payload pages per rdma_rw_ctx
Description
Returns the number of MRs the device requires to movemaxpayloadbytes. The returned value is used during transport creation tocompute max_rdma_ctxts and the size of the transport’s Send andSend Completion Queues.
- boolrdma_dev_access_netns(conststructib_device*dev,conststructnet*net)¶
Return whether an rdma device can be accessed from a specified net namespace or not.
Parameters
conststructib_device*devPointer to rdma device which needs to be checked
conststructnet*netPointer to net namesapce for which access to be checked
Description
When the rdma device is in shared mode, it ignores the net namespace.When the rdma device is exclusive to a net namespace, rdma device netnamespace is checked against the specified one.
- boolrdma_dev_has_raw_cap(conststructib_device*dev)¶
Returns whether a specified rdma device has CAP_NET_RAW capability or not.
Parameters
conststructib_device*devPointer to rdma device whose capability to be checked
Description
Returns true if a rdma device’s owning user namespace has CAP_NET_RAWcapability, otherwise false. When rdma subsystem is in legacy shared network,namespace mode, the default net namespace is considered.
- voidib_device_put(structib_device*device)¶
Release IB device reference
Parameters
structib_device*devicedevice whose reference to be released
Description
ib_device_put() releases reference to the IB device to allow it to beunregistered and eventually free.
- structib_device*ib_device_get_by_name(constchar*name,enumrdma_driver_iddriver_id)¶
Find an IB device by name
Parameters
constchar*nameThe name to look for
enumrdma_driver_iddriver_idThe driver ID that must match (RDMA_DRIVER_UNKNOWN matches all)
Description
Find and hold an ib_device by its name. The caller must callib_device_put() on the returned pointer.
Parameters
size_tsizesize of structure to allocate
structnet*netnetwork namespace device should be located in, namespacemust stay valid until
ib_register_device()is completed.
Description
Low-level drivers should useib_alloc_device() to allocatestructib_device.size is the size of the structure to be allocated,including any private data used by the low-level driver.ib_dealloc_device() must be used to free structures allocated withib_alloc_device().
- voidib_dealloc_device(structib_device*device)¶
free an IB device struct
Parameters
structib_device*devicestructure to free
Description
Free a structure allocated withib_alloc_device().
- conststructib_port_immutable*ib_port_immutable_read(structib_device*dev,unsignedintport)¶
Read rdma port’s immutable data
Parameters
structib_device*devIB device
unsignedintportport number whose immutable data to read. It starts with index 1 andvalid upto including
rdma_end_port().
- intib_register_device(structib_device*device,constchar*name,structdevice*dma_device)¶
Register an IB device with IB core
Parameters
structib_device*deviceDevice to register
constchar*nameunique string device name. This may include a ‘%’ which willcause a unique index to be added to the passed device name.
structdevice*dma_devicepointer to a DMA-capable device. If
NULL, then the IBdevice will be used. In this case the caller should fullysetup the ibdev for DMA. This usually means using dma_virt_ops.
Description
Low-level drivers useib_register_device() to register theirdevices with the IB core. All registered clients will receive acallback for each device that is added.device must be allocatedwithib_alloc_device().
If the driver uses ops.dealloc_driver and calls anyib_unregister_device()asynchronously then the device pointer may become freed as soon as thisfunction returns.
- voidib_unregister_device(structib_device*ib_dev)¶
Unregister an IB device
Parameters
structib_device*ib_devThe device to unregister
Description
Unregister an IB device. All clients will receive a remove callback.
Callers should call this routine only once, and protect against races withregistration. Typically it should only be called as part of a removecallback in an implementation of driver core’sstructdevice_driver andrelated.
If ops.dealloc_driver is used then ib_dev will be freed upon return fromthis function.
- voidib_unregister_device_and_put(structib_device*ib_dev)¶
Unregister a device while holding a ‘get’
Parameters
structib_device*ib_devThe device to unregister
Description
This is the same asib_unregister_device(), except it includes an internalib_device_put() that should match a ‘get’ obtained by the caller.
It is safe to call this routine concurrently from multiple threads whileholding the ‘get’. When the function returns the device is fullyunregistered.
Drivers using this flow MUST use the driver_unregister callback to clean uptheir resources associated with the device and dealloc it.
- voidib_unregister_driver(enumrdma_driver_iddriver_id)¶
Unregister all IB devices for a driver
Parameters
enumrdma_driver_iddriver_idThe driver to unregister
Description
This implements a fence for device unregistration. It only returns once alldevices associated with the driver_id have fully completed theirunregistration and returned from ib_unregister_device*().
If device’s are not yet unregistered it goes ahead and starts unregisteringthem.
This does not block creation of new devices with the given driver_id, thatis the responsibility of the caller.
- voidib_unregister_device_queued(structib_device*ib_dev)¶
Unregister a device using a work queue
Parameters
structib_device*ib_devThe device to unregister
Description
This schedules an asynchronous unregistration using a WQ for the device. Adriver should use this to avoid holding locks while doing unregistration,such as holding the RTNL lock.
Drivers using this API must use ib_unregister_driver before module unloadto ensure that all scheduled unregistrations have completed.
- intib_register_client(structib_client*client)¶
Register an IB client
Parameters
structib_client*clientClient to register
Description
Upper level users of the IB drivers can useib_register_client() toregister callbacks for IB device addition and removal. When an IBdevice is added, each registered client’s add method will be called(in the order the clients were registered), and when a device isremoved, each client’s remove method will be called (in the reverseorder that clients were registered). In addition, whenib_register_client() is called, the client will receive an addcallback for all devices already registered.
- voidib_unregister_client(structib_client*client)¶
Unregister an IB client
Parameters
structib_client*clientClient to unregister
Description
Upper level users useib_unregister_client() to remove their clientregistration. Whenib_unregister_client() is called, the clientwill receive a remove callback for each IB device still registered.
This is a full fence, once it returns no client callbacks will be called,or are running in another thread.
- voidib_set_client_data(structib_device*device,structib_client*client,void*data)¶
Set IB client context
Parameters
structib_device*deviceDevice to set context for
structib_client*clientClient to set context for
void*dataContext to set
Description
ib_set_client_data() sets client context data that can be retrieved withib_get_client_data(). This can only be called while the client isregistered to the device, once the ib_clientremove() callback returns thiscannot be called.
- voidib_register_event_handler(structib_event_handler*event_handler)¶
Register an IB event handler
Parameters
structib_event_handler*event_handlerHandler to register
Description
ib_register_event_handler() registers an event handler that will becalled back when asynchronous IB events occur (as defined inchapter 11 of the InfiniBand Architecture Specification). Thiscallback occurs in workqueue context.
- voidib_unregister_event_handler(structib_event_handler*event_handler)¶
Unregister an event handler
Parameters
structib_event_handler*event_handlerHandler to unregister
Description
Unregister an event handler registered withib_register_event_handler().
- intib_query_port(structib_device*device,u32port_num,structib_port_attr*port_attr)¶
Query IB port attributes
Parameters
structib_device*deviceDevice to query
u32port_numPort number to query
structib_port_attr*port_attrPort attributes
Description
ib_query_port() returns the attributes of a port through theport_attr pointer.
- intib_device_set_netdev(structib_device*ib_dev,structnet_device*ndev,u32port)¶
Associate the ib_dev with an underlying net_device
Parameters
structib_device*ib_devDevice to modify
structnet_device*ndevnet_device to affiliate, may be NULL
u32portIB port the net_device is connected to
Description
Drivers should use this to link the ib_device to a netdev so the netdevshows up in interfaces like ib_enum_roce_netdev. Only one netdev may beaffiliated with any port.
The caller must ensure that the given ndev is not unregistered orunregistering, and that either the ib_device is unregistered orib_device_set_netdev() is called with NULL when the ndev sends aNETDEV_UNREGISTER event.
- intib_query_netdev_port(structib_device*ibdev,structnet_device*ndev,u32*port)¶
Query the port number of a net_device associated with an ibdev
Parameters
structib_device*ibdevIB device
structnet_device*ndevNetwork device
u32*portIB port the net_device is connected to
- structib_device*ib_device_get_by_netdev(structnet_device*ndev,enumrdma_driver_iddriver_id)¶
Find an IB device associated with a netdev
Parameters
structnet_device*ndevnetdev to locate
enumrdma_driver_iddriver_idThe driver ID that must match (RDMA_DRIVER_UNKNOWN matches all)
Description
Find and hold an ib_device that is associated with a netdev viaib_device_set_netdev(). The caller must callib_device_put() on thereturned pointer.
- intib_query_pkey(structib_device*device,u32port_num,u16index,u16*pkey)¶
Get P_Key table entry
Parameters
structib_device*deviceDevice to query
u32port_numPort number to query
u16indexP_Key table index to query
u16*pkeyReturned P_Key
Description
ib_query_pkey() fetches the specified P_Key table entry.
- intib_modify_device(structib_device*device,intdevice_modify_mask,structib_device_modify*device_modify)¶
Change IB device attributes
Parameters
structib_device*deviceDevice to modify
intdevice_modify_maskMask of attributes to change
structib_device_modify*device_modifyNew attribute values
Description
ib_modify_device() changes a device’s attributes as specified bythedevice_modify_mask anddevice_modify structure.
- intib_modify_port(structib_device*device,u32port_num,intport_modify_mask,structib_port_modify*port_modify)¶
Modifies the attributes for the specified port.
Parameters
structib_device*deviceThe device to modify.
u32port_numThe number of the port to modify.
intport_modify_maskMask used to specify which attributes of the portto change.
structib_port_modify*port_modifyNew attribute values for the port.
Description
ib_modify_port() changes a port’s attributes as specified by theport_modify_mask andport_modify structure.
- intib_find_gid(structib_device*device,unionib_gid*gid,u32*port_num,u16*index)¶
Returns the port number and GID table index where a specified GID value occurs. Its searches only for IB link layer.
Parameters
structib_device*deviceThe device to query.
unionib_gid*gidThe GID value to search for.
u32*port_numThe port number of the device where the GID value was found.
u16*indexThe index into the GID table where the GID was found. Thisparameter may be NULL.
- intib_find_pkey(structib_device*device,u32port_num,u16pkey,u16*index)¶
Returns the PKey table index where a specified PKey value occurs.
Parameters
structib_device*deviceThe device to query.
u32port_numThe port number of the device to search for the PKey.
u16pkeyThe PKey value to search for.
u16*indexThe index into the PKey table where the PKey was found.
- structnet_device*ib_get_net_dev_by_params(structib_device*dev,u32port,u16pkey,constunionib_gid*gid,conststructsockaddr*addr)¶
Return the appropriate net_dev for a received CM request
Parameters
structib_device*devAn RDMA device on which the request has been received.
u32portPort number on the RDMA device.
u16pkeyThe Pkey the request came on.
constunionib_gid*gidA GID that the net_dev uses to communicate.
conststructsockaddr*addrContains the IP address that the request specified as itsdestination.
- structib_pd*__ib_alloc_pd(structib_device*device,unsignedintflags,constchar*caller)¶
Allocates an unused protection domain.
Parameters
structib_device*deviceThe device on which to allocate the protection domain.
unsignedintflagsprotection domain flags
constchar*callercaller’s build-time module name
Description
A protection domain object provides an association between QPs, sharedreceive queues, address handles, memory regions, and memory windows.
Every PD has a local_dma_lkey which can be used as the lkey value for localmemory operations.
- intib_dealloc_pd_user(structib_pd*pd,structib_udata*udata)¶
Deallocates a protection domain.
Parameters
structib_pd*pdThe protection domain to deallocate.
structib_udata*udataValid user data or NULL for kernel object
Description
It is an error to call this function while any resources in the pd stillexist. The caller is responsible to synchronously destroy them andguarantee no new allocations will happen.
- voidrdma_copy_ah_attr(structrdma_ah_attr*dest,conststructrdma_ah_attr*src)¶
Copy rdma ah attribute from source to destination.
Parameters
structrdma_ah_attr*destPointer to destination ah_attr. Contents of the destinationpointer is assumed to be invalid and attribute are overwritten.
conststructrdma_ah_attr*srcPointer to source ah_attr.
- voidrdma_replace_ah_attr(structrdma_ah_attr*old,conststructrdma_ah_attr*new)¶
Replace valid ah_attr with new one.
Parameters
structrdma_ah_attr*oldPointer to existing ah_attr which needs to be replaced.old is assumed to be valid or zero’d
conststructrdma_ah_attr*newPointer to the new ah_attr.
Description
rdma_replace_ah_attr() first releases any reference in the old ah_attr ifold the ah_attr is valid; after that it copies the new attribute and holdsthe reference to the replaced ah_attr.
- voidrdma_move_ah_attr(structrdma_ah_attr*dest,structrdma_ah_attr*src)¶
Move ah_attr pointed by source to destination.
Parameters
structrdma_ah_attr*destPointer to destination ah_attr to copy to.dest is assumed to be valid or zero’d
structrdma_ah_attr*srcPointer to the new ah_attr.
Description
rdma_move_ah_attr() first releases any reference in the destination ah_attrif it is valid. This also transfers ownership of internal references fromsrc to dest, making src invalid in the process. No new reference of the srcah_attr is taken.
- structib_ah*rdma_create_ah(structib_pd*pd,structrdma_ah_attr*ah_attr,u32flags)¶
Creates an address handle for the given address vector.
Parameters
structib_pd*pdThe protection domain associated with the address handle.
structrdma_ah_attr*ah_attrThe attributes of the address vector.
u32flagsCreate address handle flags (see
enumrdma_create_ah_flags).
Description
It returns 0 on success and returns appropriate error code on error.The address handle is used to reference a local or global destinationin all UD QP post sends.
- structib_ah*rdma_create_user_ah(structib_pd*pd,structrdma_ah_attr*ah_attr,structib_udata*udata)¶
Creates an address handle for the given address vector. It resolves destination mac address for ah attribute of RoCE type.
Parameters
structib_pd*pdThe protection domain associated with the address handle.
structrdma_ah_attr*ah_attrThe attributes of the address vector.
structib_udata*udatapointer to user’s input output buffer information need byprovider driver.
Description
It returns 0 on success and returns appropriate error code on error.The address handle is used to reference a local or global destinationin all UD QP post sends.
- voidrdma_move_grh_sgid_attr(structrdma_ah_attr*attr,unionib_gid*dgid,u32flow_label,u8hop_limit,u8traffic_class,conststructib_gid_attr*sgid_attr)¶
Sets the sgid attribute of GRH, taking ownership of the reference
Parameters
structrdma_ah_attr*attrPointer to AH attribute structure
unionib_gid*dgidDestination GID
u32flow_labelFlow label
u8hop_limitHop limit
u8traffic_classtraffic class
conststructib_gid_attr*sgid_attrPointer to SGID attribute
Description
This takes ownership of the sgid_attr reference. The caller must ensurerdma_destroy_ah_attr() is called before destroying the rdma_ah_attr aftercalling this function.
- voidrdma_destroy_ah_attr(structrdma_ah_attr*ah_attr)¶
Release reference to SGID attribute of ah attribute.
Parameters
structrdma_ah_attr*ah_attrPointer to ah attribute
Description
Release reference to the SGID attribute of the ah attribute if it isnon NULL. It is safe to call this multiple times, and safe to call it ona zero initialized ah_attr.
- structib_srq*ib_create_srq_user(structib_pd*pd,structib_srq_init_attr*srq_init_attr,structib_usrq_object*uobject,structib_udata*udata)¶
Creates a SRQ associated with the specified protection domain.
Parameters
structib_pd*pdThe protection domain associated with the SRQ.
structib_srq_init_attr*srq_init_attrA list of initial attributes required to create theSRQ. If SRQ creation succeeds, then the attributes are updated tothe actual capabilities of the created SRQ.
structib_usrq_object*uobjectuobject pointer if this is not a kernel SRQ
structib_udata*udataudata pointer if this is not a kernel SRQ
Description
srq_attr->max_wr and srq_attr->max_sge are read the determine therequested size of the SRQ, and set to the actual values allocatedon return. Ifib_create_srq() succeeds, then max_wr and max_sgewill always be at least as large as the requested values.
- structib_qp*ib_create_qp_user(structib_device*dev,structib_pd*pd,structib_qp_init_attr*attr,structib_udata*udata,structib_uqp_object*uobj,constchar*caller)¶
Creates a QP associated with the specified protection domain.
Parameters
structib_device*devIB device
structib_pd*pdThe protection domain associated with the QP.
structib_qp_init_attr*attrA list of initial attributes required to create theQP. If QP creation succeeds, then the attributes are updated tothe actual capabilities of the created QP.
structib_udata*udataUser data
structib_uqp_object*uobjuverbs obect
constchar*callercaller’s build-time module name
- intib_modify_qp_with_udata(structib_qp*ib_qp,structib_qp_attr*attr,intattr_mask,structib_udata*udata)¶
Modifies the attributes for the specified QP.
Parameters
structib_qp*ib_qpThe QP to modify.
structib_qp_attr*attrOn input, specifies the QP attributes to modify. On output,the current values of selected QP attributes are returned.
intattr_maskA bit-mask used to specify which attributes of the QPare being modified.
structib_udata*udatapointer to user’s input output buffer informationare being modified.It returns 0 on success and returns appropriate error code on error.
- structib_mr*ib_alloc_mr(structib_pd*pd,enumib_mr_typemr_type,u32max_num_sg)¶
Allocates a memory region
Parameters
structib_pd*pdprotection domain associated with the region
enumib_mr_typemr_typememory region type
u32max_num_sgmaximum sg entries available for registration.
Notes
Memory registeration page/sg lists must not exceed max_num_sg.For mr_type IB_MR_TYPE_MEM_REG, the total length cannot exceedmax_num_sg * used_page_size.
- structib_mr*ib_alloc_mr_integrity(structib_pd*pd,u32max_num_data_sg,u32max_num_meta_sg)¶
Allocates an integrity memory region
Parameters
structib_pd*pdprotection domain associated with the region
u32max_num_data_sgmaximum data sg entries available for registration
u32max_num_meta_sgmaximum metadata sg entries available forregistration
Notes
Memory registration page/sg lists must not exceed max_num_sg,also the integrity page/sg lists must not exceed max_num_meta_sg.
- structib_xrcd*ib_alloc_xrcd_user(structib_device*device,structinode*inode,structib_udata*udata)¶
Allocates an XRC domain.
Parameters
structib_device*deviceThe device on which to allocate the XRC domain.
structinode*inodeinode to connect XRCD
structib_udata*udataValid user data or NULL for kernel object
- intib_dealloc_xrcd_user(structib_xrcd*xrcd,structib_udata*udata)¶
Deallocates an XRC domain.
Parameters
structib_xrcd*xrcdThe XRC domain to deallocate.
structib_udata*udataValid user data or NULL for kernel object
- structib_wq*ib_create_wq(structib_pd*pd,structib_wq_init_attr*wq_attr)¶
Creates a WQ associated with the specified protection domain.
Parameters
structib_pd*pdThe protection domain associated with the WQ.
structib_wq_init_attr*wq_attrA list of initial attributes required to create theWQ. If WQ creation succeeds, then the attributes are updated tothe actual capabilities of the created WQ.
Description
wq_attr->max_wr and wq_attr->max_sge determinethe requested size of the WQ, and set to the actual values allocatedon return.Ifib_create_wq() succeeds, then max_wr and max_sge will always beat least as large as the requested values.
- intib_destroy_wq_user(structib_wq*wq,structib_udata*udata)¶
Destroys the specified user WQ.
Parameters
structib_wq*wqThe WQ to destroy.
structib_udata*udataValid user data
- intib_map_mr_sg_pi(structib_mr*mr,structscatterlist*data_sg,intdata_sg_nents,unsignedint*data_sg_offset,structscatterlist*meta_sg,intmeta_sg_nents,unsignedint*meta_sg_offset,unsignedintpage_size)¶
Map the dma mapped SG lists for PI (protection information) and set an appropriate memory region for registration.
Parameters
structib_mr*mrmemory region
structscatterlist*data_sgdma mapped scatterlist for data
intdata_sg_nentsnumber of entries in data_sg
unsignedint*data_sg_offsetoffset in bytes into data_sg
structscatterlist*meta_sgdma mapped scatterlist for metadata
intmeta_sg_nentsnumber of entries in meta_sg
unsignedint*meta_sg_offsetoffset in bytes into meta_sg
unsignedintpage_sizepage vector desired page size
Description
Constraints:- The MR must be allocated with type IB_MR_TYPE_INTEGRITY.
After this completes successfully, the memory regionis ready for registration.
Return
0 on success.
- intib_map_mr_sg(structib_mr*mr,structscatterlist*sg,intsg_nents,unsignedint*sg_offset,unsignedintpage_size)¶
Map the largest prefix of a dma mapped SG list and set it the memory region.
Parameters
structib_mr*mrmemory region
structscatterlist*sgdma mapped scatterlist
intsg_nentsnumber of entries in sg
unsignedint*sg_offsetoffset in bytes into sg
unsignedintpage_sizepage vector desired page size
Description
Constraints:
The first sg element is allowed to have an offset.
Each sg element must either be aligned to page_size or virtuallycontiguous to the previous element. In case an sg element has anon-contiguous offset, the mapping prefix will not include it.
The last sg element is allowed to have length less than page_size.
If sg_nents total byte length exceeds the mr max_num_sge * page_sizethen only max_num_sg entries will be mapped.
If the MR was allocated with type IB_MR_TYPE_SG_GAPS, none of theseconstraints holds and the page_size argument is ignored.
Returns the number of sg elements that were mapped to the memory region.
After this completes successfully, the memory regionis ready for registration.
- intib_sg_to_pages(structib_mr*mr,structscatterlist*sgl,intsg_nents,unsignedint*sg_offset_p,int(*set_page)(structib_mr*,u64))¶
Convert the largest prefix of a sg list to a page vector
Parameters
structib_mr*mrmemory region
structscatterlist*sgldma mapped scatterlist
intsg_nentsnumber of entries in sg
unsignedint*sg_offset_pIN
start offset in bytes into sg
OUT
offset in bytes for element n of the sg of the firstbyte that has not been processed where n is the returnvalue of this function.
int(*set_page)(structib_mr*,u64)driver page assignment function pointer
Description
Core service helper for drivers to convert the largestprefix of given sg list to a page vector. The sg listprefix converted is the prefix that meet the requirementsof ib_map_mr_sg.
Returns the number of sg elements that were assigned toa page vector.
- voidib_drain_sq(structib_qp*qp)¶
Block until all SQ CQEs have been consumed by the application.
Parameters
structib_qp*qpqueue pair to drain
Description
If the device has a provider-specific drain function, thencall that. Otherwise call the generic drain function__ib_drain_sq().
The caller must:
ensure there is room in the CQ and SQ for the drain work request andcompletion.
allocate the CQ usingib_alloc_cq().
ensure that there are no other contexts that are posting WRs concurrently.Otherwise the drain is not guaranteed.
- voidib_drain_rq(structib_qp*qp)¶
Block until all RQ CQEs have been consumed by the application.
Parameters
structib_qp*qpqueue pair to drain
Description
If the device has a provider-specific drain function, thencall that. Otherwise call the generic drain function__ib_drain_rq().
The caller must:
ensure there is room in the CQ and RQ for the drain work request andcompletion.
allocate the CQ usingib_alloc_cq().
ensure that there are no other contexts that are posting WRs concurrently.Otherwise the drain is not guaranteed.
- voidib_drain_qp(structib_qp*qp)¶
Block until all CQEs have been consumed by the application on both the RQ and SQ.
Parameters
structib_qp*qpqueue pair to drain
Description
The caller must:
ensure there is room in the CQ(s), SQ, and RQ for drain work requestsand completions.
allocate the CQs usingib_alloc_cq().
ensure that there are no other contexts that are posting WRs concurrently.Otherwise the drain is not guaranteed.
- structrdma_hw_stats*rdma_alloc_hw_stats_struct(conststructrdma_stat_desc*descs,intnum_counters,unsignedlonglifespan)¶
Helper function to allocate dynamic struct for the drivers.
Parameters
conststructrdma_stat_desc*descsarray of static descriptors
intnum_countersnumber of elements in array
unsignedlonglifespanmilliseconds between updates
- voidrdma_free_hw_stats_struct(structrdma_hw_stats*stats)¶
Helper function to release rdma_hw_stats
Parameters
structrdma_hw_stats*statsstatistics to release
- voidib_pack(conststructib_field*desc,intdesc_len,void*structure,void*buf)¶
Pack a structure into a buffer
Parameters
conststructib_field*descArray of structure field descriptions
intdesc_lenNumber of entries indesc
void*structureStructure to pack from
void*bufBuffer to pack into
Description
ib_pack() packs a list of structure fields into a buffer,controlled by the array of fields indesc.
- voidib_unpack(conststructib_field*desc,intdesc_len,void*buf,void*structure)¶
Unpack a buffer into a structure
Parameters
conststructib_field*descArray of structure field descriptions
intdesc_lenNumber of entries indesc
void*bufBuffer to unpack from
void*structureStructure to unpack into
Description
ib_pack() unpacks a list of structure fields from a buffer,controlled by the array of fields indesc.
- voidib_sa_cancel_query(intid,structib_sa_query*query)¶
try to cancel an SA query
Parameters
intidID of query to cancel
structib_sa_query*queryquery pointer to cancel
Description
Try to cancel an SA query. If the id and query don’t match up orthe query has already completed, nothing is done. Otherwise thequery is canceled and will complete with a status of -EINTR.
- intib_init_ah_attr_from_path(structib_device*device,u32port_num,structsa_path_rec*rec,structrdma_ah_attr*ah_attr,conststructib_gid_attr*gid_attr)¶
Initialize address handle attributes based on an SA path record.
Parameters
structib_device*deviceDevice associated ah attributes initialization.
u32port_numPort on the specified device.
structsa_path_rec*recpath record entry to use for ah attributes initialization.
structrdma_ah_attr*ah_attraddress handle attributes to initialization from path record.
conststructib_gid_attr*gid_attrSGID attribute to consider during initialization.
Description
Whenib_init_ah_attr_from_path() returns success,(a) for IB link layer it optionally contains a reference to SGID attributewhen GRH is present for IB link layer.(b) for RoCE link layer it contains a reference to SGID attribute.User must invokerdma_destroy_ah_attr() to release reference to SGIDattributes which are initialized usingib_init_ah_attr_from_path().
- intib_sa_path_rec_get(structib_sa_client*client,structib_device*device,u32port_num,structsa_path_rec*rec,ib_sa_comp_maskcomp_mask,unsignedlongtimeout_ms,gfp_tgfp_mask,void(*callback)(intstatus,structsa_path_rec*resp,unsignedintnum_paths,void*context),void*context,structib_sa_query**sa_query)¶
Start a Path get query
Parameters
structib_sa_client*clientSA client
structib_device*devicedevice to send query on
u32port_numport number to send query on
structsa_path_rec*recPath Record to send in query
ib_sa_comp_maskcomp_maskcomponent mask to send in query
unsignedlongtimeout_mstime to wait for response
gfp_tgfp_maskGFP mask to use for internal allocations
void(*callback)(intstatus,structsa_path_rec*resp,unsignedintnum_paths,void*context)function called when query completes, times out or iscanceled
void*contextopaque user context passed to callback
structib_sa_query**sa_queryquery context, used to cancel query
Description
Send a Path Record Get query to the SA to look up a path. Thecallback function will be called when the query completes (orfails); status is 0 for a successful response, -EINTR if the queryis canceled, -ETIMEDOUT is the query timed out, or -EIO if an erroroccurred sending the query. The resp parameter of the callback isonly valid if status is 0.
If the return value ofib_sa_path_rec_get() is negative, it is anerror code. Otherwise it is a query ID that can be used to cancelthe query.
- intib_sa_service_rec_get(structib_sa_client*client,structib_device*device,u32port_num,structsa_service_rec*rec,ib_sa_comp_maskcomp_mask,unsignedlongtimeout_ms,gfp_tgfp_mask,void(*callback)(intstatus,structsa_service_rec*resp,unsignedintnum_services,void*context),void*context,structib_sa_query**sa_query)¶
Start a Service get query
Parameters
structib_sa_client*clientSA client
structib_device*devicedevice to send query on
u32port_numport number to send query on
structsa_service_rec*recService Record to send in query
ib_sa_comp_maskcomp_maskcomponent mask to send in query
unsignedlongtimeout_mstime to wait for response
gfp_tgfp_maskGFP mask to use for internal allocations
void(*callback)(intstatus,structsa_service_rec*resp,unsignedintnum_services,void*context)function called when query completes, times out or iscanceled
void*contextopaque user context passed to callback
structib_sa_query**sa_queryquery context, used to cancel query
Description
Send a Service Record Get query to the SA to look up a path. Thecallback function will be called when the query completes (orfails); status is 0 for a successful response, -EINTR if the queryis canceled, -ETIMEDOUT is the query timed out, or -EIO if an erroroccurred sending the query. The resp parameter of the callback isonly valid if status is 0.
If the return value ofib_sa_service_rec_get() is negative, it is anerror code. Otherwise it is a query ID that can be used to cancelthe query.
- intib_ud_header_init(intpayload_bytes,intlrh_present,inteth_present,intvlan_present,intgrh_present,intip_version,intudp_present,intimmediate_present,structib_ud_header*header)¶
Initialize UD header structure
Parameters
intpayload_bytesLength of packet payload
intlrh_presentspecify if LRH is present
inteth_presentspecify if Eth header is present
intvlan_presentpacket is tagged vlan
intgrh_presentGRH flag (if non-zero, GRH will be included)
intip_versionif non-zero, IP header, V4 or V6, will be included
intudp_presentif non-zero, UDP header will be included
intimmediate_presentspecify if immediate data is present
structib_ud_header*headerStructure to initialize
- intib_ud_header_pack(structib_ud_header*header,void*buf)¶
Pack UD header
structintowire format
Parameters
structib_ud_header*headerUD header struct
void*bufBuffer to pack into
Description
ib_ud_header_pack() packs the UD header structureheader into wireformat in the bufferbuf.
- unsignedlongib_umem_find_best_pgsz(structib_umem*umem,unsignedlongpgsz_bitmap,unsignedlongvirt)¶
Find best HW page size to use for this MR
Parameters
structib_umem*umemumem struct
unsignedlongpgsz_bitmapbitmap of HW supported page sizes
unsignedlongvirtIOVA
Description
This helper is intended for HW that support multiple pagesizes but can do only a single page size in an MR.
Returns 0 if the umem requires page sizes not supported bythe driver to be mapped. Drivers always supporting PAGE_SIZEor smaller will never see a 0 result.
- structib_umem*ib_umem_get(structib_device*device,unsignedlongaddr,size_tsize,intaccess)¶
Pin and DMA map userspace memory.
Parameters
structib_device*deviceIB device to connect UMEM
unsignedlongaddruserspace virtual address to start at
size_tsizelength of region to pin
intaccessIB_ACCESS_xxx flags for memory being pinned
- voidib_umem_release(structib_umem*umem)¶
release memory pinned with ib_umem_get
Parameters
structib_umem*umemumem
structtorelease
- structib_umem_odp*ib_umem_odp_alloc_implicit(structib_device*device,intaccess)¶
Allocate a parent implicit ODP umem
Parameters
structib_device*deviceIB device to create UMEM
intaccessib_reg_mr access flags
Description
Implicit ODP umems do not have a VA range and do not have any page lists.They exist only to hold the per_mm reference to help the driver createchildren umems.
- structib_umem_odp*ib_umem_odp_alloc_child(structib_umem_odp*root,unsignedlongaddr,size_tsize,conststructmmu_interval_notifier_ops*ops)¶
Allocate a child ODP umem under an implicit parent ODP umem
Parameters
structib_umem_odp*rootThe parent umem enclosing the child. This must be allocated using
ib_alloc_implicit_odp_umem()unsignedlongaddrThe starting userspace VA
size_tsizeThe length of the userspace VA
conststructmmu_interval_notifier_ops*opsMMU interval ops, currently onlyinvalidate
- structib_umem_odp*ib_umem_odp_get(structib_device*device,unsignedlongaddr,size_tsize,intaccess,conststructmmu_interval_notifier_ops*ops)¶
Create a umem_odp for a userspace va
Parameters
structib_device*deviceIB device
structtoget UMEMunsignedlongaddruserspace virtual address to start at
size_tsizelength of region to pin
intaccessIB_ACCESS_xxx flags for memory being pinned
conststructmmu_interval_notifier_ops*opsMMU interval ops, currently onlyinvalidate
Description
The driver should use when the access flags indicate ODP memory. It avoidspinning, instead, stores the mm for future page fault handling inconjunction with MMU notifiers.
- intib_umem_odp_map_dma_and_lock(structib_umem_odp*umem_odp,u64user_virt,u64bcnt,u64access_mask,boolfault)¶
DMA map userspace memory in an ODP MR and lock it.
Parameters
structib_umem_odp*umem_odpthe umem to map and pin
u64user_virtthe address from which we need to map.
u64bcntthe minimal number of bytes to pin and map. The mapping might bebigger due to alignment, and may also be smaller in case of an errorpinning or mapping a page. The actual pages mapped is returned inthe return value.
u64access_maskbit mask of the requested access permissions for the givenrange.
boolfaultis faulting required for the given range
Description
Maps the range passed in the argument to DMA addresses.Upon success the ODP MR will be locked to let caller complete its devicepage table update.
Returns the number of pages mapped in success, negative error codefor failure.
RDMA Verbs transport library¶
- intrvt_fast_reg_mr(structrvt_qp*qp,structib_mr*ibmr,u32key,intaccess)¶
fast register physical MR
Parameters
structrvt_qp*qpthe queue pair where the work request comes from
structib_mr*ibmrthe memory region to be registered
u32keyupdated key for this memory region
intaccessaccess flags for this memory region
Description
Returns 0 on success.
- intrvt_invalidate_rkey(structrvt_qp*qp,u32rkey)¶
invalidate an MR rkey
Parameters
structrvt_qp*qpqueue pair associated with the invalidate op
u32rkeyrkey to invalidate
Description
Returns 0 on success.
- intrvt_lkey_ok(structrvt_lkey_table*rkt,structrvt_pd*pd,structrvt_sge*isge,structrvt_sge*last_sge,structib_sge*sge,intacc)¶
check IB SGE for validity and initialize
Parameters
structrvt_lkey_table*rkttable containing lkey to check SGE against
structrvt_pd*pdprotection domain
structrvt_sge*isgeoutgoing internal SGE
structrvt_sge*last_sgelast outgoing SGE written
structib_sge*sgeSGE to check
intaccaccess flags
Description
Check the IB SGE for validity and initialize our internal versionof it.
Increments the reference count when a new sge is stored.
Return
0 if compressed, 1 if added , otherwise returns -errno.
- intrvt_rkey_ok(structrvt_qp*qp,structrvt_sge*sge,u32len,u64vaddr,u32rkey,intacc)¶
check the IB virtual address, length, and RKEY
Parameters
structrvt_qp*qpqp for validation
structrvt_sge*sgeSGE state
u32lenlength of data
u64vaddrvirtual address to place data
u32rkeyrkey to check
intaccaccess flags
Return
1 if successful, otherwise 0.
Description
increments the reference count upon success
- __be32rvt_compute_aeth(structrvt_qp*qp)¶
compute the AETH (syndrome + MSN)
Parameters
structrvt_qp*qpthe queue pair to compute the AETH for
Description
Returns the AETH.
- voidrvt_get_credit(structrvt_qp*qp,u32aeth)¶
flush the send work queue of a QP
Parameters
structrvt_qp*qpthe qp who’s send work queue to flush
u32aeththe Acknowledge Extended Transport Header
Description
The QP s_lock should be held.
- u32rvt_restart_sge(structrvt_sge_state*ss,structrvt_swqe*wqe,u32len)¶
rewind the sge state for a wqe
Parameters
structrvt_sge_state*ssthe sge state pointer
structrvt_swqe*wqethe wqe to rewind
u32lenthe data length from the start of the wqe in bytes
Description
Returns the remaining data length.
- intrvt_check_ah(structib_device*ibdev,structrdma_ah_attr*ah_attr)¶
validate the attributes of AH
Parameters
structib_device*ibdevthe ib device
structrdma_ah_attr*ah_attrthe attributes of the AH
Description
If driver supports a more detailed check_ah function call back to itotherwise just check the basics.
Return
0 on success
- structrvt_dev_info*rvt_alloc_device(size_tsize,intnports)¶
allocate rdi
Parameters
size_tsizehow big of a structure to allocate
intnportsnumber of ports to allocate array slots for
Description
Use IB core device alloc to allocate space for the rdi which is assumed to beinside of the ib_device. Any extra space that drivers require should beincluded in size.
We also allocate a port array based on the number of ports.
Return
pointer to allocated rdi
- voidrvt_dealloc_device(structrvt_dev_info*rdi)¶
deallocate rdi
Parameters
structrvt_dev_info*rdistructure to free
Description
Free a structure allocated withrvt_alloc_device()
- intrvt_register_device(structrvt_dev_info*rdi)¶
register a driver
Parameters
structrvt_dev_info*rdimain dev structure for all of rdmavt operations
Description
It is up to drivers to allocate the rdi and fill in the appropriateinformation.
Return
0 on success otherwise an errno.
- voidrvt_unregister_device(structrvt_dev_info*rdi)¶
remove a driver
Parameters
structrvt_dev_info*rdirvt dev struct
- intrvt_init_port(structrvt_dev_info*rdi,structrvt_ibport*port,intport_index,u16*pkey_table)¶
init internal data for driver port
Parameters
structrvt_dev_info*rdirvt_dev_info struct
structrvt_ibport*portrvt port
intport_index0 based index of ports, different from IB core port num
u16*pkey_tablepkey_table forport
Description
Keep track of a list of ports. No need to have a detach port.They persist until the driver goes away.
Return
always 0
- boolrvt_cq_enter(structrvt_cq*cq,structib_wc*entry,boolsolicited)¶
add a new entry to the completion queue
Parameters
structrvt_cq*cqcompletion queue
structib_wc*entrywork completion entry to add
boolsolicitedtrue ifentry is solicited
Description
This may be called with qp->s_lock held.
Return
return true on success, else returnfalse if cq is full.
- intrvt_error_qp(structrvt_qp*qp,enumib_wc_statuserr)¶
put a QP into the error state
Parameters
structrvt_qp*qpthe QP to put into the error state
enumib_wc_statuserrthe receive completion error to signal if a RWQE is active
Description
Flushes both send and receive work queues.
Return
true if last WQE event should be generated.The QP r_lock and s_lock should be held and interrupts disabled.If we are already in error state, just return.
- intrvt_get_rwqe(structrvt_qp*qp,boolwr_id_only)¶
copy the next RWQE into the QP’s RWQE
Parameters
structrvt_qp*qpthe QP
boolwr_id_onlyupdate qp->r_wr_id only, not qp->r_sge
Description
Return -1 if there is a local error, 0 if no RWQE is available,otherwise return 1.
Can be called from interrupt level.
- voidrvt_comm_est(structrvt_qp*qp)¶
handle trap with QP established
Parameters
structrvt_qp*qpthe QP
- voidrvt_add_rnr_timer(structrvt_qp*qp,u32aeth)¶
add/start an rnr timer on the QP
Parameters
structrvt_qp*qpthe QP
u32aethaeth of RNR timeout, simulated aeth for loopback
- voidrvt_stop_rc_timers(structrvt_qp*qp)¶
stop all timers
Parameters
structrvt_qp*qpthe QPstop any pending timers
- voidrvt_del_timers_sync(structrvt_qp*qp)¶
wait for any timeout routines to exit
Parameters
structrvt_qp*qpthe QP
- structrvt_qp_iter*rvt_qp_iter_init(structrvt_dev_info*rdi,u64v,void(*cb)(structrvt_qp*qp,u64v))¶
initial for QP iteration
Parameters
structrvt_dev_info*rdirvt devinfo
u64vu64 value
void(*cb)(structrvt_qp*qp,u64v)user-defined callback
Description
This returns an iterator suitable for iterating QPsin the system.
Thecb is a user-defined callback andv is a 64-bitvalue passed to and relevant for processing in thecb. An example use case would be to alter QP processingbased on criteria not part of the rvt_qp.
Use cases that require memory allocation to succeedmust preallocate appropriately.
Return
a pointer to an rvt_qp_iter or NULL
- intrvt_qp_iter_next(structrvt_qp_iter*iter)¶
return the next QP in iter
Parameters
structrvt_qp_iter*iterthe iterator
Description
Fine grained QP iterator suitable for usewith debugfs seq_file mechanisms.
Updates iter->qp with the current QP when the returnvalue is 0.
Return
0 - iter->qp is valid 1 - no more QPs
- voidrvt_qp_iter(structrvt_dev_info*rdi,u64v,void(*cb)(structrvt_qp*qp,u64v))¶
iterate all QPs
Parameters
structrvt_dev_info*rdirvt devinfo
u64va 64-bit value
void(*cb)(structrvt_qp*qp,u64v)a callback
Description
This provides a way for iterating all QPs.
Thecb is a user-defined callback andv is a 64-bitvalue passed to and relevant for processing in thecb. An example use case would be to alter QP processingbased on criteria not part of the rvt_qp.
The code has an internal iterator to simplifynon seq_file use cases.
- voidrvt_copy_sge(structrvt_qp*qp,structrvt_sge_state*ss,void*data,u32length,boolrelease,boolcopy_last)¶
copy data to SGE memory
Parameters
structrvt_qp*qpassociated QP
structrvt_sge_state*ssthe SGE state
void*datathe data to copy
u32lengththe length of the data
boolreleaseboolean to release MR
boolcopy_lastdo a separate copy of the last 8 bytes
- voidrvt_ruc_loopback(structrvt_qp*sqp)¶
handle UC and RC loopback requests
Parameters
structrvt_qp*sqpthe sending QP
Description
This is called fromrvt_do_send() to forward a WQE addressed to the same HFINote that although we are single threaded due to the send engine, we stillhave to protect againstpost_send(). We don’t have to worry aboutreceive interrupts since this is a connected protocol and all packetswill pass through here.
- structrvt_mcast*rvt_mcast_find(structrvt_ibport*ibp,unionib_gid*mgid,u16lid)¶
search the global table for the given multicast GID/LID
Parameters
structrvt_ibport*ibpthe IB port structure
unionib_gid*mgidthe multicast GID to search for
u16lidthe multicast LID portion of the multicast address (host order)
NOTE
It is valid to have 1 MLID with multiple MGIDs. It is not validto have 1 MGID with multiple MLIDs.
Description
The caller is responsible for decrementing the reference count if found.
Return
NULL if not found.
Upper Layer Protocols¶
iSCSI Extensions for RDMA (iSER)¶
- structiser_data_buf¶
iSER data buffer
Definition:
struct iser_data_buf { struct scatterlist *sg; int size; unsigned long data_len; int dma_nents;};Members
sgpointer to the sg list
sizenum entries of this sg
data_lentotal buffer byte len
dma_nentsreturned by dma_map_sg
- structiser_mem_reg¶
iSER memory registration info
Definition:
struct iser_mem_reg { struct ib_sge sge; u32 rkey; struct iser_fr_desc *desc;};Members
sgememory region sg element
rkeymemory region remote key
descpointer to fast registration context
- structiser_tx_desc¶
iSER TX descriptor
Definition:
struct iser_tx_desc { struct iser_ctrl iser_header; struct iscsi_hdr iscsi_header; enum iser_desc_type type; u64 dma_addr; struct ib_sge tx_sg[2]; int num_sge; struct ib_cqe cqe; bool mapped; struct ib_reg_wr reg_wr; struct ib_send_wr send_wr; struct ib_send_wr inv_wr;};Members
iser_headeriser header
iscsi_headeriscsi header
typecommand/control/dataout
dma_addrheader buffer dma_address
tx_sgsg[0] points to iser/iscsi headerssg[1] optionally points to either of immediate dataunsolicited data-out or control
num_sgenumber sges used on this TX task
cqecompletion handler
mappedIs the task header mapped
reg_wrregistration WR
send_wrsend WR
inv_wrinvalidate WR
- structiser_rx_desc¶
iSER RX descriptor
Definition:
struct iser_rx_desc { struct iser_ctrl iser_header; struct iscsi_hdr iscsi_header; char data[ISER_RECV_DATA_SEG_LEN]; u64 dma_addr; struct ib_sge rx_sg; struct ib_cqe cqe; char pad[ISER_RX_PAD_SIZE];};Members
iser_headeriser header
iscsi_headeriscsi header
datareceived data segment
dma_addrreceive buffer dma address
rx_sgib_sge of receive buffer
cqecompletion handler
padfor sense data TODO: Modify to maximum sense length supported
- structiser_login_desc¶
iSER login descriptor
Definition:
struct iser_login_desc { void *req; void *rsp; u64 req_dma; u64 rsp_dma; struct ib_sge sge; struct ib_cqe cqe;};Members
reqpointer to login request buffer
rsppointer to login response buffer
req_dmaDMA address of login request buffer
rsp_dmaDMA address of login response buffer
sgeIB sge for login post recv
cqecompletion handler
- structiser_device¶
iSER device handle
Definition:
struct iser_device { struct ib_device *ib_device; struct ib_pd *pd; struct ib_event_handler event_handler; struct list_head ig_list; int refcount;};Members
ib_deviceRDMA device
pdProtection Domain for this device
event_handlerIB events handle routine
ig_listentry in devices list
refcountReference counter, dominated by open iser connections
- structiser_reg_resources¶
Fast registration resources
Definition:
struct iser_reg_resources { struct ib_mr *mr; struct ib_mr *sig_mr;};Members
mrmemory region
sig_mrsignature memory region
- structiser_fr_desc¶
Fast registration descriptor
Definition:
struct iser_fr_desc { struct list_head list; struct iser_reg_resources rsc; bool sig_protected; struct list_head all_list;};Members
listentry in connection fastreg pool
rscdata buffer registration resources
sig_protectedis region protected indicator
all_listfirst and last list members
- structiser_fr_pool¶
connection fast registration pool
Definition:
struct iser_fr_pool { struct list_head list; spinlock_t lock; int size; struct list_head all_list;};Members
listlist of fastreg descriptors
lockprotects fastreg pool
sizesize of the pool
all_listfirst and last list members
- structib_conn¶
Infiniband related objects
Definition:
struct ib_conn { struct rdma_cm_id *cma_id; struct ib_qp *qp; struct ib_cq *cq; u32 cq_size; struct iser_device *device; struct iser_fr_pool fr_pool; bool pi_support; struct ib_cqe reg_cqe;};Members
cma_idrdma_cm connection maneger handle
qpConnection Queue-pair
cqConnection completion queue
cq_sizeThe number of max outstanding completions
devicereference to iser device
fr_poolconnection fast registration pool
pi_supportIndicate device T10-PI support
reg_cqecompletion handler
- structiser_conn¶
iSER connection context
Definition:
struct iser_conn { struct ib_conn ib_conn; struct iscsi_conn *iscsi_conn; struct iscsi_endpoint *ep; enum iser_conn_state state; unsigned qp_max_recv_dtos; u16 max_cmds; char name[ISER_OBJECT_NAME_SIZE]; struct work_struct release_work; struct mutex state_mutex; struct completion stop_completion; struct completion ib_completion; struct completion up_completion; struct list_head conn_list; struct iser_login_desc login_desc; struct iser_rx_desc *rx_descs; u32 num_rx_descs; unsigned short scsi_sg_tablesize; unsigned short pages_per_mr; bool snd_w_inv;};Members
ib_connconnection RDMA resources
iscsi_connlink to matching iscsi connection
eptransport handle
stateconnection logical state
qp_max_recv_dtosmaximum number of data outs, correspondsto max number of post recvs
max_cmdsmaximum cmds allowed for this connection
nameconnection peer portal
release_workdeferred work for release job
state_mutexprotects iser onnection state
stop_completionconn_stop completion
ib_completionRDMA cleanup completion
up_completionconnection establishment completed(state is ISER_CONN_UP)
conn_listentry in ig conn list
login_desclogin descriptor
rx_descsrx buffers array (cyclic buffer)
num_rx_descsnumber of rx descriptors
scsi_sg_tablesizescsi host sg_tablesize
pages_per_mrmaximum pages available for registration
snd_w_invconnection uses remote invalidation
- structiscsi_iser_task¶
iser task context
Definition:
struct iscsi_iser_task { struct iser_tx_desc desc; struct iser_conn *iser_conn; enum iser_task_status status; struct scsi_cmnd *sc; int command_sent; int dir[ISER_DIRS_NUM]; struct iser_mem_reg rdma_reg[ISER_DIRS_NUM]; struct iser_data_buf data[ISER_DIRS_NUM]; struct iser_data_buf prot[ISER_DIRS_NUM];};Members
descTX descriptor
iser_connlink to iser connection
statuscurrent task status
sclink to scsi command
command_sentindicate if command was sent
diriser data direction
rdma_regtask rdma registration desc
dataiser data buffer desc
protiser protection buffer desc
- structiser_global¶
iSER global context
Definition:
struct iser_global { struct mutex device_list_mutex; struct list_head device_list; struct mutex connlist_mutex; struct list_head connlist; struct kmem_cache *desc_cache;};Members
device_list_mutexprotects device_list
device_listiser devices global list
connlist_mutexprotects connlist
connlistiser connections global list
desc_cachekmem cache for tx dataout
- intiscsi_iser_pdu_alloc(structiscsi_task*task,uint8_topcode)¶
allocate an iscsi-iser PDU
Parameters
structiscsi_task*taskiscsi task
uint8_topcodeiscsi command opcode
Description
- Netes: This routine can’t fail, just assign iscsi task
hdr and max hdr size.
- intiser_initialize_task_headers(structiscsi_task*task,structiser_tx_desc*tx_desc)¶
Initialize task headers
Parameters
structiscsi_task*taskiscsi task
structiser_tx_desc*tx_desciser tx descriptor
Notes
This routine may race with iser teardown flow for scsierror handling TMFs. So for TMF we should acquire thestate mutex to avoid dereferencing the IB device whichmay have already been terminated.
- intiscsi_iser_task_init(structiscsi_task*task)¶
Initialize iscsi-iser task
Parameters
structiscsi_task*taskiscsi task
Description
Initialize the task for the scsi command or mgmt command.
Return
Returns zero on success or -ENOMEM when failingto init task headers (dma mapping error).
- intiscsi_iser_mtask_xmit(structiscsi_conn*conn,structiscsi_task*task)¶
xmit management (immediate) task
Parameters
structiscsi_conn*conniscsi connection
structiscsi_task*tasktask management task
Notes
The function can return -EAGAIN in which case caller mustcall it again later, or recover. ‘0’ return code means successfulxmit.
- intiscsi_iser_task_xmit(structiscsi_task*task)¶
xmit iscsi-iser task
Parameters
structiscsi_task*taskiscsi task
Return
zero on success or escalates $error on failure.
- voidiscsi_iser_cleanup_task(structiscsi_task*task)¶
cleanup an iscsi-iser task
Parameters
structiscsi_task*taskiscsi task
Notes
- In case the RDMA device is already NULL (might have
been removed in DEVICE_REMOVAL CM event it will bail-outwithout doing dma unmapping.
- u8iscsi_iser_check_protection(structiscsi_task*task,sector_t*sector)¶
check protection information status of task.
Parameters
structiscsi_task*taskiscsi task
sector_t*sectorerror sector if exsists (output)
Return
zero if no data-integrity errors have occurred0x1: data-integrity error occurred in the guard-block0x2: data-integrity error occurred in the reference tag0x3: data-integrity error occurred in the application tag
Description
In addition the error sector is marked.
- structiscsi_cls_conn*iscsi_iser_conn_create(structiscsi_cls_session*cls_session,uint32_tconn_idx)¶
create a new iscsi-iser connection
Parameters
structiscsi_cls_session*cls_sessioniscsi class connection
uint32_tconn_idxconnection index within the session (for MCS)
Return
iscsi_cls_conn when iscsi_conn_setup succeeds or NULLotherwise.
- intiscsi_iser_conn_bind(structiscsi_cls_session*cls_session,structiscsi_cls_conn*cls_conn,uint64_ttransport_eph,intis_leading)¶
bind iscsi and iser connection structures
Parameters
structiscsi_cls_session*cls_sessioniscsi class session
structiscsi_cls_conn*cls_conniscsi class connection
uint64_ttransport_ephtransport end-point handle
intis_leadingindicate if this is the session leading connection (MCS)
Return
zero on success, $error if iscsi_conn_bind fails and-EINVAL in case end-point doesn’t exists anymore or iser connectionstate is not UP (teardown already started).
- intiscsi_iser_conn_start(structiscsi_cls_conn*cls_conn)¶
start iscsi-iser connection
Parameters
structiscsi_cls_conn*cls_conniscsi class connection
Notes
- Here iser intialize (or re-initialize) stop_completion as
from this point iscsi must call conn_stop in session/connectionteardown so iser transport must wait for it.
- voidiscsi_iser_conn_stop(structiscsi_cls_conn*cls_conn,intflag)¶
stop iscsi-iser connection
Parameters
structiscsi_cls_conn*cls_conniscsi class connection
intflagindicate if recover or terminate (passed as is)
Notes
- Calling iscsi_conn_stop might theoretically race with
DEVICE_REMOVAL event and dereference a previously freed RDMA devicehandle, so we call it under iser the state lock to protect againstthis kind of race.
- voidiscsi_iser_session_destroy(structiscsi_cls_session*cls_session)¶
destroy iscsi-iser session
Parameters
structiscsi_cls_session*cls_sessioniscsi class session
Description
Removes and free iscsi host.
- structiscsi_cls_session*iscsi_iser_session_create(structiscsi_endpoint*ep,uint16_tcmds_max,uint16_tqdepth,uint32_tinitial_cmdsn)¶
create an iscsi-iser session
Parameters
structiscsi_endpoint*episcsi end-point handle
uint16_tcmds_maxmaximum commands in this session
uint16_tqdepthsession command queue depth
uint32_tinitial_cmdsninitiator command sequnce number
Description
Allocates and adds a scsi host, expose DIF supprot ifexists, and sets up an iscsi session.
- structiscsi_endpoint*iscsi_iser_ep_connect(structScsi_Host*shost,structsockaddr*dst_addr,intnon_blocking)¶
Initiate iSER connection establishment
Parameters
structScsi_Host*shostscsi_host
structsockaddr*dst_addrdestination address
intnon_blockingindicate if routine can block
Description
Allocate an iscsi endpoint, an iser_conn structure and bind them.After that start RDMA connection establishment via rdma_cm. Wedon’t allocate iser_conn embedded in iscsi_endpoint since in teardownthe endpoint will be destroyed at ep_disconnect while iser_conn willcleanup its resources asynchronuously.
Return
iscsi_endpoint created by iscsi layer or ERR_PTR(error)if fails.
- intiscsi_iser_ep_poll(structiscsi_endpoint*ep,inttimeout_ms)¶
poll for iser connection establishment to complete
Parameters
structiscsi_endpoint*episcsi endpoint (created at ep_connect)
inttimeout_mspolling timeout allowed in ms.
Description
This routine boils down to waiting for up_completion signalingthat cma_id got CONNECTED event.
Return
1 if succeeded in connection establishment, 0 if timeout expired(libiscsi will retry will kick in) or -1 if interrupted by signalor more likely iser connection state transitioned to TEMINATING orDOWN during the wait period.
- voidiscsi_iser_ep_disconnect(structiscsi_endpoint*ep)¶
Initiate connection teardown process
Parameters
structiscsi_endpoint*episcsi endpoint handle
Description
This routine is not blocked by iser and RDMA termination processcompletion as we queue a deffered work for iser/RDMA destructionand cleanup or actually call it immediately in case we didn’t passiscsi conn bind/start stage, thus it is safe.
- intiser_send_command(structiscsi_conn*conn,structiscsi_task*task)¶
send command PDU
Parameters
structiscsi_conn*connlink to matching iscsi connection
structiscsi_task*taskSCSI command task
- intiser_send_data_out(structiscsi_conn*conn,structiscsi_task*task,structiscsi_data*hdr)¶
send data out PDU
Parameters
structiscsi_conn*connlink to matching iscsi connection
structiscsi_task*taskSCSI command task
structiscsi_data*hdrpointer to the LLD’s iSCSI message header
- intiser_alloc_fastreg_pool(structib_conn*ib_conn,unsignedcmds_max,unsignedintsize)¶
Creates pool of fast_reg descriptors for fast registration work requests.
Parameters
structib_conn*ib_connconnection RDMA resources
unsignedcmds_maxmax number of SCSI commands for this connection
unsignedintsizemax number of pages per map request
Return
0 on success, or errno code on failure
Parameters
structib_conn*ib_connconnection RDMA resources
Parameters
structiser_conn*iser_conniser connection struct
booldestroyindicator if we need to try to release theiser device and memory regoins pool (only iscsishutdown and DEVICE_REMOVAL will use this).
Description
This routine is called with the iser state mutex heldso the cm_id removal is out of here. It is Safe tobe invoked multiple times.
- voidiser_conn_release(structiser_conn*iser_conn)¶
Frees all conn objects and deallocs conn descriptor
Parameters
structiser_conn*iser_conniSER connection context
- intiser_conn_terminate(structiser_conn*iser_conn)¶
triggers start of the disconnect procedures and waits for them to be done
Parameters
structiser_conn*iser_conniSER connection context
Description
Called with state mutex held
- intiser_post_send(structib_conn*ib_conn,structiser_tx_desc*tx_desc)¶
Initiate a Send DTO operation
Parameters
structib_conn*ib_connconnection RDMA resources
structiser_tx_desc*tx_desciSER TX descriptor
Return
0 on success, -1 on failure
Omni-Path (OPA) Virtual NIC support¶
- structopa_vnic_ctrl_port¶
OPA virtual NIC control port
Definition:
struct opa_vnic_ctrl_port { struct ib_device *ibdev; struct opa_vnic_ctrl_ops *ops; u8 num_ports;};Members
ibdevpointer to ib device
opsopa vnic control operations
num_portsnumber of opa ports
- structopa_vnic_adapter¶
OPA VNIC netdev private data structure
Definition:
struct opa_vnic_adapter { struct net_device *netdev; struct ib_device *ibdev; struct opa_vnic_ctrl_port *cport; const struct net_device_ops *rn_ops; u8 port_num; u8 vport_num; struct mutex lock; struct __opa_veswport_info info; u8 vema_mac_addr[ETH_ALEN]; u32 umac_hash; u32 mmac_hash; struct hlist_head *mactbl; struct mutex mactbl_lock; spinlock_t stats_lock; u8 flow_tbl[OPA_VNIC_FLOW_TBL_SIZE]; unsigned long trap_timeout; u8 trap_count;};Members
netdevpointer to associated netdev
ibdevib device
cportpointer to opa vnic control port
rn_opsrdma netdev’s net_device_ops
port_numOPA port number
vport_numvesw port number
lockadapter lock
infovirtual ethernet switch port information
vema_mac_addrmac address configured by vema
umac_hashunicast maclist hash
mmac_hashmulticast maclist hash
mactblhash table of MAC entries
mactbl_lockmac table lock
stats_lockstatistics lock
flow_tblflow to default port redirection table
trap_timeouttrap timeout
trap_countno. of traps allowed within timeout period
- structopa_vnic_mac_tbl_node¶
OPA VNIC mac table node
Definition:
struct opa_vnic_mac_tbl_node { struct hlist_node hlist; u16 index; struct __opa_vnic_mactable_entry entry;};Members
hlisthash list handle
indexindex of entry in the mac table
entryentry in the table
- structopa_vesw_info¶
OPA vnic switch information
Definition:
struct opa_vesw_info { __be16 fabric_id; __be16 vesw_id; u8 rsvd0[6]; __be16 def_port_mask; u8 rsvd1[2]; __be16 pkey; u8 rsvd2[4]; __be32 u_mcast_dlid; __be32 u_ucast_dlid[OPA_VESW_MAX_NUM_DEF_PORT]; __be32 rc; u8 rsvd3[56]; __be16 eth_mtu; u8 rsvd4[2];};Members
fabric_id10-bit fabric id
vesw_id12-bit virtual ethernet switch id
rsvd0reserved bytes
def_port_maskbitmask of default ports
rsvd1reserved bytes
pkeypartition key
rsvd2reserved bytes
u_mcast_dlidunknown multicast dlid
u_ucast_dlidarray of unknown unicast dlids
rcrouting control
rsvd3reserved bytes
eth_mtuEthernet MTU
rsvd4reserved bytes
- structopa_per_veswport_info¶
OPA vnic per port information
Definition:
struct opa_per_veswport_info { __be32 port_num; u8 eth_link_status; u8 rsvd0[3]; u8 base_mac_addr[ETH_ALEN]; u8 config_state; u8 oper_state; __be16 max_mac_tbl_ent; __be16 max_smac_ent; __be32 mac_tbl_digest; u8 rsvd1[4]; __be32 encap_slid; u8 pcp_to_sc_uc[OPA_VNIC_MAX_NUM_PCP]; u8 pcp_to_vl_uc[OPA_VNIC_MAX_NUM_PCP]; u8 pcp_to_sc_mc[OPA_VNIC_MAX_NUM_PCP]; u8 pcp_to_vl_mc[OPA_VNIC_MAX_NUM_PCP]; u8 non_vlan_sc_uc; u8 non_vlan_vl_uc; u8 non_vlan_sc_mc; u8 non_vlan_vl_mc; u8 rsvd2[48]; __be16 uc_macs_gen_count; __be16 mc_macs_gen_count; u8 rsvd3[8];};Members
port_numport number
eth_link_statuscurrent ethernet link state
rsvd0reserved bytes
base_mac_addrbase mac address
config_stateconfigured port state
oper_stateoperational port state
max_mac_tbl_entmax number of mac table entries
max_smac_entmax smac entries in mac table
mac_tbl_digestmac table digest
rsvd1reserved bytes
encap_slidbase slid for the port
pcp_to_sc_ucsc by pcp index for unicast ethernet packets
pcp_to_vl_ucvl by pcp index for unicast ethernet packets
pcp_to_sc_mcsc by pcp index for multicast ethernet packets
pcp_to_vl_mcvl by pcp index for multicast ethernet packets
non_vlan_sc_ucsc for non-vlan unicast ethernet packets
non_vlan_vl_ucvl for non-vlan unicast ethernet packets
non_vlan_sc_mcsc for non-vlan multicast ethernet packets
non_vlan_vl_mcvl for non-vlan multicast ethernet packets
rsvd2reserved bytes
uc_macs_gen_countgeneration count for unicast macs list
mc_macs_gen_countgeneration count for multicast macs list
rsvd3reserved bytes
- structopa_veswport_info¶
OPA vnic port information
Definition:
struct opa_veswport_info { struct opa_vesw_info vesw; struct opa_per_veswport_info vport;};Members
veswOPA vnic switch information
vportOPA vnic per port information
Description
On host, each of the virtual ethernet ports belongsto a different virtual ethernet switches.
- structopa_veswport_mactable_entry¶
single entry in the forwarding table
Definition:
struct opa_veswport_mactable_entry { u8 mac_addr[ETH_ALEN]; u8 mac_addr_mask[ETH_ALEN]; __be32 dlid_sd;};Members
mac_addrMAC address
mac_addr_maskMAC address bit mask
dlid_sdMatching DLID and side data
Description
On the host each virtual ethernet port will havea forwarding table. These tables are used tomap a MAC to a LID and other data. For moredetails seestructopa_veswport_mactable_entries.This is the structure of a single mactable entry
- structopa_veswport_mactable¶
Forwarding table array
Definition:
struct opa_veswport_mactable { __be16 offset; __be16 num_entries; __be32 mac_tbl_digest; struct opa_veswport_mactable_entry tbl_entries[];};Members
offsetmac table starting offset
num_entriesNumber of entries to get or set
mac_tbl_digestmac table digest
tbl_entriesArray of table entries
Description
The EM sends down this structure in a MAD indicatingthe starting offset in the forwarding table that thisentry is to be loaded into and the number of entriesthat that this MAD instance containsThe mac_tbl_digest has been added to this MAD structure. It will be set bythe EM and it will be used by the EM to check if there are anydiscrepancies with this value and the valuemaintained by the EM in the case of VNIC port being deleted or unloadedA new instantiation of a VNIC will always have a value of zero.This value is stored as part of the vnic adapter structure and will beaccessed by the GET and SET routines for both the mactable entries and theveswport info.
- structopa_veswport_summary_counters¶
summary counters
Definition:
struct opa_veswport_summary_counters { __be16 vp_instance; __be16 vesw_id; __be32 veswport_num; __be64 tx_errors; __be64 rx_errors; __be64 tx_packets; __be64 rx_packets; __be64 tx_bytes; __be64 rx_bytes; __be64 tx_unicast; __be64 tx_mcastbcast; __be64 tx_untagged; __be64 tx_vlan; __be64 tx_64_size; __be64 tx_65_127; __be64 tx_128_255; __be64 tx_256_511; __be64 tx_512_1023; __be64 tx_1024_1518; __be64 tx_1519_max; __be64 rx_unicast; __be64 rx_mcastbcast; __be64 rx_untagged; __be64 rx_vlan; __be64 rx_64_size; __be64 rx_65_127; __be64 rx_128_255; __be64 rx_256_511; __be64 rx_512_1023; __be64 rx_1024_1518; __be64 rx_1519_max; __be64 reserved[16];};Members
vp_instancevport instance on the OPA port
vesw_idvirtual ethernet switch id
veswport_numvirtual ethernet switch port number
tx_errorstransmit errors
rx_errorsreceive errors
tx_packetstransmit packets
rx_packetsreceive packets
tx_bytestransmit bytes
rx_bytesreceive bytes
tx_unicastunicast packets transmitted
tx_mcastbcastmulticast/broadcast packets transmitted
tx_untaggednon-vlan packets transmitted
tx_vlanvlan packets transmitted
tx_64_sizetransmit packet length is 64 bytes
tx_65_127transmit packet length is >=65 and < 127 bytes
tx_128_255transmit packet length is >=128 and < 255 bytes
tx_256_511transmit packet length is >=256 and < 511 bytes
tx_512_1023transmit packet length is >=512 and < 1023 bytes
tx_1024_1518transmit packet length is >=1024 and < 1518 bytes
tx_1519_maxtransmit packet length >= 1519 bytes
rx_unicastunicast packets received
rx_mcastbcastmulticast/broadcast packets received
rx_untaggednon-vlan packets received
rx_vlanvlan packets received
rx_64_sizereceived packet length is 64 bytes
rx_65_127received packet length is >=65 and < 127 bytes
rx_128_255received packet length is >=128 and < 255 bytes
rx_256_511received packet length is >=256 and < 511 bytes
rx_512_1023received packet length is >=512 and < 1023 bytes
rx_1024_1518received packet length is >=1024 and < 1518 bytes
rx_1519_maxreceived packet length >= 1519 bytes
reservedreserved bytes
Description
All the above are counters of corresponding conditions.
- structopa_veswport_error_counters¶
error counters
Definition:
struct opa_veswport_error_counters { __be16 vp_instance; __be16 vesw_id; __be32 veswport_num; __be64 tx_errors; __be64 rx_errors; __be64 rsvd0; __be64 tx_smac_filt; __be64 rsvd1; __be64 rsvd2; __be64 rsvd3; __be64 tx_dlid_zero; __be64 rsvd4; __be64 tx_logic; __be64 rsvd5; __be64 tx_drop_state; __be64 rx_bad_veswid; __be64 rsvd6; __be64 rx_runt; __be64 rx_oversize; __be64 rsvd7; __be64 rx_eth_down; __be64 rx_drop_state; __be64 rx_logic; __be64 rsvd8; __be64 rsvd9[16];};Members
vp_instancevport instance on the OPA port
vesw_idvirtual ethernet switch id
veswport_numvirtual ethernet switch port number
tx_errorstransmit errors
rx_errorsreceive errors
rsvd0reserved bytes
tx_smac_filtsmac filter errors
rsvd1reserved bytes
rsvd2reserved bytes
rsvd3reserved bytes
tx_dlid_zerotransmit packets with invalid dlid
rsvd4reserved bytes
tx_logicother transmit errors
rsvd5reserved bytes
tx_drop_statepacket tansmission in non-forward port state
rx_bad_veswidreceived packet with invalid vesw id
rsvd6reserved bytes
rx_runtreceived ethernet packet with length < 64 bytes
rx_oversizereceived ethernet packet with length > MTU size
rsvd7reserved bytes
rx_eth_downreceived packets when interface is down
rx_drop_statereceived packets in non-forwarding port state
rx_logicother receive errors
rsvd8reserved bytes
rsvd9reserved bytes
Description
All the above are counters of corresponding error conditions.
- structopa_veswport_trap¶
Trap message sent to EM by VNIC
Definition:
struct opa_veswport_trap { __be16 fabric_id; __be16 veswid; __be32 veswportnum; __be16 opaportnum; u8 veswportindex; u8 opcode; __be32 reserved;};Members
fabric_id10 bit fabric id
veswid12 bit virtual ethernet switch id
veswportnumlogical port number on the Virtual switch
opaportnumphysical port num (redundant on host)
veswportindexswitch port index on opa port 0 based
opcodeoperation
reserved32 bit for alignment
Description
The VNIC will send trap messages to the Ethernet manager toinform it about changes to the VNIC config, behaviour etc.This is the format of the trap payload.
- structopa_vnic_iface_mac_entry¶
single entry in the mac list
Definition:
struct opa_vnic_iface_mac_entry { u8 mac_addr[ETH_ALEN];};Members
mac_addrMAC address
- structopa_veswport_iface_macs¶
Msg to set globally administered MAC
Definition:
struct opa_veswport_iface_macs { __be16 start_idx; __be16 num_macs_in_msg; __be16 tot_macs_in_lst; __be16 gen_count; struct opa_vnic_iface_mac_entry entry[];};Members
start_idxposition of first entry (0 based)
num_macs_in_msgnumber of MACs in this message
tot_macs_in_lstThe total number of MACs the agent has
gen_countgen_count to indicate change
entryThe mac list entry
Description
Same attribute IDS and attribute modifiers as in locally administeredaddresses used to set globally administered addresses
- structopa_vnic_vema_mad¶
Generic VEMA MAD
Definition:
struct opa_vnic_vema_mad { struct ib_mad_hdr mad_hdr; struct ib_rmpp_hdr rmpp_hdr; u8 reserved; u8 oui[3]; u8 data[OPA_VNIC_EMA_DATA];};Members
mad_hdrGeneric MAD header
rmpp_hdrRMPP header for vendor specific MADs
reservedreserved bytes
ouiUnique org identifier
dataMAD data
- structopa_vnic_notice_attr¶
Generic Notice MAD
Definition:
struct opa_vnic_notice_attr { u8 gen_type; u8 oui_1; u8 oui_2; u8 oui_3; __be16 trap_num; __be16 toggle_count; __be32 issuer_lid; __be32 reserved; u8 issuer_gid[16]; u8 raw_data[64];};Members
gen_typeGeneric/Specific bit and type of notice
oui_1Vendor ID byte 1
oui_2Vendor ID byte 2
oui_3Vendor ID byte 3
trap_numTrap number
toggle_countNotice toggle bit and count value
issuer_lidTrap issuer’s lid
reservedreserved bytes
issuer_gidIssuer GID (only if Report method)
raw_dataTrap message body
- structopa_vnic_vema_mad_trap¶
Generic VEMA MAD Trap
Definition:
struct opa_vnic_vema_mad_trap { struct ib_mad_hdr mad_hdr; struct ib_rmpp_hdr rmpp_hdr; u8 reserved; u8 oui[3]; struct opa_vnic_notice_attr notice;};Members
mad_hdrGeneric MAD header
rmpp_hdrRMPP header for vendor specific MADs
reservedreserved bytes
ouiUnique org identifier
noticeNotice structure
- voidopa_vnic_vema_report_event(structopa_vnic_adapter*adapter,u8event)¶
sent trap to report the specified event
Parameters
structopa_vnic_adapter*adaptervnic port adapter
u8eventevent to be reported
Description
This function calls vema api to sent a trap for the given event.
- voidopa_vnic_get_summary_counters(structopa_vnic_adapter*adapter,structopa_veswport_summary_counters*cntrs)¶
get summary counters
Parameters
structopa_vnic_adapter*adaptervnic port adapter
structopa_veswport_summary_counters*cntrspointer to destination summary counters structure
Description
This function populates the summary counters that is maintained by thegiven adapter to destination address provided.
- voidopa_vnic_get_error_counters(structopa_vnic_adapter*adapter,structopa_veswport_error_counters*cntrs)¶
get error counters
Parameters
structopa_vnic_adapter*adaptervnic port adapter
structopa_veswport_error_counters*cntrspointer to destination error counters structure
Description
This function populates the error counters that is maintained by thegiven adapter to destination address provided.
- voidopa_vnic_get_vesw_info(structopa_vnic_adapter*adapter,structopa_vesw_info*info)¶
Get the vesw information
Parameters
structopa_vnic_adapter*adaptervnic port adapter
structopa_vesw_info*infopointer to destination vesw info structure
Description
This function copies the vesw info that is maintained by thegiven adapter to destination address provided.
- voidopa_vnic_set_vesw_info(structopa_vnic_adapter*adapter,structopa_vesw_info*info)¶
Set the vesw information
Parameters
structopa_vnic_adapter*adaptervnic port adapter
structopa_vesw_info*infopointer to vesw info structure
Description
This function updates the vesw info that is maintained by thegiven adapter with vesw info provided. Reserved fields are storedand returned back to EM as is.
- voidopa_vnic_get_per_veswport_info(structopa_vnic_adapter*adapter,structopa_per_veswport_info*info)¶
Get the vesw per port information
Parameters
structopa_vnic_adapter*adaptervnic port adapter
structopa_per_veswport_info*infopointer to destination vport info structure
Description
This function copies the vesw per port info that is maintained by thegiven adapter to destination address provided.Note that the read only fields are not copied.
- voidopa_vnic_set_per_veswport_info(structopa_vnic_adapter*adapter,structopa_per_veswport_info*info)¶
Set vesw per port information
Parameters
structopa_vnic_adapter*adaptervnic port adapter
structopa_per_veswport_info*infopointer to vport info structure
Description
This function updates the vesw per port info that is maintained by thegiven adapter with vesw per port info provided. Reserved fields arestored and returned back to EM as is.
- voidopa_vnic_query_mcast_macs(structopa_vnic_adapter*adapter,structopa_veswport_iface_macs*macs)¶
query multicast mac list
Parameters
structopa_vnic_adapter*adaptervnic port adapter
structopa_veswport_iface_macs*macspointer mac list
Description
This function populates the provided mac list with the configuredmulticast addresses in the adapter.
- voidopa_vnic_query_ucast_macs(structopa_vnic_adapter*adapter,structopa_veswport_iface_macs*macs)¶
query unicast mac list
Parameters
structopa_vnic_adapter*adaptervnic port adapter
structopa_veswport_iface_macs*macspointer mac list
Description
This function populates the provided mac list with the configuredunicast addresses in the adapter.
- structopa_vnic_vema_port¶
VNIC VEMA port details
Definition:
struct opa_vnic_vema_port { struct opa_vnic_ctrl_port *cport; struct ib_mad_agent *mad_agent; struct opa_class_port_info class_port_info; u64 tid; u8 port_num; struct xarray vports; struct ib_event_handler event_handler; struct mutex lock;};Members
cportpointer to port
mad_agentpointer to mad agent for port
class_port_infoClass port info information.
tidTransaction id
port_numOPA port number
vportsvnic ports
event_handlerib event handler
lockadapter interface lock
- u8vema_get_vport_num(structopa_vnic_vema_mad*recvd_mad)¶
Get the vnic from the mad
Parameters
structopa_vnic_vema_mad*recvd_madReceived mad
Return
returns value of the vnic port number
- structopa_vnic_adapter*vema_get_vport_adapter(structopa_vnic_vema_mad*recvd_mad,structopa_vnic_vema_port*port)¶
Get vnic port adapter from recvd mad
Parameters
structopa_vnic_vema_mad*recvd_madreceived mad
structopa_vnic_vema_port*portptr to port
structonwhich MAD was recvd
Return
vnic adapter
- boolvema_mac_tbl_req_ok(structopa_veswport_mactable*mac_tbl)¶
Check if mac request has correct values
Parameters
structopa_veswport_mactable*mac_tblmac table
Description
This function checks for the validity of the offset and number ofentries required.
Return
true if offset and num_entries are valid
- structopa_vnic_adapter*vema_add_vport(structopa_vnic_vema_port*port,u8vport_num)¶
Add a new vnic port
Parameters
structopa_vnic_vema_port*portptr to opa_vnic_vema_port struct
u8vport_numvnic port number (to be added)
Description
Return a pointer to the vnic adapter structure
- voidvema_get_class_port_info(structopa_vnic_vema_port*port,structopa_vnic_vema_mad*recvd_mad,structopa_vnic_vema_mad*rsp_mad)¶
Get class info for port
Parameters
structopa_vnic_vema_port*portPort on whic MAD was received
structopa_vnic_vema_mad*recvd_madpointer to the received mad
structopa_vnic_vema_mad*rsp_madpointer to respose mad
Description
This function copies the latest class port info value set for theport and stores it for generating traps
- voidvema_set_class_port_info(structopa_vnic_vema_port*port,structopa_vnic_vema_mad*recvd_mad,structopa_vnic_vema_mad*rsp_mad)¶
Get class info for port
Parameters
structopa_vnic_vema_port*portPort on whic MAD was received
structopa_vnic_vema_mad*recvd_madpointer to the received mad
structopa_vnic_vema_mad*rsp_madpointer to respose mad
Description
This function updates the port class info for the specific vnicand sets up the response mad data
- voidvema_get_veswport_info(structopa_vnic_vema_port*port,structopa_vnic_vema_mad*recvd_mad,structopa_vnic_vema_mad*rsp_mad)¶
Get veswport info
Parameters
structopa_vnic_vema_port*portsource port on which MAD was received
structopa_vnic_vema_mad*recvd_madpointer to the received mad
structopa_vnic_vema_mad*rsp_madpointer to respose mad
- voidvema_set_veswport_info(structopa_vnic_vema_port*port,structopa_vnic_vema_mad*recvd_mad,structopa_vnic_vema_mad*rsp_mad)¶
Set veswport info
Parameters
structopa_vnic_vema_port*portsource port on which MAD was received
structopa_vnic_vema_mad*recvd_madpointer to the received mad
structopa_vnic_vema_mad*rsp_madpointer to respose mad
Description
This function gets the port class infor for vnic
- voidvema_get_mac_entries(structopa_vnic_vema_port*port,structopa_vnic_vema_mad*recvd_mad,structopa_vnic_vema_mad*rsp_mad)¶
Get MAC entries in VNIC MAC table
Parameters
structopa_vnic_vema_port*portsource port on which MAD was received
structopa_vnic_vema_mad*recvd_madpointer to the received mad
structopa_vnic_vema_mad*rsp_madpointer to respose mad
Description
This function gets the MAC entries that are programmed intothe VNIC MAC forwarding table. It checks for the validity ofthe index into the MAC table and the number of entries thatare to be retrieved.
- voidvema_set_mac_entries(structopa_vnic_vema_port*port,structopa_vnic_vema_mad*recvd_mad,structopa_vnic_vema_mad*rsp_mad)¶
Set MAC entries in VNIC MAC table
Parameters
structopa_vnic_vema_port*portsource port on which MAD was received
structopa_vnic_vema_mad*recvd_madpointer to the received mad
structopa_vnic_vema_mad*rsp_madpointer to respose mad
Description
This function sets the MAC entries in the VNIC forwarding tableIt checks for the validity of the index and the number of forwardingtable entries to be programmed.
- voidvema_set_delete_vesw(structopa_vnic_vema_port*port,structopa_vnic_vema_mad*recvd_mad,structopa_vnic_vema_mad*rsp_mad)¶
Reset VESW info to POD values
Parameters
structopa_vnic_vema_port*portsource port on which MAD was received
structopa_vnic_vema_mad*recvd_madpointer to the received mad
structopa_vnic_vema_mad*rsp_madpointer to respose mad
Description
This function clears all the fields of veswport info for the requested veswand sets them back to the power-on default values. It does not delete thevesw.
- voidvema_get_mac_list(structopa_vnic_vema_port*port,structopa_vnic_vema_mad*recvd_mad,structopa_vnic_vema_mad*rsp_mad,u16attr_id)¶
Get the unicast/multicast macs.
Parameters
structopa_vnic_vema_port*portsource port on which MAD was received
structopa_vnic_vema_mad*recvd_madReceived mad contains fields to set vnic parameters
structopa_vnic_vema_mad*rsp_madResponse mad to be built
u16attr_idAttribute ID indicating multicast or unicast mac list
- voidvema_get_summary_counters(structopa_vnic_vema_port*port,structopa_vnic_vema_mad*recvd_mad,structopa_vnic_vema_mad*rsp_mad)¶
Gets summary counters.
Parameters
structopa_vnic_vema_port*portsource port on which MAD was received
structopa_vnic_vema_mad*recvd_madReceived mad contains fields to set vnic parameters
structopa_vnic_vema_mad*rsp_madResponse mad to be built
- voidvema_get_error_counters(structopa_vnic_vema_port*port,structopa_vnic_vema_mad*recvd_mad,structopa_vnic_vema_mad*rsp_mad)¶
Gets summary counters.
Parameters
structopa_vnic_vema_port*portsource port on which MAD was received
structopa_vnic_vema_mad*recvd_madReceived mad contains fields to set vnic parameters
structopa_vnic_vema_mad*rsp_madResponse mad to be built
- voidvema_get(structopa_vnic_vema_port*port,structopa_vnic_vema_mad*recvd_mad,structopa_vnic_vema_mad*rsp_mad)¶
Process received get MAD
Parameters
structopa_vnic_vema_port*portsource port on which MAD was received
structopa_vnic_vema_mad*recvd_madReceived mad
structopa_vnic_vema_mad*rsp_madResponse mad to be built
- voidvema_set(structopa_vnic_vema_port*port,structopa_vnic_vema_mad*recvd_mad,structopa_vnic_vema_mad*rsp_mad)¶
Process received set MAD
Parameters
structopa_vnic_vema_port*portsource port on which MAD was received
structopa_vnic_vema_mad*recvd_madReceived mad contains fields to set vnic parameters
structopa_vnic_vema_mad*rsp_madResponse mad to be built
- voidvema_send(structib_mad_agent*mad_agent,structib_mad_send_wc*mad_wc)¶
Send handler for VEMA MAD agent
Parameters
structib_mad_agent*mad_agentpointer to the mad agent
structib_mad_send_wc*mad_wcpointer to mad send work completion information
Description
Free all the data structures associated with the sent MAD
- voidvema_recv(structib_mad_agent*mad_agent,structib_mad_send_buf*send_buf,structib_mad_recv_wc*mad_wc)¶
Recv handler for VEMA MAD agent
Parameters
structib_mad_agent*mad_agentpointer to the mad agent
structib_mad_send_buf*send_bufSend buffer if found, else NULL
structib_mad_recv_wc*mad_wcpointer to mad send work completion information
Description
Handle only set and get methods and respond to other methodsas unsupported. Allocate response buffer and address handlefor the response MAD.
- structopa_vnic_vema_port*vema_get_port(structopa_vnic_ctrl_port*cport,u8port_num)¶
Gets the opa_vnic_vema_port
Parameters
structopa_vnic_ctrl_port*cportpointer to control dev
u8port_numPort number
Description
This function loops through the ports and returnsthe opa_vnic_vema port structure that is associatedwith the OPA port number
Return
ptr to requested opa_vnic_vema_port strucureif success, NULL if not
- voidopa_vnic_vema_send_trap(structopa_vnic_adapter*adapter,struct__opa_veswport_trap*data,u32lid)¶
This function sends a trap to the EM
Parameters
structopa_vnic_adapter*adapterpointer to vnic adapter
struct__opa_veswport_trap*datapointer to trap data filled by calling function
u32lidissuers lid (encap_slid from vesw_port_info)
Description
This function is called from the VNIC driver to send a trap if thereis somethng the EM should be notified about. These events currentlyare1) UNICAST INTERFACE MACADDRESS changes2) MULTICAST INTERFACE MACADDRESS changes3) ETHERNET LINK STATUS changesWhile allocating the send mad the remote site qpn used is 1as this is the well known QP.
- voidvema_unregister(structopa_vnic_ctrl_port*cport)¶
Unregisters agent
Parameters
structopa_vnic_ctrl_port*cportpointer to control port
Description
This deletes the registration by VEMA for MADs
- intvema_register(structopa_vnic_ctrl_port*cport)¶
Registers agent
Parameters
structopa_vnic_ctrl_port*cportpointer to control port
Description
This function registers the handlers for the VEMA MADs
Return
returns 0 on success. non zero otherwise
- voidopa_vnic_ctrl_config_dev(structopa_vnic_ctrl_port*cport,boolen)¶
This function sends a trap to the EM by way of ib_modify_port to indicate support for ethernet on the fabric.
Parameters
structopa_vnic_ctrl_port*cportpointer to control port
boolenenable or disable ethernet on fabric support
- intopa_vnic_vema_add_one(structib_device*device)¶
Handle new ib device
Parameters
structib_device*deviceib device pointer
Description
Allocate the vnic control port and initialize it.
- voidopa_vnic_vema_rem_one(structib_device*device,void*client_data)¶
Handle ib device removal
Parameters
structib_device*deviceib device pointer
void*client_dataib client data
Description
Uninitialize and free the vnic control port.
InfiniBand SCSI RDMA protocol target support¶
- enumsrpt_command_state¶
SCSI command state managed by SRPT
Constants
SRPT_STATE_NEWNew command arrived and is being processed.
SRPT_STATE_NEED_DATAProcessing a write or bidir command and waitingfor data arrival.
SRPT_STATE_DATA_INData for the write or bidir command arrived and isbeing processed.
SRPT_STATE_CMD_RSP_SENTSRP_RSP for SRP_CMD has been sent.
SRPT_STATE_MGMTProcessing a SCSI task management command.
SRPT_STATE_MGMT_RSP_SENTSRP_RSP for SRP_TSK_MGMT has been sent.
SRPT_STATE_DONECommand processing finished successfully, commandprocessing has been aborted or command processingfailed.
- structsrpt_ioctx¶
shared SRPT I/O context information
Definition:
struct srpt_ioctx { struct ib_cqe cqe; void *buf; dma_addr_t dma; uint32_t offset; uint32_t index;};Members
cqeCompletion queue element.
bufPointer to the buffer.
dmaDMA address of the buffer.
offsetOffset of the first byte inbuf anddma that is actually used.
indexIndex of the I/O context in its ioctx_ring array.
- structsrpt_recv_ioctx¶
SRPT receive I/O context
Definition:
struct srpt_recv_ioctx { struct srpt_ioctx ioctx; struct list_head wait_list; int byte_len;};Members
ioctxSee above.
wait_listNode for insertion in srpt_rdma_ch.cmd_wait_list.
byte_lenNumber of bytes inioctx.buf.
- structsrpt_send_ioctx¶
SRPT send I/O context
Definition:
struct srpt_send_ioctx { struct srpt_ioctx ioctx; struct srpt_rdma_ch *ch; struct srpt_recv_ioctx *recv_ioctx; struct srpt_rw_ctx s_rw_ctx; struct srpt_rw_ctx *rw_ctxs; struct scatterlist imm_sg; struct ib_cqe rdma_cqe; enum srpt_command_state state; struct se_cmd cmd; u8 n_rdma; u8 n_rw_ctx; bool queue_status_only; u8 sense_data[TRANSPORT_SENSE_BUFFER];};Members
ioctxSee above.
chChannel pointer.
recv_ioctxReceive I/O context associated with this send I/O context.Only used for processing immediate data.
s_rw_ctxrw_ctxs points here if only a single rw_ctx is needed.
rw_ctxsRDMA read/write contexts.
imm_sgScatterlist for immediate data.
rdma_cqeRDMA completion queue element.
stateI/O context state.
cmdTarget core command data structure.
n_rdmaNumber of work requests needed to transfer this ioctx.
n_rw_ctxSize of rw_ctxs array.
queue_status_onlySend a SCSI status back to the initiator but no data.
sense_dataSense data to be sent to the initiator.
- enumrdma_ch_state¶
SRP channel state
Constants
CH_CONNECTINGQP is in RTR state; waiting for RTU.
CH_LIVEQP is in RTS state.
CH_DISCONNECTINGDREQ has been sent and waiting for DREP or DREQ hasbeen received.
CH_DRAININGDREP has been received or waiting for DREP timed outand last work request has been queued.
CH_DISCONNECTEDLast completion has been received.
- structsrpt_rdma_ch¶
RDMA channel
Definition:
struct srpt_rdma_ch { struct srpt_nexus *nexus; struct ib_qp *qp; union { struct { struct ib_cm_id *cm_id; } ib_cm; struct { struct rdma_cm_id *cm_id; } rdma_cm; }; struct ib_cq *cq; u32 cq_size; struct ib_cqe zw_cqe; struct rcu_head rcu; struct kref kref; struct completion *closed; int rq_size; u32 max_rsp_size; atomic_t sq_wr_avail; struct srpt_port *sport; int max_ti_iu_len; atomic_t req_lim; atomic_t req_lim_delta; u16 imm_data_offset; spinlock_t spinlock; enum rdma_ch_state state; struct kmem_cache *rsp_buf_cache; struct srpt_send_ioctx **ioctx_ring; struct kmem_cache *req_buf_cache; struct srpt_recv_ioctx **ioctx_recv_ring; struct list_head list; struct list_head cmd_wait_list; uint16_t pkey; bool using_rdma_cm; bool processing_wait_list; struct se_session *sess; u8 sess_name[40]; struct work_struct release_work;};Members
nexusI_T nexus this channel is associated with.
qpIB queue pair used for communicating over this channel.
{unnamed_union}anonymous
ib_cmSee below.
ib_cm.cm_idIB CM ID associated with the channel.
rdma_cmSee below.
rdma_cm.cm_idRDMA CM ID associated with the channel.
cqIB completion queue for this channel.
cq_sizeNumber of CQEs incq.
zw_cqeZero-length write CQE.
rcuRCU head.
krefkref for this channel.
closedCompletion object that will be signaled as soon as a newchannel object with the same identity can be created.
rq_sizeIB receive queue size.
max_rsp_sizeMaximum size of an RSP response message in bytes.
sq_wr_availnumber of work requests available in the send queue.
sportpointer to the information of the HCA port used by thischannel.
max_ti_iu_lenmaximum target-to-initiator information unit length.
req_limrequest limit: maximum number of requests that may be sentby the initiator without having received a response.
req_lim_deltaNumber of credits not yet sent back to the initiator.
imm_data_offsetOffset from start of SRP_CMD for immediate data.
spinlockProtects free_list and state.
statechannel state. See also
enumrdma_ch_state.rsp_buf_cachekmem_cache forioctx_ring.
ioctx_ringSend ring.
req_buf_cachekmem_cache forioctx_recv_ring.
ioctx_recv_ringReceive I/O context ring.
listNode in srpt_nexus.ch_list.
cmd_wait_listList of SCSI commands that arrived before the RTU event. Thislist contains
structsrpt_ioctxelements and is protectedagainst concurrent modification by the cm_id spinlock.pkeyP_Key of the IB partition for this SRP channel.
using_rdma_cmWhether the RDMA/CM or IB/CM is used for this channel.
processing_wait_listWhether or not cmd_wait_list is being processed.
sessSession information associated with this SRP channel.
sess_nameSession name.
release_workAllows scheduling of
srpt_release_channel().
- structsrpt_nexus¶
I_T nexus
Definition:
struct srpt_nexus { struct rcu_head rcu; struct list_head entry; struct list_head ch_list; u8 i_port_id[16]; u8 t_port_id[16];};Members
rcuRCU head for this data structure.
entrysrpt_port.nexus_list list node.
ch_liststructsrpt_rdma_chlist. Protected by srpt_port.mutex.i_port_id128-bit initiator port identifier copied from SRP_LOGIN_REQ.
t_port_id128-bit target port identifier copied from SRP_LOGIN_REQ.
- structsrpt_port_attrib¶
attributes for SRPT port
Definition:
struct srpt_port_attrib { u32 srp_max_rdma_size; u32 srp_max_rsp_size; u32 srp_sq_size; bool use_srq;};Members
srp_max_rdma_sizeMaximum size of SRP RDMA transfers for new connections.
srp_max_rsp_sizeMaximum size of SRP response messages in bytes.
srp_sq_sizeShared receive queue (SRQ) size.
use_srqWhether or not to use SRQ.
- structsrpt_tpg¶
information about a single “target portal group”
Definition:
struct srpt_tpg { struct list_head entry; struct srpt_port_id *sport_id; struct se_portal_group tpg;};Members
entryEntry insport_id->tpg_list.
sport_idPort name this TPG is associated with.
tpgLIO TPG data structure.
Description
Zero or more target portal groups are associated with each port name(srpt_port_id). With each TPG an ACL list is associated.
- structsrpt_port_id¶
LIO RDMA port information
Definition:
struct srpt_port_id { struct mutex mutex; struct list_head tpg_list; struct se_wwn wwn; char name[64];};Members
mutexProtectstpg_list changes.
tpg_listTPGs associated with the RDMA port name.
wwnWWN associated with the RDMA port name.
nameASCII representation of the port name.
Description
Multiple sysfs directories can be associated with a single RDMA port. Thisdata structure represents a single (port, name) pair.
- structsrpt_port¶
SRPT RDMA port information
Definition:
struct srpt_port { struct srpt_device *sdev; struct ib_mad_agent *mad_agent; bool enabled; u8 port; u32 sm_lid; u32 lid; union ib_gid gid; struct work_struct work; char guid_name[64]; struct srpt_port_id *guid_id; char gid_name[64]; struct srpt_port_id *gid_id; struct srpt_port_attrib port_attrib; atomic_t refcount; struct completion *freed_channels; struct mutex mutex; struct list_head nexus_list;};Members
sdevbackpointer to the HCA information.
mad_agentper-port management datagram processing information.
enabledWhether or not this target port is enabled.
portone-based port number.
sm_lidcached value of the port’s sm_lid.
lidcached value of the port’s lid.
gidcached value of the port’s gid.
workwork structure for refreshing the aforementioned cached values.
guid_nameport name in GUID format.
guid_idLIO target port information for the port name in GUID format.
gid_nameport name in GID format.
gid_idLIO target port information for the port name in GID format.
port_attribPort attributes that can be accessed through configfs.
refcountNumber of objects associated with this port.
freed_channelsCompletion that will be signaled oncerefcount becomes 0.
mutexProtects nexus_list.
nexus_listNexus list. See also srpt_nexus.entry.
- structsrpt_device¶
information associated by SRPT with a single HCA
Definition:
struct srpt_device { struct kref refcnt; struct ib_device *device; struct ib_pd *pd; u32 lkey; struct ib_srq *srq; struct ib_cm_id *cm_id; int srq_size; struct mutex sdev_mutex; bool use_srq; struct kmem_cache *req_buf_cache; struct srpt_recv_ioctx **ioctx_ring; struct ib_event_handler event_handler; struct list_head list; struct srpt_port port[];};Members
refcntReference count for this device.
deviceBackpointer to the
structib_devicemanaged by the IB core.pdIB protection domain.
lkeyL_Key (local key) with write access to all local memory.
srqPer-HCA SRQ (shared receive queue).
cm_idConnection identifier.
srq_sizeSRQ size.
sdev_mutexSerializes use_srq changes.
use_srqWhether or not to use SRQ.
req_buf_cachekmem_cache forioctx_ring buffers.
ioctx_ringPer-HCA SRQ.
event_handlerPer-HCA asynchronous IB event handler.
listNode in srpt_dev_list.
portInformation about the ports owned by this HCA.
- voidsrpt_event_handler(structib_event_handler*handler,structib_event*event)¶
asynchronous IB event callback function
Parameters
structib_event_handler*handlerIB event handler registered by
ib_register_event_handler().structib_event*eventDescription of the event that occurred.
Description
Callback function called by the InfiniBand core when an asynchronous IBevent occurs. This callback may occur in interrupt context. See alsosection 11.5.2, Set Asynchronous Event Handler in the InfiniBandArchitecture Specification.
- voidsrpt_srq_event(structib_event*event,void*ctx)¶
SRQ event callback function
Parameters
structib_event*eventDescription of the event that occurred.
void*ctxContext pointer specified at SRQ creation time.
- voidsrpt_qp_event(structib_event*event,void*ptr)¶
QP event callback function
Parameters
structib_event*eventDescription of the event that occurred.
void*ptrSRPT RDMA channel.
- voidsrpt_set_ioc(u8*c_list,u32slot,u8value)¶
initialize a IOUnitInfo structure
Parameters
u8*c_listcontroller list.
u32slotone-based slot number.
u8valuefour-bit value.
Description
Copies the lowest four bits of value in element slot of the array of fourbit elements called c_list (controller list). The index slot is one-based.
- voidsrpt_get_class_port_info(structib_dm_mad*mad)¶
copy ClassPortInfo to a management datagram
Parameters
structib_dm_mad*madDatagram that will be sent as response to DM_ATTR_CLASS_PORT_INFO.
Description
See also section 16.3.3.1 ClassPortInfo in the InfiniBand ArchitectureSpecification.
- voidsrpt_get_iou(structib_dm_mad*mad)¶
write IOUnitInfo to a management datagram
Parameters
structib_dm_mad*madDatagram that will be sent as response to DM_ATTR_IOU_INFO.
Description
See also section 16.3.3.3 IOUnitInfo in the InfiniBand ArchitectureSpecification. See also section B.7, table B.6 in the SRP r16a document.
- voidsrpt_get_ioc(structsrpt_port*sport,u32slot,structib_dm_mad*mad)¶
write IOControllerprofile to a management datagram
Parameters
structsrpt_port*sportHCA port through which the MAD has been received.
u32slotSlot number specified in DM_ATTR_IOC_PROFILE query.
structib_dm_mad*madDatagram that will be sent as response to DM_ATTR_IOC_PROFILE.
Description
See also section 16.3.3.4 IOControllerProfile in the InfiniBandArchitecture Specification. See also section B.7, table B.7 in the SRPr16a document.
- voidsrpt_get_svc_entries(u64ioc_guid,u16slot,u8hi,u8lo,structib_dm_mad*mad)¶
write ServiceEntries to a management datagram
Parameters
u64ioc_guidI/O controller GUID to use in reply.
u16slotI/O controller number.
u8hiEnd of the range of service entries to be specified in the reply.
u8loStart of the range of service entries to be specified in the reply..
structib_dm_mad*madDatagram that will be sent as response to DM_ATTR_SVC_ENTRIES.
Description
See also section 16.3.3.5 ServiceEntries in the InfiniBand ArchitectureSpecification. See also section B.7, table B.8 in the SRP r16a document.
- voidsrpt_mgmt_method_get(structsrpt_port*sp,structib_mad*rq_mad,structib_dm_mad*rsp_mad)¶
process a received management datagram
Parameters
structsrpt_port*spHCA port through which the MAD has been received.
structib_mad*rq_madreceived MAD.
structib_dm_mad*rsp_madresponse MAD.
- voidsrpt_mad_send_handler(structib_mad_agent*mad_agent,structib_mad_send_wc*mad_wc)¶
MAD send completion callback
Parameters
structib_mad_agent*mad_agentReturn value of
ib_register_mad_agent().structib_mad_send_wc*mad_wcWork completion reporting that the MAD has been sent.
- voidsrpt_mad_recv_handler(structib_mad_agent*mad_agent,structib_mad_send_buf*send_buf,structib_mad_recv_wc*mad_wc)¶
MAD reception callback function
Parameters
structib_mad_agent*mad_agentReturn value of
ib_register_mad_agent().structib_mad_send_buf*send_bufNot used.
structib_mad_recv_wc*mad_wcWork completion reporting that a MAD has been received.
Parameters
structsrpt_port*sportSRPT HCA port.
Description
Enable InfiniBand management datagram processing, update the cached sm_lid,lid and gid values, and register a callback function for processing MADson the specified port.
Note
It is safe to call this function more than once for the same port.
- voidsrpt_unregister_mad_agent(structsrpt_device*sdev,intport_cnt)¶
unregister MAD callback functions
Parameters
structsrpt_device*sdevSRPT HCA pointer.
intport_cntnumber of ports with registered MAD
Note
It is safe to call this function more than once for the same device.
- structsrpt_ioctx*srpt_alloc_ioctx(structsrpt_device*sdev,intioctx_size,structkmem_cache*buf_cache,enumdma_data_directiondir)¶
allocate a SRPT I/O context structure
Parameters
structsrpt_device*sdevSRPT HCA pointer.
intioctx_sizeI/O context size.
structkmem_cache*buf_cacheI/O buffer cache.
enumdma_data_directiondirDMA data direction.
- voidsrpt_free_ioctx(structsrpt_device*sdev,structsrpt_ioctx*ioctx,structkmem_cache*buf_cache,enumdma_data_directiondir)¶
free a SRPT I/O context structure
Parameters
structsrpt_device*sdevSRPT HCA pointer.
structsrpt_ioctx*ioctxI/O context pointer.
structkmem_cache*buf_cacheI/O buffer cache.
enumdma_data_directiondirDMA data direction.
- structsrpt_ioctx**srpt_alloc_ioctx_ring(structsrpt_device*sdev,intring_size,intioctx_size,structkmem_cache*buf_cache,intalignment_offset,enumdma_data_directiondir)¶
allocate a ring of SRPT I/O context structures
Parameters
structsrpt_device*sdevDevice to allocate the I/O context ring for.
intring_sizeNumber of elements in the I/O context ring.
intioctx_sizeI/O context size.
structkmem_cache*buf_cacheI/O buffer cache.
intalignment_offsetOffset in each ring buffer at which the SRP informationunit starts.
enumdma_data_directiondirDMA data direction.
- voidsrpt_free_ioctx_ring(structsrpt_ioctx**ioctx_ring,structsrpt_device*sdev,intring_size,structkmem_cache*buf_cache,enumdma_data_directiondir)¶
free the ring of SRPT I/O context structures
Parameters
structsrpt_ioctx**ioctx_ringI/O context ring to be freed.
structsrpt_device*sdevSRPT HCA pointer.
intring_sizeNumber of ring elements.
structkmem_cache*buf_cacheI/O buffer cache.
enumdma_data_directiondirDMA data direction.
- enumsrpt_command_statesrpt_set_cmd_state(structsrpt_send_ioctx*ioctx,enumsrpt_command_statenew)¶
set the state of a SCSI command
Parameters
structsrpt_send_ioctx*ioctxSend I/O context.
enumsrpt_command_statenewNew I/O context state.
Description
Does not modify the state of aborted commands. Returns the previous commandstate.
- boolsrpt_test_and_set_cmd_state(structsrpt_send_ioctx*ioctx,enumsrpt_command_stateold,enumsrpt_command_statenew)¶
test and set the state of a command
Parameters
structsrpt_send_ioctx*ioctxSend I/O context.
enumsrpt_command_stateoldCurrent I/O context state.
enumsrpt_command_statenewNew I/O context state.
Description
Returns true if and only if the previous command state was equal to ‘old’.
- intsrpt_post_recv(structsrpt_device*sdev,structsrpt_rdma_ch*ch,structsrpt_recv_ioctx*ioctx)¶
post an IB receive request
Parameters
structsrpt_device*sdevSRPT HCA pointer.
structsrpt_rdma_ch*chSRPT RDMA channel.
structsrpt_recv_ioctx*ioctxReceive I/O context pointer.
- intsrpt_zerolength_write(structsrpt_rdma_ch*ch)¶
perform a zero-length RDMA write
Parameters
structsrpt_rdma_ch*chSRPT RDMA channel.
Description
A quote from the InfiniBand specification: C9-88: For an HCA responderusing Reliable Connection service, for each zero-length RDMA READ or WRITErequest, the R_Key shall not be validated, even if the request includesImmediate data.
- intsrpt_get_desc_tbl(structsrpt_recv_ioctx*recv_ioctx,structsrpt_send_ioctx*ioctx,structsrp_cmd*srp_cmd,enumdma_data_direction*dir,structscatterlist**sg,unsignedint*sg_cnt,u64*data_len,u16imm_data_offset)¶
parse the data descriptors of a SRP_CMD request
Parameters
structsrpt_recv_ioctx*recv_ioctxI/O context associated with the received commandsrp_cmd.
structsrpt_send_ioctx*ioctxI/O context that will be used for responding to the initiator.
structsrp_cmd*srp_cmdPointer to the SRP_CMD request data.
enumdma_data_direction*dirPointer to the variable to which the transfer direction will bewritten.
structscatterlist**sg[out] scatterlist for the parsed SRP_CMD.
unsignedint*sg_cnt[out] length ofsg.
u64*data_lenPointer to the variable to which the total data length of alldescriptors in the SRP_CMD request will be written.
u16imm_data_offset[in] Offset in SRP_CMD requests at which immediate datastarts.
Description
This function initializes ioctx->nrbuf and ioctx->r_bufs.
Returns -EINVAL when the SRP_CMD request contains inconsistent descriptors;-ENOMEM when memory allocation fails and zero upon success.
- intsrpt_init_ch_qp(structsrpt_rdma_ch*ch,structib_qp*qp)¶
initialize queue pair attributes
Parameters
structsrpt_rdma_ch*chSRPT RDMA channel.
structib_qp*qpQueue pair pointer.
Description
Initialized the attributes of queue pair ‘qp’ by allowing local write,remote read and remote write. Also transitions ‘qp’ to state IB_QPS_INIT.
- intsrpt_ch_qp_rtr(structsrpt_rdma_ch*ch,structib_qp*qp)¶
change the state of a channel to ‘ready to receive’ (RTR)
Parameters
structsrpt_rdma_ch*chchannel of the queue pair.
structib_qp*qpqueue pair to change the state of.
Description
Returns zero upon success and a negative value upon failure.
Note
currently astructib_qp_attr takes 136 bytes on a 64-bit system.If this structure ever becomes larger, it might be necessary to allocateit dynamically instead of on the stack.
- intsrpt_ch_qp_rts(structsrpt_rdma_ch*ch,structib_qp*qp)¶
change the state of a channel to ‘ready to send’ (RTS)
Parameters
structsrpt_rdma_ch*chchannel of the queue pair.
structib_qp*qpqueue pair to change the state of.
Description
Returns zero upon success and a negative value upon failure.
Note
currently astructib_qp_attr takes 136 bytes on a 64-bit system.If this structure ever becomes larger, it might be necessary to allocateit dynamically instead of on the stack.
- intsrpt_ch_qp_err(structsrpt_rdma_ch*ch)¶
set the channel queue pair state to ‘error’
Parameters
structsrpt_rdma_ch*chSRPT RDMA channel.
- structsrpt_send_ioctx*srpt_get_send_ioctx(structsrpt_rdma_ch*ch)¶
obtain an I/O context for sending to the initiator
Parameters
structsrpt_rdma_ch*chSRPT RDMA channel.
- intsrpt_abort_cmd(structsrpt_send_ioctx*ioctx)¶
abort a SCSI command
Parameters
structsrpt_send_ioctx*ioctxI/O context associated with the SCSI command.
- voidsrpt_rdma_read_done(structib_cq*cq,structib_wc*wc)¶
RDMA read completion callback
Parameters
structib_cq*cqCompletion queue.
structib_wc*wcWork completion.
Description
XXX: what is now target_execute_cmd used to be asynchronous, and unmappingthe data that has been transferred via IB RDMA had to be postponed until thecheck_stop_free() callback. None of this is necessary anymore and needs tobe cleaned up.
- intsrpt_build_cmd_rsp(structsrpt_rdma_ch*ch,structsrpt_send_ioctx*ioctx,u64tag,intstatus)¶
build a SRP_RSP response
Parameters
structsrpt_rdma_ch*chRDMA channel through which the request has been received.
structsrpt_send_ioctx*ioctxI/O context associated with the SRP_CMD request. The response willbe built in the buffer ioctx->buf points at and hence this function willoverwrite the request data.
u64tagtag of the request for which this response is being generated.
intstatusvalue for the STATUS field of the SRP_RSP information unit.
Description
Returns the size in bytes of the SRP_RSP response.
An SRP_RSP response contains a SCSI status or service response. See alsosection 6.9 in the SRP r16a document for the format of an SRP_RSPresponse. See also SPC-2 for more information about sense data.
- intsrpt_build_tskmgmt_rsp(structsrpt_rdma_ch*ch,structsrpt_send_ioctx*ioctx,u8rsp_code,u64tag)¶
build a task management response
Parameters
structsrpt_rdma_ch*chRDMA channel through which the request has been received.
structsrpt_send_ioctx*ioctxI/O context in which the SRP_RSP response will be built.
u8rsp_codeRSP_CODE that will be stored in the response.
u64tagTag of the request for which this response is being generated.
Description
Returns the size in bytes of the SRP_RSP response.
An SRP_RSP response contains a SCSI status or service response. See alsosection 6.9 in the SRP r16a document for the format of an SRP_RSPresponse.
- voidsrpt_handle_cmd(structsrpt_rdma_ch*ch,structsrpt_recv_ioctx*recv_ioctx,structsrpt_send_ioctx*send_ioctx)¶
process a SRP_CMD information unit
Parameters
structsrpt_rdma_ch*chSRPT RDMA channel.
structsrpt_recv_ioctx*recv_ioctxReceive I/O context.
structsrpt_send_ioctx*send_ioctxSend I/O context.
- voidsrpt_handle_tsk_mgmt(structsrpt_rdma_ch*ch,structsrpt_recv_ioctx*recv_ioctx,structsrpt_send_ioctx*send_ioctx)¶
process a SRP_TSK_MGMT information unit
Parameters
structsrpt_rdma_ch*chSRPT RDMA channel.
structsrpt_recv_ioctx*recv_ioctxReceive I/O context.
structsrpt_send_ioctx*send_ioctxSend I/O context.
Description
Returns 0 if and only if the request will be processed by the target core.
For more information about SRP_TSK_MGMT information units, see also section6.7 in the SRP r16a document.
- boolsrpt_handle_new_iu(structsrpt_rdma_ch*ch,structsrpt_recv_ioctx*recv_ioctx)¶
process a newly received information unit
Parameters
structsrpt_rdma_ch*chRDMA channel through which the information unit has been received.
structsrpt_recv_ioctx*recv_ioctxReceive I/O context associated with the information unit.
- voidsrpt_send_done(structib_cq*cq,structib_wc*wc)¶
send completion callback
Parameters
structib_cq*cqCompletion queue.
structib_wc*wcWork completion.
Note
Although this has not yet been observed during tests, at least intheory it is possible that thesrpt_get_send_ioctx() call invoked bysrpt_handle_new_iu() fails. This is possible because the req_lim_deltavalue in each response is set to one, and it is possible that this responsemakes the initiator send a new request before the send completion for thatresponse has been processed. This could e.g. happen if the call tosrpt_put_send_iotcx() is delayed because of a higher priority interrupt orif IB retransmission causes generation of the send completion to bedelayed. Incoming information units for whichsrpt_get_send_ioctx() failsare queued on cmd_wait_list. The code below processes these delayedrequests one at a time.
- intsrpt_create_ch_ib(structsrpt_rdma_ch*ch)¶
create receive and send completion queues
Parameters
structsrpt_rdma_ch*chSRPT RDMA channel.
- boolsrpt_close_ch(structsrpt_rdma_ch*ch)¶
close a RDMA channel
Parameters
structsrpt_rdma_ch*chSRPT RDMA channel.
Description
Make sure all resources associated with the channel will be deallocated atan appropriate time.
Returns true if and only if the channel state has been modified intoCH_DRAINING.
- intsrpt_cm_req_recv(structsrpt_device*constsdev,structib_cm_id*ib_cm_id,structrdma_cm_id*rdma_cm_id,u8port_num,__be16pkey,conststructsrp_login_req*req,constchar*src_addr)¶
process the event IB_CM_REQ_RECEIVED
Parameters
structsrpt_device*constsdevHCA through which the login request was received.
structib_cm_id*ib_cm_idIB/CM connection identifier in case of IB/CM.
structrdma_cm_id*rdma_cm_idRDMA/CM connection identifier in case of RDMA/CM.
u8port_numPort through which the REQ message was received.
__be16pkeyP_Key of the incoming connection.
conststructsrp_login_req*reqSRP login request.
constchar*src_addrGID (IB/CM) or IP address (RDMA/CM) of the port that submittedthe login request.
Description
Ownership of the cm_id is transferred to the target session if thisfunction returns zero. Otherwise the caller remains the owner of cm_id.
- voidsrpt_cm_rtu_recv(structsrpt_rdma_ch*ch)¶
process an IB_CM_RTU_RECEIVED or USER_ESTABLISHED event
Parameters
structsrpt_rdma_ch*chSRPT RDMA channel.
Description
An RTU (ready to use) message indicates that the connection has beenestablished and that the recipient may begin transmitting.
- intsrpt_cm_handler(structib_cm_id*cm_id,conststructib_cm_event*event)¶
IB connection manager callback function
Parameters
structib_cm_id*cm_idIB/CM connection identifier.
conststructib_cm_event*eventIB/CM event.
Description
A non-zero return value will cause the caller destroy the CM ID.
Note
srpt_cm_handler() must only return a non-zero value when transferringownership of the cm_id to a channel bysrpt_cm_req_recv() failed. Returninga non-zero value in any other case will trigger a race with theib_destroy_cm_id() call insrpt_release_channel().
- voidsrpt_queue_response(structse_cmd*cmd)¶
transmit the response to a SCSI command
Parameters
structse_cmd*cmdSCSI target command.
Description
Callback function called by the TCM core. Must not block since it can beinvoked on the context of the IB completion handler.
Parameters
structsrpt_port*sportSRPT HCA port.
- structport_and_port_idsrpt_lookup_port(constchar*name)¶
Look up an RDMA port by name
Parameters
constchar*nameASCII port name
Description
Increments the RDMA port reference count if an RDMA port pointer is returned.The caller must drop that reference count by callingsrpt_port_put_ref().
- intsrpt_add_one(structib_device*device)¶
InfiniBand device addition callback function
Parameters
structib_device*deviceDescribes a HCA.
- voidsrpt_remove_one(structib_device*device,void*client_data)¶
InfiniBand device removal callback function
Parameters
structib_device*deviceDescribes a HCA.
void*client_dataThe value passed as the third argument to
ib_set_client_data().
- voidsrpt_close_session(structse_session*se_sess)¶
forcibly close a session
Parameters
structse_session*se_sessSCSI target session.
Description
Callback function invoked by the TCM core to clean up sessions associatedwith a node ACL when the user invokesrmdir /sys/kernel/config/target/$driver/$port/$tpg/acls/$i_port_id
- intsrpt_parse_i_port_id(u8i_port_id[16],constchar*name)¶
parse an initiator port ID
Parameters
u8i_port_id[16]Binary 128-bit port ID.
constchar*nameASCII representation of a 128-bit initiator port ID.
- structse_portal_group*srpt_make_tpg(structse_wwn*wwn,constchar*name)¶
configfs callback invoked for mkdir /sys/kernel/config/target/$driver/$port/$tpg
Parameters
structse_wwn*wwnCorresponds to $driver/$port.
constchar*name$tpg.
- voidsrpt_drop_tpg(structse_portal_group*tpg)¶
configfs callback invoked for rmdir /sys/kernel/config/target/$driver/$port/$tpg
Parameters
structse_portal_group*tpgTarget portal group to deregister.
- structse_wwn*srpt_make_tport(structtarget_fabric_configfs*tf,structconfig_group*group,constchar*name)¶
configfs callback invoked for mkdir /sys/kernel/config/target/$driver/$port
Parameters
structtarget_fabric_configfs*tfNot used.
structconfig_group*groupNot used.
constchar*name$port.
- voidsrpt_drop_tport(structse_wwn*wwn)¶
configfs callback invoked for rmdir /sys/kernel/config/target/$driver/$port
Parameters
structse_wwn*wwn$port.
- intsrpt_init_module(void)¶
kernel module initialization
Parameters
voidno arguments
Note
Sinceib_register_client() registers callback functions, and since atleast one of these callback functions (srpt_add_one()) calls target corefunctions, this driver must be registered with the target core beforeib_register_client() is called.
iSCSI Extensions for RDMA (iSER) target support¶
- voidisert_conn_terminate(structisert_conn*isert_conn)¶
Initiate connection termination
Parameters
structisert_conn*isert_connisert connection struct
Notes
In case the connection state is BOUND, move stateto TEMINATING and start teardown sequence (rdma_disconnect).In case the connection state is UP, complete flush as well.
This routine must be called with mutex held. Thus it issafe to call multiple times.
- voidisert_put_unsol_pending_cmds(structiscsit_conn*conn)¶
Drop commands waiting for unsolicitate dataout
Parameters
structiscsit_conn*conniscsi connection
Description
We might still have commands that are waiting for unsoliciteddataouts messages. We must put the extra reference on thosebefore blocking on the target_wait_for_session_cmds