Management Component Transport Protocol (MCTP)¶
net/mctp/ contains protocol support for MCTP, as defined by DMTF standardDSP0236. Physical interface drivers (“bindings” in the specification) areprovided in drivers/net/mctp/.
The core code provides a socket-based interface to send and receive MCTPmessages, through an AF_MCTP, SOCK_DGRAM socket.
Structure: interfaces & networks¶
The kernel models the local MCTP topology through two items: interfaces andnetworks.
An interface (or “link”) is an instance of an MCTP physical transport binding(as defined by DSP0236, section 3.2.47), likely connected to a specific hardwaredevice. This is represented as astructnetdevice.
A network defines a unique address space for MCTP endpoints by endpoint-ID(described by DSP0236, section 3.2.31). A network has a user-visible identifierto allow references from userspace. Route definitions are specific to onenetwork.
Interfaces are associated with one network. A network may be associated with oneor more interfaces.
If multiple networks are present, each may contain endpoint IDs (EIDs) that arealso present on other networks.
Sockets API¶
Protocol definitions¶
MCTP usesAF_MCTP /PF_MCTP for the address- and protocol- families.Since MCTP is message-based, onlySOCK_DGRAM sockets are supported.
intsd=socket(AF_MCTP,SOCK_DGRAM,0);
The only (current) value for theprotocol argument is 0.
As with all socket address families, source and destination addresses arespecified with asockaddr type, with a single-byte endpoint address:
typedef__u8mctp_eid_t;structmctp_addr{mctp_eid_ts_addr;};structsockaddr_mctp{__kernel_sa_family_tsmctp_family;unsignedintsmctp_network;structmctp_addrsmctp_addr;__u8smctp_type;__u8smctp_tag;};#define MCTP_NET_ANY 0x0#define MCTP_ADDR_ANY 0xff
Syscall behaviour¶
The following sections describe the MCTP-specific behaviours of the standardsocket system calls. These behaviours have been chosen to map closely to theexisting sockets APIs.
bind() : set local socket address¶
Sockets that receive incoming request packets will bind to a local address,using thebind() syscall.
structsockaddr_mctpaddr;addr.smctp_family=AF_MCTP;addr.smctp_network=MCTP_NET_ANY;addr.smctp_addr.s_addr=MCTP_ADDR_ANY;addr.smctp_type=MCTP_TYPE_PLDM;addr.smctp_tag=MCTP_TAG_OWNER;intrc=bind(sd,(structsockaddr*)&addr,sizeof(addr));
This establishes the local address of the socket. Incoming MCTP messages thatmatch the network, address, and message type will be received by this socket.The reference to ‘incoming’ is important here; a bound socket will only receivemessages with the TO bit set, to indicate an incoming request message, ratherthan a response.
Thesmctp_tag value will configure the tags accepted from the remote side ofthis socket. Given the above, the only valid value isMCTP_TAG_OWNER, whichwill result in remotely “owned” tags being routed to this socket. SinceMCTP_TAG_OWNER is set, the 3 least-significant bits ofsmctp_tag are notused; callers must set them to zero.
Asmctp_network value ofMCTP_NET_ANY will configure the socket toreceive incoming packets from any locally-connected network. A specific networkvalue will cause the socket to only receive incoming messages from that network.
Thesmctp_addr field specifies a local address to bind to. A value ofMCTP_ADDR_ANY configures the socket to receive messages addressed to anylocal destination EID.
Thesmctp_type field specifies which message types to receive. Only thelower 7 bits of the type is matched on incoming messages (ie., themost-significant IC bit is not part of the match). This results in the socketreceiving packets with and without a message integrity check footer.
sendto(),sendmsg(),send() : transmit an MCTP message¶
An MCTP message is transmitted using one of thesendto(),sendmsg() orsend() syscalls. Usingsendto() as the primary example:
structsockaddr_mctpaddr;charbuf[14];ssize_tlen;/* set message destination */addr.smctp_family=AF_MCTP;addr.smctp_network=0;addr.smctp_addr.s_addr=8;addr.smctp_tag=MCTP_TAG_OWNER;addr.smctp_type=MCTP_TYPE_ECHO;/* arbitrary message to send, with message-type header */buf[0]=MCTP_TYPE_ECHO;memcpy(buf+1,"hello, world!",sizeof(buf)-1);len=sendto(sd,buf,sizeof(buf),0,(structsockaddr_mctp*)&addr,sizeof(addr));
The network and address fields ofaddr define the remote address to send to.Ifsmctp_tag has theMCTP_TAG_OWNER, the kernel will ignore any bits setinMCTP_TAG_VALUE, and generate a tag value suitable for the destinationEID. IfMCTP_TAG_OWNER is not set, the message will be sent with the tagvalue as specified. If a tag value cannot be allocated, the system call willreport an errno ofEAGAIN.
The application must provide the message type byte as the first byte of themessage buffer passed tosendto(). If a message integrity check is to beincluded in the transmitted message, it must also be provided in the messagebuffer, and the most-significant bit of the message type byte must be 1.
Thesendmsg() system call allows a more compact argument interface, and themessage buffer to be specified as a scatter-gather list. At present no ancillarymessage types (used for themsg_control data passed tosendmsg()) aredefined.
Transmitting a message on an unconnected socket withMCTP_TAG_OWNERspecified will cause an allocation of a tag, if no valid tag is alreadyallocated for that destination. The (destination-eid,tag) tuple acts as animplicit local socket address, to allow the socket to receive responses to thisoutgoing message. If any previous allocation has been performed (to for adifferent remote EID), that allocation is lost.
Sockets will only receive responses to requests they have sent (with TO=1) andmay only respond (with TO=0) to requests they have received.
recvfrom(),recvmsg(),recv() : receive an MCTP message¶
An MCTP message can be received by an application using one of therecvfrom(),recvmsg(), orrecv() system calls. Usingrecvfrom()as the primary example:
structsockaddr_mctpaddr;socklen_taddrlen;charbuf[14];ssize_tlen;addrlen=sizeof(addr);len=recvfrom(sd,buf,sizeof(buf),0,(structsockaddr_mctp*)&addr,&addrlen);/* We can expect addr to describe an MCTP address */assert(addrlen>=sizeof(buf));assert(addr.smctp_family==AF_MCTP);printf("received %zd bytes from remote EID %d\n",rc,addr.smctp_addr);
The address argument torecvfrom andrecvmsg is populated with theremote address of the incoming message, including tag value (this will be neededin order to reply to the message).
The first byte of the message buffer will contain the message type byte. If anintegrity check follows the message, it will be included in the received buffer.
Therecv() system call behaves in a similar way, but does not provide aremote address to the application. Therefore, these are only useful if theremote address is already known, or the message does not require a reply.
Like the send calls, sockets will only receive responses to requests they havesent (TO=1) and may only respond (TO=0) to requests they have received.
ioctl(SIOCMCTPALLOCTAG) andioctl(SIOCMCTPDROPTAG)¶
These tags give applications more control over MCTP message tags, by allocating(and dropping) tag values explicitly, rather than the kernel automaticallyallocating a per-message tag atsendmsg() time.
In general, you will only need to use these ioctls if your MCTP protocol doesnot fit the usual request/response model. For example, if you need to persisttags across multiple requests, or a request may generate more than one response.In these cases, the ioctls allow you to decouple the tag allocation (andrelease) from individual message send and receive operations.
Both ioctls are passed a pointer to astructmctp_ioc_tag_ctl:
structmctp_ioc_tag_ctl{mctp_eid_tpeer_addr;__u8tag;__u16flags;};
SIOCMCTPALLOCTAG allocates a tag for a specific peer, which an applicationcan use in futuresendmsg() calls. The application populates thepeer_addr member with the remote EID. Other fields must be zero.
On return, thetag member will be populated with the allocated tag value.The allocated tag will have the following tag bits set:
MCTP_TAG_OWNER: it only makes sense to allocate tags if you’re the tagowner
MCTP_TAG_PREALLOC: to indicate tosendmsg()that this is apreallocated tag.... and the actual tag value, within the least-significant three bits(
MCTP_TAG_MASK). Note that zero is a valid tag value.
The tag value should be used as-is for thesmctp_tag member ofstructsockaddr_mctp.
SIOCMCTPDROPTAG releases a tag that has been previously allocated by aSIOCMCTPALLOCTAG ioctl. Thepeer_addr must be the same as used for theallocation, and thetag value must match exactly the tag returned from theallocation (including theMCTP_TAG_OWNER andMCTP_TAG_PREALLOC bits).Theflags field must be zero.
Kernel internals¶
There are a few possible packet flows in the MCTP stack:
local TX to remote endpoint, message <= MTU:
sendmsg() -> mctp_local_output() : route lookup -> rt->output() (== mctp_route_output) -> dev_queue_xmit()
local TX to remote endpoint, message > MTU:
sendmsg()-> mctp_local_output() -> mctp_do_fragment_route() : creates packet-sized skbs. For each new skb: -> rt->output() (== mctp_route_output) -> dev_queue_xmit()
remote TX to local endpoint, single-packet message:
mctp_pkttype_receive(): route lookup-> rt->output() (== mctp_route_input) : sk_key lookup -> sock_queue_rcv_skb()
remote TX to local endpoint, multiple-packet message:
mctp_pkttype_receive(): route lookup-> rt->output() (== mctp_route_input) : sk_key lookup : stores skb in struct sk_key->reasm_headmctp_pkttype_receive(): route lookup-> rt->output() (== mctp_route_input) : sk_key lookup : finds existing reassembly in sk_key->reasm_head : appends new fragment -> sock_queue_rcv_skb()
Key refcounts¶
keys are refed by:
a skb: during route output, stored in
skb->cb.netns and sock lists.
keys can be associated with a device, in which case they hold areference to the dev (set through
key->dev, counted throughdev->key_count). Multiple keys can reference the device.