Multipath TCP (MPTCP)¶
Introduction¶
Multipath TCP or MPTCP is an extension to the standard TCP and is described inRFC 8684 (MPTCPv1). It allows adevice to make use of multiple interfaces at once to send and receive TCPpackets over a single MPTCP connection. MPTCP can aggregate the bandwidth ofmultiple interfaces or prefer the one with the lowest latency. It also allows afail-over if one path is down, and the traffic is seamlessly reinjected on otherpaths.
For more details about Multipath TCP in the Linux kernel, please see theofficial website:mptcp.dev.
Use cases¶
Thanks to MPTCP, being able to use multiple paths in parallel or simultaneouslybrings new use-cases, compared to TCP:
Seamless handovers: switching from one path to another while preservingestablished connections, e.g. to be used in mobility use-cases, like onsmartphones.
Best network selection: using the “best” available path depending on someconditions, e.g. latency, losses, cost, bandwidth, etc.
Network aggregation: using multiple paths at the same time to have a higherthroughput, e.g. to combine fixed and mobile networks to send files faster.
Concepts¶
Technically, when a new socket is created with theIPPROTO_MPTCP protocol(Linux-specific), asubflow (orpath) is created. Thissubflow consists ofa regular TCP connection that is used to transmit data through one interface.Additionalsubflows can be negotiated later between the hosts. For the remotehost to be able to detect the use of MPTCP, a new field is added to the TCPoption field of the underlying TCPsubflow. This field contains, amongstother things, aMP_CAPABLE option that tells the other host to use MPTCP ifit is supported. If the remote host or any middlebox in between does not supportit, the returnedSYN+ACK packet will not contain MPTCP options in the TCPoption field. In that case, the connection will be “downgraded” to plain TCP,and it will continue with a single path.
This behavior is made possible by two internal components: the path manager, andthe packet scheduler.
Path Manager¶
The Path Manager is in charge ofsubflows, from creation to deletion, and alsoaddress announcements. Typically, it is the client side that initiates subflows,and the server side that announces additional addresses via theADD_ADDR andREMOVE_ADDR options.
Path managers are controlled by thenet.mptcp.path_manager sysctl knob --seeMPTCP Sysfs variables. There are two types: the in-kernel one (kernel) wherethe same rules are applied for all the connections (see:ipmptcp) ; and theuserspace one (userspace), controlled by a userspace daemon (i.e.mptcpd) where different rules can be applied for eachconnection. The path managers can be controlled via a Netlink API; seeFamily mptcp_pm netlink specification.
To be able to use multiple IP addresses on a host to create multiplesubflows(paths), the default in-kernel MPTCP path-manager needs to know which IPaddresses can be used. This can be configured withipmptcpendpoint forexample.
Packet Scheduler¶
The Packet Scheduler is in charge of selecting which availablesubflow(s) touse to send the next data packet. It can decide to maximize the use of theavailable bandwidth, only to pick the path with the lower latency, or any otherpolicy depending on the configuration.
Packet schedulers are controlled by thenet.mptcp.scheduler sysctl knob --seeMPTCP Sysfs variables.
Sockets API¶
Creating MPTCP sockets¶
On Linux, MPTCP can be used by selecting MPTCP instead of TCP when creating thesocket:
intsd=socket(AF_INET(6),SOCK_STREAM,IPPROTO_MPTCP);
Note thatIPPROTO_MPTCP is defined as262.
If MPTCP is not supported,errno will be set to:
EINVAL: (Invalid argument): MPTCP is not available, on kernels < 5.6.EPROTONOSUPPORT(Protocol not supported): MPTCP has not been compiled,on kernels >= v5.6.ENOPROTOOPT(Protocol not available): MPTCP has been disabled usingnet.mptcp.enabledsysctl knob; seeMPTCP Sysfs variables.
MPTCP is then opt-in: applications need to explicitly request it. Note thatapplications can be forced to use MPTCP with different techniques, e.g.LD_PRELOAD (seemptcpize), eBPF (seemptcpify), SystemTAP,GODEBUG (GODEBUG=multipathtcp=1), etc.
Switching toIPPROTO_MPTCP instead ofIPPROTO_TCP should be astransparent as possible for the userspace applications.
Socket options¶
MPTCP supports most socket options handled by TCP. It is possible some lesscommon options are not supported, but contributions are welcome.
Generally, the same value is propagated to all subflows, including the onescreated after the calls tosetsockopt(). eBPF can be used to set differentvalues per subflow.
There are some MPTCP specific socket options at theSOL_MPTCP (284) level toretrieve info. They fill theoptval buffer of thegetsockopt() systemcall:
MPTCP_INFO: Usesstructmptcp_info.MPTCP_TCPINFO: Usesstructmptcp_subflow_data, followed by an array ofstructtcp_info.MPTCP_SUBFLOW_ADDRS: Usesstructmptcp_subflow_data, followed by anarray ofmptcp_subflow_addrs.MPTCP_FULL_INFO: Usesstructmptcp_full_info, with one pointer to anarray ofstructmptcp_subflow_info(including thestructmptcp_subflow_addrs), and one pointer to an array ofstructtcp_info, followed by the content ofstructmptcp_info.
Note that at the TCP level,TCP_IS_MPTCP socket option can be used to knowif MPTCP is currently being used: the value will be set to 1 if it is.
Design choices¶
A new socket type has been added for MPTCP for the userspace-facing socket. Thekernel is in charge of creating subflow sockets: they are TCP sockets where thebehavior is modified using TCP-ULP.
MPTCP listen sockets will create “plain”accepted TCP sockets if theconnection request from the client didn’t ask for MPTCP, making the performanceimpact minimal when MPTCP is enabled by default.