IPVLAN Driver HOWTO¶
- Initial Release:
Mahesh Bandewar <maheshb AT google.com>
1. Introduction:¶
This is conceptually very similar to the macvlan driver with one majorexception of using L3 for mux-ing /demux-ing among slaves. This property makesthe master device share the L2 with its slave devices. I have developed thisdriver in conjunction with network namespaces and not sure if there is use caseoutside of it.
2. Building and Installation:¶
In order to build the driver, please select the config item CONFIG_IPVLAN.The driver can be built into the kernel (CONFIG_IPVLAN=y) or as a module(CONFIG_IPVLAN=m).
3. Configuration:¶
There are no module parameters for this driver and it can be configuredusing IProute2/ip utility.
ip link add link <master> name <slave> type ipvlan [ mode MODE ] [ FLAGS ] where MODE: l3 (default) | l3s | l2 FLAGS: bridge (default) | private | vepa
e.g.
Following will create IPvlan link with eth0 as master inL3 bridge mode:
bash# ip link add link eth0 name ipvl0 type ipvlanThis command will create IPvlan link in L2 bridge mode:
bash# ip link add link eth0 name ipvl0 type ipvlan mode l2 bridgeThis command will create an IPvlan device in L2 private mode:
bash# ip link add link eth0 name ipvlan type ipvlan mode l2 privateThis command will create an IPvlan device in L2 vepa mode:
bash# ip link add link eth0 name ipvlan type ipvlan mode l2 vepa
4. Operating modes:¶
IPvlan has two modes of operation - L2 and L3. For a given master device,you can select one of these two modes and all slaves on that master willoperate in the same (selected) mode. The RX mode is almost identical exceptthat in L3 mode the slaves won’t receive any multicast / broadcast traffic.L3 mode is more restrictive since routing is controlled from the other (mostly)default namespace.
4.1 L2 mode:¶
In this mode TX processing happens on the stack instance attached to theslave device and packets are switched and queued to the master device to sendout. In this mode the slaves will RX/TX multicast and broadcast (if applicable)as well.
4.2 L3 mode:¶
In this mode TX processing up to L3 happens on the stack instance attachedto the slave device and packets are switched to the stack instance of themaster device for the L2 processing and routing from that instance will beused before packets are queued on the outbound device. In this mode the slaveswill not receive nor can send multicast / broadcast traffic.
4.3 L3S mode:¶
This is very similar to the L3 mode except that iptables (conn-tracking)works in this mode and hence it is L3-symmetric (L3s). This will have slightly lessperformance but that shouldn’t matter since you are choosing this mode over plain-L3mode to make conn-tracking work.
5. Mode flags:¶
At this time following mode flags are available
5.1 bridge:¶
This is the default option. To configure the IPvlan port in this mode,user can choose to either add this option on the command-line or don’t specifyanything. This is the traditional mode where slaves can cross-talk amongthemselves apart from talking through the master device.
5.2 private:¶
If this option is added to the command-line, the port is set in privatemode. i.e. port won’t allow cross communication between slaves.
5.3 vepa:¶
If this is added to the command-line, the port is set in VEPA mode.i.e. port will offload switching functionality to the external entity asdescribed in 802.1QbgNote: VEPA mode in IPvlan has limitations. IPvlan uses the mac-address of themaster-device, so the packets which are emitted in this mode for the adjacentneighbor will have source and destination mac same. This will make the switch /router send the redirect message.
6. What to choose (macvlan vs. ipvlan)?¶
These two devices are very similar in many regards and the specific usecase could very well define which device to choose. if one of the followingsituations defines your use case then you can choose to use ipvlan:
The Linux host that is connected to the external switch / router haspolicy configured that allows only one mac per port.
No of virtual devices created on a master exceed the mac capacity andputs the NIC in promiscuous mode and degraded performance is a concern.
If the slave device is to be put into the hostile / untrusted networknamespace where L2 on the slave could be changed / misused.
6. Example configuration:¶
+=============================================================+| Host: host1 || || +----------------------+ +----------------------+ || | NS:ns0 | | NS:ns1 | || | | | | || | | | | || | ipvl0 | | ipvl1 | || +----------#-----------+ +-----------#----------+ || # # || ################################ || # eth0 |+==============================#==============================+
Create two network namespaces - ns0, ns1:
ip netns add ns0ip netns add ns1
Create two ipvlan slaves on eth0 (master device):
ip link add link eth0 ipvl0 type ipvlan mode l2ip link add link eth0 ipvl1 type ipvlan mode l2
Assign slaves to the respective network namespaces:
ip link set dev ipvl0 netns ns0ip link set dev ipvl1 netns ns1
Now switch to the namespace (ns0 or ns1) to configure the slave devices
For ns0:
(1) ip netns exec ns0 bash(2) ip link set dev ipvl0 up(3) ip link set dev lo up(4) ip -4 addr add 127.0.0.1 dev lo(5) ip -4 addr add $IPADDR dev ipvl0(6) ip -4 route add default via $ROUTER dev ipvl0
For ns1:
(1) ip netns exec ns1 bash(2) ip link set dev ipvl1 up(3) ip link set dev lo up(4) ip -4 addr add 127.0.0.1 dev lo(5) ip -4 addr add $IPADDR dev ipvl1(6) ip -4 route add default via $ROUTER dev ipvl1