| Netfilter | |
|---|---|
| Initial release | 26 August 1999; 26 years ago (1999-08-26) (Linux 2.3.15) |
| Stable release | |
| Written in | C |
| Operating system | Linux |
| Type |
|
| License | GNU GPL |
| Website | netfilter |
Netfilter is aframework provided by theLinux kernel that allows variousnetworking-related operations to be implemented in the form of customized handlers. Netfilter offers various functions and operations forpacket filtering,network address translation, andport translation, which provide the functionality required for directing packets through a network andprohibiting packets from reaching sensitive locations within a network.
Netfilter represents a set ofhooks inside the Linux kernel, allowing specifickernel modules to registercallback functions with the kernel's networking stack. Those functions, usually applied to the traffic in the form of filtering and modification rules, are called for every packet that traverses the respective hook within the networking stack.[2]

Rusty Russell started thenetfilter/iptables project in 1998; he had also authored the project's predecessor,ipchains. As the project grew, he founded theNetfilter Core Team (or simplycoreteam) in 1999. The software they produced (callednetfilter hereafter) uses theGNU General Public License (GPL) license, and on 26 August 1999 it was merged into version 2.3.15 of theLinux kernel mainline and thus was in the 2.4.0 stable version.[3]
In August 2003Harald Welte became chairman of the coreteam. In April 2004, following a crack-down by the project on those distributing the project's softwareembedded inrouters without complying with the GPL, aGerman court granted Welte an historicinjunction againstSitecom Germany, which refused to follow the GPL's terms (seeGPL-related disputes). In September 2007 Patrick McHardy, who led development for past years, was elected as new chairman of the coreteam.
Prior to iptables, the predominant software packages for creating Linux firewalls wereipchains in Linux kernel 2.2.x andipfwadm in Linux kernel 2.0.x,[3] which in turn was based onBSD'sipfw. Both ipchains and ipfwadm alter the networking code so they can manipulate packets, as Linux kernel lacked a general packets control framework until the introduction of Netfilter.
Whereas ipchains and ipfwadm combine packet filtering and NAT (particularly three specific kinds ofNAT, calledmasquerading,port forwarding, andredirection), Netfilter separates packet operations into multiple parts, described below. Each connects to the Netfilter hooks at different points to access packets. The connection tracking and NAT subsystems are more general and more powerful than the rudimentary versions within ipchains and ipfwadm.
In 2017IPv4 andIPv6 flow offload infrastructure was added, allowing a speedup of software flow table forwarding and hardware offload support.[4][5]

The kernel modules namedip_tables,ip6_tables,arp_tables (the underscore is part of the name), andebtables comprise the legacy packet filtering portion of the Netfilter hook system. They provide a table-based system for defining firewall rules that can filter or transform packets. The tables can be administered through the user-space toolsiptables,ip6tables,arptables, andebtables. Notice that although both the kernel modules and userspace utilities have similar names, each of them is a different entity with different functionality.
Each table is actually its own hook, and each table was introduced to serve a specific purpose. As far as Netfilter is concerned, it runs a particular table in a specific order with respect to other tables. Any table can call itself and it also can execute its own rules, which enables possibilities for additional processing and iteration.
Rules are organized into chains, or in other words, "chains of rules". These chains are named with predefined titles, includingINPUT,OUTPUT andFORWARD. These chain titles help describe the origin in the Netfilter stack. Packet reception, for example, falls intoPREROUTING, while theINPUT represents locally delivered data, and forwarded traffic falls into theFORWARD chain. Locally generated output passes through theOUTPUT chain, and packets to be sent out are inPOSTROUTING chain.
Netfilter modules not organized into tables (see below) are capable of checking for the origin to select their mode of operation.
iptable_raw moduleiptable_mangle moduleiptable_nat moduleiptable_filter modulesecurity_filter moduleSECMARK andCONNSECMARK targets. (These so-called "targets" refer to Security-Enhanced Linux markers.) Mandatory Access Control is implemented by Linux Security Modules such as SELinux. The security table is called following the call of the filter table, allowing any Discretionary Access Control (DAC) rules in the filter table to take effect before any MAC rules. This table provides the following built-in chains:INPUT (for packets coming into the computer itself),OUTPUT (for altering locally-generated packets before routing), andFORWARD (for altering packets being routed through the computer).nftables is the new packet-filtering portion of Netfilter.nft is the new userspace utility that replacesiptables,ip6tables,arptables andebtables.
nftables kernel engine adds a simplevirtual machine into the Linux kernel, which is able to execute bytecode to inspect a network packet and make decisions on how that packet should be handled. The operations implemented by this virtual machine are intentionally made basic: it can get data from the packet itself, have a look at the associated metadata (inbound interface, for example), and manage connection tracking data. Arithmetic, bitwise and comparison operators can be used for making decisions based on that data. The virtual machine is also capable of manipulating sets of data (typically IP addresses), allowing multiple comparison operations to be replaced with a single set lookup.[6]
This is in contrast to the legacy Xtables (iptables, etc.) code, which has protocol awareness so deeply built into the code that it has had to be replicated four times—for IPv4, IPv6, ARP, and Ethernet bridging—as the firewall engines are too protocol-specific to be used in a generic manner.[6] The main advantages overiptables are simplification of the Linux kernelABI, reduction ofcode duplication, improvederror reporting, and more efficient execution, storage, and incremental,atomic changes of filtering rules.
Thenf_defrag_ipv4 module will defragment IPv4 packets before they reach Netfilter's connection tracking (nf_conntrack_ipv4 module). This is necessary for the in-kernel connection tracking and NAT helper modules (which are a form of "mini-ALGs") that only work reliably on entire packets, not necessarily on fragments.
The IPv6 defragmenter is not a module in its own right, but is integrated into thenf_conntrack_ipv6 module.
One of the important features built on top of the Netfilter framework is connection tracking.[7] Connection tracking allows the kernel to keep track of all logical network connections orsessions, and thereby relate all of the packets which may make up that connection. NAT relies on this information to translate all related packets in the same way, andiptables can use this information to act as a stateful firewall.
The connection state however is completely independent of any upper-level state, such as TCP's or SCTP's state. Part of the reason for this is that when merely forwarding packets, i.e. no local delivery, the TCP engine may not necessarily be invoked at all. Evenconnectionless-mode transmissions such asUDP,IPsec (AH/ESP),GRE and othertunneling protocols have, at least, a pseudo connection state. The heuristic for such protocols is often based upon a preset timeout value for inactivity, after whose expiration a Netfilter connection is dropped.
Each Netfilter connection is uniquely identified by a (layer-3 protocol, source address, destination address, layer-4 protocol, layer-4 key) tuple. The layer-4 key depends on the transport protocol; for TCP/UDP it is the port numbers, for tunnels it can be their tunnel ID, but otherwise is just zero, as if it were not part of the tuple. To be able to inspect the TCP port in all cases, packets will be mandatorily defragmented.
Netfilter connections can be manipulated with the user-space toolconntrack.
iptables can make use of checking the connection's information such as states, statuses and more to make packet filtering rules more powerful and easier to manage. The most common states are:
NEWESTABLISHEDRELATEDnf_conntrack_ftp module sees anFTP "PASV" commandINVALIDUNTRACKEDA normal example would be that the first packet the conntrack subsystem sees will be classified "new", the reply would be classified "established" and anICMP error would be "related". An ICMP error packet which did not match any known connection would be "invalid".
Through the use of plugin modules, connection tracking can be given knowledge of application-layer protocols and thus understand that two or more distinct connections are "related". For example, consider theFTP protocol. A control connection is established, but whenever data is transferred, a separate connection is established to transfer it. When thenf_conntrack_ftp module is loaded, the first packet of an FTP data connection will be classified as "related" instead of "new", as it is logically part of an existing connection.
The helpers only inspect one packet at a time, so if vital information for connection tracking is split across two packets, either due toIP fragmentation or TCP segmentation, the helper will not necessarily recognize patterns and therefore not perform its operation. IP fragmentation is dealt with the connection tracking subsystem requiring defragmentation, thoughTCP segmentation is not handled. In case of FTP, segmentation is deemed not to happen "near" a command likePASV with standard segment sizes, so is not dealt with in Netfilter either.
Each connection has a set oforiginal addresses andreply addresses, which initially start out the same. NAT in Netfilter is implemented by simply changing the reply address, and where desired, port. When packets are received, their connection tuple will also be compared against the reply address pair (and ports). Being fragment-free is also a requirement for NAT. (If need be, IPv4 packets may be refragmented by the normal, non-Netfilter, IPv4 stack.)
Similar to connection tracking helpers, NAT helpers will do a packet inspection and substitute original addresses by reply addresses in the payload.
Though not being kernel modules that make use of Netfilter code directly, the Netfilter project hosts a few more noteworthy software.
conntrack-tools is a set of user-space tools for Linux that allow system administrators to interact with the Connection Tracking entries and tables. The package includes theconntrackd daemon and the command line interfaceconntrack. The userspace daemonconntrackd can be used to enable high availability cluster-based stateful firewalls and collect statistics of the stateful firewall use. The command line interfaceconntrack provides a more flexible interface to the connection tracking system than the obsolete/proc/net/nf_conntrack.
Unlike other extensions such as Connection Tracking,ipset[8] is more related toiptables than it is to the core Netfilter code.ipset does not make use of Netfilter hooks for instance, but actually provides aniptables module to match and do minimal modifications (set/clear) to IP sets.
The user-space tool calledipset is used to set up, maintain and inspect so called "IP sets" in the Linux kernel. An IP set usually contains a set ofIP addresses, but can also contain sets of other network numbers, depending on its "type". These sets are much more lookup-efficient than bareiptables rules, but of course may come with a greater memory footprint. Different storage algorithms (for the data structures in memory) are provided inipset for the user to select an optimum solution.
Any entry in one set can be bound to another set, allowing for sophisticated matching operations. A set can only be removed (destroyed) if there are noiptables rules or other sets referring to it.
SYNPROXY target makes handling of largeSYN floods possible without the large performance penalties imposed by the connection tracking in such cases. By redirecting initialSYN requests to theSYNPROXY target, connections are not registered within the connection tracking until they reach a validated finalACK state, freeing up connection tracking from accounting large numbers of potentially invalid connections. This way, hugeSYN floods can be handled in an effective way.[9]
On 3 November 2013,SYN proxy functionality was merged into the Netfilter, with the release of version 3.12 of the Linux kernel mainline.[10][11]
ulogd is a user-space daemon to receive and log packets and event notifications from the Netfilter subsystems.ip_tables can deliver packets via the userspace queueing mechanism to it, and connection tracking can interact withulogd to exchange further information about packets or events (such as connection teardown, NAT setup).
Netfilter also provides a set of libraries havinglibnetfilter as a prefix of their names, that can be used to perform different tasks from the userspace. These libraries are released under the GNU GPL version 2. Specifically, they are the following:
libnetfilter_queuelibnfnetlinklibnetfilter_conntracklibnfnetlinklibnetfilter_loglibnfnetlinklibnl-3-netfilterlibnl project[12]libiptcnetlink library, and itsAPI is internally used by theiptables utilitieslibipsetlibmnl.The Netfilter project organizes an annual meeting for developers, which is used to discuss ongoing research and development efforts. The 2018 Netfilter workshop took place in Berlin, Germany, in June 2018.[13]