CROSS-REFERENCE TO RELATED APPLICATIONS This patent application is related to co-pending U.S. patent application entitled “METHOD AND APPARATUS FOR DEFLECTING FLOODING ATTACKS” by Thomas A. Maufer and Sameer Nanda, filed Dec. 31, 2002, application Ser. No. 10/334,656, assigned to the same assignee as this patent application, which is incorporated by reference as though fully set forth herein.
This patent application is related to co-pending patent application entitled “METHOD AND APPARATUS FOR PERFORMING NETWORK PROCESSING FUNCTIONS” by Robert A. Alfieri, Gary D. Hicok, Paul J. Sidenblad, filed Dec. 13, 2002, application Ser. No. 10/319,791, assigned to the same assignee as this patent application, which is incorporated by reference as though fully set forth herein.
This patent application is related to co-pending U.S. patent application entitled “NETWORK LEVEL PROTOCOL NEGOTIATION AND OPERATION” by Robert A. Alfieri, filed Sep. 23, 2002, application Ser. No. 10/253,362, assigned to the same assignee as this patent application, which is incorporated by reference as though fully set forth herein.
This patent application is related to co-pending U.S. patent application entitled “METHOD AND APPARATUS FOR SECURITY PROTOCOL AND ADDRESS TRANSLATION INTEGRATION” by Thomas A. Maufer, Sameer Nanda, and Paul J. Sidenblad, filed Jun. 13, 2002, application Ser. No. 10/172,352, assigned to the same assignee as this patent application, which is incorporated by reference as though fully set forth herein.
FIELD OF THE INVENTION One or more aspects of the invention generally relate to data structures for network protocol processing and more particularly, to cross-linked tables for network protocol processing, including state tracking.
BACKGROUND OF THE INVENTION The Internet remains a growing public network. Many companies rely on communication over the Internet using Internet Protocol (“IP”) to facilitate their business endeavors. For security in communication over the Internet, a computer may be configured to track and screen communications. This configuration is known as a “firewall,” and one or more of the actions of which may be referred to as “firewalling.”
In a “stateful firewall,” a set of values uniquely identifying each existing connection, (“state of each active connection”) is maintained, subject to deactivation or disconnection. Conventionally, five values are used to form such a set. These five values are sometimes collectively referred to as a “five-tuple” entry. A five-tuple entry includes respective values for IP Source Address, IP Destination Address, IP Protocol, Transport Layer Source Port (“Source Port”), and Transport Layer Destination Port (“Destination Port”). Examples of IP Protocols include User Datagram Protocol (“UDP”) and Transmission Control Protocol (“TCP”). In a UDP or TCP packet, there are IP Source and Destination Addresses in the IP packet header. In a UDP or TCP packet, Source and Destination Ports are in the UDP or TCP header, respectively, as well as an IP Protocol value indicating whether the packet is a UDP or TCP packet. For clarity, a TCP packet is described below, though it will be apparent that a UDP packet may be used.
In a connection using TCP (“a TCP connection”), namely, where TCP packets are exchanged, there is a received packet (“an inbound packet”) and a sent packet (“an outbound packet”). Notably, five-tuple entries for inbound and outbound packets are the same except that Source and Destination Addresses are reversed, and Source and Destination Ports are reversed. Of course, in each of these two related five-tuple entries; IP Protocol is the same in both inbound and outbound packets.
In a stateful firewall, a data structure, such as an array, may have respective columns indexed to five-tuple categories of information where each row represents an active connection. Additional columns may be used depending on the level of detail used to evaluate each connection. Such a data structure may be referred to as a “table,” indicating a tabularized form of information whether or not headings are used. Five-tuple entries for inbound and outbound packets are stored in a connection table. Connection table stored five-tuple entries are used to compare against five-tuples of inbound and outbound packets to determine whether or not the packets are for use with an existing connection.
When Network Address Translation (“NAT”) is employed, five-tuple information is stored to indicate Public IP Address and Public Transport Layer Port (“Public Port”) of a NAT configured device (“gateway”). The term “Public” is used to indicate that the address and port of the gateway are accessible from outside a local network associated with the gateway. The term “Remote” is used to indicate a device outside of a local network of the gateway. Notably, the gateway device may be a separate computer or installed. in a “Local” computer. The term “Local” refers to a device on a local network of the gateway. For NAT, instances of inbound packets to a NAT gateway, a five-tuple entry includes: an IP Source Address (“Remote IP Address”); an IP Destination Address (“Public IP Address”); a Source Port (“Remote Source Port”); and a Destination Port (“Public Destination Port”). For NAT, instances of outbound packets to a NAT gateway, a five-tuple entry includes: an IP Source Address (“Local IP Address”); an IP Destination Address (“Remote IP Address”); Source Port (“Local Source Port”); Destination Port (“Remote Destination Port”); and IP Protocol.
When an inbound packet having a five-tuple from a Remote device is received by a gateway where the five-tuple matches one stored in a NAT table, the gateway translates such an inbound packet for routing. Using the above describe convention, the five-tuple includes: IP Source Address (“Remote IP Address”); IP Destination Address (“Local IP Address”); Source Port (“Remote Source Port”); Destination Port (“Local Destination Port”); and IP Protocol. This is because a packet from a Remote device is sent to a gateway using Public information, which after found to be part of an active connection is used for address translation for routing to a Local device.
When an outbound packet having a five-tuple from a local device is received by a gateway where the five-tuple matches one stored in a NAT table, the gateway translates such an outbound packet for routing. Using the above described convention, the five-tuple includes: IP Source Address (“Public IP Address”); IP Destination Address (“Remote IP Address”); Source Port (“Public Source Port”); Destination Port (“Remote Destination Port”); and IP Protocol. For clarity, the terms Remote, Local and Public are used below whether or not NAT is being used.
Furthermore, to enhance firewalling security, encrypted information may be established for a connection. Examples of protocols for enhanced security on the Internet include Point-to-Point Tunneling Protocol (“PPTP”) and a set of protocols known collectively as Internet Protocol Security (“IPSec”). However, fragmentation of IP packets has been used to defeat firewalls, such as the so-called “ping-of-death,” “wedge” and “tiny fragment” attacks. IP version4 (“IPv4”) supports header structures allowing fragmentation of IP packets. Notably, a fragmented packet (“fragment”) may be fragmented further, and there is no requirement that fragments arrive in order, or even that they arrive at all. In many stateless firewalls, fragments are summarily process by dropping them. However, fragments are useful when an intermediate router has to forward a packet that is larger than the maximum transmission unit (“MTU”) of an outgoing interface (“OIF”). Thus, by dropping fragments, information may be lost. Examples of stateless firewalls may be found integrated in low-end home gateway routers. In higher-end standalone or integrated stateful firewalls, more states are added to verify authenticity of a fragment. This approach facilitates use of devices with significant embedded memory limitations, using less memory than a fragment buffering and reassembly approach.
Accordingly, it would be desirable to have a stateful firewall that buffers and reassembles fragments.
It should be appreciated that whether or not NAT is used a table lookup is done for each packet. Thus, computational cycles are spent for each lookup and comparison of each five-tuple entry. Accordingly, a reduction in computational cycles for packet processing would be useful and desirable.
SUMMARY OF THE INVENTION An aspect of the invention is a method for creating data structures for firewalling and network address translating. The method comprises: instantiating a first data structure and a second data structure; populating the first data structure with state information for a packet; populating the second data structure with packet information for the packet; and cross-linking the first data structure and the second data structure, where the cross-linking includes generating an index for the packet information; and storing in the first data structure the index in association with the state information.
Another aspect of the invention is a method for creating data structures for physical layer addressing. The method comprises: instantiating a first, a second and a third data structure; populating the first data structure with state information; populating the second data structure with network address translation information; populating the third data structure with interface information; and cross-linking the first data structure and the second data structure to the third data structure, the cross-linking including: generating an index for the interface information; and storing the index in the first data structure in association with the state information and in the second data structure in association with the network address translation information.
Another aspect of the invention is a method for security protocol support. The method comprises: creating a table, the table including a first, a second and a third assigned data space; populating the first assigned data space to indicate that a security protocol is being used; populating the second assigned data space with a first portion of a security protocol string; and populating the third assigned data space with a second portion of the security protocol string.
Another aspect of the invention is a method for creating at least one data structure. The method comprises: determining if a firewall is activated; determining if network address translator is activated; and creating the at least one data structure responsive to one of: the firewall and the network address translator being activated; the firewall being activated and the network address translator not being activated; and the firewall not being activated and the network address translator being activated.
Another aspect of the invention is a data structure for routing packets. The data structure comprises: an Internet Protocol destination address data space for storing Internet Protocol destination addresses; an Internet Protocol source address data space for storing Internet Protocol source address; and an address resolution table index data space for storing indices to an address resolution table, where the address resolution table includes a media access control address data space for storing media access control addresses.
Another aspect of the invention is a method of forming hashing table chains. The method comprises: obtaining a first connection hash value, the first connection hash value pointing to a first slot in the hashing table; obtaining a second connection hash value, the second connection hash value pointing to the first slot in the hashing table; assigning the second connection hash value to a second slot in the hashing table; pointing the first slot toward the second slot; obtaining a third connection hash value, the third connection hash value pointing to the second slot in the hashing table; moving contents of the second slot to a third slot in the hashing table; and assigning the third connection hash value to the second slot in the hashing table.
Another aspect of the invention is a method for tracking packet states, comprising: initiating tracking of state from a CLOSED state; from the first CLOSED state, tracking transition to a LISTEN state or a SYN-SENT state; from the LISTEN state, tracking transition to one of the first CLOSED state, a SYN-RCVD state or the SYN-SENT state; from the SYN-RCVD state, tracking transition to either a first hardware state or a SYN-RCVD-SYN-SENT state; from the SYN-SENT state, tracking transition to either a second hardware state or the SYN-RCVD-SYN-SENT state; from the SYN-RCVD-SYN-SENT state, tracking transition to either a first SYN-RCVD-SYN-SENT-ACK state or a second SYN-RCVD-SYN-SENT-ACK state; and from either the first SYN-RCVD-SYN-SENT-ACK state or the second SYN-RCVD-SYN-SENT-ACK state, tracking transition to a third hardware state.
Another aspect of the invention is an apparatus for tracking packet states, comprising: means for initiating tracking of state from a first CLOSED state; means for tracking software states for packets; and means for tracking hardware states for the packets. The means for tracking software states for tracking the packets to one of a first, a second and a third hardware state: the first hardware state being a SYN-RCVD-SYN-ACK-SENT state, the second hardware state being SYN-SENT-SYN-ACK-RCVD state, and the third hardware state being a connection-established state. The means for tracking hardware states including: means for tracking transition to the connection-established state from the SYN-RCVD-SYN-ACK-SENT state; means for tracking transition to the connection-established state from the SYN-SENT-SYN-ACK-RCVD state; means for tracking transition to a first FIN-WAIT state from the SYN-RCVD-SYN-ACK-SENT state, the SYN-SENT-SYN-ACK-RCVD state or the connection-established state; and means for tracking transition to a CLOSE-WAIT-FIN state from the SYN-RCVD-SYN-ACK-SENT state, the SYN-SENT-SYN-ACK-RCVD state or the connection-established state.
An aspect of the invention is a method for network protocol processing. The method comprises: obtaining a packet for network address translation, the packet having a media access control header; obtaining information, including the media access control header, from the packet; parsing out the information into one or more data structures; determining if a network processing unit is in a pass-through mode responsive to the media access control header; and responsive to the network processing unit not being in the pass-through mode: determining whether multicast or broadcast is active, and determining whether a protocol type for the packet is supported by the network processing unit.
BRIEF DESCRIPTION OF THE DRAWINGS Accompanying drawing(s) show exemplary embodiment(s) in accordance with one or more aspects of the invention; however, the accompanying drawing(s) should not be taken to limit the invention to the embodiment(s) shown, but are for explanation and understanding only.
FIG. 1 is a block diagram of an exemplary embodiment of an address translation flow.
FIGS. 2A-1,2A-2,2B-1,2B-2,2C,2D-1 and2D-2 are respective flow diagrams of respective exemplary embodiments of portions of the address translation flow ofFIG. 1.
FIGS. 3A, 3B and3C are flow diagrams of respective exemplary embodiments of bridging and routing flows.
FIG. 4A is a flow diagram of an exemplary embodiment of a Network Address Translation (“NAT”) filtering flow.
FIG. 4B is a flow diagram of an exemplary alternative embodiment of a portion of NAT filtering flow ofFIG. 4A.
FIGS. 5A and 5B are flow diagram of a respective exemplary embodiment of outbound filtering flows.
FIGS. 6, 7,8 and9A are table entry diagrams for respective exemplary embodiments of tables for which information may be stored.
FIG. 9B is a flow diagram depicting an exemplary embodiment of a state table creation flow.
FIG. 10 is a state transition diagram of an exemplary embodiment of a state tracking flow.
FIG. 11 is a flow diagram of an exemplary embodiment of portion of a data structure population flow.
FIG. 12A is a block diagram of an exemplary embodiment of a network processor unit (“NPU”).
FIG. 12B is a flow diagram of an exemplary embodiment of a packet processing flow for processing bursts of packets.
FIG. 13 is a block diagram of an exemplary embodiment of a computer system.
FIG. 14 is a block diagram of an exemplary embodiment of a network.
FIGS. 15A and 15B are block diagrams depicting exemplary embodiments of respective tables indexed by hash function output values.
FIG. 16 is a flow diagram of an exemplary embodiment of a fragment processing flow.
FIG. 17 is a block diagram of an exemplary embodiment of a buffer stack.
DETAILED DESCRIPTION OF THE DRAWINGS In the following description, numerous specific details are set forth to provide a more thorough understanding of aspects of the invention as described with respect to exemplary embodiments herein. However, it will be apparent to one of skill in the art that one or more aspects of the invention may be practiced without one or more of these specific details. In other instances, well-known features have not been described for purposes of clarity.
FIG. 1 is a block diagram of an exemplary embodiment of an address translation flow100. Address translation flow100 includes respective flows or subroutines. A packet is interrogated withpacket interrogation flow120. Output frompacket interrogation flow120 is sent to Network Processor Unit (“NPU”)Mode A flow140. Output from NPU Mode A flow is sent to NPUMode B flow160. Output from NPUMode B flow160 is sent to composepacket flow180. Let it be understood that address translation flow100 maybe instantiated in hardware, software or a combination of hardware and software. For clarity, address translation flow100 is described as an implementation of a combination of hardware and software.
FIGS. 2A-1,2A-2,2B-1,2B-2,2C and2D-1 (singly and collectively “FIG. 2”) are respective flow diagrams of respective exemplary embodiments of portions of address translation flow100 ofFIG. 1. Apacket101 of a transmission is received. Multiple packets corresponding to multiple connections may be processed at a time for address translation flow100 ofFIG. 1, though for purposes of clarity processing of a single packet is described. This is consistent with how packets are received for a connection, namely, one packet at a time. Notably, if a plurality of packets is received in a short span of time, such packets may be buffered as described below with respect to an NPU. Furthermore,FIG. 2 for the most part is described with respect to an address translator portion of an NPU, and accordingly receiving and transmitting is often done with reference to information going to and from, respectively, the address translator portion.
At102 a determination as to whether address translation is supported in hardware, such as with an address translator in an NPU. If hardware does not support address translation, a received packet is sent to software providing at least a portion of NPU functionality (“NPUsoft”) with error condition (“E”)103. NPUsoft represents handling of a packet as embodied in software. For clarity, NPUsoft activity in instances is not described in any detail because either such processing follows from description of the hardware implementation or such processing is conventional. If, however, hardware does support address translation, then at104 optionally a determination as to whether an audit mode is in an active state. Notably, an audit mode is generally for testing, and thus need not be employed in a tested product. If an audit mode is in an active state, then a determination is made at105 as to whetherpacket105 is a “re-inserted” packet. By “re-inserted” packet, it is meant a packet moved out of address translation flow100 with respect to hardware processing for processing by software, NPUsoft, prior to being re-inserted back into address translation flow100.
If at105,packet101 is not a re-inserted packet, namely, this is thefirst time packet101 has been partially processed by address translation flow100, thenpacket101 is sent to NPUsoft witherror condition106. This allowspacket101 to be tested, such as by a host computer system programmed with NPUsoft, prior to further processing in hardware for compatibility with such hardware. If, however,packet101 has previously been partially processed with address translation flow100 or an audit mode is not active, then at107 a determination is made as to whether information may be obtained frompacket101. If information may be readily obtained frompacket101, then such information is processed at107. At107, a packet is broken out into a data structure for parsing information into distinct fields, such as for a table. This alternate representation of a packet may be done in software for purposes of building tables of information. Tables that may used forFIG. 2 are described in additional detail with reference toFIGS. 6, 7,8 and9A.
FIGS. 6, 7,8 and9A are table entry diagrams for respective exemplary embodiments of tables for which information, such as from apacket101 ofFIG. 2, maybe stored.FIG. 11 is a high-level flow diagram of an exemplary embodiment for a datastructure population flow850. Datastructure population flow850 is described with simultaneous reference toFIGS. 2, 6,7,8 and9A.
Ifpacket101 is an inbound or outbound packet from which information may be obtained, then at811 packet information, such as five-tuple, is obtained. Additionally, interface information relative topacket101, such as Media Access Control (“MAC”) information, may be obtained at811.
At812, respective indices are generated using packet information obtained at811. At813, packet information, interface information and indices are stored in data structures. Examples of data structures are Connection Table (“CT”)600, or if NAT is being used, NAT Table (“NT”)700. Interface information is stored in Address Resolution Table (“ART”)800. For example, an index generated from five-tuple information is stored in eitherCT600 or NT700 for cross-linking such tables, as described below in additional detail. For example, an index generated from an entry inART800, for example by hashing all or a portion of an entry of interface information, is stored inCT600, or inNT700 if NAT is being used, for cross-linking withART800, as described below in additional detail. Additionally, such an ART index may be stored inART800 to avoid recalculation of such an index, for example when updating an auxiliary Canonical Frame Header (“xCFH”) ofpacket101 for broadcasting, as described below in additional detail. A CFH is a data structure, separate frompacket101, that travels withpacket101, where data for a CFH is derived frompacket101, as described below in additional detail. Moreover, an ART index from such interface information is stored in Routing Table (“RT”)900 for cross-linking withART800, as described below in additional detail.
It should be noted thatCT600,NT700 andRT900 are linked toART800 viaART index601. Thus,CT600,NT700 andRT900 are somewhat dependent onART800. For example, there may be one or more than one CT entry linked to the same ART entry. It should be further noted thatCT600 is linked toNT700 viaNT index606, and it should further be noted thatNT700 is linked toCT600 viaCT index706. Thus,CT600 andNT700 are cross-linked.
Rather than having one large state table or other data structure forCT600 and NT700 information, two linked state tables are used to conserve memory. For example, if NAT is not being used, whether supported or not, many entries in a single state table may be left blank. Accordingly, by populating a smaller table with higher usage efficiency, memory usage is reduced over use of a larger table with lower usage efficiency. However, it should be understood that one or more of state tables600,700,800, and900 may be combined. However, for purposes of clarity, separate state tables600,700,800, and900 are described. Notably,CT600 andNT700 may be created according to whether firewalling or NAT is active. Referring toFIG. 9B, there is shown a flow diagram depicting an exemplary embodiment of a statetable creation flow910. Statetable creation flow910 may, for example, be instantiated in software for creation of state tables600,700,800, and900. A command, such as createdata structures903, may be initiated as part of a startup mode. At904, it is determined whether NAT is active. If NAT is active,CT600 andNT700 are created to allow for NAT, andCT600 is used for tracking TCP state. Notably, TCP state is tracked whether or not firewalling is active. After which, statetable creation flow910 returns at906. If NAT is not active, then at905CT600 is created if firewalling is active for tracking TCP state information. If the firewall is not active,CT600 is not created. At906, statetable creation flow910 returns.
It should be understood that for NAT to take place, a packet needs to be in compliance with a NAT protocol. Accordingly, if a packet were not in compliance, such a communication would fail. Thus, to reduce or avoid firewall processing of invalid packets,FIG. 2 is described as being done in front of a firewall. Thus, firewall policies may be instantiated in an address translator portion of an NPU, as described below in additional detail, and pointing from aNT700 to a firewall data structure may be postponed until confirmation that such a packet is in compliance for NAT. This is more evident with respect to the description of output filtering below.
For a non-NAT connection, information stored inCT600 generally includes anIP Protocol607, aRemote IP Address602, aRemote Port605, aLocal IP Address604 and aLocal Port607. For a NAT connection, information stored inNT700 typically includes anIP Protocol607, aRemote IP Address602, aRemote Port605, aPublic IP Address704 and aPublic Port707. Notably, an inbound or outbound packet is either a remote or local packet, and thus entries for such packets may be “remote five-tuple” and “local five-tuple” entries for inbound or outbound packets. Thus, it should be appreciated that for inbound packets:Remote IP Address602 are IP source addresses; Public IP Addresses704 and Local IP Addresses604 are IP destination addresses;Remote Ports605 are source ports; andPublic Ports707 andLocal Ports607 are destination ports. Furthermore, it should be appreciated that for outbound packets: Local IP Addresses604 and Public IP Addresses704 are IP source addresses; Remote IP Addresses602 are IP destination addresses;Local Ports607 andPublic Ports707 are source ports; andRemote Ports605 are destination ports. There are some exceptions to this for handling security protocol packets.
Additionally, for a stateful firewall, at least aTCP state609 for each connection may be stored inCT600. Other known attributes, such as sequence numbers, acknowledgment numbers, and window size, among other known state variables, may be stored inCT600. These other attributes may be associated with a five-tuple entry. For example, additionally aSequence Number610 for each inbound and outbound packet may be stored inCT600. Notably, TCP State in addition with other state attributes may be stored only inCT600 even though NAT is being used. Recall from above, that two smaller tables are used rather than a single large table. Accordingly, attributes for stateful firewalling may be stored in one location, namely,CT600.
If a secure connection has been established, such as with IPSec or PPTP, then a portion of an inbound five-tuple; whether non-NAT or NAT, may be encoded. Accordingly, either a Security Parameters Index (“SPI”) or Generic Routing Encapsulation (“GRE”) Call Identification (“GRE Call ID”)603 run over an IPSec or PPTP, may be stored inCT600, or NT700 if NAT is being used. However, encryption, decryption, compression or decompression may be done in a sequence processor portion of an NPU, and thuspacket101 is presumed to be in a non-encrypted and non-compressed state forFIG. 2, to the extent a security protocol allows. Because some information of a security protocol is encrypted, other information is used instead. For example, inCT600 or NT700, SPI/GRE Call ID603 is a flag used to indicate that no security protocol is being used or that either an SPI or GRE Call ID is being used, where the index or call identification is actually broken out into two portions, one portion of which is stored in a remote port data space and another portion of which is store in a public port data space. An example of use of IPSec and NAT is describe in co-pending U.S. patent application entitled “METHOD AND APPARATUS FOR SECURITY PROTOCOL AND ADDRESS TRANSLATION INTEGRATION” by Thomas A. Maufer, Sameer Nanda, and Paul J. Sidenblad, filed Jun. 13, 2002, application Ser. No. 10/172,352, which is incorporated by reference as though fully set forth herein, NAT may be used with a security protocol.
Indices are computed for eachCT600 entry, eachNT700 entry and eachART800 entry. ACT Index706 and anART Index601 are stored inNT700. AnNT Index606 and anART Index601 are stored inCT600. AnART Index601 is stored inRT900. Indices are computed by hashing values for an entry, for example for a five-tuple entry inCT600 orNT700. A hash of an entry or portion thereof represents an index to that entry in that table. For example, a hash of a five-tuple forming a portion of an entry may be used as an index to the entry. Indices are stored in tables in association with a corresponding entry. Accordingly, tables are cross-linked through such indices, except forART800 which does not need to be cross-linked.
Computational cycles are expended for an initial table lookup. However, by creating and storing table entry indices, entries are cross-linked. For example, each NT entry is cross-linked with a corresponding CT entry, and each CT entry is cross-linked with a corresponding NT entry. Following a link to a corresponding entry in another table is less computationally intensive than looking up an entry by checking for matches of a plurality of values, such as a five-tuple, each time a table is accessed. Additionally, by storing a hash of an entry, re-computation of such a hash is avoided thereby reducing use of computational resources.
CT600 andNT700 each store links toART800 viaART index601.ART index601 is a hash of an entry inART800. In this manner,CT600 andNT700 are respectively cross-linked withART800.ART800 stores information associated with delivery ofpacket101, namely, a MAC address and other MAC-layer attributes. For example, aMAC Address801, a Virtual Local Area Network (LAN) Identification (ID)802 and anInterface Mask803 may be stored inART800.MAC Address801 is a next destination address forpacket101, which may be a next hop final destination or a next hop toward the final destination address.
It is less computationally intensive to follow a link corresponding to an ART entry than hashing a packet's destination address, such as anIP Destination Address901. By storing anART index601 for eachART800 entry inRT900 along with anIP Destination Address901, aMAC Address801, as well as other MAC-layer attributes, fromART800 is linked to suchIP Destination Address901. Thus, it should be appreciated that once a match to an index is found inCT600 or NT700, an ART index may be obtained leading to a next hopIP Destination Address901 orMAC Address801. Thus, once entries for packet and interface information are instantiated for a first packet of a connection, all subsequent packets may be processed by hashing information for matching an index. Hashes forindices601,606 and706, may be done responsive to initialization of an associated state table entry for a first packet sent with respect to a connection. By saving computed indices for a connection, with a single hash for each subsequent packet for such connection, translation or forwarding data for each subsequent packet may be found by linking to an appropriate table entry, using subsequent packet hashing. Notably,RT900 may be used when a routing only condition exists. Thus, if one or both of firewalling and network address translating (“NAT'ing”) is done, thenRT900 may be bypassed asCT600 andNT700 are linked withART800 viaART Index601.
Accordingly, performance in packet processing is enhanced, and thus throughput is increased. Furthermore, as described with respect to use of a parallel data structure, namely, an xCFH data structure that travels withpacket101, indices are embedded to further enhance packet forwarding, namely, routing or bridging.
It is possible that a same hash results from two or more respective entries. Accordingly, as a failsafe measure, after an entry has been accessed by finding a match of a hash of a received packet as an index in a state table, a comparison of such currently received packet's information to packet information for a previously received packet for a connection stored in such state table may be made. For example, a comparison of five-tuples may be done responsive to a match of such a hash of a received packet to a stored index. Though this adds additional overhead, it is still less computationally intensive for example than comparing what potentially may be an entire table of five-tuples to a five-tuple of a currently received packet. Moreover, by having separate tables, fewer entries within a table need to be checked for matches. Furthermore, hash function output values, as described below, may be employed as table indices.
With the above-described context, the remainder of address translation flow100 with respect toFIG. 2 is described.
Returning toFIG. 2, if, at107, information cannot readily be obtained frompacket101, then at114 a determination is made as to whether such apacket101 may continue to be processed withpacket interrogation flow120. For example, at108 a check may be made as to whetherpacket101 is part of a non-data over-the-air (“wireless”) frame. Notably, the packet interrogator is described in terms of a parser of wireless data frames and not as being configured to parse non-data wireless frames, as described below in additional detail. Ifpacket101 is part of a non-data wireless frame, thenpacket101 is sent to NPUsoft witherror condition110. Ifpacket101 is not part of a non-data wireless frame, then at109 a check may be made to determine if a frame used for sendingpacket101 was insufficient, namely, too short. If such a frame was too short, thenpacket101 is sent to NPU soft witherror condition111. If the frame was not too short, then at113 a check for another abort code may be made. If another abort code is found forpacket101, thenpacket101 is sent to NPUsoft with an error condition for such other abort code, forexample error condition112. Notably,error conditions111 and112 may lead to droppingpacket101 if too incomplete to process. Recall,packet101 is now formatted in a data structure that is more readily parsed.
Alternatively, a packet interrogation flow alternative is shown inFIG. 2A-2, wherewireless flow118 includes checking for anon-data wireless frame108. However, if it is not a non-data wireless frame at108, thenpacket101 is sent to NPUsoft witherror condition110. If it is a non-data wireless frame at108, then at115 it is determined if the frame came from the host computer or device. If the frame came from the host at115, thenpacket101 is transmitted at157. Otherwise, if the frame did not come for the host as detected at115,packet101 is sent to NPUsoft witherror condition117. In other words, if a non-data wireless frame came from wire, it is not put back on wire. Thus, non-data wireless frames are only transmitted by an address translation (“AT”) subunit of an NPU if they came from a host device, where firewalling and NAT are bypassed for such transmission. However, alternatively, the packet interrogator may be configured to parse non-data wireless packets, and state could be tracked for such parsed non-data wireless packets.
With continuing reference toFIG. 2, at116, a determination is made as to whether an NPU is in a Pass-through Mode A. Pass-through Mode A is a pass-through mode with frame conversion only, which may be determined from a MAC header ofpacket101. Thus, if only a frame conversion is needed for bridging, a significant portion of address translation flow100 may be bypassed. If an NPU is in pass-through Mode A, then at156packet101 is composed, namely, header format is converted, for example from an IP format to an Ethernet format. Such composedpacket158 is transmitted at156, for example to a firewall module of an address translator or a sequence processor portion of such an NPU.
If Pass-through Mode A is not invoked, then at121 a determination is made as to whether multicast reception is active on an Incoming Interface (“IIF”) for a group of listeners of a multicast. If multicast reception is not active, thenpacket101 is sent to NPUsoft with an error condition, forexample error condition122.
At123, a data link layer (“layer-2”) validity check is done. A layer-2 validity check determines whether a MAC source address is a multicast MAC address and whether there is a length error for a frame used for such a MAC address. Additionally, a layer-2 validity check may involve checking whether a report, which may be termed a “cracker report,” generated as a result of obtaining information at107 indicated an error in an xCFH forpacket101. If at123,packet101 is found to be invalid as a result of a layer-2 validity check, thenpacket101 is sent to NPUsoft with an error condition, forexample error condition124.
Apacket101 determined to be valid at123 is checked at125 for packet protocol type and protocol support on the IIF. If IP protocol ofpacket101 is not supported by a network processing unit, thenpacket101 is sent to NPUsoft with an error condition, forexample error condition126.
Alternatively, an NPUmode A flow140A maybe used. Referring toFIG. 2B-2, for an NPU not in pass through mode A, at128 a determination is made as to whether a frame forpacket101 is a multicast or broadcast frame. If such a frame is a multicast or broadcast frame, then a check for multicast or broadcast active on the IIF is made at121, as previously described. Otherwise, if such a frame is not a multicast or broadcast frame, then at125 a check for a supported protocol is made as previously described. Notably, layer-2 validity checking is not done here in this alternative, as layer-2 validity checking may be moved and done with layer-3 validity checking as described below with respect to filtering.
If IP protocol ofpacket101 is supported, then at127 it is determined whether an NPU is in Pass-through Mode B. Pass-through Mode B is a pass-through through with firewall-only mode. This maybe determined by accessing a data structure, such as a table, indicating whether firewalling-only has been activated forpacket101. If such an NPU is in Pass-through Mode B, a check is made at153 ofFIG. 2C to determine ifpacket101 is a non-IP protocol packet. If, however, such an NPU is not in Pass-through Mode B, then other processing occurs prior to checking whetherpacket101 is a non-IP protocol packet.
Referring toFIG. 2; and in particularFIG. 2C, at131, optionally a hash of interface information ofpacket101 is done, otherwise a lookup is done by comparing MAC source addresses. If a hash is done, the result may be stored as an ART Index in an xCFH for data path flow withpacket101. Assuming a hash is not done, a check is made to determine if a MAC source address for a frame obtained frompacket101 is in an ART, such asART800 ofFIG. 8. If a MAC source address forpacket101 is not inART800, for example ifpacket101 is an initial packet of a connection, thenpacket101 is sent to NPUsoft with an error condition, forexample error condition132, meaning that ART entries need to be built for thispacket101. In addition to such an error message, optionally the hash, ofpacket101, if optionally done, may be sent to NPUsoft. NPUsoft may usepacket101 for bridge learning and optionally forIEEE 802 authorization. However, NPUsoft may determine thatpacket101 is to be dropped. If, however, a MAC source address forpacket101 is inART800, then at131 such MAC source address is looked up fromART800.
At134, control bits may be read from an ART entry associated with a MAC source address looked up at131. Control bits provide flags responsive to events, for example as indicated with respect to error conditions for invoking NPUsoft. If control bits cannot be read, thenpacket101 is sent to NPUsoft with an error condition, forexample error condition135. If control bits are read at134, then at136 a determination is made as to whether the IIF is running NAT. Additionally, at136, a check may be made to determine if the frame has anIP packet101. If NAT is running, then at137 inbound NAT filtering is done, and at139 a check is made as to whether a frame used forpacket101 is a broadcast or multicast frame. Notably, bridging and routing may be bypassed if NAT is running. This is because an ART Index providing a pointer to table entries is embedded in an xCFH traveling withpacket101. If NAT is not running at136, then bridging and routing is done at138A, and at139 a check is made as to whether a frame used forpacket101 is a broadcast or multicast frame.
If, at139, either a multicast or broadcast frame is being used, then at141 a check for hardware support for multicast or broadcast frame replication is made responsive to frame type. If multicast or broadcast support is found to be lacking at141, thenpacket101 is sent to NPUsoft with an error condition, forexample error condition142. If such support in hardware exists, then at143 a check is made to determine if expansioning or skipping for multicast or broadcast, depending on frame type, includes any disallowed outgoing interface (“OIF”) for a group of listeners. If one or more disallowed OIFs are included; thenpacket101 is sent to NPUsoft with an error condition, forexample error condition144.Error condition144 means that multicasting or broadcasting is not supported or thatpacket101 is invalid with respect to multicasting or broadcasting. Accordingly,packet101 may be dropped. If, however, no disallowed OIF is included as determined at143, or no multicast nor broadcast frame is used as determined at139, then at145 a check is made to determine if the OIF equals the IIF forpacket101. Notably, steps146 may be moved to a routing and bridging flow, as described below in additional detail. If the IIF and the OIF are equal, then an interface mask, such asinterface mask803 ofFIG. 8, is for an IEEE 802.11 interface and thenpacket101 is sent to NPUsoft with an error condition, forexample error condition147, for processing by NPUsoft or dropping. If, however, the IIF and the OIF are not equal, then at148 a check for IP protocol type of the OIF is made. At148, it is determined whether the IP protocol type is supported on the OIF. If the IP protocol type is not supported on the OIF, thenpacket101 is sent to NPUsoft with an error condition, forexample error condition149, for processing by NPUsoft or dropping.
If the IP protocol type is supported on the OIF as determined at148, then at151 it is determined whether broadcasting or multicasting is invoked for the OIF. Notably, determining whether broadcasting or multicasting of packets being sent out via the OIF is permitted at151 is optional here, and may be done in a routing and bridging flow as described below. If broadcasting or multicasting is not invoked for the OIF, thenpacket101 is sent to NPUsoft with an error condition, forexample error condition152, for processing by NPUsoft or dropping. If, however, broadcasting or multicasting is invoked for the OIF, responsive to frame type, or if an NPU is in Pass-through Mode B, a check is made at153 to determine ifpacket101 is a non-IP protocol packet.
Referring toFIG. 2, and in particularFIG. 2D-1, Ifpacket101 is of a non-IP protocol type at153, thenpacket101 sent without outbound filtering, where at156packet101 is composed to produce composedpacket158 and processed further as previously described. If, however,packet101 is not of a non-IP protocol type, then at154 it is determined whether the OIF and the IIF are both trusted or both not trusted, for example by processing a trust bit for each through an XOR gate. If both are trusted or both are not trusted, then composition, ofpacket101 takes place as previously described. If, however, one of the IIF or the OIF is trusted and the other one of the OIF and the IIF is not trusted, then at155 outbound filtering is done. After outbound filtering,packet101 is composed at156 as previously described.
Alternatively, with reference toFIG. 2D-2, an alternative composepacket flow180A is shown. As much of composepacket flow180A is the same as that of composepacket flow180 ofFIG. 2D-1, it is not repeated. If it is determined at154 that the OIF and the IIF are not both trusted or untrusted, then at201 it is determined whetherpacket101 is an IP version six (“IPv6”) packet or IPv6 site boundary enforcement is active. Ifpacket101 is not an IPv6 packet or IPv6 site boundary enforcement is not active, then outbound filtering takes place at155. Otherwise, at202 a determination is made as to whether a site, prefix in a destination address forIPv6 packet101 is the same as the OIF's site prefix. If the two prefixes are not the same,packet101 is sent to NPUsoft with an error condition or dropped at203. Otherwise,packet202 is sent to155 for outbound filtering.
FIG. 3A is a flow diagram of an exemplary embodiment of a bridging androuting flow138A. Recall from above, an ART entry, hash optionally may have been done forpacket101 and such an ART index may be traveling withpacket101 via an xCFH. Thus, for each ART and RT lookup, such an ART index may be used. However, it is that an ART entry hash forpacket101 has not optionally been done.
Bridging and routing flow138A is initiated at301. At302, a determination is made as to whether a MAC destination address ofpacket101 matches an interface, such as IIF or OIF.
If a MAC destination address matches an interface for routing ofpacket101, then at303 a determination is made as to whetherpacket101 contains a routable IP protocol, such as whetherpacket101 is an IPv4 or IPv6 packet. Ifpacket101 does not contain a routable IP protocol, thenpacket101 is sent to NPUsoft with an error condition, forexample error condition304, for processing by NPUsoft or dropping. If, however,packet101 contains a routable IP protocol, such as IP version4 (“IPv4”) or IP version6 (“IPv6”), then at306 a determination is made as to whether routing is supported in hardware. If routing is not supported in hardware, thenpacket101 is sent to NPUsoft with an error condition, forexample error condition307, for routing by NPUsoft as described below for example with respect to one or more ofinstantiations314,316 and318.
At314, a network layer (“layer-3”) validity check is done, and an xCFH is marked to indicate this check has been done. Ifpacket101 is found to be invalid with respect to a layer-3 validity check at314, thenpacket101 is sent to NPUsoft with an error condition, forexample error condition315, for processing by NPUsoft or dropping.
If network layer validity is established, then at316, IP options are checked, and an xCFH ofpacket101 is marked to indicate that IP options have been checked. If IP options are unsupported or invalid, thenpacket101 is sent to NPUsoft with an error condition, forexample error condition317, for processing by NPUsoft or dropping. If, however, all IP options are supported and valid, at318RT900 is accessed looking for a match of an IP destination address forpacket101 as an entry inRT900. If no match is found, then,packet101 is sent to NPUsoft with an error condition, forexample error condition319, for processing by NPUsoft, such as with a general routing table (“GRT”) lookup. If, however, anIP Destination Address901 is found inRT900 matching an IP destination address ofpacket101, anART Index601 stored inRT900 in association with suchIP Destination Address901 is added to an xCFH ofpacket101, and then routing and bridging flow138A returns to address translation flow100 at399. Additionally, the TTL in the xCFH may be decremented. Notably, it should be appreciated thatRT900 is a compact routing table as compared with conventional routing tables. This compact nature ofRT900 facilitates using exact-match comparison of the packet's IP destination address against all the entries inRT900, instead of a “longest” match. (i.e., a longest-match algorithm for finding the GRT entry with the greatest number of most-significant bits (“MSBs”) in common with the packet's IP destination address). Furthermore, if an exact match is found inRT900, then all information for a next hop header is available. Accordingly, a next hop header may be built without having to resort to a GRT lookup. Alternatively, a MAC destination address search may be done inART800 for an exact match, and if an exact match is not found, the MAC destination address is stored in an xCFH ofpacket101 and inRT900.
If, however, at302 a MAC destination address does not match an interface for bridging ofpacket101, then at305 a determination, is made as to whether IP multicast routing is invoked and whetherpacket101 is an IP multicast packet and not a broadcast packet. If both IP multicast routing is invoked andpacket101 is a multicast packet, then at308 a determination is made as to whether anIP Source Address902 inRT900 matches an IP source address ofpacket101. If no match of the IP source address is found, thenpacket101 is sent to NPUsoft with an error condition, forexample error condition313, for processing, by NPUsoft or dropping. If a match of the IP source address is found, thenpacket101 is processed further as previously described starting at314.
If, however, at305, either IP multicast routing is not invoked orpacket101 is not a multicast packet, then checks for broadcasting ofpacket101 are done beginning at309 with a determination of whether bridging is supported in hardware. If bridging is not supported in hardware, thenpacket101 is sent to NPUsoft with an error condition, forexample error condition310, for processing by NPUsoft as described below for example with respect toinstantiation311.
If bridging is supported in hardware, then at311ART800 is accessed looking for a match of a MAC destination address forpacket101 as an entry inART800. If no match is found, thenpacket101 is sent to NPUsoft with an error condition, forexample error condition312, for processing or dropping by NPUsoft. If, however, aMAC Destination Address801 is found inART800 matching a MAC destination address ofpacket101, anART Index601 stored inART800 in association with suchMAC Destination Address801 is added to an xCFH ofpacket101, and then routing and bridging flow138A returns to address translation flow100 at399.
FIG. 3B is a flow diagram of an exemplary embodiment of a bridging and routing flow138B. Much of bridging and routing flow138B is the same as previously described bridging and routing flow138A, and thus is not repeated. At301 bridging and routing flow is initiated. At334, a layer-2 validity check is done, and an xCFH of a packet is marked to indicate such check was done. Layer-2 validity checks may include: whether a MAC source address is a non-unicast source address; whether there is a length error in the MAC frame; and whether the cracker report indicates an error in the xCFH. If layer-2 for such a packet, for example,packet101, is invalid, then anerror condition335 is sent to NPUsoft. Notably,operation334 need not be done here, if previously done as part of NPU Mode Aflow140. If layer-2 forpacket101 is valid, thenoperation302 for a MAC address match is done.
If there, is no match at302, at324 a check as to whetherpacket101 is a unicast or broadcast packet is made. Ifpacket101 is a unicast or broadcast packet, then previously describedoperations309 and311 may be done. Otherwise, at325 it is, determined whether this multicast frame, by process of elimination, has an IP packet. If there is no IP packet, then previously describedoperations309 and311 may be done. Otherwise, at326 it is determined whetherpacket101 is a valid IP multicast frame and packet. Ifpacket101 is found not to be valid at326, then anerror condition329 is sent to NPUsoft. Otherwise, at327 it is determined if multicast routing is active. If not active, then previously describedoperations309 and311 may be done. If multicast routing is active, then at328operation308 is done with one addition, namely, storing a reverse path forwarding interface (“RFPi”). An RFPi is an interface on which a packet form a source of the packet would be expected for arrival, for example by looking up a source's IP address in a routing table and seeing if the interface on which the packet arrived was indeed the same interface that the router would use if it had to send a packet in the direction of the source of the packet that arrived. At314, layer-3 validity is checked as previously described.
If there is no MAC address match at302,operations303,306 and314 may be done. Fromoperation314, an optional check to determine ifpacket101 has any IP options may be made at334. If there, are no IP options, thenoperation318 is done as previously described. If there are one or more IP options, thenoperations316 and318 may be done as previously described.
Notably, all broadcast frames for flow138B are processed on the bridging path. Furthermore, ART entries may be setup such that NPUsoft gets a copy of each broadcast frame.
FIG. 3C is a flow diagram of an exemplary embodiment of a bridging and routing flow138C. Much of bridging and routing flow138C is the same as previously described bridging and routing flows138A and138B, and thus is not repeated. Routing and bridging flow138C is initiated at301. At354, layer-2 and layer-3 validity checks are done. If layer-2 or layer-3 is invalid, anerror condition355 is sent to NPUsoft. Otherwise, at344 it is determined whether the frame is a broadcast frame. If the frame forpacket101 is a broadcast frame, then at347 it is determined whether the Operating System (“OS”) is to processpacket101. If the OS is not to processpacket101, anerror condition349 is sent to NPUsoft. If, however, the OS is to processpacket101, thenpacket101 is forwarded to an IP stack of the host device.
If, however, at344 it is determined that a frame forpacket101 is not a broadcast frame, then at345 it is determined whether the frame is a multicast frame. If it is determined that the frame is a multicast frame, then at348 it is determined whether the OS is to process the packet. If the OS is to processpacket101, thenpacket101 is forwarded to an IP stack of the host device. If the OS is not to processpacket101, then previously describedoperation327 is done, except that if multicast routing is not active anerror condition351 is sent to NPUsoft. If multicast routing is active, thenoperation308 is done. If a source address is found fromoperation308, then at356 it is determined whether unicast routing is supported in hardware, such as an NPU.
However, if at345 it is determined that a frame forpacket101 is not a multicast frame, then at346 it is determined whether a MAC destination address for the frame matches a MAC address of an IIF forpacket101. If there is no address match, then previously describedoperations309 and311 may be done. If there is an address match, then at352 it is determined whether a protocol forpacket101 is routable on such an IIF. If this protocol is not a routable protocol for this IIF, then anerror condition353 is sent to NPUsoft. If this protocol is a routable protocol for this IIF, then at356 it is determined whether unicast routing is supported in hardware, such as an NPU.
From356, if it is determined that unicast routing is not supported in hardware, then anerror condition354 is sent to NPUsoft. Otherwise, previously describedoperations316 and318 may be done.
As mentioned above with reference toFIG. 2C,operations146 could be incorporated into bridging and routing flow138C, whereoperations344 and345 in combination provideoperation139 ofFIG. 2C. Additionally, instead of havingOS process packet101,operations141 and143 may be done.
FIG. 4A is a flow diagram of an exemplary embodiment of an inboundNAT filtering flow137. InboundNAT filtering flow137 is initiated at401. At402, a check for hardware support for NAT is made. If no such support is available, thenpacket101 is sent to NPUsoft with an error condition, forexample error condition403, for processing by NPUsoft as described below. If, however, NAT processing is supported in hardware, then at404 a layer-3 validity check is done. Notably, if layer-2 validity checking is not done as par of NPUmode A flow140, then layer-2 validity is also checked at404. For clarity, it is assumed that only layer-3 validity is checked at404, though both layer-2 and layer-3 validity may be checked at404 where both need to be valid to pass or where if one is invalid, an error condition indicating which or both of layers-2 and -3 is invalid is sent. If the layer-3 validity check comes back with an invalid condition, thenpacket101 is sent to NPUsoft with an error condition, forexample error condition405, for processing or dropping by NPUsoft as an invalid packet. If layer-3 is valid, then at406 an IP options check is done. If one or more IP options are unsupported or invalid, thenpacket101 is sent to NPUsoft with an error condition, forexample error condition407, for processing by NPUsoft as having one or more unsupported or invalid IP options. If all IP options are supported and valid, then at408 a check is made to determine ifpacket101 is an IP fragment, namely, from a fragmented packet. Ifpacket101 is a fragment, thenpacket101 is sent to NPUsoft with an error condition, forexample error condition409, for processing by NPUsoft. Notably, NPUsoft may employ “fragment absorption,” where received fragment packets are all collected and reassembled, where possible, before being forwarded, as described in below.
Ifpacket101 is not a fragment, then it is determined what type of packet it is for further processing. Ifpacket101 is a TCP packet as found at410, then at411 it is determined ifpacket101 is for a new connection. For example, if TCP state has synchronize (“SYN”) equal to one, then this is for a new connection. Ifpacket101 is for a new connection, thenpacket101 is sent to NPUsoft with an error condition, forexample error condition412, for processing by NPUsoft. Thus, NPUsoft will use information frompacket101 to build an entry inCT600 and NT700 prior to returningpacket101 to address translation flow100.
Ifpacket101 is not for a new connection, or if at410packet101 is found not to be a TCP packet but at413 is found to be a UDP packet, then at414,NT700 is accessed to lookup an inbound five-tuple forpacket101. A hash of a five-tuple ofpacket101 is done prior to this lookup, for example during building entries inCT600 and inNT700 for thispacket101 or aprevious packet101 for the same connection, a hash of a five-tuple may be stored inCT600 and inNT700 in association with such a five-tuple forcross-linking tables CT600 andNT700. Recall,packet101 may be a remote or local inbound packet to the NPU. If the five-tuple forpacket101 is not inNT700, thenpacket101 is sent to NPUsoft with an error condition, forexample error condition415, for processing to build an entry inCT600 and NT700 prior to returningpacket101 to address translation flow100, or for dropping by NPUsoft. If, however, the five-tuple forpacket101 is inNT700, then at414 a CT Index hashed from such a five-tuple ofpacket101 is stored in an xCFH ofpacket101. Processing ofpacket101 processing proceeds at416. At416, an NT Index is obtained fromCT600 in association with a five-tuple entry matching that ofpacket101 is stored in an xCFH ofpacket101. This lookup inCT600 is done with the recently obtained CT Index added to an xCFH ofpacket101. As mentioned above, such an NT Index and a CT Index may be from a hash done in hardware or with NPUsoft when building a respective entry inNT700 andCT600 for a prior packet of this connection forpacket101. Furthermore, it should be appreciated that for NAT, translation is done by a gateway device between a remote computer and a local computer. Thus, to obtain an address and port number of a local computer for NAT,CT600 is used, and to obtain an address and port number of a gateway device,NT700 is used.
Ifpacket101 is not found to be a UDP packet at413 but is found to be a GRE packet at417, then at418,NT700 is accessed to lookup an inbound “five-tuple” forpacket101. By “five-tuple,” is meant to include a GRE Call ID split into two data spaces turning a four-tuple into a pseudo-five-tuple. Thus, a five-tuple ofpacket101 is used for this lookup. Thus, a GRE Call ID is used in part for this lookup. Recall,packet101 may be a remote or local inbound packet to the NPU. If the five-tuple forpacket101 is not inNT700, thenpacket101 is sent to NPUsoft with an error condition, forexample error condition419, for processing or dropping by NPUsoft. If, however, the five-tuple forpacket101 is inNT700, then at418 a CT Index hashed from such a five-tuple is obtained fromNT700 and is stored in an xCFH ofpacket101. Processing ofpacket101 processing proceeds at416. At416, an NT Index is obtained fromCT600 in association with a five-tuple entry matching that ofpacket101 is stored in an xCFH ofpacket101. This lookup inCT600 may be done using the recently obtained CT Index stored in an xCFH ofpacket101. As mentioned above, such an NT Index and a CT Index may be from a hash done in hardware or with NPUsoft when building a respective entry inNT700 andCT600 for a prior packet of this connection.
Ifpacket101 is not found to be a GRE packet at417 but is found to be an IPSec packet at420, then at421,NT700 is accessed to lookup an inbound “five-tuple” forpacket101. By “five-tuple,” is meant to include an SPI split into two data spaces turning a four-tuple into a pseudo-five-tuple. A five-tuple ofpacket101 is used for this lookup. Thus, a SPI is used in part for this lookup. Recall,packet101 may be a remote or local inbound packet to the NPU. If the five-tupile forpacket101 is not inNT700, thenpacket101 is sent to NPUsoft with an error condition, forexample error condition422, for processing to build an entry inCT600 and NT700 prior to returningpacket101 to address translation flow100, or for dropping by NPUsoft. If, however, the five-tuple forpacket101 is inNT700, then at421 a CT Index hashed from such a five-tuple and looked up inNT700 is stored in an xCFH ofpacket101. Processing ofpacket101 processing proceeds at416. At416, an NT Index is obtained fromCT600 in association with a five-tuple entry matching that ofpacket101 is stored in an xCFH ofpacket101. This lookup inCT600 may be done using the recently obtained CT Index stored in an xCFH ofpacket101. As mentioned above, such an NT Index and a CT Index may be from a hash done in hardware or with NPUsoft when building a respective entry inNT700 andCT600 for a prior packet of this connection.
Ifpacket101 is not found to be an IPSec packet at420 or an Internet Control Message Protocol (ICMP) packet at423, thenpacket101 is sent to NPUsoft with an error condition, forexample error condition424, for processing to build an entry inCT600 and NT700 prior to returningpacket101 to address translation flow100, or for dropping by NPUsoft. Ifpacket101 is not found to be an IPSec packet at420 but is found to be an ICMP packet at423, then at425 a check is made to determine ifpacket101 is on a list of supported ICMP packet types stored in memory, such as ICMP version 4 (“ICMPv4”) and ICMP version 6 (“ICMPv6”). Ifpacket101 type is not on the list of supported ICMP packet types, thenpacket101 is sent to NPUsoft with an error condition, forexample error condition426, for processing or dropping by NPUsoft. Ifpacket101 type is on the list of supported ICMP packet types, then processing ofpacket101 proceeds at427.
At427, from425 or from416, an ART Index is stored in an xCFH ofpacket101. The Art Index is obtained fromCT600 or NT700, using a CT Index or NT Index, respectively, from an xCFH ofpacket101 or is obtained from a five-tuple entry matching that ofpacket101 in one ofCT600 or NT700 for ICMP packets. At428, inboundNAT filtering flow137 returns to address translation flow100. Notably, a hash for generating an ART Index may be of an entry or portion thereof inART800, and such a hash may be done when building an entry forpacket101 or aprior packet101 for the same connection inART800.
Notably, a hash function computes a hash value based on a packet's five-tuple information, and this hash value is used as an index toNT700. A hash function is the same for creating NT and CT indices. However, input to the hash function is not the same for creating CT index as it is for creating an NT index. In other words, an NT index uses public address information as part of the hash function input, and a CT index uses local address information as part of the hash function input instead of the public address information. However, a CT index may be created from local address information and stored in place of an NT index inCT600 when NAT is not active.
FIG. 4B is a flow diagram of an exemplary alternative embodiment of a portion ofNAT filtering flow137 ofFIG. 4A. Rather than obtaining and storing a CT index at414,418 and421 as inFIG. 4B, no CT index is obtained and stored at correspondingblocks444,448 and441. Rather, at428, CT, NT and ART indices are obtained and stored in CFHs ofpacket101. Additionally, “Time To Live” (“TTL”) in CFH is decremented at432. Another difference from the flow ofFIG. 4A, is that rather than obtaining and storing an ART index for an ICMP packet on the list at425, an error or state condition is sent at431 to NPUsoft. As the remainder ofFIGS. 4A and 4B are the same, the description is not repeated.
FIG. 5A is a flow diagram of an exemplary embodiment of anoutbound filtering flow155. Much ofoutbound filtering flow155 is similar to inboundNAT filtering flow137, and thus is not repeated here.Outbound filtering flow155 is initiated at501. At502, a check for hardware support for firewall processing is made. If no such support is available, thenpacket101 is sent to NPUsoft with an error condition, forexample error condition503, for processing by NPUsoft as described below.
If, however, firewall processing is supported in hardware, then at504 a check is made to determine ifpacket101 is an IP fragment, namely, from a fragmented packet. Ifpacket101 is a fragment, thenpacket101 is sent to NPUsoft with an error condition, forexample error condition505, for processing by NPUsoft. Notably, NPUsoft may employ “fragment absorption,” where received fragment packets are all collected and reassembled, where possible, before being forwarded, as described below.
If, however,packet101 is not an IP fragment, then at529 a check is made to determine if the IIF forpacket101 was running NAT. If the IIF was running NAT, then at516 an NT Index is obtained from an xCFH ofpacket101 to find a five-tuple inNT700. Alternatively, a CT Index may be obtained from an xCFH ofpacket101 to obtain a five-tuple fromCT600, if stored therein. After which, processing ofpacket101 continues at531, as described below.
If, however, at529, the IIF ofpacket101 was not running NAT, then at506 a layer-3 validity check is done. Notably, if layer-2 validity checking is not done as part of NPUmode A flow140, then layer-2 validity is also checked at506. For clarity, it is assumed that only layer-3 validity is checked at506, though both layer-2 and layer-3 validity may be checked at506 where both need to be valid to pass or where if one is invalid, an error condition indicating which or both of layers-2 and -3 is invalid is sent. If the layer-3 validity check comes back with an invalid condition, thenpacket101 is sent to NPUsoft with an error condition, forexample error condition507, for processing or dropping by NPUsoft as an invalid packet. If layer-3 is valid, then at508 an IP options check is done. If one or more IP options are unsupported or invalid, thenpacket101 is sent to NPUsoft with an error condition, forexample error condition509, for processing by NPUsoft as having one or more unsupported or invalid IP options.
If all IP options are supported and valid at508, then a check is made at510 to determine ifpacket101 is a TCP packet. Ifpacket101 is determined to be a TCP packet, then at511 it is determined ifpacket101 is for a new connection (i.e., SYN equal to 1). Ifpacket101 is for a new TCP connection or new “handshake,” thenpacket101 is sent to NPUsoft with an error condition, forexample error condition512, for processing to build an entry inCT600 prior to returningpacket101 to address translation flow100. Ifpacket101 is not for a new TCP connection, or if at510packet101 is found not to be a TCP packet but at513 is found to be a UDP packet, then at514 a check for an NT Index, such as from a prior hash of a five-tuple forpacket101 or a prior packet for the same connection, is made by doing aCT600 lookup for an outbound five-tuple matching the five-tuple ofpacket101. Recall,packet101 may be a remote or local outbound packet to the NPU. If the five-tuple forpacket101 is not inCT600, thenpacket101 is sent to NPUsoft with an error condition, forexample error condition539, for processing to build an entry inCT600 prior to returningpacket101 to address translation flow100, or for dropping by NPUsoft. If, however, the five-tuple forpacket101 is inCT600, then at514 an NT Index hashed from such a five-tuple is stored in an xCFH ofpacket101, provided such an NT Index is present inCT600. Notably, if a firewalling-only mode is being used, namely, a mode without any NAT, then no NT index will be present inCT600. Processing ofpacket101 processing proceeds at531.
At531, a check is made to determine or confirm (as it may have previously been determined at510 thatpacket101 is a TCP packet) as applicable, ifpacket101 is a TCP packet and ifpacket101 has a TCP state error. A TCP error results when state of a packet does not match the state of a connection associated with the packet. Notably, the check at531 is inapplicable to UDP packets as they just flow through531. Furthermore, TCP state tracking as described below, or a subset thereof, may be used for TCPstate error check513. Ifpacket101 is a TCP packet and has a TCP state error, thenpacket101 is sent to NPUsoft with an error condition, forexample error condition515, for processing or dropping by NPUsoft. If, however, at531 eitherpacket101 is not a TCP packet or does not have a TCP state error, then processing ofpacket101 proceeds at532, as described below.
Ifpacket101 is not found to be a UDP packet at513 but is found to be a GRE packet at517, then at518,CT600 is accessed with a five-tuple frompacket101 to lookup an outbound five-tuple forpacket101. Recall,packet101 may be a remote or local outbound packet to the NPU, and part of the five-tuple is a GRE Call ID. If the five-tuple forpacket101 is not inCT600, thenpacket101 is sent to NPUsoft with an error condition, forexample error condition519, for processing to build an entry inCT600 prior to returningpacket101 to address translation flow100, or for dropping by NPUsoft. If, however, the five-tuple forpacket101 is inCT600, then at518 an NT Index, hashed from such a five-tuple, is obtained fromCT600 if present and is stored in an xCFH ofpacket101. Processing ofpacket101 proceeds at532, as described below.
Ifpacket101 is not found to be a GRE packet at517 but is found to be an IPSec packet at520, then at521,CT600 is accessed with a five-tuple ofpacket101 to lookup an outbound five-tuple forpacket101. Recall,packet101 may be a remote or local outbound packet to the NPU, part of the five-tuple is an SPI. If the five-tuple forpacket101 is not inCT600, thenpacket101 is sent to NPUsoft with an error condition, forexample error condition522, for processing to build an entry inCT600 prior to returningpacket101 to address translation flow100, or for dropping by NPUsoft. If, however, the five-tuple forpacket101 is inCT600, then at521 an NT Index, hashed from such a five-tuple, is obtained fromCT600 if present and is stored in an xCFH ofpacket101. Processing ofpacket101 proceeds at532, as described below.
At532, a check is made to determine if the OIF ofpacket101 is running NAT. If the OIF ofpacket101 is not running NAT, at528outbound filtering flow155 returns to address translation flow100. If, however, the OIF ofpacket101 is running NAT, then at527 an entry inNT700 is accessed using an NT Index obtained from an xCFH ofpacket101. After which, at528outbound filtering flow155 returns to address translation flow100.
Ifpacket101 is not found to be an IPSec packet at520 or an ICMP packet at523, thenpacket101 is sent to NPUsoft with an error condition, forexample error condition524, for processing or dropping by NPUsoft. Ifpacket101 is not found to be an IPSec packet at520 but is found to be an ICMP packet at523, then at525 a check is made to determine ifpacket101 is on a list of supported ICMP packet types stored in memory, such as ICMPv4 and ICMPv6. Ifpacket101 type is not on the list of supported ICMP packet types, thenpacket101 is sent, for example to NPUsoft, with an error condition, forexample error condition526, for allowing such a packet to pass through or to be dropped. Notably, if an ICMP packet type is not on the list, the default may be to drop the packet or to allow the packet to pass through the NPU, which outcome may be dependent on the type of ICMP packet. Ifpacket101 type is on the list of supported ICMP packet types, at528outbound filtering flow155 returns to address translation flow100.
Notably, by using indices stored in an xCFH of a packet, information is handed down from inbound filtering to outbound filtering. This is particularly useful when NAT is being used, where outbound filtering is substantially simplified by having access to an index toNT700. Furthermore, it should be appreciated that ordering of the steps may be altered. For example, a check for an ICMP packet type at423 or523 may be done prior to checking for any other packet type. However, as NAT inbound and outbound filtering is not supported for ICMP error packet payloads, doing ICMP toward the end makes sense.
FIG. 5B is a flow diagram of an exemplary embodiment of an outbound filtering flow155A. Much of outbound filtering flow155A is similar tooutbound filtering flow155, and thus much not repeated here. Outbound filtering is initiated at501. At502, it is determined if firewall processing is supported in hardware. Atoperation529, it is determined whether an IIF is running NAT. If the IIF is not running NAT, thenoperations506,508 and504 are done as previously described. Packet processing operations for non-NAT inbound filtering are the same as foroutbound filter flow155, except that output fromoperation514 for a match is processed differently, namely, if a match is found in aCT600 at514, then at581 it is determined whetherpacket101 is a TCP packet. Additional details are provided below forpost operation581 processing.
If, at529, an IIF is running NAT, then at566 CT and NT indices are obtained from CFHs forpacket101. At567,packet101 is translated from a local or private address to a gateway or public address using information obtained fromCT600 and NT700 lookups using CT and NT indices to obtain local, public and remote address information. At581, it is determined whetherpacket101 is a TCP packet.
If, at581,packet101 is found not to be a TCP packet, then at532 it is determined if an OIF is running NAT. If the OIF is running NAT, then at586 a five-tuple is looked up using an NT index from a CFH ofpacket101 to do theNT700 lookup. A CT index is obtained fromNT700 during the NT index lookup and stored in a CFH forpacket101, if not already present in the CFH forpacket101. At587,packet101 is translated from a local or private address to a gateway or public address using information obtained fromCT600 and NT700 lookups using CT and NT indices to obtain local, public and remote address information. After which, outbound filtering flow155A returns at528. Additionally, if the OIF is not running NAT, then outbound filtering flow155A returns at528.
If, however, at581packet101 is found to be a TCP packet, then at582 TCP options are checked. If TCP options are not okay, anerror condition585 is sent to NPUsoft. If TCP options are okay, then at583 a check is made for a TCP state error. If there is a TCP state error, anerror condition584 is sent to NPUsoft. If there is no TCP state error, then a check for the OIF running NAT at532 is made as previously described.
FIG. 10 is a state transition diagram of an exemplary embodiment of astate tracking flow531 for tracking a packet. State tracking flow is described in terms of well-known code bits for TCP state transitions, such as Synchronize (“SYN”), Acknowledge (“ACK”), Reset (“RST”), Finished (“FIN”), and Received (“RCVD”), among others.State tracking flow531 may be implemented in hardware or software, including a combination thereof, in the form of a state machine.State tracking flow531 starts in aCLOSED state998, from which a passive open causes a transition to LISTENstate903 or from which a sent SYN cause a transition to SYN-SENTstate904. FromLISTEN state903, transitioning toCLOSED state998 occurs responsive to an age out or close condition. Notably, an RST received (not shown) within a valid receive window or sent out within a valid send window causes a transition to aCLOSED state998 or999.
FromLISTEN state903, transitioning to SYN-RCVD state905 occurs responsive to a received SYN. FromLISTEN state903, transitioning to SYN-SENTstate904 occurs responsive to a sent SYN.
From SYN-RCVD state905, transitioning to SYN-RCVD-SYN-SENTstate906 occurs responsive to a sent SYN, and transitioning to SYN-RCVD-SYN-ACK-SENTstate912 occurs responsive to sent SYN-ACK.
From SYN-SENTstate904, transitioning to SYN-RCVD-SYN-SENTstate906 occurs responsive to a received SYN, and transitioning to SYN-SENT-SYN-ACK-RCVD state913 occurs responsive to a received SYN-ACK.
From SYN-RCVD-SYN-SENTstate906, transitioning to SYN-RCVD-SYN-SENT SYN-SENT1 state907 occurs responsive to a sent SYN-ACK, and transitioning to SYN-RCVD-SYN-SENT2 state908 occurs responsive to a received SYN-ACK.
From SYN-RCVD-SYN-SENT1 state907, transitioning to a connection ESTABLISHEDstate909 occurs responsive to a received SYN-ACK. From SYN-RCVD-SYN-SENT2 state908, transitioning to ESTABLISHEDstate909 occurs responsive to a sent SYN-ACK.
From SYN-RCVD-SYN-ACK-SENTstate912, transitioning to ESTABLISHEDstate909 occurs responsive to a received ACK of a SYN. From SYN-SENT-SYN-ACK-RCVD state913, transitioning to ESTABLISHEDstate909 occurs responsive to a sent ACK of a SYN.
FromESTABLISHED state909, SYN-RCVD-SYN-ACK-SENTstate912 or SYN-SENT-SYN-ACK-RCVD state913, transitioning to FIN-WAIT1 state914 occurs responsive to a sent FIN. FromESTABLISHED state909, SYN-RCVD-SYN-ACK-SENTstate912 or SYN-SENT-SYN-ACK-RCVD state913, transitioning to CLOSE-WAIT-FIN state915 occurs responsive to a received FIN.
From FIN-WAIT1 state914, transitioning to: CLOSING-FIN state917 occurs responsive to a received FIN; FIN-WAIT2 state916 occurs responsive to a received ACK of a FIN, and transitioning to FIN-WAIT2-FIN state921 occurs responsive to a received FIN and a received ACK of the FIN in the same packet.
From CLOSE-WAIT-FIN state915, transitioning to: CLOSING-FIN.state917 occurs responsive to a sent FIN; CLOSE-WAIT state918 occurs responsive to a sent ACK of a FIN, and transitioning to LAST-ACK state923 occurs responsive to a sent FIN and a sent ACK of the FIN in the same packet.
From FIN-WAIT2 state916, transitioning to FIN-WAIT2-FIN state921 occurs responsive to a received FIN. From CLOSE-WAIT state918, transitioning to LAST-ACK state923 occurs responsive to a sent FIN.
From CLOSING-FIN state917, transitoning to FlN-WAIT2-FIN state921 occurs responsive to a received ACK of a FIN, and transitioning to CLOSINGstate922 occurs responsive to a sent ACK of a FIN.
FromCLOSING state922, transitioning to TIME-WAIT state924 occurs responsive to a received ACK of a FIN. From FIN-WAIT2-FIN state921, transitioning to TIME-WAIT state924 occurs responsive to a sent ACK of a FIN.
From LAST-ACK state923, transitioning toCLOSED state999 occurs responsive to a received ACK of a FIN. From TIME-WAIT state924, transitioning toCLOSED state999 occurs responsive to a timed out condition.
For a hardware and software embodiment, CLOSED states998 and999 are hardware and software states. States within dashed-box997 are software states, and states with dashed-box996 are hardware states.
Referring toFIG. 12A, there is shown a block diagram of an exemplary embodiment of aNPU1070.NPU1070 comprises MAC interface (“MI”)1010,sequence processor1020,address translator1030, hostMAC1040, andfront end1050.NPU1070 micro architecture uses a hardwired pipeline without a central processing unit (“CPU”) core. A network driver program, including a software or data portion of address translation flow100, may be stored in system memory. Such a network driver program andNPU1070 communicate with one another using commands via push buffers (“PBs”), namely, a command buffer going from software to NPU or NPU to software, as described in additional detail below.
Input fromMAC layer1097 and output to MAC layer or host bus1098 may be in a form compatible with one or more of Ethernet 10/100/1000 mega-bits-per-second (“Mbps) (“IEEE 802.3”) for local area network (“LAN”) connectivity, Home Phoneline Network Alliance (“HomePNA” or “HPNA”), wireless local area network (“WLAN”) (“IEEE 802.11”), and a digital signal processor (“DSP”) MAC layer, among others. Though a personal computer workstation embodiment is described herein, it should be understood thatNPU1070 may be used in other known devices for network connectivity, including, but not limited to, routers, switches, gateways, and the like. Furthermore, a host or local bus may be a Fast Peripheral Component Interconnect (“FPCI”) bus; however, other buses, whether directly accessed or coupled to a host bus, include, but are not limited to, Peripheral Component Interconnect (“PCI”), 3GIO, Video Electronic Standards Association (“VESA), VersaModule Eurocard (“VME”), Vestigial Side Band (“VSB”), Accelerated Graphics Port (“AGP), Intelligent I/O (“I2O”), Small Computer System Interface (“SCSI”), Fiber Channel, Universal Serial Bus (“USB”), IEEE 1394 (sometimes referred to as “Firewire,” “iLink” and “Lynx”), Personal Computer Memory Card International Association (“PCMCIA”), and the like.
NPU1070 receives a frame input fromMAC layer1097. This frame flows throughNPU1070's pipeline, starting withMAC interface1010.MAC interface1010 receives one ormore frame inputs1011.MAC interface1010 is coupled tofront end1050 for access tomemory1052 viamemory arbiter1051. Notably,memory1052 may be memory local toNPU1070 or system memory of a host system.Frame inputs1011 are processed in part by placing them into staging buffers incache memory1013. If capacity of staging buffers is exceeded ordownstream NPU1070 pipeline is blocked, spill over frames are queued inmemor1052.
Frame inputs1011 have a respective CFH added to the beginning of a frame to indicate its type and input MAC index. Notably, handling offrame inputs1011 can depend at least in part on frame type. For example, WLAN management frames and like frame types have their CFH marked for being passed directly to HostMAC1040, while other frames are passed to sequence,processor1020.
For purposes of clarity of explanation, processing of one frame throughNPU1070 pipeline will be described, though it should be understood that multiple frames may be pipeline-processed throughNPU1070. Lookup tables inmemory1052 may include state tables600,700,800, and900, as described above, as well as a list of supported ICMP types1071. Supported ICMP types may be loaded from a network driver program.Sequence processor1020 on an inbound side may include adecapsulation module1021, avalidation module1022 and asecurity module1023A.
Address translator1030 provides NAT for converting public IP addresses to private IP addresses. However, if a packet is from a LAN, then conventionally no address translation is done. Rather, NAT is done for a packet communicated over a wide area network (“WAN”), including, but not limited to, a portion of the Internet. Security modules for incoming andoutgoing packets1023A and1023B, respectively, may be instantiated insequence processor1020. For example, IPSec may be used with NAT as describe in a co-pending U.S. patent application entitled “METHOD AND APPARATUS FOR SECURITY PROTOCOL, AND ADDRESS TRANSLATION INTEGRATION” by Thomas A. Maufer, Sameer Nanda, and Paul J. Sidenblad, filed Jun. 13, 2002, application Ser. No. 10/172,352, which is incorporated by reference as though fully set forth herein.
Bridging androuting module1032 includes multicast expansioning. After a lookup inCT600 or NT700, a routing table lookup frommemory1052 is done for an Address Resolution Protocol (“ARP”) table702 to convert an IP address for a packet into a physical address. Moreover, if more than one output MAC address is specified, then multicast expansioning is done. Notably, at this point a packet may be output for use by a host computer user. Routing fromaddress translator1030 for a packet may be for sending such a packet.
In addition to NAT, firewalling may be done with NAT output, firewall screening and flowclassification module1033, namely, review header fields, classify packets in lookup tables in cache memory, mark CFH with per-MAC output first-in first-out (“FIFO”) index, new priority, and a new ToS, among other previously described events.
Packets processed on an outbound side ofsequence processor1020 may be processed through one or more offragment module1027,security module1023B andencapsulation module1028. One or more packets are provided as multiple frames for each packet fromsequence processor1020 toMAC interface1010 asframe output1012.MAC interface1010 writes a frame fromsequence processor1020 to one or more staging buffers incache memory1013. IfMAC interface1010 does not have priority to do such writing tocache memory1013 due to flow scheduling, such frame is spilled over tomemory1052.Frame output1012, once scheduled, is output-to-output MAC layer or host bus1098.
NPU1070 may form a portion of an intelligent network interface (sometimes referred to as a “network interface card” or “NIC”), and thusNPU1070 may be used to do computationally intensive network stack operations rather than using a host CPU. This frees up a host CPU for other activities. Additionally, a privileged andcommand engine1053 may be included withFE1050 and coupled to a host via an input/output (“I/O”)interface1099 for direct access to and fromNPU1070 by a host system. Otherdetails regarding NPU1070 may be found in the co-pending patent application entitled “METHOD AND APPARATUS FOR PERFORMING NETWORK PROCESSING FUNCTIONS” by Robert A. Alfieri, Gary D. Hicok, Paul J. Sidenblad, filed Dec. 13, 2002, application Ser. No. 10/319,791, assigned to the same assignee as this patent application, which is incorporated by reference as though fully set forth herein.
Notably,memory1013 may be coupled to frameinput1011 for buffering packets for a respective connection. For example, in Voice-Over IP (“VOIP”), UDP is used to send many packets at a time. VOIP is a low latency application where packets are order specific. Accordingly,memory1013 can buffer overflow packets, andincrement counter1043 viacount signal1044. As packets are processed out ofmemory1013,count signal1044 is used to decrementcounter1043. Whencounter1043 is down to zero, as indicated bytotal signal1045, then all packets inmemory1013 for a connection have been sent out ofmemory1013. Notably, multiple counts may be maintained for supporting multiple connections.
Referring toFIG. 12B, there is shown a flow diagram of an exemplary embodiment of apacket processing flow1080 for processing bursts of packets. A VOIP session may generate an exemplary burst of UDP packets. With continued reference toFIG. 12B and renewed reference toFIGS. 6 and 12A, processing of packets when burst of packets are received is further described.
Packets1060 are serially received at1061 to an NPU, such asNPU1070. At1061,packets1060 are buffered intomemory1013 and acounter1043 is incremented for each packet buffered. At1062, each received packet is checked for an entry inCT600.
If at1063 it is determined that no entry inCT600 exists for a packet, then such a packet is sent to NPUsoft at1064. Notably, a CT index maybe obtained from an xCFH or CFH for this lookup. At1065, a packet to be processed with NPUsoft is buffered, and NPUsoft builds a CT entry for such a packet. Notably, though separate buffers are described for asoftware portion1082 and ahardware portion1081, a single buffer may be used for both. Notably, a first packet, for example for a VOIP connection, may be used to build such a CT entry, and subsequent packets for such VOIP connection would therefore not need to have another CT entry built. If, however, at1063 it is determined that an entry for such a packet is inCT600, then at1073 it is determined if such a CT entry has a ready status flag set. If at1073 it is determined that a ready status flag is not set, then such a packet is sent to NPUsoft at1079.
Suppose thatpackets1 through N, for N a positive integer, are buffered at1074. Notably, if a ready status flag is not in place for subsequently received packets N+1, and so on, such packets are sent to NPUsoft for processing, until all packets buffered at1074 have been cleared, as described below in additional detail.
A first packet of the sequence is obtained for processing at1066, followed by a second packet of the sequence, and so on and so forth. This is because UDP packets, such as VOIP packets, may need to be played back in sequence. At1067, a processed packet is sent to an NPU at1072. Notably, NPUsoft may fully process a packet or leave some portion of packet processing for an NPU. However, in this embodiment, the NPU processes the packet in its entirety. If the NPUsoft submitted a packet to hardware, a hash of the packet's five-tuple would lead to a CT entry that was marked as “not ready,” and such a packet would come right back to the software. Accordingly, the NPUsoft completely processes each of these packets and sends them out marked such that they bypass the NPU. In other embodiments, the NPUsoft may be able to process packets sufficiently to create sufficient CT or NT state so that such processed packets may then be re-submitted to the NPU to complete the processing. At1068, NPUsoft checks for another packet in the buffer to process. If there is another packet to process, then at1069 such other packet is obtained from buffer memory for processing.
If there are no more packets to process at1068, then a ready status flag is set at1071 for an associated CT entry, such as for a VOIP connection. Accordingly, subsequently received packets will have a CT entry at1063 and a ready status flag set at1073, and thus such packets will be processed by NPU at1075.
After a packet is processed at1075, it is forwarded along from NPU at1072 as a processedpacket1076. As each packet is forwarded, such packet is removed frombuffer memory1013 andcounter1043 is decremented. Once all packets sent for processing by NPUsoft are processed,counter1043 is zeroed as indicated bytotal count signal1045. Thus,NPU1070 will know when all packets, such as for such a VOIP connection, sent to NPUsoft have been completely processed, and will know when all packets inbuffer memory1013 have been processed.
Thus, it should be understood that a state is created in software for hardware to process packets. However, this state is not activated for use until all packets received to software have been processed out of software. However, once all such packets have been processed out of software, then hardware may be used for real-time traffic. Though entries forCT600 may pass from NPUsoft toNPU1070 for writing toCT600, tables may be created in software and maintained by software.
Referring toFIG. 13, there is shown a block diagram of an exemplary embodiment of acomputer system1000 having anNPU1070.Computer system1000 comprisesCPU1001,system memory1003, a variety ofsupport circuits1006, I/0interface1002, and media communications processor (“MCP”)1004, all of which are coupled via a plurality of buses.MCP1004 includesNPU1070.MCP1004 may be coupled for I/O from/to anetwork1005.CPU1001 may be any type of microprocessor known in the art.Support circuits1006 forcomputer system1000 include conventional cache, power supplies, clock circuits, data registers, I/O interfaces, and the like.Memory1003 may be directly coupled toCPU1001 or coupled through I/O interface1002, and I/O interface1002 may be coupled to a conventional keyboard, network mouse, display printer, and interface circuitry adapted to receive and transmit data, such as data files and the like.
Memory1003 may store all or portions of one or more programs or data to implement processes in accordance with one or more aspects of the invention, including anetwork driver program1007 having at least a portion of address translation flow100.Network driver program1007 may include NPUsoft programming. Additionally, those skilled in the art will appreciate that one or more aspects of the invention may be implemented in hardware, software, or a combination of hardware and software. Such implementations may include a number of processors independently executing various programs and dedicated hardware, such as application specific integrated circuits (“ASICs”).
Programmed computer system1000 may be programmed with an operating system, which may be OS/2, Java Virtual Machine, Linux, Solaris, Unix, Windows, Windows95, Windows98, Windows NT, and Windows2000, WindowsME, and WindowsXP, among other known platforms. At least a portion of an operating system may be disposed inmemory1003.Memory1003 may include one or more of the following random access memory, read only memory, magneto-resistive read/write memory, optical read/write memory, cache memory, magnetic read/write memory, and the like, as well as signal-bearing media as described below.
One or more aspects of the invention are implemented as program products for use withcomputer system1000. Program(s) of the program product defines functions of embodiments in accordance with one or more aspects of the invention and can be contained on a variety of signal-bearing media, such as computer-readable media having code, which include, but are not limited to: (i) information permanently stored on non-writable storage media (e.g., read-only memory devices within a computer such as CD-ROM or DVD-RAM disks readable by a CD-ROM drive or a DVD drive), (ii) alterable information stored on writable storage media (e.g., floppy disks within a diskette drive or hard-disk drive or read/writable CD or read/writable DVD); or (iii) information conveyed to a computer by a communications medium, such as through a computer or telephone network, including wireless communications. The latter embodiment specifically includes information downloaded from the Internet and other networks. Such signal-bearing media, when carrying computer-readable instructions that direct functions of one or more aspects of the invention represent embodiments of the invention.
Referring toFIG. 14, there is shown a block diagram of an exemplary embodiment of anetwork1005.Network1005 includescomputer system1000 coupled local network nodes viaLAN1102 and coupled toremote network node1104 via WAN1103. It should be appreciated that an NPU configured with address translation as described above may be used as a gateway, having firewalling or NAT'ing, to nodes on aLAN1102. Additionally,computer system1000 may be aremote node104 with stand-a-lone firewalling or NAT'ing.
FIGS. 15A and 15B are block diagrams depicting exemplary embodiments of respective tables indexed by hashfunction output values1110A and1110B. In tables indexed by hash function output values1110A, a first connection1111-1 is established, and a hash value of a five-tuple for connection1111-1 hashes to slot1 in such tables indexed by hash function output values1110A. Supposingslot1 is empty, soslot1 is used for connection1111-1. Then, suppose a second connection1111-2 is established, and a hash value of a five-tuple for connection1111-2 also hashes toslot1. Notably, hash chains are used to account for instances where multiple n-tuples, such as five-tuples, hash to the same value. Asslot1 is already occupied with connection1111-1, a collision occurs. Accordingly, an empty location in hash function output values1110A is found for connection1111-2, which in this exemplary embodiment isslot2. So,slot2 is used for connection1111-2. Now, suppose a third connection1111-3 is established which hashes toslot2. Accordingly, an empty location in hash function output values1110A is found for connection1111-3, which in this exemplary embodiment isslot3. However, this creates a chain of length three. In other words, to get to connection1111-3, a chain from connection1111-1 to1111-2 to1111-3 is used, where at each link the packet's five-tuple is compared with the five-tuple stored in that table entry matching the hash value that was computed using the five-tuple as input. Notably, though this is a hash chain of length three, it actually is an intermingling of two hash chains; namely, a chain pointing to slot1 withslot1 pointing to slot2, and a chain withslot2 pointing to slot3.
An alternative embodiment tables indexed by hashfunction output values1110B are depicted inFIG. 15B. So, if a connection1111-3 is established and hashes to slot2, andslot2 is found to be occupied, an empty slot in tables indexed by hashfunction output values1110B is found, which in this exemplary embodiment isslot3. Instead of putting connection1111-3 intoslot3 andpointing slot2 toslot3, as done in tables indexed by hash function output values1110A, contents ofslot2 are moved toslot3. Moving contents ofslot2 toslot3, emptiesslot2 for connection1111-3, as shown. As connection1111-3 hashed toslot2, no other slot points to slot2, andslot1 now points toslot3. Thus, there are two chains, namely, one of length two and one of length one. This reduces the length of a hash chain over that shown inFIG. 15A. Other advantages include improved performance with respect to length in which a hash chain has to be followed prior to arriving at a target connection, and reduced or eliminated of intermingling of hash chains. With respect to the last advantage, only one chain is needed to get to any of connections1111 in tables indexed by hash function output values1110B, namely, a chain pointing to slot1 withslot1 point to slot3, or a chain pointing toslot2.
FIG. 16 is a flow diagram of an exemplary embodiment of afragment processing flow1200. If at408 or504 then an associated error condition is identified for processing with software at1201, as follows. So, if an IP fragment is received, then at1201, IP information, for example IP packet identification (conventionally a two-bit packet identifier) and IP source and destination addresses, is obtained from such a fragment. At1201, a check is made to determine if another IP fragment, based on such IP information obtained, is already stored in buffer space, such as inmemory1013 or1003. If there is no match of IP information, then at1202 buffer space is reserved, for example inmemory1013 or1003. Additionally, a timer is started at1203 responsive to a first received fragment for a fragmented packet. If there is a match at1201, then at1205, a checksum, namely, a checksum for a packet undergoing re-assembly, for a received fragment is obtained and compared against a checksum of another fragment. If the obtained checksum is invalid, then such a fragment is dropped at1206.
If a check sum for a fragment is valid at1205 or is a first fragment received for a fragmented packet, then at1204 such a fragment is buffered or otherwise stored, such as inmemory1013 or1003. Accordingly, if IP information for this fragment matches that of a previously buffered fragment, then this newly received fragment is buffered in association with a buffer stack for a fragmented packet already in process for reassembly. This may be a physical or a logical association in memory for association on a fragmented packet basisFIG. 17 is a block diagram of an exemplary embodiment of abuffer stack1230. If, however, IP information for this fragment does not match that of a previously buffered fragment, then such fragment is buffered at1204 in newly reserved buffer space as reserved at1202.
At1207, packet and packet fragment identifiers associated with such a received fragment are obtained therefrom. At1208, a fragment is sorted according to packet identifier and packet fragment identifier. In other words, buffered fragments are sorted into a bin for packet of origin, and then within that bin such fragments are sorted responsive to fragment number. Notably, a later arriving fragment may have a same fragment number as a previously arrived fragment, and thus the later received fragment overwrites the previously received fragment. Furthermore, fragments may not be received in the order in which they were generated. This numerical association of packet identifier to fragment may be a physical or a logical ordering within memory. This numerical association of packet fragment identifier to fragment may be a physical or a logical ordering within memory.
At1215, an optional check is made to determine if a threshold communication length for a summation of all packets in a buffer stack has been exceeded. If a communication length threshold has been exceed, then the buffer stack is cleared at1213; otherwise, processing continues at1209.
At1209, a buffer stack is checked to determine if any fragments for a fragmented packet have as yet not been buffered. For example inbuffer stack1230,fragment2 is as yet not buffered. The number of fragments a packet may have is indicated by fragment N for N a positive integer, and is dependent upon what protocol is being used, such as IPv4 or IPv6. If a fragment is missing, then at1212 it is determined whether a buffer stack has timed out based on when time was started at1203 for a first fragment for such a buffer stack. If a buffer stack has timed out, then at1213 the buffer is cleared, meaning all fragments in such buffer are dropped. If, however, a buffer stack has not timed out, then at1214 a set time interval is used as a wait period before checking again at1209 as to whether any fragments are still missing. Such a wait period will depend on implementation and availability of memory. Also, the number of fragments received to a destination is dependent upon likelihood of routing through an interface not able to handle full size packets.
If, however, at1209 no fragments for a fragmented packet are missing from a buffer stack, then at1210 such fragments are assembled into a single packet, namely, a reassembled packet. At1211, such a reassembled packet is re-inserted into the above-described process, such as apacket101 intopacket interrogation flow120 for further processing, including any firewalling. Thus, it should be appreciated that packet fragment assembly is done prior to screening, namely, in front of a firewall.
Notably, though IP fragment flow has been described in terms of software, it may be instantiated in hardware or both hardware and software. For example, hardware includes combinatorial logic forming a portion of an NPU. Hardware may have a performance advantage over software but at additional cost. Furthermore, while a personal computer environment has been described, a dedicated firewall computer may be used. Additionally, one or more aspects may be employed in a personal data assistant (PDA), a web-enabled phone, and other devices used for Internet communication.
Accordingly, it is worth mentioning that if NAT is used, NAT need be done only once per packet. This is facilitated by having NAT proximal to front end packet processing. Furthermore, it should be appreciated that by doing NAT, and implicit routing table lookup is done.
Additionally, it should be appreciated that if firewalling is used, firewalling need be done only once per packet. This is facilitated by having firewalling proximal to back end packet processing.
While the foregoing describes exemplary embodiment(s) in accordance with one or more aspects of the invention, other and further embodiment(s) in accordance with the one or more aspects of the invention may be devised without departing from the scope thereof, which is determined by the claim(s) that follow and equivalents thereof. For example, it is not necessary to incorporate an NPU as described, as a software embodiment may be used. Furthermore, the NPU architecture described herein is not the only architecture that may be used. Additionally, rather than a personal computer, a firewall computing device may be used. Claim(s) listing steps do not imply any order of the steps. Trademarks are the property of their respective owners.