CROSS-REFERENCE TO RELATED APPLICATIONS The present application is related to the following copending U.S. patent applications:
U.S. patent application, Ser. No. ______ (Attorney Docket No. RPS920050059US1/3485P), entitled “Host Ethernet Adapter for Networking Offload in Server Environment”, filed on even date herewith and assigned to the assignee of the present invention.
U.S. patent application, Ser. No. ______ (Attorney Docket No. RPS920050060US 1/3486P), entitled “Method and System for Accommodating Several Ethernet Ports and a Wrap Transmitted Flow Handled by a Simplified Frame-By-Frame Upper Structure”, filed on even date herewith and assigned to the assignee of the present invention.
U.S. patent application, Ser. No. ______ (Attorney Docket No. RPS920050061US1/3487P), entitled “Method and Apparatus for Providing a Network Connection Table”, filed on even date herewith and assigned to the assignee of the present invention.
U.S. patent application, Ser. No. ______ (Attorney Docket No. RPS920050062US1/3488P), entitled “Network Communications for Operating System Partitions”, filed on even date herewith and assigned to the assignee of the present invention.
U.S. patent application, Ser. No. ______ (Attorney Docket No. RPS920050073US1/3502P), entitled “Configurable Ports for a Host Ethernet Adapter”, filed on even date herewith and assigned to the assignee of the present invention.
U.S. patent application, Ser. No. ______ (Attorney Docket No. RPS920050074US1/3503P, entitled “System and Method for Parsing, Filtering, and Computing the Checksum in a Host Ethernet Adapter (HEA)”, filed on even date herewith and assigned to the assignee of the present invention.
U.S. patent application, Ser. No. ______ (Attorney Docket No. RPS920050075US 1/3504P), entitled “System and Method for a Method for Reducing Latency in a Host Ethernet Adapter (HEA)”, filed on even date herewith and assigned to the assignee of the present invention.
U.S. patent application, Ser. No. ______ (Attorney Docket No. RPS920050082US 1/3512P), entitled “Method and System for Performing a Packet Header Lookup”, filed on even date herewith and assigned to the assignee of the present invention.
U.S. patent application, Ser. No. ______ (Attorney Docket No. RPS920050089US1/3516P), entitled “System and Method for Computing a Blind Checksum in a Host Ethernet Adapter (HEA)”, filed on even date herewith and assigned to the assignee of the present invention.
FIELD OF THE INVENTION The present invention relates to network transmissions for computer devices, and more particularly to error detection using checksums in network transmissions.
BACKGROUND OF THE INVENTION Computer systems communicate over networks by establishing and using network connections. When providing secure network connections, one concern is that data may be lost during network transmission, and various techniques are used to ensure that data is not lost in transit. An additional concern is the risk that errors will be introduced in data from transmission over the network. One technique for detecting network transmission errors in data uses what is known as a checksum. The basic technique of a checksum is to take a string of data bytes (or other unit of storage) and add them together, then send this sum with the data stream and have the receiver check the sum using the same method used to create the sum. If the receiver's calculated sum matches the sum in the data stream, then no errors have been introduced during transmission.
For example, the Transmission Control Protocol (TCP) provides basic protection against errors in transmission by including a 16-bit Checksum field in the header of each data packet. In TCP, a standard algorithm is used to calculate the checksum which is slightly different than a conventional checksum algorithm. Instead of computing the checksum over only the actual data fields of the TCP segment, a 12-byte TCP pseudo header is created prior to checksum calculation. This header includes information taken from fields in both the TCP header and the IP datagram into which the TCP segment will be encapsulated. The TCP pseudo header includes a source Internet Protocol (IP) address of the originator (taken from the IP header), destination IP address of the intended recipient (taken from the IP header), a reserved field, a protocol field for specifying the protocol used, and a TCP length field specifying the length of the TCP segment including header and data (body) (which is calculated by the originator). The formed pseudo header is placed in a buffer, followed by the TCP segment, and the checksum is computed over this set of data (pseudo header plus TCP segment). The value of the checksum is placed into the Checksum field of the TCP header, and the pseudo header is discarded, since it is not an actual part of the packet and is not transmitted.
The packet is transmitted over the network, and the receiver performs the same calculation by forming the pseudo header performing the checksum (ignoring the Checksum value in the header field to replicate the original condition). If there is a mismatch between its calculation and the value in the Checksum field, this indicates that an error of some sort occurred, and the packet can then be discarded or the error noted. The checksum thus protects against errors in the TCP segment fields and against incorrect segment delivery (if there is a mismatch in the Source or Destination Address), incorrect protocol, and incorrect segment length.
TCP checksum generation is often performed in hardware to achieve faster performance. Since the TCP checksum field is in the packet header, which is transmitted before the packet data, true “on-the-fly” checksum generation is not provided. The packet data must be stored in transmission data buffers, and once the entire packet is received, the checksum can be generated and placed in the header, and then the packet can then start to be transmitted.
One problem with this packet transmission method is that it requires random access to the data buffers storing the packet, so that the checksum field can be accessed and written to with the determined checksum value. This random access capability to a large output buffer adds expense to the system and is slow and sequential, adding latency to the system. Furthermore, the determination of the checksum value can add processing time to the transmission process, since the checksum value is determined only after the packet is fully stored in the data buffer.
In addition, portions of the packets to be transmitted are often stored in different configurations. For example, the header of a packet might be stored in an area of memory easily retrieved with a descriptor, but the body of the packet may be stored elsewhere in memory. Or the header may be stored with the body. However, there are no existing methods to efficiently combine parts of a packet and handle the different packet storage configurations for checksum determination when transmitting.
Accordingly, what is needed is an apparatus and method for providing a network transmission mechanism that can efficiently process checksums for outgoing packets with minimal access to storage buffers, much-reduced latency, and efficient handling of packet storage configurations. The present invention addresses such a need.
SUMMARY OF THE INVENTION The invention of the present application relates to providing a checksum in a network transmission. In one aspect of the invention, a method for determining a checksum for a packet to be transmitted on a network includes retrieving packet information from a storage device, the packet information to be included in the packet to be transmitted. A blind checksum value is determined based on the retrieved packet information, the blind checksum value is adjusted to a protocol checksum based on descriptor information describing the structure of the packet. The protocol checksum is inserted in the packet before the packet is transmitted.
In another aspect of the invention, an apparatus for determining a checksum for a packet to be transmitted on a network includes a memory access unit that retrieves packet information from a storage device, the packet information to be included in the packet to be transmitted. An accumulator determines a blind checksum value based on the retrieved packet information, and a transmission unit adjusts the blind checksum value to a protocol checksum based on descriptor information describing the structure of the packet. The transmission unit inserts the protocol checksum in the packet and outputs the packet for transmission on the network.
The present invention provides a method and apparatus that allows efficient checksum determination and transmission of packets having a checksum by determining a blind checksum, if necessary, and adjusting that blind checksum on-the-fly and without random accesses to data buffers. The invention allows low-latency checksum determination and packet transmission and flexibility in handling multiple packet information storage configurations.
BRIEF DESCRIPTION OF THE FIGURESFIG. 1 is a block diagram of an example of a system suitable for use with the present invention;
FIGS. 2a,2b, and2care block diagrams illustrating different packet transmission systems that process packets and outputs them on a connection line to a network;
FIG. 3 is a block diagram illustrating a packet transmission system of the present invention which can handle the different situations illustrated inFIGS. 2a-2c;
FIG. 4 is a flow diagram illustrating a method of the present invention for providing a checksum for a packet to be transmitted over a network; and
FIGS. 5aand5bare flow diagrams illustrating a method of the present invention for adjusting and determining a checksum value.
DETAILED DESCRIPTION The present invention relates to network transmissions for computer devices, and more particularly to error detection using checksums in network transmissions. The following description is presented to enable one of ordinary skill in the art to make and use the invention and is provided in the context of a patent application and its requirements. Various modifications to the preferred embodiment and the generic principles and features described herein will be readily apparent to those skilled in the art. Thus, the present invention is not intended to be limited to the embodiment shown but is to be accorded the widest scope consistent with the principles and features described herein.
The present invention is mainly described in terms of systems provided in particular implementations. However, one of ordinary skill in the art will readily recognize that this method and system will operate effectively in other implementations. For example, the system architectures and network configurations usable with the present invention can take a number of different forms. The present invention will also be described in the context of particular methods having certain steps. However, the method and system operate effectively for other methods having different and/or additional steps not inconsistent with the present invention.
To more particularly describe the features of the present invention, please refer toFIGS. 1 through 5bin conjunction with the discussion below. The present invention is described in the context of a TCP/IP protocol network system; however, other protocols (such as User Datagram Protocol (UDP), etc.) and configurations can be used in other embodiments.
FIG. 1 is a block diagram of asystem10 suitable for use with the present invention.System10 is a computer system such as a server, mainframe, desktop client computer, workstation, or other computer or electronic device.System10 can communicate with variousother computer systems12 over anetwork14.System10 can run one ormore applications16, which are processes running on the system. For example, anapplication16 can provide data that is to be sent out over the network to one ormore computer systems12.
System10 includes one or more TCP/IP stacks20 to manage network communications to and from thesystem10. Thestack20 is process code in an operating system or user software space running on the server which handles the incoming and outgoing data to and from thenetwork connections14. The TCP/IP stack can establish a new network connection to aserver application16, provide packets to an existing connection, etc. Eachapplication16 may have one or more connections. For example, the stack20 (and/or other components) can provide data fromapplication16 as TCP/IP network packets to destination ports and addresses over thenetwork14. Thestack20 also receives network packets provided byother computer devices12 on the network and provides the packets to theapplication16 via network connections. The TCP/IP stack20 can accessmain storage22 of thesystem10 to store packets intended to be sent out on thenetwork14.Storage22 can be any suitable type of memory provided on a computer system, as is well known.
One ormore device drivers18 is running on thesystem10 and can interface with themain memory22 as well as thenetwork adapter24. Thedevice driver18 can, for example, write a packet descriptor into themain storage22 with some parameters given by the TCP/IP stack20 or even by anapplication16. The packet descriptor (“descriptor”) includes characteristics for a packet to be transmitted and instructions as to what actions to take for the packet, which may be stored elsewhere inmain storage22, such as in buffers, and a pointer or address of where portions of the packet is stored inmain storage22, if applicable.
Anetwork adapter24 is used to provide the physical connection point between thesystem10 andother computer systems12 connected to thenetwork14.Adapter24 can be, for example, a hardware network interface adapter. The adapter can take packets provided inmain storage22 and process them for transmission on the network, and send out the packets, as well as receive packets from thenetwork14. For example, in the present invention, theadapter14 includes processing components so that the adapter can determine a TCP checksum that is added to each TCP/IP data packet before it is output on thenetwork14. This is detailed below with respect toFIGS. 2a-candFIG. 3.
A checksum value is added to the header of each TCP packet using a particular method so that a receiver of the packet can use the same checksum method to generate its own checksum value and compare with the checksum value in the packet to determine if the packet has any transmission errors. A “packet”, as referenced herein, is an entire TCP/IP packet, including an Ethernet header (including a Media Access Control (MAC) address), followed by an Internet Protocol (IP) header, followed by a TCP header, and followed by a body (data). The TCP segment is the TCP header and the body. The terms apply analogously to other protocols used in other embodiments, such as UDP.
A packet for transmission may be split such that different parts of the packet are stored in different areas ofmain memory22 by thestack20. Three examples that show how a packet intended for transmission may be distributed inmain storage22, and that show different systems for anadapter20 for processing those packet distributions, are described below with reference toFIGS. 2a,2b, and2c.
FIG. 2ais a block diagram illustrating apacket transmission system50 includingmain storage22 and ahardware adapter52 that processes packets and outputs them on aconnection line56. In the example ofFIG. 2a, adescriptor58 has been written into themain storage22 by a device driver or other program running on thesystem10 which governs the sending out of a packet (e.g., with some parameters given by the TCP/IP stack20 or even by an application16). Thedescriptor58 includes instructions as to what actions to take for an associated packet stored elsewhere inmain storage22, and a pointer or address of where the body of the packet is stored inmain storage22.
Header60 is the header for the packet associated with thedescriptor58. The header includes the standard TCP and IP header information, as well as other header information such as Ethernet MAC address, if applicable. The header was placed in the main buffer main storage by the TCP/IP stack20, while the Ethernet header, if present, was placed by an Ethernet device driver.Body62 is stored in the buffer in the same partition ofmain storage22 as theheader60 and is the data for the packet associated with thedescriptor58.Body62 was placed in main storage by anapplication16 running on the system or was copied there by the TCP/IP stack20.
When transmitting the packet, aTx processor64 on theadapter52 retrieves thedescriptor58 from an area of main storage22 (a “descriptor area” which is the area where the descriptor is stored). TheTx processor64 then instructs a Direct Memory Access (DMA)unit66 to retrieve thepacket header60 andbody62 from the buffer ofmain storage22 at an address indicated by the retrieveddescriptor58. TheDMA unit66 provides the retrieved packet header and body to the data buffers68 of theadapter52.Tx processor64 retrieves the header and body from data buffers68 using random access, e.g., one byte at a time, and determines a checksum. The checksum is calculated by creating a TCP pseudo header that includes parts of the IP header within the header60 (a source IP address, destination IP address, and protocol), and a TCP segment length (calculated byprocessor64 or received as information in the descriptor58). The checksum is calculated based on the TCP segment and pseudo header. The Tx processor then writes the checksum value back to the data buffers70 in the TCP checksum field of theheader60. The header and body are then output from the data buffers70 on thenetwork connection line56 to a destination on the network.
FIG. 2bis a block diagram illustrating apacket transmission system100 includingmain storage22 and ahardware adapter104 that processes packets and outputs them on aconnection line106. In the example ofFIG. 2b, adescriptor108 has been written into a descriptor area ofmain storage22 by a device driver or other program running on thesystem10. Similar to the example ofFIG. 2a, thedescriptor108 includes instructions as to what actions to take for an associated packet stored elsewhere inmain storage22, and a pointer or address of where the body of the packet is stored inmain storage22.
In this example, theheader110 of the packet associated with thedescriptor108 is stored in same descriptor area and block ofmain storage22 as thedescriptor108. For example, theheader110 can directly follow thedescriptor108 in memory. This storage scheme may have occurred to due particular circumstances and efficiencies in thesystem10. For example, theprotocol stack20 and device driver may coordinate to store the descriptor and header in the same block of main storage, for efficiency. Or, there may be advantages for keeping application data (the body) separate from the header, e.g., in the case that the application wishes to eliminate the copy of data by the TCP/IP stack between application buffers and operating system kernel buffers (a.k.a. a “zero-copy”). Thebody112 of the packet is stored in the buffer ofmain storage22 at some location and memory block different thandescriptor108 andheader110.
When transmitting the packet, aTx processor114 on theadapter104 retrieves thedescriptor108 andheader110 frommain storage22. TheTx processor114 then instructs aDMA unit116 to retrieve thepacket body112 frommain storage22 at an address indicated by thedescriptor108. TheDMA unit116 provides thebody112 to the data buffers118 of theadapter104. TheTx processor114 retrieves thebody112 from the data buffers118 using random access, e.g., one byte at a time, and determines a checksum. As in the example ofFIG. 2a, the checksum is calculated over the TCP segment of the packet and a created pseudo header. The Tx processor stores the checksum value in theheader110 and theheader110 is sent to amerge block120 fromTx processor114. The data buffers118 are instructed to send thebody112 to merge block120 after theheader100, so that the body is merged with the header and the full packet is output onconnection line106. Thissystem100 requires less access and writing to the data buffers118 than thesystem50 ofFIG. 2aand thus can be more efficient (and merge unit is typically more efficient than writing to data buffers118).
FIG. 2cis a block diagram illustrating apacket transmission system150 includingmain storage22 and ahardware adapter154 that processes data packets and outputs them on aconnection line156. In this system, thedescriptor158, the associatedpacket header160, and the associatedpacket body162 are stored in the descriptor area of main storage102, e.g., contiguously. This storage configuration is made possible if thebody162 is small enough in size so that it does not need to be stored and processed using larger data buffers168. When transmitted the packet, thedescriptor158,header160, andbody162 are retrieved by theTx processor164 of theadapter154, which has all the information it needs to calculate a checksum as explained above, add the checksum to the appropriate field of theheader160, and output theheader160 andbody162 as a packet onconnection line156. This example and system do not require use of aDMA unit166 ordata buffers168 on theadapter154, and do not require address translation to read the packet information from the buffer ofmain storage22, and thus is the most efficient of the examples ofFIGS. 2a-2c.
FIG. 3 is a block diagram illustrating apacket transmission system200 of the present invention. Advantages ofsystem200 include its ability to handle any of the three cases illustrated inFIGS. 2a-c, and it performs on-the-fly checksum accumulation and checksum adjustment, and thus faster, more efficient packet transmission. Furthermore, it requires no random access to the data buffers and thus no additional logic and latency for that function.
System200 includesmain storage22 and anetwork hardware adapter204, which includes components that process packets to be sent out and output the packets on aconnection line206 of the network.
Main storage22 stores the parts of the packet that is to be transmitted. The possible storage locations for parts of the packet are all illustrated inFIG. 3, to indicate that all of these cases can be handled bysystem200. Thus, adescriptor208 is stored inmain storage22, by a device driver or other program running on thesystem10 which is sending out a packet, in a descriptor area. In some cases, theheader210 of the packet is stored in the same descriptor area and block of memory as thedescriptor208, as described above with reference toFIG. 2b. And in some cases, the body212 (if it is small enough) is stored in the same descriptor area and block of memory with thedescriptor208 andheader210, as described above with reference toFIG. 2c(also, the IP header portion of theheader210 could be stored with thedescriptor208 and the TCP header portion ofheader210 stored with the body212). In other cases, theheader210 and thebody212, or just thebody212, are stored in a buffer area ofmain storage22 different than thearea storing descriptor208, as described above with reference toFIGS. 2aand2b.
Theprocessing system200 onadapter204 includes aTx processor216 that controls the packet transmission functions of thesystem200.Tx processor216 retrievesdescriptor208 and other packet information (if present) frommain storage22, and sends appropriate instructions and data to the components of thesystem200 to control their operation, including aDMA unit218, data buffers220, amerge unit222, and an XCS unit224. The term “packet information,” as used herein, refers to apacket header210 and/orbody212.
DMA unit218 retrieves theheader210 andbody212 of the packet, if stored in a different area of main storage than thedescriptor208. In the present invention, anaccumulator219 has been added to theDMA unit218 to perform a blind checksum for received data on-the-fly (as it is being retrieved), and provides the blind checksum value toTx processor216, which sends the blind checksum value to the XCS unit224. Themerge unit222 merges any packet information provided by the Tx processor (such asheader210 in some cases) with any packet information received from the data buffers220. The data buffers220 are large enough to hold the maximum size of packet usable with thesystem200 andcomputer system10.
Transmit Checksum (XCS) unit224 of the present invention is provided aftermerge unit222 and receives the packet information merged by that unit. The XCS unit performs an adjustment and correction of the blind checksum value provided by theTx processor216 to create a standard TCP checksum value. The XCS unit is given other information from theTx processor216 to facilitate the adjustment process. The XCS unit can adjust the blind checksum as the packet header is received, and outputs the packet online206 and out to thenetwork14 on-the-fly after the adjustment of the blind checksum is performed based on the header of the packet.
It should be noted that embodiments of thesystem200 of the present invention can perform multiple DMA actions byDMA unit218. For example, there can be “gather” descriptors indescriptor208, which are different DMA instructions, each pointing to different areas of memory. This can cause several DMA operations to happen sequentially by theDMA unit218. This can be appropriate, for example, in a case in which packet information for a packet, such as the body, is split across many different non-contiguous areas ofmain storage22; each DMA action can retrieve packet information from a different area of main storage.
In some embodiments, there can bemultiple Tx processors216 andDMA units218 working in parallel to speed the processing of packets.
FIG. 4 is a flow diagram illustrating amethod250 of the present invention for providing a checksum for a packet to be transmitted over a network. This process is implemented preferably by thenetwork adapter24 in hardware.
The process begins at252, and instep254, adescriptor208 is retrieved from the descriptor area ofmain storage22 by theTx processor216. Any packet information stored in the descriptor area which may be stored with the descriptor (in appropriate cases) is also retrieved, such asheader210, orheader210 andbody212, of the packet. Instep256, the process checks whether there is packet information in the main buffer ofmain storage22 to be loaded. In some cases, theentire body212 andheader210 was loaded to theTx processor216 with thedescriptor208 instep254, and thus no packet information need be loaded from the main buffer. In some other cases, thebody212 and/or thedescriptor210 are stored in the main buffer. In yet other cases, part of thebody212 is stored with thedescriptor208, and part of thebody212 is stored in the main buffer.
If no packet information need be loaded from the main buffer, then the process continues to step258, in which the packet information and other information needed to determine the TCP checksum (explained in greater detail with respect to step270) is provided to the XCS unit224 from theTx processor216. The packet information can be sent via themerge unit222, without any actual merging occurring, and the other information can be directly provided to the XCS unit, or in other embodiments the other information can be sent via themerge unit222 with the packet. Innext step260, the XCS unit calculates a TCP checksum from the received information. The checksum is calculated using a pseudo header based on the information in the IP header and from the descriptor, and the checksum calculated based on the TCP segment and pseudo header. An example method for calculating the checksum is described below with reference toFIGS. 5aand5b. It should be noted that in this case, the checksum is being created rather than adjusted/corrected, since no blind checksum was previously determined. The process then continues to step272, described below.
If the packet body was not retrieved withdescriptor208 as checked instep256, then the process continues to step262, in which theDMA unit218 is instructed by theTx processor216 to load the packet information from the buffer ofmain storage22 at an address indicated by address or pointer information in the retrieveddescriptor208, and a blind checksum is calculated. In some cases, theheader210 and thebody212 are retrieved as packet information frommain storage22 in this step, while in other cases, just thebody212 is retrieved.
The blind checksum ofstep262 is calculated by anaccumulator219 which can be included in theDMA unit218. It is a “blind” checksum in the sense that the accumulator does not follow established rules to create a TCP checksum, i.e., with a pseudo header; it instead accumulates a sum simply based on the values of successive normally aligned halfwords (2-byte strings of bits) of packet information retrieved from the buffer of main storage22 (storage units other than halfwords can be processed in other embodiments). All the halfwords transferred frommain storage22 are included, from the start of the packet up to the end of the IP payload, i.e., including Ethernet header, the IP header, the TCP header, and the body of the packet that are included as the payload of the surrounding IP packet information. Not included are the Ethernet padding bytes (if any) or the Ethernet Cyclical Redundancy Check (CRC), a.k.a. a Frame Check Sequence (FCS) 4-bytes at the end. If the packet information starts or ends on an odd boundary in memory, as indicated by the addresses where it is stored, then the packet information is padded with a zero appropriately to allow the packet to properly align on an even boundary.
Innext step264, the buffer packet information frommain storage22 is stored in the data buffers220 (this actually can occur as each portion of the packet information is being retrieved by the DMA unit). In addition, once the packet information frommain storage22 is fully retrieved, the accumulated blind checksum value is sent to theTx processor216. Instep266, it is checked whether theheader210 was with thebody212 in the packet information retrieved from the buffer ofmain storage22 instep262. In one case, theheader210 was retrieved with thebody212 from the buffer area ofmain storage22 and both header and body are stored indata buffers220; if this is the case, the process continues to step270, described below, where both header and body are sent via themerge unit222 to the XCS unit224.
In the other case, theheader210 is not retrieved with thebody212, since theheader210 was retrieved by theTx processor216 instep254 with thedescriptor208 from the descriptor area ofmain storage22. If this is the case, then the process continues to step268, in which the header from theTx processor216 and the body from the data buffers220 is merged at themerge unit222. For example, theTx processor216 sends theheader210 to themerge unit222, and then instructs that thebody212 in the data buffers118 be sent to themerge block222 to be placed after theheader210 so that the header is merged with the body to create the full packet. The process then continues to step270.
Instep270, the packet and other information is sent from or via themerge unit222 to the XCS unit224. The other information is directly sent from theTx processor216 to the XCS unit224. In other embodiments, the other information can be sent with the packet via themerge unit222, e.g., prepended to the packet as a “sticker” which is later removed by the XCS unit; such an embodiment requires no direct connections of Tx processor and XCS unit. The other information includes the blind checksum value and an immediate data length (IMMLEN) value that indicates the number of bytes of the packet which has not been included in the blind checksum of step262 (known from the descriptor208).
Depending on the parsing abilities of the XCS unit in different embodiments, the other information may also include an IP start offset indicating the offset in halfwords (or other storage unit) at which the IP header begins from the beginning of the packet, a TCP start offset indicating the offset at which the TCP segment begins from the beginning of the packet, and a TCP checksum offset indicating the offset in halfwords at which the TCP checksum field begins. For example, in the described embodiment, the XCS unit224 does not parse this offset information from the packet, but instead receives it directly from theTx processor216, where the Tx processor retrieved it from thedescriptor208. In a different embodiment, the XCS unit224 can parse the packet to determine these offsets, e.g., start at the beginning of the packet at, e.g., Ethernet packet information, and continue field by field to the IP header start, the TCP header start, the TCP checksum field start, etc.
The XCS unit224 can include a small buffer (e.g., 256 bytes) allowing random access to access the checksum field (which could end up at virtually any location in the buffer due to various Ethernet, IP, and TCP header lengths and previous packet sizes which may still be in the buffer). The buffer is used to store the packet (or a portion thereof) until the checksum is fully determined.
Innext step272, the XCS unit224 adjusts the blind checksum value to correct this checksum so that it corresponds to a TCP checksum. The XCS unit determines whether to adjust the blind checksum value based on examined halfwords in the packet. The XCS unit performs this function on the fly, as each halfword is being received; thus, step272 is preferably integrated withstep274 as the packet information is being received. A method that the XCS unit224 can use to adjust the blind checksum value, and achieve the TCP checksum, is described in greater detail below with respect toFIGS. 5aand5b.
Innext step274, the XCS unit224 places the determined TCP checksum in the TCP header of the packet, using the TCP checksum field offset received as other information instep270 or258, and sends the packet out on theline206 out to thenetwork14. Once the XCS unit has examined enough halfwords to have fully adjusted the checksum, then the TCP checksum value is placed in the checksum field and the packet is begun to be transmitted, and all remaining halfwords of the packet received at the XCS unit224 can be output online206 on the fly, as they are received; this is a significant advantage of the present invention. The process is then complete at276. It should be noted that there is no need to discard a created pseudo header, because a pseudo header is never separately created; rather, it is included in the determined checksum as part of the adjustment process. Another significant advantage of the invention is that the random access needed is limited to only the small size of the buffer in the XCS unit (e.g., 256 bytes), instead of for the large size of the data buffers (e.g., 9 kilobytes for a “jumbo” frame buffer). Random access capability is more expensive to set up for larger buffers. In addition, in the present invention, each byte of the packet does not have to be ready from a “distance” and the checksum need not be calculated after the packet is received; rather, the checksum is accumulated as the packet passes, both in theaccumulator219 and in the XCS unit224, which is much more efficient.
FIGS. 5aand5bare flow diagrams illustrating a method detailing a particular implementation ofstep272 ofmethod250 ofFIG. 4, in which the XCS unit224 adjusts the blind checksum value to achieve a TCP checksum (or calculates the TCP checksum directly without using a blind checksum, in some cases). The method ofFIGS. 5aand5bis just one example of how the final checksum determination can be implemented.
The process begins at282 inFIG. 5a, and instep284, a halfword pointer (HP) is moved to the next halfword (HW) in the received packet information. Thefirst time step284 is performed, the pointer starts at the first halfword of the packet (offset zero). As detailed above forFIG. 4, the XCS unit224 receives the blind checksum value from theTx processor216, and the packet information that is to be transmitted via themerge unit222. As each halfword is received by the XCS unit, it moves its halfword pointer to that received halfword, to achieve on-the-fly examination and adjustment of the blind checksum value. As explained above, this method assumes the XCS unit received other information from theTx processor216, such as various offsets required in order to know the positions of various fields in the packet and to store it in the TCP header. E.g., the XCS unit knows where the IP header starts due to having received an IP header offset, and knows where the TCP header starts by having received a TCP header offset. Alternatively, the XCS unit can determine or parse these offsets by moving the pointer a known, standard number of bits or bytes through the packet information for each field and offset. Based on these offsets, the XCS can calculate the offsets of other fields such as the IP SA and DA offsets, needed for the checksum determination.
As described above, one of the other values received from theTx processor216 is the IMMLEN value, which is the immediate data length for the packet, i.e., the number of bytes in the packet that have not been included in the blind checksum, as known from thedescriptor108. Thus, if IMMLEN equals zero, this indicates that all of the packet halfwords were included in the blind checksum, and that if any adjustment is required for the current halfword, it will be to subtract it from the blind checksum. Likewise, if IMMLEN is not zero (i.e., positive), then some of the packet halfwords were not included in the blind checksum, and subtraction and/or addition of halfwords may be needed for correction. In the described embodiment, it is assumed that IMMLEN is either zero or greater than the IP packet start; this restriction reduces hardware implementation complexity. Other embodiments can used methods to avoid this restriction.
Innext step286, the process checks whether the halfword pointer is at the IP source address (SA) or the IP destination address (DA) stored in the IP header of the packet. This information typically is specified in full words, so there are two halfwords provided for each address. If the pointer is at a halfword for one of these addresses, then the process checks instep288 whether IMMLEN is equal to zero. If so, then as indicated instep290 the checksum is not adjusted, and the process returns to step284 to move to and examine the next halfword. If IMMLEN is not equal to zero, then this IP address information was not included in the blind checksum, and in step292 the current halfword is added to the blind checksum. This is because the IP source and destination addresses are required for TCP checksum determination, i.e., these addresses are included in the pseudo header used in TCP checksum determination. The process then returns to step284.
If the check ofstep286 is negative, then instep294 the process checks whether the halfword pointer is at a halfword describing the protocol field and time-to-live (TTL) field of the IP header of the packet. Each of these fields is a byte, and thus the halfword would include both fields. If the pointer is at this halfword, then instep296 the process checks whether IMMLEN is equal to zero (i.e., whether this halfword is already included in the blind checksum). If so, in step298 the TTL portion (byte) of the halfword is subtracted out, since the TTL field is not needed in the pseudo header to determine the TCP checksum; this leaves the protocol field, which is needed in the pseudo header. The process then returns to step284 to move to and examine the next halfword. If IMMLEN is not equal to zero, then this IP address information was not included in the blind checksum, and instep300 the protocol portion of the halfword (byte) is added to the blind checksum, since the protocol field is needed in the pseudo header. The process then returns to step284.
If the check ofstep294 is negative, then instep302 the process checks whether the halfword pointer is at a halfword describing the Version field (half byte), IP Header Length (HL) field (half byte), and Type of Service (TOS) field (1 byte) of the IP header of the packet. If the pointer is at this halfword, then instep304 the process checks whether IMMLEN is equal to zero. If so, instep306 this halfword is subtracted from the blind checksum, since these fields are not needed in the TCP pseudo header. In addition, the IP Header Length field is saved in a hardware latch (or other convenient storage), since this field is needed for step314 (described below). The process then returns to step284. If IMMLEN is not equal to zero, then the current halfword was not included in the blind checksum, and in step308 the IP Header Length field is subtracted from the blind checksum. This is in anticipation of step316 (described below), in which the total length is added; since the required pseudo header length field is the TCP segment length (IP total length minus IP header length), the IP Header Length can be subtracted out now. The process then returns to step284. Note that this is an advantage of the present invention: a normally unused adder cycle in which the XCS unit would do nothing, is instead efficiently utilized to make an adjustment to the blind checksum in anticipation of other adjustments, thus saving additional operations at the time of those later adjustments and avoiding the use of additional halfword adders.
If the check ofstep302 is negative, then instep310 the process checks whether the halfword pointer is at a halfword describing the IP Total Length field of the packet in the IP header of the packet. If the pointer is at this halfword, then instep312 the process checks whether IMMLEN is equal to zero and this halfword is included in the blind checksum. If so, instep314 the IP Header Length field saved in a latch instep306 is subtracted from the blind checksum, since the pseudo header needs the TCP segment length (i.e., IP total length minus IP header length) to determine the TCP checksum. The process then returns to step284. If IMMLEN is not equal to zero, then the current halfword was not included in the blind checksum, and in step316 the current halfword is added to the blind checksum. This creates the desired TCP segment length in the pseudo header since the IP Header Length is subtracted in step308. The process then returns to step284.
If the check ofstep310 is negative, then instep318 the process checks whether the halfword pointer is less than or equal to the IP header end, i.e. whether the pointer is at a halfword in the IP header that is not covered by the steps described above. If the pointer is at such a halfword location in the packet, then instep320 the process checks whether IMMLEN is equal to zero and this halfword is included in the blind checksum. If so, instep322 the current halfword is subtracted from the blind checksum value, since the pseudo header or TCP checksum does not need any other IP header halfwords or fields except those described in the steps above. The process then returns to step284. If IMMLEN is not equal to zero, then the current halfword was not included in the blind checksum, and as indicated instep324, the checksum is not adjusted. The process then returns to step284.
If the check ofstep318 is negative, then the process continues to step326, as detailed inFIG. 5b. The remaining steps include situations where the halfword pointer is pointed to halfwords in the TCP header, the TCP payload, and after the TCP payload.
Instep326, the process checks whether the halfword pointer is equal to the TCP checksum field. If so, instep328, the process checks whether the halfword pointer is equal to IMMLEN, i.e., whether the current halfword has an offset equal to the number of bytes not included in the blind checksum. If so, then it indicates that the halfword pointer is at the halfword at a boundary, and that the checksum field is at that boundary. Instep330 the process checks whether IMMLEM is on an even-numbered address boundary. If so, a halfword is subtracted out instep332, and the process returns to step284. If on an odd boundary, then only the lower byte of the halfword is subtracted out in step334, since the upper byte is already included in the blind checksum. The process then returns to step284. If the pointer is not equal to IMMLEN atstep328, then instep336 the process checks whether the pointer is greater than IMMLEN. If so, a halfword is subtracted out instep338, and the process returns to step284. If not, the checksum is not adjusted as indicated instep340, and the process returns to step284.
Instep326, the process checks whether the halfword pointer is equal to IMMLEN, i.e., whether the current halfword has an offset equal to the number of bytes not included in the blind checksum. If so, then it indicates that the halfword pointer is at the halfword at a boundary, such as the boundary between the IP header and the TCP header, or the boundary at the end of the packet; the location of the boundary depends on how much of the packet was included in the blind checksum. Instep328 the process checks whether IMMLEM is on an odd-numbered address boundary. If so, then instep330 only the upper byte of the current halfword is added to the blind checksum value, since the lower byte belongs to the TCP header, for example, and was already included in the blind checksum value. The blind checksum value has thus been fully adjusted to conform to a TCP checksum, and the process is then complete at338. If the pointer is at an even numbered boundary, then no additional bytes need be added, and the blind checksum value is not corrected as indicated in step332 (e.g., a zero can be added). The adjustment/correction process of the blind checksum is then over as indicated at338, resulting in a TCP checksum value.
If the check ofstep326 is negative, then the process continues to step334, in which the process checks whether the halfword pointer is greater than IMMLEN, i.e., whether the current halfword is at a halfword already included in the blind checksum or is after the end of the TCP payload. If so, then remaining halfwords are already included in the blind checksum value, and the checksum is not adjusted as indicated instep336. The adjustment process of the blind checksum to a TCP checksum value is then over as indicated at338. If the halfword pointer is not greater than IMMLEN at step334, then it is less than IMMLEN and pointing to a halfword that was not included in the blind checksum. Thus, instep340, the current halfword is added to the blind checksum, and the process then returns to step284 for the next halfword. For example, step340 can add halfwords to the checksum from the TCP header or body which were not included in the blind checksum (or if a blind checksum was never created, e.g., thebody212 was provided directly to Tx processor216).
It should be noted that the process ofFIGS. 5aand5bassumes that theheader210 is not stored so that it is split between the descriptor area ofmain storage22 and the buffer area of main storage, i.e., that the header data remains contiguous. This is because this process assumes that the IMMLEN value, retrieved from thedescriptor108, does not fall within the Ethernet or IP header, i.e., it is either zero or greater than the start offset of the IP header.
Although the present invention has been described in accordance with the embodiments shown, one of ordinary skill in the art will readily recognize that there could be variations to the embodiments and those variations would be within the spirit and scope of the present invention. Accordingly, many modifications may be made by one of ordinary skill in the art without departing from the spirit and scope of the appended claims.