Disclosure of Invention
The invention aims to provide an efficient cache management system facing an Ethernet switch so as to overcome the defects of the prior art.
In order to achieve the purpose, the invention adopts the following technical scheme:
an efficient cache management system facing an Ethernet exchanger comprises a cache management module, a queue management module, an inlet control module, an outlet control module, a QoS control module and a register module;
the buffer management module is used for completing the allocation and release of the buffer addresses of the data packets received and transmitted from the MAC, transmitting the data packets to be allocated to the queue management module, and simultaneously discarding the data packets to be discarded to release the buffer space;
the queue management module is used for carrying out enqueue and dequeue distribution on the data packets distributed by the cache management module;
the inlet control module is used for controlling and monitoring the cache use condition of the CoS queue of the inlet port;
the outlet control module is used for counting the use of the outlet cache to realize outlet flow control;
the QoS control module carries out flow classification on the received data packets and puts the data packets into different CoS queues according to classification results, and realizes flow control and congestion processing of ports and queues;
the register module is used for realizing the configuration of the cache management unit.
Further, the packet to be discarded includes a port which can not find the forwarding, a jumbo frame which is beyond the specified length and can not be received by the port, or other control giving a discard flag to discard the packet.
Further, the data cache space of the cache management module stores the data packet to be forwarded; when the receiving port receives the data packet, the cache management module allocates corresponding space for the receiving port, generates descriptor information at the same time, and sends the descriptor information to the queue management module; if the data packet is a data packet to be discarded, the data packet is discarded and the allocated storage space is released.
Furthermore, the cache of the cache management module takes CELL as a minimum unit, and the size of each CELL is 128 bytes.
Further, the exit of the queue management module forms an output array by using a two-layer linked list structure, wherein the first layer is a transmission queue linked list, and the second layer is a buffer mark linked list.
Further, when receiving a storage packet, the descriptor management module writes a descriptor of a packet into a transmission descriptor queue of a port, and when the switch controller reads the descriptor of the packet from the transmission descriptor queue, the switch controller reads data of the packet from the data buffer according to contents of fields in the descriptor, and transmits the data from the corresponding port.
Further, the queue management module comprises a cache request module, a descriptor write request control module, a descriptor read request control module, a descriptor management module, a descriptor cache module, a CELL write management module and a data packet write management module;
the cache request module is used for recording the address allocated to the data packet by the cache management module;
the descriptor management module is used for writing the information recorded by the cache request module into a corresponding descriptor queue;
the descriptor writing request control module requests the corresponding sending port to perform writing descriptor queue operation according to the relevant result of the forwarding control;
the descriptor read request control module is used for judging whether the sending transmission port can carry out data forwarding or not and giving a descriptor queue read request;
the descriptor caching module is used for storing descriptor chain table information;
and the CELL writing management module is used for recording the current CELL writing operation state requesting for enqueuing and preparing for the next CELL writing of the port.
And the data packet writing management module is used for recording the writing operation state of the data packet currently requesting to be queued and preparing for writing the next data packet of the port.
Further, if the cache of the CoS queue of the input port exceeds a threshold value, a flow control message is generated; if the opposite terminal processes the flow control message, the opposite terminal stops sending the data packet.
Further, the entry control module divides the cache space into a port guarantee space, a shared space and a header space, wherein the port guarantee space provides a minimum guarantee available space for the port; the shared space is used for providing a shared cache space for the port when the minimum guaranteed space is insufficient; the header space is used to provide some extra buffering capacity when the minimum guaranteed space and the shared buffer space are not sufficient.
Further, the register module is used for realizing the configuration of the cache management unit, including the threshold values of the cache space, the CoS queue and the flow control and shaping functions in each module of the entry control, the exit control and the QoS control.
Compared with the prior art, the invention has the following beneficial technical effects:
the invention relates to an efficient cache management system facing an Ethernet exchanger, which divides a cache management module through a queue management module, divides a cache space, controls and monitors the cache use condition of a CoS queue of an input port by using an inlet control module, classifies the flow of received data packets, puts the data packets into different CoS queues according to the classification result, realizes the flow control and congestion processing of the ports and the queues, realizes the reasonable use of the cache for dynamic management, and realizes the management of an outlet queue through a double linked list management technology.
Furthermore, the buffer space is divided into a port guarantee space, a shared space and a head space, the port guarantee space reserves a minimum buffer space for each port, other ports cannot occupy, the port can use the shared buffer after the port is used up, so that the reasonable use of the buffer space by each port is guaranteed, the head space is used for storing a small part of data which can still be received after the switch sends the flow control signal when the shared buffer is used up, and the condition that no packet loss occurs is guaranteed.
Furthermore, the cache management unit realizes the functions of flow classification, flow supervision, shaping, congestion management and the like, realizes efficient non-blocking forwarding of data, and ensures the service quality.
Furthermore, incoming messages are counted independently at an input port and an output port respectively, and flow control and congestion processing are achieved by adopting a threshold value, a leaky bucket and a WRED and carrying out classification strategies on queues.
Detailed Description
The invention is described in further detail below with reference to the accompanying drawings:
as shown in fig. 1, an efficient cache management system facing an ethernet switch includes acache management module 1, aqueue management module 2, aningress control module 3, anegress control module 4, aQoS control module 5, and aregister module 6;
thebuffer management module 1 is used for realizing allocation and release of buffer space.
The data cache space of thecache management module 1 stores the data packet to be forwarded. When the receiving port receives the data packet, thecache management module 1 is responsible for allocating corresponding space for the data packet, generating descriptor information at the same time, and sending the descriptor information to the queue management module; if the packet cannot find the port to be forwarded, exceeds a specified length, a jumbo frame that the port cannot receive, or other control gives a discard flag, the packet is discarded and the allocated memory is released. When the data packets satisfy the forwarding condition, the queue management module takes out the data packets from the buffer space of thebuffer management module 1 through the descriptor information to forward, and the corresponding space is released to become a new free space.
The cache of thecache management module 1 takes CELL as a minimum unit, and the size of each CELL is 128 bytes. The data caching space of thecaching management module 1 is supposed to be organized by adopting a linked list structure, wherein the linked list refers to a single linked list limited to deletion of a table head node and insertion of a table tail node. The idle part in the data buffer space is organized in a single linked list mode by taking CELL as a unit, and when the space is distributed each time, one node of the linked list is deleted from the head of the linked list; and each time the space is released, a node is inserted at the tail of the linked list.
The structure of the singly linked list is implemented by 16384x14 bit RAM on chip, the depth 16384 corresponds to the maximum number of free pages, i.e. the entire data buffer space, the content of the 14 bit word corresponds to the address of the next word in the control RAM, and the location of each entry in this RAM corresponds to the location of one page in the buffer memory space. In addition to controlling the RAM, there are two 14-bit wide registers that record the positions of the Head and Tail pointers of the singly linked list, denoted Head and Tail, respectively.
The free page chain list needs an initialization process before use, after initialization, the content in Head is 0, the content inword 0 in the control RAM is 1, the content inword 1 isword 2 … … 16383, the content in Tail is 16383.
The release of the buffer space can be divided into two cases, one is that the receiving port analyzes the received data packet, finds that the data packet does not meet the forwarding condition, and needs to release the space allocated to the data packet at this time; another situation is that the occupied buffer space needs to be released after the data packet forwarding of each sending port is completed. It should be noted that when forwarding multicast or broadcast packets, this space must be freed up when all forwarding ports have completed transmitting data.
Thequeue management module 2 forms an output array at the exit using a two-layer linked list structure. The first layer is a transmission queue linked list, and the second layer is a buffer mark linked list; using a transmit queue link list to ensure per-port packet priority order; for each packet, a linked list of cache markers is used to ensure that the order of the pages in the cache corresponds to each packet.
Each output port supports a maximum of 24 transmit queues to guarantee quality of service, and all transmit queues share a transmit queue link list. The transmit queue is maintained as a linked list, with each node representing a pointer to the packet buffer identifier. Each buffer identification includes packet information and a pointer to the next packet identification. Each cache identifies an associated page allocated in the data cache. For packets with a packet size greater than 128 bytes, multiple cache identifications are required.
When receiving a storage data packet, the descriptor management module writes a descriptor of a data packet into a sending descriptor queue of a certain port, namely, the stored data packet is bound to a corresponding sending port, and when the switch controller reads the descriptor of the data packet from the sending port descriptor queue, the data of the packet can be read from the data cache according to the content of each field in the descriptor and sent out from the corresponding port.
The storage position of a packet received by a port in a data cache is random, one packet may occupy one CELL or multiple CELLs, and the CELLs are not necessarily continuous, so how to correlate within a data packet and between data packets needs to be considered when defining a descriptor structure.
Therefore, the descriptor records not only the information of the storage location of the packet, but also the association information between each CELL of the packet and the association information between the packets. The first layer of sending queue chain table is used to ensure the priority order of each port packet, so the first address of the packet and other related information are recorded. The second layer is buffer mark chain list to ensure the order of buffer pages corresponding to each packet and record the CELL address and other information of data packet.
The descriptor chain table is associated with the data buffer through the address pointer, when the data packet is forwarded, the information of the descriptor chain table is written into the descriptor queue of the sending port, and when the data packet needs to be forwarded to different ports, the descriptor pointers of the ports can point to the same buffer area, so that the storage space is saved.
In design, there are 30 sending ports, so the on-chip descriptor queue space is divided into 30 port queues, each port queue can hold 16383 descriptors, if port priorities are configured, a sending transmission port can support 24 transmission queues at most, each port is sent to a corresponding sending transmission queue according to the corresponding priority, and all queues share the port descriptor queue.
For multicast transmission, when the memory controller receives a packet end signal from an input port, it prepares a descriptor queue that binds the currently stored packet to the transmission port. In the process of binding descriptors, if a multicast transmission situation is encountered, that is, a packet is to be output from multiple transmission ports, it is necessary to bind the relevant attributes of the packet to all the transmission port descriptor queues. In the control RAM of the buffer space, the number of the port queues is counted by a 5-bit counter. When descriptor binding is performed in the management descriptor queue, when a packet end flag is received and it is confirmed from the switch controller that the current packet can be forwarded, a descriptor is written in the corresponding transmission port queue for the current packet. When the switch controller finishes sending the packet of a certain port, the packet descriptor needs to be released, the corresponding data cache space is found according to the frame address in the descriptor, the 5-bit count in the corresponding control RAM is reduced by 1, and then the descriptor is released.
As shown in fig. 2, thequeue management module 2 includes abuffer request module 7, a descriptor writerequest control module 8, a descriptor readrequest control module 9, adescriptor management module 10, adescriptor buffer module 11, a CELLwrite management module 12, and a packetwrite management module 13;
thebuffer request module 7 is configured to record addresses allocated to the data packets by the buffer management module, where the addresses include all the addresses of the CELLs of each data packet, and record related information of the CELLs, such as SOF, EOF, and BE, and thedescriptor management module 10 is configured to write the information recorded by thebuffer request module 7 into a corresponding descriptor queue, so as to implement association of the data packets.
The descriptor writerequest control module 8 requests the corresponding sending port to perform the write descriptor queue operation according to the relevant result of the forwarding control.
The descriptor readrequest control module 9 is configured to determine whether the sending transmission port can perform data forwarding, give a descriptor queue read request, and request thecache management module 1 to release the cache space when the packet completes forwarding handshake with the descriptor management module and gives a sending completion flag.
The descriptor management module is a core module for queue management, and mainly completes write control of the descriptor queue and read control of the descriptor queue.
Thedescriptor buffer module 11 is used for storing descriptor chain table information. The cache is updated during enqueuing and dequeuing of packets.
And the CELL writing management module is used for recording the current CELL writing operation state requesting for enqueuing and preparing for the next CELL writing of the port.
And the data packet writing management module is used for recording the writing operation state of the data packet currently requesting for enqueuing and preparing for writing the next data packet of the port.
The sending queue writing control realizes the arbitration of 30 port queue writing requests, completes the operation of requesting data packet writing into the queue for the arbitrated port and updates the corresponding queue pointer and descriptor. If theport 2 has a request first, the state machine jumps to the previous address state of theport 2, and if the request is the first CELL request of the data packet, the frame linked list needs to be updated according to the configured queue mode and the queue number of the data packet, and the allocated address space is updated to the next frame address item of the address space pointed by the descriptor linked list tail pointer of the frame linked list. Meanwhile, the address item of the next page in the descriptor of the address space pointed by the tail pointer of the cache mark linked list needs to be updated. If the request is not the first CELL request of a frame, only the buffer chain descriptor needs to be updated. After the above operation is completed, the state machine jumps to the current address state of theport 2, and in this state, the tail pointer and descriptor queue of the corresponding queue are updated according to the configured queue mode and the requested queue number. If the request is the first request of the packet, the descriptors and the tail pointers of the sending queue link list and the buffer mark link list need to be updated, otherwise, only the descriptor items and the tail pointers of the buffer mark link list need to be updated.
The sending queue reading (sending) state machine describes a mechanism how a sending queue of a certain port sends, a scheduler monitors the bandwidth use condition of a CoS queue of each outlet port, the monitoring mechanism classifies the CoS queues into different scheduling groups, and the bandwidth monitoring is independently carried out based on each CoS queue. The minimum bandwidth monitoring is to provide minimum bandwidth guarantees to the CoS queue of each egress port. Maximum bandwidth monitoring is to control the maximum bandwidth limit of the CoS queue for each egress port. The minimum bandwidth monitoring and the maximum bandwidth monitoring are realized through a leaky bucket mechanism.
The normal port has only 8 standard CoS queues, and the 8 CoS queues are scheduled by the S2 scheduler. While a high speed port can support 24 CoS queues. The first 8 CoS queues are scheduled by the S2 scheduler and the other 16 CoS queues are scheduled by the S1 scheduler, which guarantees the quality of service for the high speed ports.
The egress queue scheduling supports four CoS queue scheduling algorithms: scheduling of a strict priority CoS queue, scheduling of a polling priority CoS queue, scheduling of a weighted difference polling priority CoS queue, and selecting by a user according to needs.
Theinlet control module 3 is used for controlling and monitoring the buffer usage condition of the CoS queue of the inlet port. If the threshold is exceeded, a flow control message is generated. If the opposite terminal processes the flow control message, the opposite terminal stops sending the data packet, so as to relieve the pressure of the entrance.
The buffer space is divided into three parts: the port guarantee space, the shared space, and the head space;
the port guaranteed space provides the port with the minimum guaranteed available space. Through register configuration. There is only one set of register configuration and all port spaces are the same. The shared space, when the minimum guaranteed space is insufficient, provides the shared cache space for the ports, including the total shared cache space and the maximum available shared cache space for each port. All ports may select either dynamic configuration or static thresholds. Head space, when the minimum guaranteed space and the shared cache space are not sufficient, provides some extra cache capacity. The priority group head space is used for storing a flow control frame and possibly receiving a part of messages sent by an opposite terminal when the opposite terminal stops sending the messages, and the global head space is used for using the part of buffer space as the shared head space of all ports if the head space is not independently allocated to each port. If this space is used, each port is allowed to store only one packet.
The cache rule is that when a data packet is received, a port is preferentially used to ensure space, then a shared space is used, and finally a head space is used. After the data packet is sent, the head space is preferentially released, then the shared space is formed, and finally the port guarantee space is formed.
Each port is provided with an independent leaky bucket algorithm mechanism for monitoring the use condition of the cache, and flow control is triggered at an input port to realize inlet flow shaping. The BUCKET _ COUNT represents the number of tokens in the current BUCKET, the BUCKET _ COUNT is 0 at the beginning, if a message arrives, the message is converted into the corresponding number of tokens according to the size of the message bytes and is added into the BUCKET. Every T _ REFRESH cycle, the number of tokens of REFRESH COUNT is taken out of the BUCKET (BUCKET _ COUNT ═ REFRESH). The GRANULARITY of each token is selected by METER _ GRANULARITY. When BUCKET _ COUNT reaches the DISCARD _ THD pipeline, the MMU will announce the port to DISCARD the arriving packet.
Theexit control module 4 is used for controlling the highest throughput;
the principle of ingress control is to have no packet loss as much as possible, and the principle of egress control is to achieve high throughput as much as possible.
Egress control is achieved by setting a threshold for each port. Egress ports are associated with CoS queues, each having respective thresholds that determine which packets are to be entered into the queues and which packets are destined for that port are to be discarded. Similar to the ingress control, the egress control also has two cache resources, a minimum guaranteed space and a shared space. When the data packet is sent, the shared space is preferentially released, and the influence on the use of other port caches is reduced.
TheQoS control module 5 performs flow classification on the received data packets and puts the data packets into different CoS queues according to the classification result, and implements flow control and congestion processing of ports and queues.
The QoS control module supports both traditional flow control and service related flow control. The traditional flow control means that the flow control is realized by backpressure of the whole port through a PAUSE frame. And service related flow control is detailed to each COS queue inside the port, and the flow control between each COS queue is independent, so that the flow control based on the refinement between the COS queues is allowed. For example, the high priority stream can be controlled to have no packet loss, a lower threshold is set for the low priority stream, and if the threshold is exceeded, the low priority stream is directly discarded without flow control.
After the data packet enters the buffer and creates the queue, the QoS control module will update some of the relevant internal resource registers. Based on these resource statistics registers, it is decided whether a port enters flow control state, head blocking, weight early discard state (WRED), or deterministic forwarding.
Theregister module 6 is used for realizing the configuration of the cache management unit. The method mainly comprises the steps of controlling the buffer space, the CoS queue and the threshold value of the flow control and shaping functions in each module by an entrance, an exit and QoS.
When the power supply device is used, a user can configure the power supply device after powering on according to different requirements so as to meet different application requirements.
In the invention, all the cache space is not taken as a shared cache, but is divided into three parts, namely a port guarantee space, a shared space and a head space, wherein the port guarantee space reserves a minimum cache space for each port, other ports cannot occupy, and the port can use the shared cache only after the port is used up, so that the reasonable use of each port on the cache space is ensured, and the head space is used for storing a small part of data which can still be received after a switch sends a flow control signal when the shared cache is used up, thereby ensuring no packet loss as far as possible. In addition, the cache management unit realizes the functions of flow classification, flow supervision, shaping, congestion management and the like, realizes efficient non-blocking forwarding of data, and ensures the service quality.
The invention relates to an efficient cache management system facing an Ethernet exchanger, which reserves a minimum guaranteed space for each exchange port by dividing a cache space and realizes the dynamic management of a cache unit. And meanwhile, a linked list structure is adopted to manage the buffer space and the queues, so that the resource occupation is reduced and the reliability is high. Incoming messages are counted independently at an inlet port and an outlet port respectively, and flow control and congestion processing are achieved by adopting the strategies of threshold value, leaky bucket, WRED, queue classification and the like. The exit scheduler adopts a two-stage scheduling strategy and an SP/RR/WRR/WDRR queue scheduling algorithm to ensure the service priority of the port message. By the technologies of flow classification, flow shaping, congestion management and the like, the efficient forwarding of the message under normal and congestion situations is realized.
As shown in fig. 1, the internal structure of the line is a hardware implementation structure of the efficient cache management unit provided in the present invention, and the external portion of the dotted line is other modules in the gigabit ethernet switch connected thereto, including a receiving arbitration, a sending arbitration and cache module, a MAC module and a PHY module.
The numbering blocks in fig. 1 are explained below.
Thebuffer management modules 1, the number of which is 1, complete the allocation and release of the buffer addresses of the data packets received and transmitted from the MAC, including the recording of multicast and broadcast data forwarding ports.
And the number of the sendingqueue management modules 2 is 1, and the enqueue and dequeue operations of the data packets are completed.
And the number of theentry control modules 3 is 1, so that the allocation and the flow control of the entry cache are realized.
And the number of theoutlet control modules 4 is 1, and the outlet flow control is realized by counting the use of the outlet cache.
And the number of theQoS control modules 5 is 1, and the flow classification, the flow control and the congestion management are realized.
And the number of theregister modules 6 is 1, so that the configuration of each functional module is realized.
As shown in fig. 2, inside the dotted line is the architecture of the transmit queue module, and outside the dotted line are the other modules connected thereto.
Thebuffer request modules 7, the number of which is 30, record the CELL address allocated to the data packet by the buffer management module and the data packet information related to the CELL, such as SOF, EOF, BE, etc.
And the descriptor writerequest control module 8, the number of which is 30, generates the write queue information of the outlet, including write requests, addresses, data packet related information and the like.
The descriptor readrequest control module 9, the number of which is 30, implements the queue read request function of the egress.
Thedescriptor management modules 10, the number of which is 30, implement enqueue and dequeue management for each egress.
Thedescriptor buffer module 11, the number of which is 1, stores a descriptor queue chain table.
And the CELLwriting management modules 12, the number of which is 30, record the writing state of the CELL relevant information.
The number of the packetwrite management modules 13 is 30, and the packet write management modules record the write status of the packet related information.
The invention can be used in the Ethernet switch supporting the store-and-forward architecture, and is particularly suitable for the two-layer and three-layer high-performance Ethernet switch.
By adopting the cache management provided by the invention, the high-efficiency forwarding of the message under normal and congested situations can be realized.
According to the scheme, the logic functions of each module in the invention are described by VHDL language, and the logic functions are integrated with MAC and PHY modules in a gigabit Ethernet switch to carry out system-level verification on an FPGA. The verification result shows that the invention realizes the design function. The method can realize the line speed forwarding of the message aiming at the data exchange under the normal and congestion transmission situations, and has small packet loss rate.