Background
With the improvement of the performance of the general-purpose processor, the development of cloud computing technology is promoted, and more network devices gradually shift to the implementation of general-purpose processor software. While flexibility and low cost are brought, the kernel interrupt-based software package receiving processing mode gradually becomes a performance bottleneck. For this reason, when the network device receives a large number of network messages, interrupts frequently occur, and execution of the message processing logic is frequently interrupted, so that a large amount of CPU processing time is consumed in context switching.
In order to solve the problem of low performance of interrupt packet reception, some high performance network solutions based on a busy polling mode, such as pf_ring and DPDK, have emerged. The busy polling mode avoids CPU overhead caused by frequent interruption, and when the network load is high, the CPU obtains higher network message processing capacity. However, the polling mode also has a problem that when the network load is not high, the polling program such as busy still continuously occupies the CPU, and the CPU continuously accesses the empty packet receiving queue, so that the CPU cannot process other calculation tasks, and the performance of the whole machine is reduced and the power consumption is increased.
Some solutions attempt to combine interrupt and poll modes in an effort to obtain the advantages of both modes. For example, the NAPI network driver of the Linux kernel allows performing multiple polling for packet collection after one interrupt, and the number of polling and the polling time interval are configurable. This mode is referred to herein as interrupt batch mode. In this way, a wider network load scenario can be accommodated by the configuration of the number of polls and the polling interval. However, the interrupt batch mode still lacks a mechanism for automatically adjusting the polling interval and the number of times according to the load, so that the polling adjustment is not timely enough, and the best CPU and network performance balance cannot be obtained. The packet receiving load condition of the network equipment can change along with time, but the existing packet receiving scheme of the network equipment software is not flexible enough in packet receiving mode, and the packet receiving mode cannot be adjusted in time when the network load changes, so that the network IO performance is insufficient or the CPU utilization rate is not high. For example, when the network load is very low, the CPU empty consumption caused by polling the empty queue can be avoided by using the interrupt packet receiving mode, and the utilization rate of the CPU is reduced by using modes such as polling busy; when the network load is very high, the packet receiving modes such as busy polling and the like can avoid the context switching overhead caused by frequent CPU interruption, the highest network processing performance is obtained, and the interrupt packet receiving mode can cause the CPU to be consumed on the switching of interrupt contexts, so that the network performance is reduced.
Introduction of terms:
interruption: the interruption refers to that when some unexpected situation occurs and the host intervention is needed, the machine can automatically stop the running program and transfer to the program for processing new situation, and after the processing is finished, the original suspended program is returned to continue running
Polling: the CPU sends out inquiry at regular time, and inquires whether each object needs its service or not in sequence, if so, the service is given, and after the service is finished, the next object is inquired, and the operation is repeated.
Disclosure of Invention
Aiming at the problem that the prior art cannot well balance the network IO processing performance and the CPU utilization rate, the invention provides an automatic adjustment method for the network packet receiving mode, which realizes better balance between the network IO processing performance and the CPU utilization rate by dividing the four packet receiving modes of full interrupt, interrupt batch processing, interval polling, busy polling and the like and setting a packet receiving scheduling program.
The invention has the following specific implementation contents:
the invention provides an automatic adjustment method for a network packet receiving mode, which sets four packet receiving processing modes according to the load of a packet receiving processing thread, namely the packet receiving processing rate of the packet receiving processing thread: a complete interrupt mode, an interrupt batch mode, an interval polling mode, a busy polling mode, and the like; and setting a packet receiving scheduler program to switch modes of a plurality of packet receiving processing threads at the same time.
In order to better realize the invention, further, the ranges of processing loads are respectively set for the four modes of the packet receiving processing thread; for the mode switching of the packet receiving processing thread, the specific operation is as follows:
when the load of the packet receiving processing thread in the complete interrupt mode exceeds the range set by the complete interrupt mode, the packet receiving scheduler switches the mode of the packet receiving processing thread from the complete interrupt mode to the interrupt batch mode;
when the load of the packet receiving processing thread in the interrupt batch mode exceeds the range set by the interrupt batch mode, the packet receiving scheduler switches the mode of the packet receiving processing thread from the interrupt batch mode to the interval polling mode; when the load of the packet receiving processing thread in the interrupt batch mode is lower than the range set by the interrupt batch mode, the packet receiving scheduler switches the mode of the packet receiving processing thread from the interrupt batch mode to the full interrupt mode;
when the load of the packet receiving processing thread in the interval polling mode exceeds the range set by the interval polling mode, the packet receiving scheduler switches the mode of the packet receiving processing thread from the interval polling mode to a busy polling mode; when the load of the packet receiving processing thread in the interval polling mode is lower than the range set by the interval polling mode, the packet receiving scheduler switches the mode of the packet receiving processing thread from the interval polling mode to the interrupt batch processing mode;
when the load of the packet receiving processing thread in the busy waiting polling mode is lower than the range set by the busy waiting polling mode, the packet receiving scheduler switches the mode of the packet receiving processing thread from the busy waiting polling mode to the interval polling mode;
in addition, the shortest period of the switching mode is set for the packet reception processing thread, and the packet reception processing thread can be switched only when running in one packet reception mode for more than the shortest period.
In order to better implement the invention, further, for the interrupt batch mode, an interrupt condition parameter N' is set; the interrupt condition parameter N' is the number of messages to be processed after the packet receiving process is interrupted in each interrupt batch processing mode, and is the product of the actual load and the system scheduling period; the interrupt condition parameter N 'is written into the memory by the packet receiving scheduling program, and the packet receiving scheduling program inquires and uses the interrupt condition parameter N' when the packet receiving processing process carries out packet receiving processing.
In order to better realize the invention, further, for the interval polling mode, an intermittent polling condition parameter T 'is set, wherein the intermittent polling condition parameter T' is a time interval for which the packet receiving processing thread needs to actively sleep and wait after processing all messages of the network card receiving queue each time.
In order to better implement the present invention, further, the specific flow of the packet receiving scheduler for performing mode switching on the packet receiving processing thread is as follows:
(1) Firstly, initializing a scheduling timer;
(2) Then initializing a statistics timer;
(3) Starting a packet receiving thread according to an initial mode of the packet receiving processing thread;
(4) And then, periodically executing packet receiving scheduling, namely switching the packet receiving mode of the packet receiving processing thread, wherein the method comprises the following specific steps of:
s1, firstly, calculating the message processing speed of each packet receiving processing thread;
s2, judging whether the packet receiving processing thread exceeds a load range corresponding to the current mode;
s3, if the load range corresponding to the current mode is not exceeded, the operation of the step S1 is carried out again; if the load range corresponding to the current mode is judged to be exceeded, judging whether the running time of the packet receiving processing thread in the current mode reaches the shortest period of packet receiving mode switching or not;
s4, judging that the running time of the current mode does not exceed the packet receiving processing thread of the shortest period, and carrying out the processing of the step S1 again; and if the running time of the current mode is judged to exceed the packet receiving processing thread of the shortest period, switching the packet receiving mode according to the current load condition and the set load range corresponding to each hand packet mode.
In order to better implement the present invention, further, there are two statistics counters of each packet receiving processing thread, which are a P counter for recording the processing number of the data packet at the current time and an N counter for recording the number of the data packet processed at the previous statistics.
In order to better realize the invention, further, setting the execution period of the packet receiving scheduler as T scheduling, the load value of the packet receiving processing thread is the difference value obtained by subtracting the value recorded by the P counter from the value recorded by the N counter and dividing the difference value by the T scheduling.
To better implement the present invention, further, the periodic execution of the packet reception scheduler depends on a system clock trigger, and the initializing schedule timer specifically operates as: the timer is set according to the configured execution period of the packet receiving scheduler.
In order to better implement the present invention, further, each of the packet-receiving processing threads needs to process data of one packet-receiving queue, and the specific workflow mainly includes:
firstly, reading a data packet of a packet receiving queue, then processing the read data packet, updating a statistics counter corresponding to a self packet receiving processing thread after the read data packet is processed, and then continuously reading the data packet of the packet receiving queue, and circulating the operations.
In order to better realize the invention, further, in the complete interrupt mode, the packet receiving interrupt of each network card is uploaded to a packet receiving processing thread;
under the interrupt batch processing mode, the driver gathers and sends the data to a packet receiving processing thread after receiving the packet receiving interrupt of the network card for a plurality of times;
under the interrupt batch processing mode, calculating and setting the number of messages processed by a primary interrupt packet receiving processing thread;
under the interval polling mode, the network card interrupt is completely closed, after the packet receiving processing thread processes the packet of the packet receiving queue, the CPU is actively yielded, after the CPU is dormant for a time interval, the execution of other threads is interrupted again, and the CPU is preempted to execute the scanning polling of the packet receiving queue;
under the busy polling modes, the network card interrupt is completely closed, the CPU is not yielded after the packet receiving processing thread processes the packet of the packet receiving queue, and the CPU is continuously preempted to execute the scanning polling of the packet receiving queue;
the number of times of network card interruption is the number of messages to be processed by the receiving processing thread for reading the receiving queue once.
Compared with the prior art, the invention has the following advantages:
the better balance between the network IO processing performance and the CPU utilization rate is realized; the packet receiving load condition of the network equipment can change along with time, and the corresponding optimal network packet receiving mode is available under different network loads.
Detailed Description
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it should be understood that the described embodiments are only some embodiments of the present invention, but not all embodiments, and therefore should not be considered as limiting the scope of protection. All other embodiments, which are obtained by a worker of ordinary skill in the art without creative efforts, are within the protection scope of the present invention based on the embodiments of the present invention.
Example 1:
the invention provides an automatic adjustment method for a network packet receiving mode, as shown in fig. 1 and 4, the method sets four packet receiving processing modes according to the load of a packet receiving processing thread, namely the packet receiving processing rate of the packet receiving processing thread: a complete interrupt mode, an interrupt batch mode, an interval polling mode, a busy polling mode, and the like; and setting a packet receiving scheduler program to switch the mode switching of a plurality of packet receiving processing threads at the same time.
In order to better realize the invention, further, the ranges of processing loads are respectively set for the four modes of the packet receiving processing thread; for the mode switching of the packet receiving processing thread, the specific operation is as follows:
when the load of the packet receiving processing thread in the complete interrupt mode exceeds the range set by the complete interrupt mode, the packet receiving scheduler switches the mode of the packet receiving processing thread from the complete interrupt mode to the interrupt batch mode;
when the load of the packet receiving processing thread in the interrupt batch mode exceeds the range set by the interrupt batch mode, the packet receiving scheduler switches the mode of the packet receiving processing thread from the interrupt batch mode to the interval polling mode; when the load of the packet receiving processing thread in the interrupt batch mode is lower than the range set by the interrupt batch mode, the packet receiving scheduler switches the mode of the packet receiving processing thread from the interrupt batch mode to the full interrupt mode;
when the load of the packet receiving processing thread in the interval polling mode exceeds the range set by the interval polling mode, the packet receiving scheduler switches the mode of the packet receiving processing thread from the interval polling mode to a busy polling mode; when the load of the packet receiving processing thread in the interval polling mode is lower than the range set by the interval polling mode, the packet receiving scheduler switches the mode of the packet receiving processing thread from the interval polling mode to the interrupt batch processing mode;
when the load of the packet receiving processing thread in the busy waiting polling mode is lower than the range set by the complete interrupt mode, the packet receiving scheduler switches the mode of the packet receiving processing thread from the busy waiting polling mode to the interval polling mode;
in addition, the shortest period of the switching mode is set for the packet receiving processing thread, and the packet receiving processing thread can be switched only when running in one packet receiving mode and exceeding the shortest period;
under the complete interrupt mode, uploading the packet receiving interrupt of each network card to a packet receiving processing thread;
under the interrupt batch processing mode, the driver gathers and sends the data to a packet receiving processing thread after receiving the packet receiving interrupt of the network card for a plurality of times;
under the interrupt batch processing mode, calculating and setting the number of messages processed by a primary interrupt packet receiving processing thread;
under the interval polling mode, the network card interrupt is completely closed, after the packet receiving processing thread processes the packet of the packet receiving queue, the CPU is actively yielded, after the CPU is dormant for a time interval, the execution of other threads is interrupted again, and the CPU is preempted to execute the scanning polling of the packet receiving queue;
under the busy polling modes, the network card interrupt is completely closed, the CPU is not yielded after the packet receiving processing thread processes the packet of the packet receiving queue, and the CPU is continuously preempted to execute the scanning polling of the packet receiving queue;
the number of times of network card interruption is the number of messages to be processed by the receiving processing thread for reading the receiving queue once.
Working principle: as shown in fig. 1, the core of the scheme of the invention is that a packet receiving mode state machine is commonly maintained by a packet receiving dispatcher and a packet receiving processing thread. When the packet receiving scheduler detects that the load of the packet receiving processing thread, namely the packet receiving processing rate is changed, the packet receiving scheduler judges which mode needs to be switched according to the load condition. The switching of the modes is completed by a packet receiving scheduler, and the switching process mainly comprises switching interruption and setting parameters of batch processing and polling. It should be noted that the packet receiving scheduler and the packet receiving processing thread need to be bound to different CPU cores, so as to avoid the influence of the packet receiving scheduler thread on the packet receiving processing thread.
Interrupt batch mode requires setting a parameter: i.e. the number of messages to be processed for each interruption; the parameter value is calculated by the actual load and the interrupt batch processing mode parameter, and is written into the memory by the packet receiving scheduler, and the packet receiving scheduler inquires and uses the parameter value. Setting the number of batch processing network card interrupts as N, and then the calculation formula of N is as follows: n=load x operating system scheduling period.
The same interval polling mode requires setting a parameter: the packet receiving processing thread needs to actively sleep for waiting time intervals after processing all messages of the network card receiving queue each time. The polling interval of the interval polling is set as 'T polling', and the calculation formula is as follows: t poll = 2 operating system scheduling period [ (busy equal load-current load)/(busy equal load-minimum load) ].
Example 2:
in order to better implement the present invention, further, as shown in fig. 2, an interrupt condition parameter N is set for the interrupt batch mode; the interrupt condition parameter N is the number of messages to be processed after the packet receiving process is interrupted in each time in the interrupt batch processing mode, and is the product of the actual load and the system scheduling period; the packet receiving scheduling program writes the interrupt condition parameter N into the memory, and queries and uses the interrupt condition parameter N when the packet receiving processing process carries out packet receiving processing.
And setting intermittent polling condition parameters T for the intermittent polling mode, wherein the intermittent polling condition parameters T are time intervals for which the packet receiving processing thread needs to actively sleep and wait after processing all messages of the network card receiving queue each time.
The specific flow of the packet receiving scheduler for carrying out mode switching on the packet receiving processing thread is as follows:
(1) Firstly, initializing a scheduling timer;
(2) Then initializing a statistics timer;
(3) Starting a packet receiving processing thread according to an initial mode of the packet receiving processing thread;
(4) And then, periodically executing packet receiving scheduling, namely switching the packet receiving mode of the packet receiving processing thread, wherein the method comprises the following specific steps of:
s1, firstly, calculating the message processing speed of each packet receiving processing thread;
s2, judging whether the packet receiving processing thread exceeds a load range corresponding to the current mode;
s3, if the load range corresponding to the current mode is not exceeded, the operation of the step S1 is carried out again; if the load range corresponding to the current mode is judged to be exceeded, judging whether the running time of the packet receiving processing thread in the current mode reaches the shortest period of packet receiving mode switching or not;
s4, judging that the running time of the current mode does not exceed the packet receiving processing thread of the shortest period, and carrying out the processing of the step S1 again; and if the running time of the current mode is judged to exceed the packet receiving processing thread of the shortest period, switching the packet receiving mode according to the current load condition and the set load range corresponding to each hand packet mode.
The number of the statistics counters of each packet receiving processing thread is two, namely a P counter for recording the processing number of the data packet at the current time and an N counter for recording the number of the processed data packet at the previous statistics;
and setting the execution period of the packet receiving scheduler as T scheduling, and dividing the difference value obtained by subtracting the value recorded by the P counter from the value recorded by the N counter by the load value of the packet receiving processing thread.
The periodic execution of the packet receiving scheduler is triggered by a system clock, and the initialization scheduling timer specifically operates as follows: the timer is set according to the configured execution period of the packet receiving scheduler.
Working principle: for the packet receiving scheduler, the configuration items are shown in the following table 1:
| configuration item | Description of the invention |
| Number of threads for packet collection processing | Is determined by the number of CPU cores and CPU configuration of the current device |
| Execution cycle of packet receiving scheduler | In milliseconds, with a recommended value range of 1-100 milliseconds |
| Packet reception mode switching shortest period | In seconds, the recommended value range is between 1 and 3600 seconds |
| Packet receiving rate range corresponding to each packet receiving mode | The upper and lower limits of the range are in "packets per second", e.g. 1MPPS-2MPPS |
| Initial mode of packet receiving processing thread | One of four packet receiving modes |
TABLE 1
The periodic execution of the packet receiving scheduler is triggered by a system clock, and the initialization scheduling timer is to set the timer according to the configured execution period of the packet receiving scheduler. Each packet receiving processing thread processes data of one packet receiving queue, and two statistical counters are arranged, wherein the total number of the statistical counters is equal to twice the number of the packet receiving processing threads. These two statistics counters are named "P counter" and "N counter", and record the number of packet processes at the current time and the number of packets that have been processed at the previous statistics, respectively. Initializing a statistics counter refers to zeroing the value of the statistics counter. If the execution period of the scheduler is "T scheduling", the packet receiving processing speed, that is, the calculation formula of the load is: load= (N counter-P counter)/T schedule. The unit of load is "one/second". After the load is calculated, the scheduler updates the value of the N counter into the P counter. After initialization, the N counter is updated by a packet receiving processing thread, and the packet receiving scheduler only performs reading operation on the N counter. When the calculated load exceeds the current mode range, the packet receiving scheduler does not immediately switch the packet receiving mode of the packet receiving processing thread, but firstly judges whether the running time of the current mode reaches the shortest period of the packet receiving mode switching. The shortest period of the packet receiving mode switching is far longer than the execution period of the packet receiving scheduler, so that switching overhead caused by frequent mode switching is avoided.
Other portions of this embodiment are the same as those of embodiment 1 described above, and thus will not be described again.
Example 3:
on the basis of any one of the above embodiments 1-2, in order to better implement the present invention, as shown in fig. 3, further, each packet-receiving processing thread needs to process data of a packet-receiving queue, and a specific workflow mainly includes:
firstly, reading a data packet of a packet receiving queue, then processing the read data packet, updating a statistics counter corresponding to a self packet receiving processing thread after the read data packet is processed, and then continuously reading the data packet of the packet receiving queue, and circulating the operations.
Working principle: the packet receiving processing thread is an independent execution unit, and the packet receiving dispatcher only controls the execution mode and does not interfere with the specific packet receiving process. And secondly, the packet receiving processing thread only updates an N counter, and a P counter is updated by a packet receiving scheduler.
Other portions of this embodiment are the same as any of embodiments 1-2 described above, and thus will not be described again.
Example 4:
the load threshold value at the time of mode switching is determined by the configuration parameter "the packet receiving rate range corresponding to each packet receiving mode" on the basis of any one of the above embodiments 1 to 3, for example, in an embodiment, the packet receiving rate ranges of the various modes determined according to the CPU and software loads are as shown in the following table 2:
| full interrupt mode | <2MPPS |
| Interrupt batch mode | 2-5MPPS |
| Interval polling mode | 5-10MPPS |
| Busy etc. polling mode | >10MPPS |
TABLE 2
The current packet receiving rate is 1.5MPPS and the corresponding packet receiving mode is a full interrupt mode. When the network load becomes heavy, the packet receiving rate becomes 3MPPS. The packet receiving scheduler will switch the mode to interrupt the batch mode.
Other portions of this embodiment are the same as any of embodiments 1 to 3 described above, and thus will not be described again.
Example 5:
the present invention, on the basis of any one of the above embodiments 1 to 4, is also applicable in particular in the following scenarios:
scene 1: the Linux kernel NAPI packet receiving processing module is improved: and adding a kernel thread to perform IO scheduling, and configuring the netdev_widget/netdev_widget_usecs value of the NAPI in real time, so as to solve the problem of scene limitation in the actual use of the NAPI.
Scene 2: default busy waiting polling mode in the DPDK packet receiving process is improved: the packet reception processing thread adds a plurality of packet reception modes and a packet reception mode scheduler. And the CPU idle consumption of busy polling and the like when the network load is low is avoided, and the energy efficiency is improved.
Scene 3: in a dedicated network processor: besides the software implementation mode, the mode scheduling method mentioned in the scheme can be applied to a special hardware network processor to obtain higher network performance and CPU utilization rate.
Other portions of this embodiment are the same as any of embodiments 1 to 4 described above, and thus will not be described again.
The foregoing description is only a preferred embodiment of the present invention, and is not intended to limit the present invention in any way, and any simple modification, equivalent variation, etc. of the above embodiment according to the technical matter of the present invention fall within the scope of the present invention.