Disclosure of Invention
In view of the above problems, the present invention provides a data transmission control method based on erasure coding.
The invention relates to a data transmission control method based on erasure coding, which comprises the following steps:
step 1: chunk segmentation and encoding: inputting a file to be transmitted or streaming data, segmenting the file or the streaming data into a plurality of chunks, encoding each chunk into a plurality of data segments with equal size by using erasure correction codes, and putting the data segments into a sending cache;
step 2: data segment selection and transmission: selecting the untransmitted data segments from the sending buffer for transmission according to a given sending rate;
and step 3: and (3) receiving a data segment: placing the received data segment into a receiving cache;
and 4, step 4: chunk reorganization and decoding: attempting to decode the corresponding chunk after receiving the new data segment, and if the decoding is successful, performing streaming output on the chunk or reconstructing and restoring the chunk into a source data file;
and 5: confirmation and feedback: sending a confirmation message to each received data segment, and feeding back the number of data segments which need to be transmitted for completing decoding and corresponding to chunk;
step 6: bandwidth estimation and transmission control: estimating available bandwidth of the network according to the receiving condition of the feedback message of the receiving end so as to update the sending rate of the data segment; controlling the sending of the chunk residual data according to the number of the data segments required by the chunk fed back by the receiving end, so as to avoid redundant transmission of the data; and removing the data segment left by the transmitted chunk from the sending buffer.
Further, the sending rate of the data segment instep 2 is R, and the value of R is equal to the estimated available bandwidth Bw of the current path; the time interval for sending the data segments is v/R, wherein v is the length of the data segments; when a data segment is transmitted, selecting the chunk with the minimum number value which is not transmitted from the transmission buffer for checking to see whether the data segment needs to be selected for transmission.
The chunk check is as follows:
for chunk i, i represents a chunk number; setting the maximum number of the data segment sent by the current sending end as SiThe maximum data segment number that has received an acknowledgement is Ai(ii) a The total amount of data segments that have been sent by the chunk but have not yet been acknowledged is Si-Ai(ii) a The ratio of successful transmission of the transmission path is estimated as gammaiThen there is γ in these datai(Si-Ai) Can be received by the receiving end.
If chunk still requires OiThe decoding can be completed for each data segment, then at gammai(Si-Ai)<OiThen, selecting the data segment with the minimum number which is not transmitted from chunk i and sending the data segment to the receiving end, and updating Si(ii) a Otherwise, the same data segment selection and control is continued at the next chunk which has not been transmitted.
Further, the control method of the sending rate R is as follows:
(1) the sending rate R defaults to the maximum available bandwidth Bw supported by the network.
(2) The sending end continuously sends M data segments and then pauses; after the sending end receives N confirmation messages (the values of the data M and N are different according to different networks, and M is greater than N), the estimation of the available bandwidth Bw is immediately carried out: setting the duration between the first arrival and the last arrival acknowledgement message to T1; the minimum sending sequence number confirmed by the confirmation messages is A1, and the maximum sequence number is A2; then the available bandwidth of the current path is estimated to be N × v/T1, and the packet loss rate is estimated to be 1-N/(a2-a1+ 1).
(3) If the packet loss rate threshold caused by the non-congestion of the network path is set to be less, if 1-N/(A2-A1+1) > less, the sending end takes N x v/T1 as a new available bandwidth Bw value; otherwise, indicating that the bandwidth is not full, the sending rate may be increased, with N v/T1 θ as the new value of the available bandwidth Bw, where θ >1, representing the increased proportion of the sending rate, is a configurable parameter.
(4) If the N acknowledgement messages are not collected within the maximum acceptance time T, then (2) is returned.
(5) After the Bw finishes the first estimation, the sending end immediately adopts a new sending rate R to send the data segment; thereafter, each time an acknowledgment message is received, the sender performs the same calculations as described above based on the most recently received N acknowledgment messages.
Further, rateless coding or the like may be used as the erasure coding.
Compared with the prior art, the invention has the beneficial technical effects that:
1. the erasure correction coding is utilized to ensure that the system realizes reliable data transmission in an unreliable environment, and the transmission stability is ensured by automatically detecting and adjusting the transmission rate in a network environment with fluctuating bandwidth, so that the occurrence of congestion is avoided, and the transmission quality of the current network is ensured.
2. The available bandwidth in the network can be fully utilized while data transmission is avoided, and effective utilization of the network bandwidth is guaranteed.
3. The sending selection strategy of the data segments among different chunks ensures that the earlier chunk is transmitted with higher priority, reduces the time required for completing transmission of one chunk, and reduces the delay caused by introducing erasure coding technology.
Detailed Description
The invention is described in further detail below with reference to the figures and the detailed description.
The design idea of the invention is as follows:
1. for a large file to be transmitted or continuously generated data to be transmitted, a data transmitting end divides the large file to be transmitted into a plurality of chunks, and each chunk is encoded into a plurality of data segments with the same size by adopting an erasure correcting encoding technology; each data segment obtained by coding can be just put into a UDP message for transmission; (by splitting a large file into multiple chunks, on one hand, the time required for erasure coding and decoding can be reduced, and on the other hand, the transmission delay of each data segment can be reduced, which is beneficial to reliable transmission of streaming data, and by splitting one large transmission task into multiple chunks for processing, which can help the sending and receiving ends to utilize the multi-core processing capability of computer hardware, and perform concurrent processing to reduce the required time).
2. The transmitting end and the receiving end adopt UDP to complete the transmission of the coded data, and by utilizing the characteristics of erasure correction coding, the transmitting and receiving parties only need to ensure that a receiver receives enough decoded data quantity, and do not care whether the received data is continuous or not.
3. The receiving end sends a confirmation message to the received data segment and feeds back the number of data segments needed to finish the decoding of the chunk; the transmitting end estimates the available bandwidth in the network according to the arrival rule of the confirmation message, and estimates and adjusts the transmitting rate of the data segment; and the sending end controls the sending of the chunk data segment according to the data volume required by chunk decoding fed back by the receiving end, so that the transmission of excessive redundant data segments is avoided.
4. Before the data segment generated by coding in each chunk is sent, the sending end does not need to retransmit the unacknowledged data segment; after all the data segments are sent, the sending end sends the unacknowledged data segments in sequence, and the receiving end is ensured to receive enough data.
5. In order to fully utilize the available bandwidth in the network, the transmitting end may transmit multiple chunk data segments simultaneously.
To achieve the above design, the present invention provides a data transmission control method based on erasure coding, as shown in fig. 1, specifically:
a: chunk segmentation and encoding: inputting a file to be transmitted or streaming data, segmenting the file or the streaming data into a plurality of chunks, encoding each chunk into a plurality of data segments with equal size by using erasure correction codes, and putting the data segments into a sending cache;
b: data segment selection and transmission: selecting the untransmitted data segments from the sending buffer for transmission according to a given sending rate;
c: and (3) receiving a data segment: placing the received data segment into a receiving cache;
d: chunk reorganization and decoding: attempting to decode the corresponding chunk after receiving the new data segment, and if the decoding is successful, performing streaming output on the chunk or reconstructing and restoring the chunk into a source data file;
e: confirmation and feedback: sending a confirmation message to each received data segment, and feeding back the number of data segments which are not decoded and correspond to chunk and need to be transmitted;
f: bandwidth estimation and transmission control: estimating available bandwidth of the network according to the receiving condition of the feedback message of the receiving end so as to update the sending rate of the data segment; controlling the sending of the chunk residual data according to the number of the data segments required by the chunk fed back by the receiving end, so as to avoid redundant transmission of the data; and removing the data segment left by the transmitted chunk from the sending buffer.
The selection and sending method of the data segment in B comprises the following steps:
in the invention, the data of each chunk generates a plurality of data segments with the same size through erasure coding, and the data segments are assumed to be v; if the available bandwidth of the current path estimated by the F module is Bw, the sending rate is R, and the value of R is equal to the estimated available bandwidth of the current path Bw; and simultaneously, the module B continuously selects data segments from the sending buffer to send to the data receiving end according to the time interval of v/R. In order to fully utilize the remaining bandwidth and avoid excessive redundant transmission, the present invention adopts a novel data segment selection method, and the working principle thereof is as follows.
When a data segment can be sent, the module B selects the chunk with the smallest number value that has not been transmitted from the sending buffer to check, and it is necessary to select the data segment for sending.
The examination contents are as follows:
for chunk i, assume that the maximum number of the data segment that has been sent by the current sender is SiThe maximum data segment number that has received an acknowledgement is Ai(ii) a Then the total amount of data segments that have been sent by the chunk but have not yet been acknowledged is Si-Ai. It is contemplated that these data segments in the transmission may not all be received because of packet loss. If the estimated value of the proportion of successful transmission of the transmission path is gammaiThen there is about γ in these datai(Si-Ai) Can be received by the receiving end.
Further, if the latest feedback message at the receiving end indicates that the chunk still needs to be sent by OiDecoding can be completed by each data segment; then, if γi(Si-Ai)<OiThe B module selects the data segment with the minimum number which is not transmitted from chunk i to send to the receiving end, and updates Si(ii) a Otherwise, the module B continues to perform similar data segment selection and control on the next chunk that has not been transmitted.
The method for estimating the bandwidth (i.e., controlling the transmission rate) in F is designed as follows:
in a dynamic network, the available bandwidth between the sending and receiving ends may change over time. Therefore, the invention estimates the available bandwidth on the transmission path by using the sending and confirming conditions of the data segment, and dynamically adjusts the sending rate R used by the B module.
In the initial state, the transmission rate R defaults to the maximum transmission rate supported by the network.
In order to start the whole process, at the beginning, a sending end continuously sends M data segments, and then pauses;
after the sending end receives N confirmation messages, the estimation of R is immediately carried out: assume that the duration between the first arrival and the last arrival acknowledgement message is T1; the minimum sending sequence number confirmed by the confirmation messages is A1, and the maximum sequence number is A2; then, the available bandwidth of the current path is estimated to be N × v/T1, and the packet loss rate is estimated to be 1-N/(a2-a1+ 1).
Assuming that a packet loss rate threshold caused by non-congestion of a network path is less, if 1-N/(a2-a1+1) > less, a sending end takes N × v/T1 as a new R value; otherwise, indicating that the bandwidth is not full, the sending rate can be increased, taking N x v/T1 x θ as a new R value, where θ >1, representing the increasing proportion of the sending rate, is a configurable parameter;
if N confirmation messages are not collected within the time T2, repeating the process;
after R finishes the first estimation, the sending end immediately adopts the new sending rate to send the data segment; thereafter, each time an acknowledgment message is received, the sender performs a similar calculation as described above based on the most recently received N acknowledgment messages.
At this time, if no new acknowledgement message is received within the time T2, the sender returns to the start phase and starts over.