Disclosure of Invention
The invention provides a method and a system for processing network delay based on big data, which solve the problem of network delay generated in a peak period under a mass data scene in the prior art, effectively reduce the network delay and improve the user experience.
In order to achieve the above object, the present invention provides a method for processing network delay based on big data, which includes:
the cloud server acquires a task execution instruction;
predicting the quality of service (QoS) of the task based on an Artificial Intelligence (AI) model, and judging whether network delay abnormity occurs in the task execution process;
if the network delay is abnormal, before the task is loaded, a sliding time window is expanded, and task content data are obtained in advance based on the expanded sliding time window;
and executing the task based on the task content data.
Optionally, predicting the quality of service QoS of the task based on an artificial intelligence AI model includes:
splitting the AI model into a plurality of target workflows, wherein each target workflow corresponds to a prediction subtask of a performance index;
paralleling the target workflow into a plurality of model prediction tasks, and performing gradient prediction on the prediction tasks through the plurality of model prediction tasks;
and summarizing the gradient prediction results, performing weighting operation, and acquiring a final QoS prediction score.
Optionally, before splitting the AI model, the method further comprises:
and splitting the QoS into a plurality of performance indexes, and adding the performance indexes into a target prediction queue to mark the plurality of model prediction tasks as subtasks of the corresponding performance indexes in the target prediction queue.
Optionally, if the AI model is a distributed horizontal federal learning model, the parallelizing the target workflow into a plurality of model prediction tasks, and performing gradient prediction on the prediction tasks through the plurality of model prediction tasks includes:
the cloud server distributes the federated learning model to each distributed network node;
the distributed network node executes a model prediction task based on the federal learning model, and uploads the executed prediction result encryption gradient to the cloud server;
and the cloud server aggregates the prediction results and distributes the updated and iterated federated learning model to the distributed network nodes so that the distributed network nodes update the locally stored federated learning model.
Optionally, if the task execution instruction is a video playing instruction, determining whether a network delay abnormality occurs in the process of executing the task, including:
the cloud server acquires streaming media data of the video, wherein the streaming media data comprises video data and audio data;
when the video data are coded, the cloud server predicts the coding time of the video data and writes the video coding time into a timestamp of the video data;
when the audio data are coded, the cloud server predicts the coding time of the audio data and writes the audio coding time into a time stamp of video data;
the cloud server sets the time stamp of the video data and the time stamp of the audio data as coding double time stamps, and writes the coding double time stamps into video frames;
when the user terminal decodes the video frame, extracting the coding double time stamps, predicting video and audio decoding time in the video frame, and setting the decoding double time stamps as decoding double time stamps;
and the user terminal calculates the difference value between the decoding double time stamp and the encoding double time stamp, and if the difference value is greater than a preset threshold value, whether network delay abnormity occurs in the task execution process is judged.
Optionally, after the user terminal calculates a difference between the decoded double timestamp and the encoded double timestamp, the method includes:
the user terminal respectively calculates instantaneous difference values of decoding double time stamps and encoding double time stamps of each video frame, wherein the instantaneous difference values comprise time difference values of video decoding and video encoding in the video frames and time difference values of audio decoding and encoding;
calculating an average difference value of a plurality of video frames based on the instantaneous difference value of each video frame;
the instantaneous difference value of each video frame is differenced with the average difference value to obtain a difference value sequence;
and if the first difference value in the difference value sequence is larger than a preset difference value, determining that the video frame corresponding to the first difference value has instantaneous time delay.
Optionally, the video frame is a supplemental enhancement information frame SEI.
Optionally, the method further comprises:
and if the video decoding time stamp is not consistent with the audio decoding time stamp or the video coding time stamp is not consistent with the audio coding time stamp in the same video frame, determining that the video frame has the problem of audio and video playing asynchronization.
Optionally, pre-acquiring task content data based on the expanded sliding time window includes:
and the cloud server acquires a group of pictures (GOP) preloaded in the time window range based on the expanded sliding time window, wherein the GOP consists of a video picture I frame, a video picture B frame, a video picture P frame and an SEI frame.
The embodiment of the present invention further provides a system, which includes a memory and a processor, where the memory stores computer-executable instructions, and the processor implements the method when running the computer-executable instructions on the memory.
The method and the system of the embodiment of the invention have the following advantages:
in the embodiment of the invention, an AI model is divided into a plurality of target workflows, complex network QoS prediction and evaluation are decomposed, the plurality of target workflows are divided into a plurality of model prediction tasks, a specific performance index is predicted and evaluated in parallel through the plurality of model prediction tasks, if network delay is predicted, task content data needs to be loaded as much as possible before the task starts and before the network delay occurs, for example, when video in a streaming media format is decoded and played, a playing section is usually loaded, a next section is obtained from a network after the playing of the playing section is completed, when the network delay is serious, the next section is not loaded late, serious blockage occurs, after the method provided by the embodiment of the invention is applied, more task content data can be pre-loaded before the next section is executed before the network delay comes, the phenomenon of blocking is avoided under the condition that the network delay is abnormal, namely, the task is not or less influenced by the network delay in the process of executing the task. The user experience is improved.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention clearer, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention. In addition, the technical features involved in the embodiments of the present invention described below may be combined with each other as long as they do not conflict with each other.
Fig. 1 is a network architecture diagram according to an embodiment of the present invention, and as shown in fig. 1, the embodiment of the present invention includes a cloud server 100 and a user terminal 200, where the cloud server 100 includes acentral cloud server 100a and anedge cloud server 100b, thecentral cloud server 100a is located in a network center and has a strong data processing capability, theedge cloud 100b is located in an edge layer and is a lightweight and miniaturized cloud service cluster, and the user terminal 200 may further include anautomobile 200a, a smart phone/computer 200b, a VR/AR device 200c, and other internet ofthings devices 200 d. The method can receive various data sent by the cloud server, including but not limited to model training data, audio and video coding and decoding data, automatic driving data, VR/AR image rendering data and the like. The data types are many and the data amount is huge, so that how to prejudge the network delay and perform network delay processing and avoiding is an important problem.
To achieve the above object, as shown in fig. 2, the present invention provides a method for processing network delay based on big data, which includes the following steps:
s101, a cloud server acquires a task execution instruction;
the cloud server may be thecenter cloud server 100a or theedge cloud server 100b, where the cloud server may obtain a task execution instruction, and the task execution includes, but is not limited to, playing of audio and video, game rendering of VR/AR, starting of an automobile, navigation, driving assistance, and the like. The task execution instruction can be sent to the cloud server by the user terminal, or can be sent to a specific cloud server by another server.
S102, predicting the QoS of the task based on an artificial intelligence AI model, and judging whether network delay abnormity occurs in the task execution process;
for different service scenarios, the QoS evaluation indexes are different, and for network delay, the traffic condition of the current network is often comprehensively judged from a plurality of network performance indexes such as network throughput, signal-to-noise ratio (SNR), reference signal quality indicator (RSRQ), and the like, and each performance index represents different functions and meanings, so that AI model prediction needs to be performed on each performance index, and based on an inference conclusion, the indexes are synthesized to obtain the QoS score.
In one embodiment of the invention, the step of predicting the QoS of the task comprises:
s1021, splitting the AI model into a plurality of target workflows, wherein each target workflow corresponds to a prediction subtask of a performance index;
specifically, the cloud server may split the AI model into a plurality of target workflows through the AI model layer, and divide each workflow into a plurality of model prediction tasks, as shown in fig. 3, the cloud server may split the AI model into x (x is a positive integer greater than 1) target workflows, each workflow corresponds to a prediction subtask of one performance index, for example, workflow1 (abbreviated as w1 in fig. 3) corresponds to prediction ofperformance index 1, and workflow2 (abbreviated as w2 in fig. 3, and the rest is the same as the above) corresponds to prediction of performance index 2. Each workflow can act as a plurality of model prediction tasks in parallel, namely, a single performance index subtask is subjected to model prediction through the plurality of parallel model prediction tasks, so that the prediction precision and speed can be effectively improved.
In addition, before splitting the AI model, the cloud server also splits the QoS into a plurality of performance indicators, and adds the performance indicators into a target prediction queue to mark the plurality of model prediction tasks as subtasks of the corresponding performance indicators in the target prediction queue.
S1022, the target workflow is parallelly divided into a plurality of model prediction tasks, and gradient prediction is carried out on the prediction tasks through the plurality of model prediction tasks;
specifically, the AI model according to the embodiment of the present invention may adopt an AI model such as deep learning and machine learning, and may exemplarily adopt an AI model of federal learning. Federal learning, originally proposed by google, is essentially a distributed machine learning technique, or machine learning framework. In the embodiment of the invention, the gradient prediction of the prediction tasks is performed through the plurality of model prediction tasks, which can be understood as the gradient prediction of the prediction tasks through a plurality of distributed federal learning model prediction tasks, and specifically, the federal learning model is distributed to all distributed network nodes (nodes) by the cloud server; the distributed network node executes a model prediction task based on the federal learning model, and uploads an executed prediction result encryption gradient to a cloud server; and the cloud server aggregates the prediction results and distributes the updated and iterated federated learning model to the distributed network nodes so that the distributed network nodes update the locally stored federated learning model.
And S1023, summarizing the gradient prediction results, performing weighting operation, and obtaining a final QoS prediction score.
For the QoS index, since each performance parameter has different importance and priority, different weight values also need to be set for the prediction results of different indexes, and the prediction score of the QoS can be finally obtained through weighting operation. For example, there are 4 indicators for QoS, load rate, RSRP, SNR, and throughput, where the weight ofindicator 1 and 4 is the highest, 0.3 and 0.28, respectively, and the ratio of indicator 2 and 3 is lower, 0.2 and 0.22, respectively, then the specific QoS prediction score result isindicator 1weight 1+ indicator 2 weight 2+. If the score is high, the probability of occurrence of the time delay is low, and if the score is low, the probability of occurrence of the time delay is high.
S103, if network delay abnormity occurs, before the task is loaded, a sliding time window is expanded, and task content data are obtained in advance based on the expanded sliding time window;
the sliding time window is a time range of data acquisition, for example, 1s, 10s and the like, is a time unit, defines a time starting point and an end point of task content data acquisition, and if it is predicted that network delay will occur, task execution needs to be affected as little as possible in the predicted delay time period, so that occurrence of network delay needs to be "avoided" in advance.
For example, if the task is video playing, the cloud server may obtain, based on the expanded sliding time window, a group of pictures (GOP) preloaded in the time window range, where the GOP is composed of a video picture I frame, a video picture B frame, a video picture P frame, and a video picture SEI frame.
And S104, executing the task based on the task content data.
In one embodiment, the embodiment of the present invention is applicable to scenes such as audio and video playing, live broadcasting, VR/AR video image rendering, and the like, and exemplarily, in the embodiment of the present invention, the task execution instruction is a video playing instruction, and then in the embodiment of the present invention, the cloud server determines whether a network delay abnormality occurs in the process of executing the task, which may specifically include the following steps:
s201, a cloud server acquires streaming media data of a video, wherein the streaming media data comprises video data and audio data; for those skilled in the art, the streaming media data includes not only sound data but also image data, where the sound data is audio data and the image data is video data, and when video is played, the video data and the audio data are decoded and played synchronously.
S202, when the cloud server encodes the video data, predicting encoding time of the video data, and writing the encoding time of the video into a timestamp of the video data; specifically, the cloud server may predict the encoding/decoding time of the video data/audio data through the methods of S1021 to S1023, which will not be described herein again.
S203, when the audio data are coded, the cloud server predicts the coding time of the audio data and writes the audio coding time into a time stamp of the video data;
s204, the cloud server sets the time stamp of the video data and the time stamp of the audio data as coding double time stamps and writes the coding double time stamps into the video frame;
specifically, in the encoding stage, a hardware encoding mode is adopted for adding timestamps to video and audio, so that the speed is high and the cost is lower.
In the embodiment of the invention, the video volume is very large in the transmission process of massive video data, so that video encoding and decoding work is required. As shown in fig. 4, the encoder encodes a plurality of images to generate a section of group of pictures GOP (group of pictures), and the decoder reads the section of GOP to decode and then reads the pictures to render and display the pictures when playing. A GOP is a group of consecutive pictures, consisting of one I-frame and several B/P-frames, which are the basic units accessed by video image encoders and decoders, and the sequence of which is repeated until the end of the picture. I frames are intra-coded frames (also called key frames), P frames are forward predicted frames (forward reference frames), and B frames are bi-directional interpolated frames (bi-directional reference frames). In brief, an I-frame is a complete picture, while P-frames and B-frames record changes relative to the I-frame. Without an I-frame, P-frames and B-frames cannot be decoded.
In addition, since the cloud server cannot determine whether the acquired video frame is an I frame, and can determine whether the acquired video frame is an I frame only when the video frame is encoded, it is necessary to supplement a enhancement information frame SEI frame (an accompanying frame of the I frame).
In the encoding stage, the audio/video time stamp is put into an SEI frame, and each GOP is labeled. The encoded frame sequence reaches user equipment or a user terminal after passing through a streaming media platform, a coding timestamp is obtained by decoding when a user terminal plays the encoded frame sequence, and then the coding timestamp is compared with a timestamp obtained by acquiring a target video frame in real time, namely, time delay evaluation can be realized according to absolute time.
S205, when the user terminal decodes the video frame, extracting the coding double time stamps, predicting the video and audio decoding time in the video frame, and setting the decoding double time stamps as decoding double time stamps;
s206, the user terminal calculates the difference value between the decoding double time stamp and the coding double time stamp, and if the difference value is larger than a preset threshold value, whether network delay abnormity occurs in the task execution process is judged.
After the user terminal calculates the difference between the decoding double time stamp and the encoding double time stamp, the embodiment of the invention further comprises the following steps:
the user terminal respectively calculates instantaneous difference values of decoding double time stamps and encoding double time stamps of each video frame (SEI frame), wherein the instantaneous difference values comprise time difference values of video decoding and video encoding in the video frame and time difference values of audio decoding and encoding;
calculating an average difference value of a plurality of video frames based on the instantaneous difference value of each video frame;
the instantaneous difference value of each video frame is differenced with the average difference value to obtain a difference value sequence;
and if the first difference value in the difference value sequence is larger than a preset difference value, determining that the video frame corresponding to the first difference value has instantaneous time delay. Wherein the first difference is any difference in the difference sequence.
Optionally, the embodiment of the present invention may further determine that the audio and video playing is not synchronous:
and if the video decoding time stamp is not consistent with the audio decoding time stamp or the video coding time stamp is not consistent with the audio coding time stamp in the same video frame, determining that the video frame has the problem of audio and video playing asynchronization.
The method and the system of the embodiment of the invention have the following advantages:
in the embodiment of the invention, an AI model is divided into a plurality of target workflows, complex network QoS prediction and evaluation are decomposed, the plurality of target workflows are divided into a plurality of model prediction tasks, a specific performance index is predicted and evaluated in parallel through the plurality of model prediction tasks, if network delay is predicted, task content data needs to be loaded as much as possible before the task starts and before the network delay occurs, for example, when video in a streaming media format is decoded and played, a playing section is usually loaded, a next section is obtained from a network after the playing of the playing section is completed, when the network delay is serious, the next section is not loaded late, serious blockage occurs, after the method provided by the embodiment of the invention is applied, more task content data can be pre-loaded before the next section is executed before the network delay comes, the phenomenon of blocking is avoided under the condition that the network delay is abnormal, namely, the task is not or less influenced by the network delay in the process of executing the task. The user experience is improved.
The embodiment of the present invention further provides a system, which includes a memory and a processor, where the memory stores computer-executable instructions, and the processor implements the method when running the computer-executable instructions on the memory.
Embodiments of the present invention also provide a computer-readable storage medium having stored thereon computer-executable instructions for performing the method in the foregoing embodiments.
FIG. 5 is a diagram illustrating the hardware components of the system in one embodiment. It will be appreciated that fig. 5 only shows a simplified design of the system. In practical applications, the systems may also respectively include other necessary elements, including but not limited to any number of input/output systems, processors, controllers, memories, etc., and all systems that can implement the big data management method of the embodiments of the present application are within the protection scope of the present application.
The memory includes, but is not limited to, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM), or a portable read-only memory (CD-ROM), which is used for storing instructions and data.
The input system is for inputting data and/or signals and the output system is for outputting data and/or signals. The output system and the input system may be separate devices or may be an integral device.
The processor may include one or more processors, for example, one or more Central Processing Units (CPUs), and in the case of one CPU, the CPU may be a single-core CPU or a multi-core CPU. The processor may also include one or more special purpose processors, which may include GPUs, FPGAs, etc., for accelerated processing.
The memory is used to store program codes and data of the network device.
The processor is used for calling the program codes and data in the memory and executing the steps in the method embodiment. Specifically, reference may be made to the description of the method embodiment, which is not repeated herein.
In the several embodiments provided in the present application, it should be understood that the disclosed system and method may be implemented in other ways. For example, the division of the unit is only one logical function division, and other division may be implemented in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. The shown or discussed mutual coupling, direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, systems or units, and may be in an electrical, mechanical or other form.
Units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. The procedures or functions according to the embodiments of the present application are wholly or partially generated when the computer program instructions are loaded and executed on a computer. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable system. The computer instructions may be stored on or transmitted over a computer-readable storage medium. The computer instructions may be transmitted from one website, computer, server, or data center to another website, computer, server, or data center by wire (e.g., coaxial cable, fiber optic, Digital Subscriber Line (DSL)), or wirelessly (e.g., infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device, such as a server, a data center, etc., that includes one or more of the available media. The usable medium may be a read-only memory (ROM), or a Random Access Memory (RAM), or a magnetic medium, such as a floppy disk, a hard disk, a magnetic tape, a magnetic disk, or an optical medium, such as a Digital Versatile Disk (DVD), or a semiconductor medium, such as a Solid State Disk (SSD).
The above is only a specific embodiment of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily think of various equivalent modifications or substitutions within the technical scope of the present application, and these modifications or substitutions should be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.