Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be obtained by a person skilled in the art without any inventive step based on the embodiments of the present invention, are within the scope of the present invention.
It should be noted that the terms "first," "second," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or server that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
In order to make the objects, technical solutions and advantages disclosed in the embodiments of the present invention more clearly apparent, the embodiments of the present invention are described in further detail below with reference to the accompanying drawings and the embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the embodiments of the invention and are not intended to limit the embodiments of the invention. In order to more clearly illustrate the technical solutions and the technical advances thereof in the embodiments of the present invention, the embodiments of the present invention first set forth the following technical terms and technical backgrounds used or implemented in the embodiments of the present invention:
transport Stream file (TS): one feature of the TS file format is that it requires independent decoding from any segment of the video stream. The basic composition of a TS stream file is in units of packets (packets), one packet usually contains 188 bytes (bytes), the first byte of the packet is fixed to 0x47, and 0x47 can be regarded as an index of the packet and is generally called a sync header (sync header). The whole TS file is formed by serially connecting individual packets. These packets, in combination, contain audio (audio) data, video (video) data, and other descriptive information that may be used. The TS file includes three frame types, I frame, P frame, and B frame. The I frame can be independently coded and decoded, the P frame needs to forward reference the I frame or the P frame image coded and decoded before to complete coding and decoding, and the B frame needs to forward and backward reference two P frame images coded and decoded before to complete coding and decoding.
Fixed bit rate: the fixed bitrate means that the output bitrate of the encoder (or the input bitrate of the decoder) is a fixed value for a duration that is substantially constant during the compression/playback.
Variable bit rate: the variable bit rate is in contrast to the fixed bit rate, where the output code rate of the encoder or the input code rate of the decoder is not fixed, but rather large or small. For example, at a point in time where the bit rate is 240Kbps, the next second bit rate may be 1 Mbps. Most of the current audio and video files, such as TS, MP4 files, RM, etc., are variable bit rate files. MP4 is a set of compression coding standards for audio and video information. The RM format is a streaming media video file format developed by RealNetworks company, and different compression ratios can be formulated according to different rates of network data transmission, so that real-time transmission and playing of video files on a low-rate network are realized.
Average bit rate: that is, the average bit rate of the target file in a period of time is equal to the value calculated by dividing the total size of the file by the duration of the file.
Displaying the time stamp: presentation Time Stamp (PTS), which is used to indicate when the player displays the data of this frame. The size of this value reflects the precedence displayed by the frame map.
Decoding the time stamp: decoding Time Stamp (DTS) indicating when the player decodes the data of this frame. The size of this value reflects the decoding order of the frame map.
And (3) ordinary positioning playing: the common positioning playing is commonly used in the playing scene of audio and video. And generating a positioning time point by means of dragging the progress bar or clicking at a specific position of the progress bar and the like, jumping the audio and video to an I-frame image which is the closest forward to the positioning time point, and starting playing from the position of the I-frame image.
And (3) accurate positioning and playing: the accurate positioning playing can also be used in the playing scene of the audio and video. And generating a positioning time point by means of dragging the progress bar or clicking at a specific position of the progress bar and the like, jumping the audio and video to an I frame, P frame or B frame image which is the closest forward from the positioning time point, and starting playing from the position of the I frame, P frame or B frame image.
As shown in fig. 1, it shows a difference diagram of the pinpoint play and the normal position play. In fig. 1, the user needs to play at a time point (positioning time point) that is offset by twenty seconds from the playing start time point, but there is no I frame in the vicinity of the image of this file corresponding to the positioning time point, and the nearest forward I frame is at a position (point a) that is offset by sixteen seconds from the playing start time point, so that the normal positioning playing can only start from a position that is offset by sixteen seconds from the playing start time point. Two seconds before the positioning time point, that is, P frames exist at a position (point B) which is offset by eighteen seconds relative to the play start time point, the accurate positioning play can be started from a position which is offset by eighteen seconds relative to the play start time point, and as a result, the actual play time point B of the accurate positioning play can be closer to the positioning time point than the actual play time point a of the ordinary positioning play, so that the accurate positioning effect is achieved.
Fig. 2 is a schematic diagram of an implementation environment provided by an embodiment of the present invention, and referring to fig. 2, the implementation environment includes: the system comprises an audio and video playing server 01 and at least one audio and video playing terminal 02, wherein the audio and video playing server 01 is in communication connection with the audio and video playing terminal 02.
The audio/video playing terminal 02 may communicate with the audio/video playing Server 01 based on a Browser/Server mode (Browser/Server, B/S) or a Client/Server mode (Client/Server, C/S). The audio/video playing terminal 02 may include: the physical devices may also include software running in the physical devices, such as applications, and the like. For example, the recommendation terminal may run an audio/video player software, or a social software with an audio/video playing function.
The audio/video playing server 01 can be used for decoding audio/video files and transmitting the audio/video files to each audio/video playing terminal 02. The audio/video playing server 01 may include a server operating independently, or a distributed server, or a server cluster composed of a plurality of servers.
Referring to fig. 3, it shows a method for accurately positioning and playing provided by an embodiment of the present invention, where the method may use an entire body formed by an audio/video playing server, an audio/video playing terminal, or an audio/video playing server and an audio/video playing terminal in the above implementation environment as an implementation main body, and the method includes:
s101, responding to a precise positioning playing instruction, and acquiring a positioning time point and an average bit rate of a target file pointed by the precise positioning playing instruction.
Specifically, the positioning time point may be generated based on a trigger instruction such as a user dragging the progress bar or clicking at a specific position of the progress bar. And the positioning time point is the time point at which the target file expected by the user is about to jump to play. The target file is a file pointed by the accurate positioning playing instruction, and may be a file corresponding to an object where the user drags the progress bar or a file corresponding to an object clicked by the user. The target file may specifically be a TS transport stream file.
Specifically, as shown in fig. 4, the obtaining of the locating time point and the average bit rate of the target file pointed by the precise locating playing instruction include:
s1011, acquiring a start frame corresponding to the file head of the target file and an end frame corresponding to the file tail of the target file.
And S1013, acquiring a starting display time stamp corresponding to the starting frame and an ending display time stamp corresponding to the ending frame.
And S1015, calculating the display duration of the target file according to the starting display time stamp and the ending display time stamp.
Specifically, the display Duration is an end frame PTS — a start frame PTS.
S1017, calculating the average bit rate of the target file according to the data volume of the target file and the display duration.
Specifically, the average bit rate averagebitmap of the target file is size _ total/Duration, where size _ total is the data amount of the target file.
S103, calculating a target offset position relative to the initial position in the target file according to the positioning time point and the average bit rate, wherein the distance between the display time stamp of the first frame image after the target offset position and the positioning time point is smaller than a preset threshold value.
In particular, the distance may be characterized by an absolute value of a difference between a display timestamp of the first frame image after the target offset position and the positioning time point.
The preset threshold may be set according to actual needs, for example, the preset threshold may be 2 seconds, 4 seconds, and the like.
In order to achieve accurate positioning playback, in one possible embodiment, an estimated offset position may be obtained in the target file, which is simply expressed by multiplying the positioning time point by the average bit rate, and the estimated offset position is used as the target offset position. This has the advantage of simple calculation of the target offset position, but also has certain limitations.
The reason is that after the target offset position is obtained, it is necessary to forward search for a first target image (I frame) according to the target offset position, and decode until a second target image (I frame, P frame, or B frame) meeting the requirement of accurate positioning playing is found, and then play is started. However, the target offset position obtained according to this embodiment may be far from the position of the second target image in the target file, and therefore, the decoding time for decoding to find the second target image may be increased and may be greatly influenced by the variable bit rate fluctuation condition.
Please refer to fig. 5, which shows a schematic diagram of the fluctuation situation of the variable bit rate during the actual audio/video playing process. Wherein the line labeled 1 corresponds to the average bit rate of the TS file and the line labeled 2 corresponds to the actual bit rate of the TS file. It can be seen that the bit rate of TS files varies widely, and at different file time points the bit rate varies widely. In the scenario shown in fig. 5, it is possible that such a situation arises: the positioning time point is 20 seconds, the variable bit rate fluctuation is large, so that the display time stamp of the I frame obtained based on the target offset position calculated according to the average bit rate is 4 seconds, and the display time stamp actually corresponding to the second target image closest to the positioning time point is 18 seconds, and then 14 seconds of data need to be decoded to find the second target image, which obviously takes a long time and affects the user experience.
In order to further improve the accuracy of the target offset position and shorten the decoding time, in a preferred embodiment, the calculating the target offset position in the target file relative to the starting position according to the positioning time point and the average bit rate includes:
and S1031, calculating an initial offset position according to the positioning time point and the average bit rate, and taking the initial offset position as a current offset position.
Specifically, the initial OffSet position OffSet is SeekPoint averagebitrate, where SeekPoint is a positioning time point and averagebitrate is an average bitrate.
S1033, a target display time stamp is obtained, and the target display time stamp is the display time stamp of the first frame image after the current offset position.
And S1035, judging whether the distance between the target display time stamp and the positioning time point is smaller than a preset threshold value.
Specifically, the distance between the target display timestamp and the positioning time point is represented by an absolute value of a difference between the target display timestamp and the positioning time point.
S1037, if yes, determining the current offset position as the target offset position.
And S1039, if not, obtaining a second offset position according to the target display timestamp, the positioning time point and the current offset position, updating the current offset position according to the second offset position, and repeatedly executing the step S1033.
Specifically, the obtaining a second offset position according to the target display timestamp, the positioning time point, and the current offset position as shown in fig. 7 includes:
and S10, obtaining a reference offset position according to the target display time stamp and the positioning time point.
Specifically, if the target display timestamp is smaller than the positioning time point, the reference offset position is set to be the middle position between the current offset position and the end of the target file; and if the target display timestamp is greater than the positioning time point, setting the reference offset position as the middle position between the current offset position and the start of the target file.
If SeekPoint > PTS
currentIn which PTS
currentAnd displaying the time stamp for the target, wherein the forward bit rate is larger, and the file halving position at the current offset position is found as a reference offset position. Namely, it is
Wherein, Offset
ref,Offset
currentRespectively a reference offset position and a current offset position; on the contrary, if SeekPoint < PTS
currentThat means that the forward bit rate is smaller, and the fold-half position before the current offset position is found as the reference offset position, i.e. the forward bit rate is smaller
And S20, acquiring a reference offset display time stamp, and calculating a reference average bit rate according to the reference offset display time stamp, the target display time stamp, the current reference position and the reference offset position, wherein the reference offset display time stamp is the display time stamp of the first frame of image after the reference offset position.
In particular, reference average bit rate
Wherein AverageBitrate
ref,PTS
refThe timestamps are displayed for the reference average bit rate and the reference offset, respectively.
And S30, calculating a second offset position according to the current reference position, the reference average bit rate, the positioning time point and the target display time stamp.
Specifically, the second Offset position Offset' is Offsetcurrent+(SeekPoint-PTScurrent)*AverageBitrateref。
In this preferred embodiment, the target offset position close enough to the file position corresponding to the positioning time point is approximated in a loop iteration manner, so that the time consumption for searching the second target image in a decoding manner can be reduced to the maximum extent and the speed can be increased compared with the previous embodiment.
The purpose of setting the preset threshold in step S103 is to reduce the time consumption of a subsequent decoding process based on the target offset position by finding a target offset position where the display timestamp of the first frame image after the target offset position is close enough to the positioning time point.
And S105, reading the data of the target file from the target offset position forwards until a first target image is found, wherein the first target image is an I-frame image with the forward distance being the closest to the target offset position.
And S107, starting decoding from the position of the first target image until a second target image is decoded, wherein the second target image is an I frame, a P frame or a B frame image of which the display timestamp is closest to the positioning time point.
S109, starting playing from the second target image.
The embodiment of the invention provides an accurate positioning playing method, which can achieve a more accurate playing effect compared with the common positioning playing, and shortens the positioning time by optimizing the search logic of the first target image near the positioning time point, thereby achieving the effect of quick playing and improving the viscosity of a user.
An embodiment of the present invention further provides an apparatus for accurately positioning a playing device, as shown in fig. 8, the apparatus includes:
a positioning time point and average bit rate determining module 201, configured to respond to a precise positioning playing instruction, and obtain a positioning time point and an average bit rate of a target file pointed by the precise positioning playing instruction;
a target offsetvalue obtaining module 203, configured to calculate a target offset position in a target file relative to an initial position according to the positioning time point and the average bit rate, where a distance between a display timestamp of a first frame image after the target offset position and the positioning time point is smaller than a preset threshold;
a first target image obtaining module 205, configured to read data of the target file from the target offset position forward until a first target image is found, where the first target image is an I-frame image with a forward distance closest to the target offset position;
a second target image obtaining module 207, configured to start decoding from the position where the first target image is located until a second target image is decoded, where the second target image is an I frame, a P frame, or a B frame image whose display timestamp is closest to the positioning time point;
a playing module 209, configured to start playing from the second target image.
Further, as shown in fig. 9, the target offsetvalue obtaining module 203 includes:
an initial offset position obtaining unit 2031 configured to calculate an initial offset position according to the positioning time point and the average bit rate, and use the initial offset position as a current offset position;
a target display timestamp obtaining unit 2033, configured to obtain a target display timestamp, where the target display timestamp is a display timestamp of a first frame of image after a current offset position;
a determining unit 2035, configured to determine whether a distance between the target display timestamp and the positioning time point is smaller than a preset threshold;
an updating unit 2037, configured to obtain a second offset position according to the target display timestamp, the positioning time point, and the current offset position, and update the current offset position according to the second offset position;
a targetposition output unit 2039 for determining the current offset position as the target offset position.
Specifically, the precise positioning playing device and the method in the embodiment of the present invention are all based on the same inventive concept.
The embodiment of the present invention further provides a computer storage medium, where the computer storage medium may store a plurality of instructions, and the instructions are suitable for being loaded by a processor and executing various steps of the accurate positioning playing method according to the embodiment of the present invention, where the specific execution process includes:
a method of pinpoint playback, the method comprising:
responding to a precise positioning playing instruction, and acquiring a positioning time point and an average bit rate of a target file pointed by the precise positioning playing instruction;
calculating a target offset position relative to an initial position in a target file according to the positioning time point and the average bit rate, wherein the distance between a display timestamp of a first frame image after the target offset position and the positioning time point is smaller than a preset threshold value;
reading the data of the target file from the target offset position forwards until a first target image is found, wherein the first target image is an I-frame image with the forward distance being the closest to the target offset position;
starting decoding from the position of the first target image until a second target image is decoded, wherein the second target image is an I frame, a P frame or a B frame image of which the display timestamp is closest to the positioning time point;
and starting playing from the second target image.
Further, the obtaining of the positioning time point and the average bit rate of the target file pointed by the precise positioning playing instruction includes:
the acquisition positioning time point and the average bit rate of the target file pointed by the accurate positioning playing instruction are obtained;
acquiring a starting display time stamp corresponding to the starting frame and an ending display time stamp corresponding to the ending frame;
calculating the display duration of the target file according to the starting display timestamp and the ending display timestamp;
and calculating the average bit rate of the target file according to the data volume of the target file and the display duration.
Further, the calculating a target offset position in the target file relative to the starting position according to the positioning time point and the average bit rate includes:
calculating an initial offset position according to the positioning time point and the average bit rate, and taking the initial offset position as a current offset position;
acquiring a target display time stamp, wherein the target display time stamp is the display time stamp of the first frame of image after the current offset position;
judging whether the distance between the target display timestamp and the positioning time point is smaller than a preset threshold value or not;
and if so, determining the current offset position as the target offset position.
Further, still include:
if not, obtaining a second offset position according to the target display timestamp, the positioning time point and the current offset position, updating the current offset position according to the second offset position, and repeatedly executing the steps of: and acquiring a target display time stamp, wherein the target display time stamp is the display time stamp of the first frame of image after the current offset position.
Further, the obtaining a second offset position according to the target display timestamp, the positioning time point, and the current offset position includes:
obtaining a reference offset position according to the target display timestamp and the positioning time point;
acquiring a reference offset display timestamp, and calculating a reference average bit rate according to the reference offset display timestamp, a target display timestamp, a current reference position and a reference offset position, wherein the reference offset display timestamp is a display timestamp of a first frame image after the reference offset position;
calculating a second offset position according to the current reference position, the reference average bit rate, a positioning time point and a target display time stamp.
Further, the obtaining a reference offset position according to the target display timestamp and the positioning time point includes:
if the target display timestamp is smaller than the positioning time point, setting the reference offset position as the middle position between the current offset position and the end of the target file; and if the target display timestamp is greater than the positioning time point, setting the reference offset position as the middle position between the current offset position and the start of the target file.
Further, fig. 10 shows a hardware structure diagram of an apparatus for implementing the method provided by the embodiment of the present invention, and the apparatus may participate in constituting or including the apparatus provided by the embodiment of the present invention. As shown in fig. 10, the device 10 may include one or more (shown as 102a, 102b, … …, 102 n) processors 102 (the processors 102 may include, but are not limited to, a processing device such as a microprocessor MCU or a programmable logic device FPGA, etc.), amemory 104 for storing data, and atransmission device 106 for communication functions. Besides, the method can also comprise the following steps: a display, an input/output interface (I/O interface), a Universal Serial Bus (USB) port (which may be included as one of the ports of the I/O interface), a network interface, a power source, and/or a camera. It will be understood by those skilled in the art that the structure shown in fig. 10 is merely illustrative and is not intended to limit the structure of the electronic device. For example, device 10 may also include more or fewer components than shown in FIG. 10, or have a different configuration than shown in FIG. 10.
It should be noted that the one or more processors 102 and/or other data processing circuitry described above may be referred to generally herein as "data processing circuitry". The data processing circuitry may be embodied in whole or in part in software, hardware, firmware, or any combination thereof. Further, the data processing circuitry may be a single, stand-alone processing module, or incorporated in whole or in part into any of the other elements in the device 10 (or mobile device). As referred to in the embodiments of the application, the data processing circuit acts as a processor control (e.g. selection of a variable resistance termination path connected to the interface).
Thememory 104 may be used to store software programs and modules of application software, such as program instructions/data storage devices corresponding to the method described in the embodiment of the present invention, and the processor 102 executes various functional applications and data processing by running the software programs and modules stored in thememory 104, so as to implement the above-mentioned one precise positioning playing method. Thememory 104 may include high speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some examples,memory 104 may further include memory located remotely from processor 102, which may be connected to device 10 via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
Thetransmission device 106 is used for receiving or transmitting data via a network. Specific examples of such networks may include wireless networks provided by the communication provider of the device 10. In one example, thetransmission device 106 includes a network adapter (NIC) that can be connected to other network devices through a base station so as to communicate with the internet. In one example, thetransmission device 106 can be a Radio Frequency (RF) module, which is used for communicating with the internet in a wireless manner.
The display may be, for example, a touch screen type Liquid Crystal Display (LCD) that may enable a user to interact with a user interface of the device 10 (or mobile device).
It should be noted that: the precedence order of the above embodiments of the present invention is only for description, and does not represent the merits of the embodiments. And specific embodiments thereof have been described above. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims may be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing may also be possible or may be advantageous.
The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, as for the device and server embodiments, since they are substantially similar to the method embodiments, the description is simple, and the relevant points can be referred to the partial description of the method embodiments.
It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program instructing relevant hardware, where the program may be stored in a computer-readable storage medium, and the above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.