CN101453655A

Movatterモバイル変換

Info

Publication number: CN101453655A
Application number: CNA2007100774888A
Authority: CN
Inventors: 郑元九; 秦小庆
Original assignee: Shenzhen Huawei Communication Technologies Co Ltd
Current assignee: Huawei Device Shenzhen Co Ltd
Priority date: 2007-11-30
Filing date: 2007-11-30
Publication date: 2009-06-10

Abstract

The invention provides a user-controllable audio-video synchronous regulation method, a user-controllable audio-video synchronous regulation system and user-controllable audio-video synchronous regulation terminal equipment, wherein the method comprises: inputting audio-video synchronous regulation parameters; and according to the input parameters, carrying out the synchronous regulation of audio-video output. The method carries out the judgment of an audio-video synchronous state through a user; according to personal feeling of the user, the synchronous parameter regulation is carried out; and compared with a method of inserting a synchronous stab to the sending end and the receiving end for synchronization in the prior art, the method is not influenced by different audio-video receiving terminals, desynchronization of acquired audio-video data sources, the entering of audio-video data streams into different receiving terminals and other factors, lowers cost for realizing the audio-video synchronous regulation, increases maneuverability of the user and ensures the regulation result of synchronization.

Description

Method, system and equipment that the controlled audio-visual synchronization of user is regulated

Technical field

The present invention relates to multimedia technology, method, system and terminal equipment that the controlled audio-visual synchronization of particularly a kind of user is regulated.

Background technology

Along with development of science and technology, people more and more pay attention to seeing and hearing enjoyment, and Development of Multimedia Technology is met people's hope.

Content of multimedia is when playing, and the most headachy is exactly that audio frequency and video are asynchronous.So-called audio frequency and video asynchronous (claiming also that promptly the sound lip is asynchronous) are: when the asynchronous time of sound and image surpassed 400 milliseconds, people will feel that the sound and the shape of the mouth as one speaks can't corresponding phenomenons.The generation of audio frequency and video asynchrony phenomenon can greatly reduce user's experience effect.Therefore, how to guarantee that the user experiences synchronous sound and image effect at last, promptly our alleged audio-visual synchronization here just seems very important.Influencing the key factor that the end user experiences synchronous effect comprises: the Synchronous Processing of the data sync of transmitting terminal, the performance of transmission channel and receiving terminal.Therefore, existing related techniques is to cooperatively interact from the equipment of these 3 aspects and could realize mostly, promptly adopts transmitting terminal and receiving terminal to insert the method for stabbing synchronously and goes to realize the synchronous of audio frequency and video.Its implementation can utilize Fig. 1 to describe: sending endingequipment 130 is after collectingaudio stream 110 and video flowing 120, addmodule 131 by sync id and add sign F to data flow with synchronous characteristic, as: synchronous time stamp, and utilize RTP (Real-time Transport Protocol, real time transport protocol)packet encapsulation module 132 that the audio/video flow that adds sync id is packaged into the RTP packet; Transmit throughNetwork Transmission passage 140 then; Receivingdevice 150 is after receiving the RTP packet of Voice ﹠ Video, the synchronous mark F that utilizes RTPpacket decapsulation module 151 to take out wherein, under same cpu clock, carry out the decoding of Voice ﹠ Video data according to this synchronous mark, the final audio-visual synchronization effect that realizes,audio stream 160 and video flowing 170 after obtaining synchronously.The flow chart of its processing can be with reference to figure 2.

In realizing process of the present invention, the inventor finds that there is following shortcoming at least in prior art:

1, need transmit leg and reciever equipment to understand synchronous characteristic simultaneously, defer to same set of agreement, and the transmission equipment in the middle of requiring can not be revised the synchronous characteristic sign in the code stream; Compatibility to independent terminal equipment (as transmitting apparatus or receiving terminal) requires just than higher like this, and the realization cost of overall plan is also than higher.

2, in application scenarios, if audio, video data is taken from different device ends respectively, perhaps the transmitting apparatus audio, video data source of gathering itself is just asynchronous, or audio and video data streams enters different receiving terminals respectively, and the prior art scheme is just powerless.

Summary of the invention

In order to overcome existing audio-visual synchronization technical sophistication, deficiency that cost is high, the embodiment of the invention provides a kind of user method, system and equipment that controlled audio-visual synchronization is regulated, so that audio frequency and video are regulated no longer be subjected to the asynchronous and audio and video data streams in the audio, video data source of different audio frequency and video receiving terminals, collection itself to enter the restriction of different factors such as receiving terminal, the realization cost that makes audio-visual synchronization regulate is lower, increase user's operability, guaranteed synchronous adjusting result.

In one embodiment of the invention, can provide a kind of user the method that controlled audio-visual synchronization is regulated, it may further comprise the steps: the input audio-visual synchronization is regulated parameter; Carry out the adjusted in concert of audio frequency and video output according to input parameter.

In yet another embodiment of the present invention, can provide the controlled audio-visual synchronization of a kind of user to regulate terminal system, it comprises: voice data output equipment, video data output equipment and audio-visual synchronization conditioning equipment, by the voice data of voice data output equipment output and the video data of video data output equipment output are compared synchronously, if when finding that both are asynchronous, utilize the audio-visual synchronization conditioning equipment that the asynchronous situation of audio, video data is regulated.

In yet another embodiment of the present invention, a kind of user can be provided controlled audio-visual synchronization conditioning equipment, it comprises: audio sync processing module and/or audio video synchronization processing module and synchronization parameter input module, wherein, the audio sync processing module is used for adjusted in concert is carried out in the broadcast of voice data, and/or audio video synchronization processing module, be used for adjusted in concert is carried out in the demonstration of video data, export when asynchronous when audio, video data, utilize the synchronization parameter input module that the user is imported audio-visual synchronization and regulate parameter and convert audio frequency and/or audio video synchronization regulating command to.

Technical scheme in the embodiment of the invention has following advantage at least: can carry out the input that audio-visual synchronization is regulated parameter according to the experience of oneself by the user, the audio-visual synchronization that utilization obtains is regulated parameter, the audio-visual synchronization conditioning equipment can directly carry out variable-speed processing to audio data stream or video data stream according to certain determination strategy and user's selection, the adding synchronous characteristic identifies in the audio and video data streams of sending ending equipment with passing through in the prior art, compare in the method that receiving device carries out synchronous decoding according to the synchronous characteristic sign, the processing procedure that audio-visual synchronization is regulated is convenient, the cost of realizing also can be low greatly, and can increase user's operability.

Feature of the present invention and advantage will be elaborated in conjunction with the accompanying drawings by embodiment.

Description of drawings

In order to be illustrated more clearly in the embodiment of the invention or technical scheme of the prior art, to do one to the accompanying drawing of required use in embodiment or the description of the Prior Art below introduces simply, apparently, accompanying drawing in describing below only is some embodiments of the present invention, for those of ordinary skills, under the prerequisite of not paying creative work, can also obtain other accompanying drawing according to these accompanying drawings.

Fig. 1 is a system block diagram of realizing audio-visual synchronization in the prior art;

Fig. 2 is a flow chart of realizing the method for audio-visual synchronization in the prior art;

The method flow diagram that the controlled audio-visual synchronization of the user that Fig. 3 provides for the embodiment of the invention is regulated;

The system block diagram that the controlled audio-visual synchronization of the user that Fig. 4 provides for the embodiment of the invention is regulated;

The modular structure figure of the audio-visual synchronization conditioning equipment that the user that Fig. 5 provides for the embodiment of the invention is controlled;

The asynchronous state subaudio frequency of audio frequency and video that Fig. 6 provides for the embodiment of the invention is the audio frequency method flow diagram that slows down and regulate the preceding;

The asynchronous state subaudio frequency of audio frequency and video that Fig. 7 provides for embodiment of the invention video the preceding quickens the method flow diagram regulated;

Under the asynchronous state of audio frequency and video that Fig. 8 provides for the embodiment of the invention video the preceding audio frequency quicken the method flow diagram regulated;

The video video method flow diagram that slows down and regulate the preceding under the asynchronous state of audio frequency and video that Fig. 9 provides for the embodiment of the invention.

Embodiment

In order to make the purpose, technical solutions and advantages of the present invention clearer, describe the present invention below in conjunction with the drawings and specific embodiments.

The method that the embodiment of the invention provides mainly comprises: the input audio-visual synchronization is regulated parameter; Carry out the adjusted in concert of audio/video flow according to the parameter of audio-visual synchronization adjusting.

Before regulating parameter, the input audio-visual synchronization need the user to carry out the judgement of synchronous regime according to the audio/video flow of audio frequency and video output equipment output, if the audio/video flow of output is synchronous, then do not need the user to carry out the input that audio-visual synchronization is regulated parameter, if the audio/video flow of output is asynchronous, then carry out the input that audio-visual synchronization is regulated parameter, utilize the audio frequency and video speed changing devices to carry out audio-visual synchronization again and regulate by the user.

Understand for convenience, at first the controlled audio-visual synchronization of user that the embodiment of the invention the is provided method of regulating describes.See also Fig. 3, this method mainly may further comprise the steps:

Step 301: audio frequency and video output equipment output audio, video data;

Step 302: the user judges whether audio, video data is synchronous;

Step 303: when audio, video data was asynchronous, the user imported audio frequency and video and regulates parameter;

Step 304:, carry out the audio frequency and video speed change and regulate according to input parameter;

Step 305: the audio, video data after the output synchronously.

Fig. 4 is that the controlled audio-visual synchronization of further embodiment of this invention user is regulated the structure chart of terminal system, see also Fig. 4, this system comprises audio-frequence player device 410,video display apparatus 420 and audio-visualsynchronization conditioning equipment 430, wherein audio-frequence player device 410,video display apparatus 420 are connected with audio-visualsynchronization conditioning equipment 430 respectively, be used for playing audio-fequency data and display video data, when the user finds that audio frequency and video are asynchronous, carry out adjusted in concert by 430 pairs of audio, video datas of audio-visual synchronization conditioning equipment by the user.

Fig. 5 is the modular structure figure of the controlled audio-visual synchronization conditioning equipment of further embodiment of this invention user, see also Fig. 5, this equipment mainly comprises: video input interface 431, audio input interface 432, video data buffer module 433, voice data buffer module 434, video variable-speed processing module 435, audio speed changing processing module 436, video output interface 437, audio output interface 438, synchronization parameter input module 439.Wherein, video input interface 431 is used for the video data that video data source produces is input to the audio-visual synchronization conditioning equipment; Audio input interface 432 is used for the voice data that audio data sources produces is input to the audio-visual synchronization conditioning equipment; Video data buffer module 433 is used for the video data stream that data source transmits is carried out buffered; Voice data buffer module 434 is used for the audio data stream that data source transmits is carried out buffered; Video variable-speed processing module 435 is used for the display speed of video data is regulated; Audio speed changing processing module 436 is used for the broadcasting speed of voice data is regulated; Video output interface 437 is used for the voice data after handling is exported demonstration; Audio output interface 438 is used for the voice data after handling exported to be play synchronization parameter input module 439 and is used for exporting when asynchronous when audio, video data, imports audio-visual synchronization adjusting parameter by the user.

Adopt the controlled audio-visual synchronization conditioning equipment of this user to carry out the Synchronous Processing processing procedure to be: at first video flowing and audio stream show by video input interface 431 and audio input interface 432, watch the situation of audio-visual synchronization by the user, when the user finds that audio and video data streams is asynchronous, regulate parameter by the user by 439 inputs of synchronization parameter input module, or actually this audio-visual synchronization conditioning equipment according to certain judgment criterion judge audio frequency at preceding video preceding, and the object of regulating according to the adjusted in concert policy selection, regulate to as if audio stream or video flowing one, when the controlled plant of selecting is audio stream, utilize audio speed changing processing module 436 and voice data buffer module 434, audio stream is play carried out the speed change adjusting; When the controlled plant of selecting is video flowing, utilize video variable-speed processing module 435 and video data buffer module 433, video flowing is shown that carrying out speed change regulates.

How audio stream and video flowing are carried out speed change for more detailed description and regulate, below the concrete processing procedure of the present invention carry out following explanation:

See also Fig. 6, the embodiment of the invention provide under the asynchronous condition of audio frequency and video, voice playing speed is during faster than the video display speed, to the processing procedure that audio frequency slows down and regulates, this method can may further comprise the steps:

Step 601: extract voice data, it is carried out segmentation;

Present embodiment carries out the segmentation of voice data according to certain criterion, can adopt the method for time-domain windowed to carry out segmentation, and the size of window can be carried out suitable selection according to the statistical information of sound;

Step 602: the voice data that simulates segmentation according to the sound statistical nature of segmentation voice data;

The analysis that present embodiment is added up according to the voice data of segmentation, therefrom extract the characteristic information that can reflect this section audio data, utilize characteristic information to carry out the match of voice data, make it to become the segmentation voice data of the lengthening of the voice data feature that can reflect segmentation;

Step 603: carry out the insertion of the voice data of piecewise fitting;

Present embodiment inserts according to sequential according to the voice data of the lengthening that piecewise fitting goes out, and makes it to become the voice data that can reflect original sequential;

Step 604: the voice data after reorganization is inserted, play output;

Present embodiment carries out Filtering Processing according to the voice data after inserting, and removes because the burr voice data on the border that piecewise fitting causes enables to reflect more truly the original audio data feature, and the result that audio frequency is slowed down after regulating plays.

See also Fig. 7, the embodiment of the invention provide under the asynchronous condition of audio frequency and video, voice playing speed is during faster than the video display speed, and video is quickened the processing procedure of regulating, this method can may further comprise the steps:

Step 701:, determine the video frame number that abandons according to the time of video data hysteresis voice data;

Present embodiment calculates the video frame number that need abandon according to the frame number of video per second broadcast and the time of video data hysteresis voice data;

Step 702: abandon non-key frame of video according to certain processing policy;

Present embodiment according to corresponding strategy determine in the frame of video which be important frame which be non-important frame, abandon for non-important frame;

Step 703: reconfigure residue frame, show;

The frame of video that present embodiment can abandon the front non-important frame makes up according to original sequential again, shows.

See also Fig. 8, the embodiment of the invention provide under the asynchronous condition of audio frequency and video, the video display speed is during faster than voice playing speed, and audio frequency is quickened the processing procedure of regulating, this method can may further comprise the steps:

Step 801: extract voice data, it is carried out segmentation;

Present embodiment carries out the segmentation of voice data according to certain criterion, can adopt the method for time-domain windowed to carry out segmentation, and the big I of window is carried out suitable selection according to the statistical information of sound;

Step 802: the distinctive tone audio data of extracting segmentation according to the sound statistical nature of segmentation voice data;

The analysis that present embodiment is added up according to the voice data of segmentation, therefrom extract the characteristic information that can reflect this section audio data, utilize characteristic information to carry out the extraction of voice data, make it to become the segmentation voice data of the shortening of the voice data feature that can reflect segmentation;

Step 803: the voice data after extracting merges;

Present embodiment merges according to sequential according to the voice data of the shortening that piecewise fitting goes out, and makes it to become the voice data that can reflect original sequential;

Step 804: the voice data after will merging is play output.

Present embodiment carries out Filtering Processing according to the voice data after extracting, and removes because the burr voice data on the border that stage extraction causes enables to reflect more truly the original audio data feature, and the result after audio frequency is quickened to regulate plays.

See also Fig. 9, the embodiment of the invention provide under the asynchronous condition of audio frequency and video, the video display speed is during faster than voice playing speed, to the processing procedure that video slows down and regulates, this method can may further comprise the steps:

Step 901:, need to determine the video frame number of time-delay according to the time of the leading voice data of video data;

Present embodiment calculates the video frame number that needs time-delay according to the frame number of video per second broadcast and the time of the leading voice data of video data;

Step 902:, each frame image data is delayed time according to the frame number of determining;

Present embodiment according to corresponding strategy determine in the frame of video which be important frame which be non-important frame, delay time for important frame;

Step 903: reconfigure residue frame, show;

Present embodiment is inserted into the frame of video of the important frame of delaying time previously in the original video sequential again, shows.

Be appreciated that audio speed changing processing module and video variable-speed processing module in the audio-visual synchronization device of the present invention can be present in this synchronizer simultaneously, also can both possess one; For the back a kind of may, input audio sync parameter that the user can be simple or audio video synchronization parameter also can realize the adjusted in concert of audio frequency and video.

The above only is preferred embodiment of the present invention, and is in order to restriction the present invention, within the spirit and principles in the present invention not all, any modification of being made, is equal to replacement, improvement etc., all should be included within the scope of protection of the invention.

Claims

1, the controlled audio-visual synchronization control method of a kind of user is characterized in that this method comprises:

The input audio-visual synchronization is regulated parameter;

Carry out the adjusted in concert of audio frequency and video output according to input parameter.

2, method according to claim 1 is characterized in that, the detailed process that described input audio-visual synchronization is regulated parameter comprises: the user determines controlled plant according to the result of perception; Parameter is regulated in input.

3, method according to claim 2 is characterized in that, described controlled plant comprises: audio stream and/or video flowing, wherein video flowing comprises image stream and/or text flow.

4, method according to claim 2 is characterized in that, the mode that described input audio-visual synchronization is regulated parameter comprises: click, button input, slider bar stepping input, touch input.

5, method according to claim 3 is characterized in that, the described adjusted in concert of exporting according to input parameter comprises to the adjusted in concert of audio frequency and/or to the adjusted in concert of video.

6, the controlled audio-visual synchronization regulating system of a kind of user is characterized in that this system comprises:

The voice data output equipment is used for outputting audio data;

The video data output equipment is used for the output video data;

The audio-visual synchronization conditioning equipment is used for the asynchronous situation of audio, video data is regulated.

Utilize voice data output equipment and video data output equipment, the user judges the sync status of audio ﹠ video, and when both were asynchronous, the user utilized the audio-visual synchronization conditioning equipment that the broadcast state of audio stream and video flowing is carried out synchronously.

7, the controlled audio-visual synchronization conditioning equipment of a kind of user is characterized in that this equipment comprises:

The audio sync processing module is used for adjusted in concert is carried out in the broadcast of voice data, and/or

The audio video synchronization processing module is used for adjusted in concert is carried out in the demonstration of video data;

The synchronization parameter input module is used for exporting when asynchronous when audio, video data, the user is imported audio-visual synchronization regulate parameter and convert audio frequency and/or audio video synchronization regulating command to.

8, equipment according to claim 7 is characterized in that, this equipment also comprises:

Audio input interface is used for the voice data that audio data sources produces is input to the audio-visual synchronization conditioning equipment;

Video input interface is used for the video data that video data source produces is input to the audio-visual synchronization conditioning equipment.

9, equipment according to claim 7 is characterized in that, this equipment also comprises:

Audio output interface is used for the voice data after handling is exported broadcast;

Video output interface is used for the voice data after handling is exported demonstration.

10, equipment according to claim 7 is characterized in that, this equipment also comprises:

The audio data stream buffer module is used for the audio data stream that data source transmits is carried out buffered;

The video data stream buffer module is used for the video data stream that data source transmits is carried out buffered.