Video image synthesis system and methods for using them based on TV outputTechnical field
The present invention relates to ntelligent television technolog field, more particularly to a kind of video image based on TV output to synthesize systemSystem.
Background technology
The mankind always pursue the simplification of service without end.Although current scientific and technological level can allow television screen to doIncreasing, definition more and more higher is obtained, but often still can not take into account the most simple demand of common people:For example, when you are seeingWhen TV, can usually emerge some ideas or demand suddenly in brains, such as, the weather of today how, neighbouring supermarket hasWhich promotional product, the demand taken out, the demand of net purchase, demand of trip service etc..Although television screen is very big, do not haveIt can be your demand vacating space to have one-inch place;Although mobile phone very little, you have to while TV is seen, crawl thatMobile phone interface of tens times less than television screen and check weather forecast, take-away, trip service;Although heat, you are alsoIt is helplessly to come into that crowded crowd of supermarket, is found out the reason for Discount Promotion message.
The content of the invention
The present invention in view of the shortcomings of the prior art, proposes a kind of video image synthesis system based on TV output and applicationMethod, reach while TV is seen, the purpose for enjoying other service creation images can be synthesized in television interfaces.
The present invention above-mentioned technical purpose technical scheme is that:
A kind of video image synthesis system based on TV output, it is characterised in that:Including television video image subsystems, serviceGenerate image subsystems, video image synthesis subsystem, video frequency output subsystem;Described video image synthon system synthesisExported after television video image and service creation image and give video frequency output subsystem, the video frequency output subsystem is by after synthesisVideo image is sent to television set.
It is preferred that the HDMI sounds that described television video image generation subsystem includes being used to connect television set top box regardFrequency signal element, HDMI audio frequency and video separators, described HDMI audio frequency and video separator receives HDMI audio-video signals, and isolatesHDMI video signal, and HDMI video signal output is synthesized into subsystem to video image.
It is preferred that described service creation image subsystems include echo cancellor and noise reduction and enhancement unit, speech recognitionWith semantic understanding unit, service image generation unit;The speech recognition and the elimination of semantic understanding unit reception of echoes and noise reductionAnd strengthen later voice signal, speech recognition and semantic understanding are carried out, then the result of semantic understanding is sent to service imageGeneration unit;The service image generation unit includes cloud service image generation unit and local service image generation unit;Described cloud service image generation unit includes cloud service unit and cloud service elementary area, the local service imageGeneration unit includes local service unit and local service elementary area;The service image generation unit is by cloud service imageOr local service image is sent to video image synthesis subsystem.
It is preferred that the echo cancellor and noise reduction unit include being used for the HDMI audio frequency and video separation for connecting TV set-top boxDevice, the audio receiver for connecting tv audio output end, audio selector, echo cancellation unit, audio output one are singleMember, speech enhan-cement and noise reduction unit, playing module, audio collection unit, the unit of audio output two;The audio selector receivesThe audio signals of HDMI audio frequency and video separators or the audio signal for receiving audio receiver, be sent respectively to echo cancellation unit andThe unit of audio output one, the playing module receive the audio letter that television set is reduced after the audio signal of the unit of audio output oneNumber;After the echo cancellation unit receives the audio signal of television set and the audio signal of audio collection device respectively, power down is filteredRetain audio collection signal depending on the audio signal of machine, and the audio collection signal of reservation is sent out after speech enhan-cement and noise reductionThe unit of audio output two is given, the unit of audio output two will filter out tv audio signal and pass through speech enhan-cement and dropThe audio collection signal made an uproar is sent to speech recognition and semantic understanding unit.
It is preferred that position of the service creation image in video screen can be with self-defined setting, including the image canOutput self-defined can also be set in the lower right corner of video screen, the upper left corner, its image area size.
It is preferred that the cloud service image generation unit receives the semantic instructions of speech recognition and semantic understanding unit,Generated according to semantic instructions and export corresponding cloud service image;The cloud service content includes neighbouring supermarket promotion letterBreath, online shopping mall's service, weather service, restaurant's new product service, medicine purchase service, name doctor Medical service, trip are serviced, taken outService, other services.
It is preferred that described cloud service image, including cloud system directly invoke third party's opening API data-interface,And the independently developed service of cloud system, to generate cloud service image.
It is preferred that the independently developed service of cloud system includes integrating third party's data, third party's resource, root are integratedDemand data is generated according to user preference, and then generates cloud service image.
A kind of application process of the video image synthesis system based on TV output, it is characterised in that:Comprise the following steps:
Step 1: opening television set, television set is waited to enter a stable video clip;
Step 2: user sends phonetic order in room voice far-field range;
Step 3: phonetic order is sent to speech recognition and semantic understanding unit by voice collector;
Step 4: voice recognition and semantic understanding unit carry out semantic understanding and the semantic data after semantic understanding are sent into clothesBusiness image generation unit;
Step 5: service image generation unit generates cloud service image or local service image according to the content of semantic understanding,And image is sent back into video image synthesis subsystem;
Step 6: video image synthon system synthesis television video image and service creation image, and by the image after synthesisIt is sent to video frequency output subsystem;
Step 7: the video image after synthesis is sent to television set by video frequency output subsystem.
The turn on television set of the step 1 also includes starting television boot-strap by voice command;
The room voice far-field range of the step 2 is both the scope on 5 meters or so of voice collector periphery in room;The roomInterior voice collector is arranged in each room lamp connection box or in room audio amplifier.
Advantages of the present invention effect
1st, the present invention utilize " voice " this tie, incorporate each road resource, put into practice based on TV output integrated service andSuch a new services form for more pressing close to common people's demand of concurrent services.By by television video image and service creationImage is organically synthesized together so that people are while TV is seen, additionally it is possible to enjoy another kind it is fine, happy and joyfulService:People heartily, with being full of excitement, indiscriminately ad. as one wishes according to the preference of individual can send voice need to voice systemInstruction is asked, voice demand instruction is converted into service image and including in some given zone of video screen by voice system at onceDomain.
2nd, the present invention make those be also not equipped with smart mobile phone, be also not equipped with computer, will not also operating computer user,No longer feel sorry for these services oneself can not be enjoyed, they can equally enjoy smart mobile phone on the tv screenThe function of function, PC terminals, so that occupying the TV customer group the most vast of 99.3% TV popularity rate from TV screenTried to be the first on curtain experienced it is intelligent bring their happy, improve their qualities of life and Happiness Index.
3rd, the present invention takes full advantage of the advantage of television set giant-screen, and television set giant-screen is no longer only only for TV programme instituteAccount for, meanwhile, also as the outlet terminal of all kinds of intelligent Services;TV giant-screen no longer only serves the TV programme on line, togetherWhen, the huge numbers of families that also serve under line.
Brief description of the drawings
Fig. 1 is present system frame diagram;
Fig. 2 is TV viewing screen image subsystems schematic diagram of the present invention;
Fig. 3 is service creation image subsystems schematic diagram of the present invention;
Fig. 4 is service image generation unit structure chart of the present invention;
Fig. 5 is echo cancellor and noise reduction of the present invention and enhancement unit structural representation;
Fig. 6 is cloud service content schematic diagram of the present invention;
Fig. 7 is cloud service image construction schematic diagram of the present invention;
Fig. 8 is the independently developed service content schematic diagram of cloud system of the present invention.
Embodiment
The present invention is described in further detail below in conjunction with accompanying drawing.
First, design principle of the invention
The present invention purpose to be reached, be people while TV is seen, send phonetic order, the system is by these phonetic ordersBecome service result image to show on the tv screen.Here, three problems to be solved:
First will solve the problems, such as:User somewhere sends voice, is that appointed place sends voice or required locationVoice is sent, whom phonetic incepting object is.It is to send phonetic order against the air in room or send language against microphoneSound instructs.
The user of the system the random angle of voice far-field range can shave one's head out phonetic order in room, without handPhonetic order is sent by microphone.Principle is that the system is assembled with voice collector, language in the light switch box in each roomThe scope that 5 meters or so of sound collector is referred to as voice far-field range, in the range of this, voice recognition rate of accuracy reached to 95 withOn.Bedroom, study, dining room, lavatory, the parlor of such as house, typically it is no more than the scope of 5 meters of square, general each roomElectric light will be installed, so, the phonetic order that user sends whenever and wherever possible in room can be received.
Second will solve the problems, such as:When television program plays, the voice of sound of television and people are conflictedWhat if.Then its solution method and principle are believed tv audio as shown in figure 5, take the audio signal of television set firstNumber it is divided into two-way, the reduction for television audio signals all the way, all the way as reference signal, the reference signal is used to disappear in echoExcept in unit, when voice audio signals and the television audio signals mixing of people, removed with reference signal in mixed audio signalTelevision audio signals, and retain the voice signal of people.Meanwhile the voice signal of people may be because surrounding environment such as air-conditioning etc.Influence by many interference, strengthen the voice signal of people by speech enhan-cement and noise reduction means, finally, will optimize enhancedVoice signal is sent to speech recognition and semantic understanding unit.
3rd will solve the problems, such as:How phonetic order is changed into video image.Described video image is exactly in electricityA browser area is opened up in screen curtain setting range, described composite video image is exactly television video+browser, thisInvention system is shown in browser area on the tv screen by service creation image or for service result image.
2nd, based on above inventive principle, a kind of video image synthesis system based on TV output, as shown in figure 1, includingTelevision video image subsystems, service creation image subsystems, video image synthesis subsystem, video frequency output subsystem;It is describedVideo image synthon system synthesis television video image and service creation image after export and give video frequency output subsystem, it is describedVideo image after synthesis is sent to television set by video frequency output subsystem.
As shown in Fig. 2 described television video image generation subsystem includes being used for the HDMI for connecting television set top boxAudio-video signal unit, HDMI audio frequency and video separators, described HDMI audio frequency and video separator receives HDMI audio-video signals, and dividesHDMI video signal is separated out, and HDMI video signal output is synthesized into subsystem to video image.
As shown in Figure 3, Figure 4, described service creation image subsystems include echo cancellor and noise reduction and enhancement unit, languageSound identifies and semantic understanding unit, service image generation unit;The speech recognition and semantic understanding unit reception of echoes eliminateWith noise reduction and strengthen later voice signal, carry out speech recognition and semantic understanding, then the result of semantic understanding is sent to clothesBusiness image generation unit;The service image generation unit includes cloud service image generation unit and local service image generatesUnit;Described cloud service image generation unit includes cloud service unit and cloud service elementary area, the local clothesBusiness image generation unit includes local service unit and local service elementary area;The service image generation unit takes high in the cloudsBusiness image or local service image are sent to video image synthesis subsystem.
As shown in figure 5, the echo cancellor and noise reduction unit include being used for the HDMI audio frequency and video point for connecting TV set-top boxAudio receiver, audio selector from device, for connecting tv audio output end, echo cancellation unit, audio output oneUnit, speech enhan-cement and noise reduction unit, playing module, audio collection unit, the unit of audio output two;The audio selector connectsReceive the audio signal of HDMI audio frequency and video separators or receive the audio signal of audio receiver, be sent respectively to echo cancellation unitWith the unit of audio output one, the playing module receives the audio letter that television set is reduced after the audio signal of the unit of audio output oneNumber;After the echo cancellation unit receives the audio signal of television set and the audio signal of audio collection device respectively, power down is filteredRetain audio collection signal depending on the audio signal of machine, and the audio collection signal of reservation is sent out after speech enhan-cement and noise reductionThe unit of audio output two is given, the unit of audio output two will filter out tv audio signal and pass through speech enhan-cement and dropThe audio collection signal made an uproar is sent to speech recognition and semantic understanding unit.
Position of the service creation image in video screen can be with self-defined setting, including the image is exportable in electricityThe lower right corner, the upper left corner of screen curtain, its image area size self-defined can also be set.
As shown in fig. 6, the cloud service image generation unit receives speech recognition and the semanteme of semantic understanding unit refers toOrder, generated according to semantic instructions and export corresponding cloud service image;The cloud service content promotes including neighbouring supermarketInformation, online shopping mall's service, weather service, restaurant's new product service, medicine purchase service, name doctor Medical service, trip service, are outerThe service of selling, other services.
As shown in fig. 7, described cloud service image, including cloud system directly invoke third party's opening API data and connectMouthful, and the independently developed service of cloud system, to generate cloud service image.
As shown in figure 8, the independently developed service of cloud system includes integrating third party's data, third party's money is integratedSource, demand data is generated according to user preference, and then generate cloud service image.
A kind of application process of the video image synthesis system based on TV output, comprises the following steps:
Step 1: opening television set, television set is waited to enter a stable video clip;
Step 2: user sends phonetic order in room voice far-field range;
Step 3: phonetic order is sent to speech recognition and semantic understanding unit by voice collector;
Step 4: voice recognition and semantic understanding unit carry out semantic understanding and the semantic data after semantic understanding are sent into clothesBusiness image generation unit;
Step 5: service image generation unit generates cloud service image or local service image according to the content of semantic understanding,And image is sent back into video image synthesis subsystem;
Step 6: video image synthon system synthesis television video image and service creation image, and by the image after synthesisIt is sent to video frequency output subsystem;
Step 7: the video image after synthesis is sent to television set by video frequency output subsystem.
The turn on television set of the step 1 also includes starting television boot-strap by voice command;
The room voice far-field range of the step 2 is both the scope on 5 meters or so of voice collector periphery in room;The roomInterior voice collector is arranged in each room lamp connection box or in room audio amplifier.
This specific embodiment is only explanation of the invention, and it is not limitation of the present invention, people in the artMember can make the modification of no creative contribution to the present embodiment as needed after this specification is read, but as long as at thisAll protected in the right of invention by Patent Law.