Movatterモバイル変換


[0]ホーム

URL:


CN102236686A - Voice sectional song search method - Google Patents

Voice sectional song search method
Download PDF

Info

Publication number
CN102236686A
CN102236686ACN2010101692232ACN201010169223ACN102236686ACN 102236686 ACN102236686 ACN 102236686ACN 2010101692232 ACN2010101692232 ACN 2010101692232ACN 201010169223 ACN201010169223 ACN 201010169223ACN 102236686 ACN102236686 ACN 102236686A
Authority
CN
China
Prior art keywords
client
song
voice
user
server end
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN2010101692232A
Other languages
Chinese (zh)
Inventor
李霄寒
黄伟
蔡洪滨
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shengle Information Technolpogy Shanghai Co Ltd
Original Assignee
Shengle Information Technolpogy Shanghai Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shengle Information Technolpogy Shanghai Co LtdfiledCriticalShengle Information Technolpogy Shanghai Co Ltd
Priority to CN2010101692232ApriorityCriticalpatent/CN102236686A/en
Publication of CN102236686ApublicationCriticalpatent/CN102236686A/en
Pendinglegal-statusCriticalCurrent

Links

Images

Landscapes

Abstract

The invention discloses a voice sectional song search method, which is realized through interaction of a client side and a server side; and a song database is required to be established at the server side. The method comprises the following steps of: during search, the client side prompts a user to speak out song information to be searched in sections and transmits the voice spoken by the user to the server side after the user finishes speaking; after receiving a voice signal, the server side automatically identifies a character corresponding to the voice signal, then searches in the song database in layers and transmits a search result to the client side; and finally, the client side provides the song search result to the user. According to the method, a rational voice interaction flow is designed at the client side, so that a super large song search space can be decomposed into combination of a plurality of smaller song search spaces by the server side; and thus, search efficiency and automatic voice recognition accuracy in the super large song search space are improved, user experience is improved, meanwhile, hardware cost is saved for service providers.

Description

Voice segment formula song retrieval method
Technical field
The present invention relates to a kind of voice segment formula song retrieval method.
Background technology
Download of songs is current moving and a very active class of business of internet arena.The searching requirement that this business is provided by client according to the user finds the song that meets search condition for user's download in server Qu Ku.Modal example is the name of user in terminal input singer or song, and server finds this singer or the pairing head of song title or a series of songs to return to the user.
On a portable terminal, because the restriction of terminal physical size, the efficient of input method is lower usually, the user can be very consuming time by keyboard or screen input one first complete song title, and because phonetic input accounts for dominant position on mobile terminal input method, the user imports that the situation right and wrong of unisonance wrongly written or mispronounced characters usually see, be entered as " wearing strange " such as " legend " Wang Fei, this may directly cause certain server end searching algorithm can't find corresponding song.At above problem, a reasonable solution is to use speech recognition, such as, allow the user directly say the name of song, user terminal or server search target song or a series of song by speech recognition algorithm from the title of the song database then, from Qu Ku, obtain actual song again and return to the user, so both can improve user's input efficiency, can avoid the problem of unisonance wrongly written or mispronounced characters again, can promote user experience to a certain extent, improve the wish that the user uses the download of songs service.
But, speech recognition technology has a natural weakness, experience with people's ear is similar, speech recognition is not very accurately, and the accuracy rate of machine talk identification descends along with the increase of database, for example, increase along with the title of the song database, the similar song title of pronouncing also can increase, as " favourable turn " of Pan Weibai and " legend " of Wang Fei, and like this can the serious accuracy rate that reduces speech recognition.And in order to contain most of users' demand, a medium sized song database will comprise independently song of hundreds of thousands head usually, carries out speech recognition in a big like this scope, is a very large challenge for recognition accuracy.Simultaneously, speech recognition algorithm is computation-intensive normally, and its calculated amount increases sharply along with the increase of database, this arithmetic capability to server is a very large challenge, in order in time to handle a plurality of users' concurrent demand, the service provider often needs more hardware resource is provided, and has increased cost.
Summary of the invention
The technical problem to be solved in the present invention provides a kind of voice segment formula song retrieval method, and it can improve the accuracy rate of search efficiency and automatic speech recognition, the economize on hardware cost.
For solving the problems of the technologies described above, voice segment formula song retrieval method of the present invention, by the mutual realization of client and server end, and server end has song database, and this method comprises the following steps:
(1) the Client-Prompt customer segment is said the song information that needs retrieval, and the voice that the user says are sent to server end;
(2) server end receives the voice signal that client transmits;
(3) server end carries out automatic speech recognition to the voice signal that receives, and identifies this voice signal corresponding characters;
(4) server end starts search engine, the song of hierarchical search user appointment in song database;
(5) server end sends to client with the song search result;
(6) Search Results that transmits of client reception server end;
(7) client offers the user with the song search result.
Described step (1) comprises the following steps:
(1) the Client-Prompt user first talks about first section voice;
(2) the client recording also detects sound end simultaneously;
(3) client judges according to the sound end testing result whether the user finishes, if, then entered for (4) step, if not, then continued for (2) step;
(4) the Client-Prompt user says second section voice;
(5) the client recording also detects sound end simultaneously;
(6) client judges according to the sound end testing result whether the user finishes, if, then entered for (7) step, if not, then continued for (5) step;
(7) client sends to server end simultaneously with two sections voice.
Described step (1) also can comprise the following steps:
(1) the Client-Prompt user first talks about first section voice;
(2) the client recording also detects sound end simultaneously;
(3) client judges according to the sound end testing result whether the user finishes, if, then entered for (4) step, if not, then continued for (2) step;
(4) the Client-Prompt user says second section voice, simultaneously, first section voice is sent to server end;
(5) the client recording also detects sound end simultaneously;
(6) client judges according to the sound end testing result whether the user finishes, if, then entered for (7) step, if not, then continued for (5) step;
(7) client sends to server end with second section voice.
Song retrieval method of the present invention is passed through in client interactive voice flow process reasonable in design, allow customer segment import voice, make server end be adopted the way of hierarchical search, the song search spatial decomposition of a super large is become the combination in several less song search spaces, dwindled the hunting zone, therefore, compare with existing song retrieval method, search method of the present invention can effectively shorten retrieval time, improve the accuracy rate of automatic speech recognition, promote user experience, simultaneously, can also reduce the load of server, be service provider's economize on hardware cost.
Description of drawings
The present invention is further detailed explanation below in conjunction with accompanying drawing and embodiment:
Fig. 1 is the process flow diagram of the embodiment of the invention;
Fig. 2 is the synoptic diagram of the embodiment of the invention.
Embodiment
Understand for technology contents of the present invention, characteristics and effect being had more specifically, existing in conjunction with illustrated embodiment, details are as follows:
Fig. 1 and 2 is respectively the process step figure and the synoptic diagram of the embodiment of the invention, has used singer's name and song title to search for song in this embodiment, and server end has singer's name database and the song database with singer's classification foundation by name.As illustrated in fig. 1 and 2, when the user prepares to search for title of the song that singer Wang Fei sings for the song of " legend ", carry out according to the following step:
At first, the Client-Prompt user says the singer of the song that need retrieve, i.e. singer's name.The user says singer's name " Wang Fei ", on one side client is recorded, Yi Bian use voice activity detection algorithm to detect the terminal of voice signal in real time, whether finishes this section voice to judge the user.More common voice activity detection algorithm is the terminal of utilizing temporal signatures such as voice signal energy and short-time zero-crossing rate to come recognition of speech signals mostly at present.This deterministic process does not rely on the information of server end, and is therefore very fast.If client judges that the user does not finish as yet, then continue recording and detect sound end, if client judges that the user has finished singer's name, then point out the user to continue song title, and simultaneously singer's name voice signal is sent to server end by network.
After server end receives singer's name voice signal that client sends, at first identify this voice signal corresponding characters with speech recognition algorithm, then, start singer's name search engine, in singer's name database, search the close singer's name of sending with client of singer's name " Wang Fei ", singer's database is compared obviously much smaller with song database, can expect to obtain higher discrimination.In the search that this is taken turns, system chooses with immediate several results of user speech, is 3 results---" Wang Fei " among the embodiment, and " Dou Wei ", " Wang Feng " lists singer's name candidate list in.
Meanwhile, the Client-Prompt user continues song title, the user says song title " legend ", client is recorded on one side, use voice activity detection algorithm to judge whether the user finishes on one side,, then continue recording and detect sound end if client judges that the user does not finish as yet, if client judges that the user has finished song title, then the song title voice signal is sent to server end by network.
After server end receives the song title voice signal that client sends, at first identify this voice signal corresponding characters with speech recognition algorithm, then, start the song title search engine, in song database with singer's classification foundation by name, search for three singers---" Wang Fei " in aforementioned singer's name candidate list, " Dou Wei ", " Wang Feng " pairing song database, obtain name with the immediate several first songs of song title " legend ", for example " Wang Fei/legend ", " Dou Wei/outside window ", " Wang Feng/loyalty ", and send this Search Results to client by network.
Client receives the song search result that server end transmits, and this Search Results is showed the user with the form of list of songs, and the user can be as required, and the song in the selective listing is downloaded, and for example downloads " Wang Fei/legend ".
In sum, voice segment formula song retrieval method of the present invention, reciprocal process by above-mentioned client and server end, solved automatic speech recognition under the situation of search volume increase, the problem that discrimination descends fast, also improve search efficiency simultaneously, saved hardware cost, guaranteed favorable user experience.

Claims (3)

CN2010101692232A2010-05-072010-05-07Voice sectional song search methodPendingCN102236686A (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN2010101692232ACN102236686A (en)2010-05-072010-05-07Voice sectional song search method

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN2010101692232ACN102236686A (en)2010-05-072010-05-07Voice sectional song search method

Publications (1)

Publication NumberPublication Date
CN102236686Atrue CN102236686A (en)2011-11-09

Family

ID=44887341

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN2010101692232APendingCN102236686A (en)2010-05-072010-05-07Voice sectional song search method

Country Status (1)

CountryLink
CN (1)CN102236686A (en)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN103000173A (en)*2012-12-112013-03-27优视科技有限公司Voice interaction method and device
CN103631802A (en)*2012-08-242014-03-12腾讯科技(深圳)有限公司Song information searching method, device and corresponding server
CN103903617A (en)*2012-12-242014-07-02联想(北京)有限公司Voice recognition method and electronic device
CN104469029A (en)*2014-11-212015-03-25科大讯飞股份有限公司Method and device for telephone number query through voice
CN105118518A (en)*2015-07-152015-12-02百度在线网络技术(北京)有限公司Sound semantic analysis method and device
CN105448293A (en)*2014-08-272016-03-30北京羽扇智信息科技有限公司Voice monitoring and processing method and voice monitoring and processing device
CN105912558A (en)*2015-02-242016-08-31卡西欧计算机株式会社Voice retrieval apparatus, and voice retrieval method
CN106409294A (en)*2016-10-182017-02-15广州视源电子科技股份有限公司Method and device for preventing voice command from being recognized by mistake
CN106601250A (en)*2015-11-102017-04-26刘芨可Speech control method and device and equipment
CN107221323A (en)*2017-06-052017-09-29北京智能管家科技有限公司 Method for ordering songs by voice, terminal and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN1750117A (en)*2004-09-162006-03-22乐金电子(惠州)有限公司 Accompaniment machine song search system and its melody database construction method
CN1940918A (en)*2005-09-292007-04-04英华达(上海)电子有限公司MP3 song selection on manual device by speech recognition
CN101206859A (en)*2007-11-302008-06-25清华大学Method for ordering song by voice

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN1750117A (en)*2004-09-162006-03-22乐金电子(惠州)有限公司 Accompaniment machine song search system and its melody database construction method
CN1940918A (en)*2005-09-292007-04-04英华达(上海)电子有限公司MP3 song selection on manual device by speech recognition
CN101206859A (en)*2007-11-302008-06-25清华大学Method for ordering song by voice

Cited By (17)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US9704485B2 (en)2012-08-242017-07-11Tencent Technology (Shenzhen) Company LimitedMultimedia information retrieval method and electronic device
CN103631802A (en)*2012-08-242014-03-12腾讯科技(深圳)有限公司Song information searching method, device and corresponding server
CN103631802B (en)*2012-08-242015-05-20腾讯科技(深圳)有限公司Song information searching method, device and corresponding server
CN103000173B (en)*2012-12-112015-06-17优视科技有限公司Voice interaction method and device
CN103000173A (en)*2012-12-112013-03-27优视科技有限公司Voice interaction method and device
CN103903617A (en)*2012-12-242014-07-02联想(北京)有限公司Voice recognition method and electronic device
CN105448293B (en)*2014-08-272019-03-12北京羽扇智信息科技有限公司Audio monitoring and processing method and equipment
CN105448293A (en)*2014-08-272016-03-30北京羽扇智信息科技有限公司Voice monitoring and processing method and voice monitoring and processing device
CN104469029A (en)*2014-11-212015-03-25科大讯飞股份有限公司Method and device for telephone number query through voice
CN104469029B (en)*2014-11-212017-11-07科大讯飞股份有限公司Number checking method and device is carried out by voice
CN105912558A (en)*2015-02-242016-08-31卡西欧计算机株式会社Voice retrieval apparatus, and voice retrieval method
CN105118518A (en)*2015-07-152015-12-02百度在线网络技术(北京)有限公司Sound semantic analysis method and device
CN106601250A (en)*2015-11-102017-04-26刘芨可Speech control method and device and equipment
CN106409294A (en)*2016-10-182017-02-15广州视源电子科技股份有限公司Method and device for preventing voice command from being recognized by mistake
CN106409294B (en)*2016-10-182019-07-16广州视源电子科技股份有限公司 Method and device for preventing misrecognition of voice commands
CN107221323A (en)*2017-06-052017-09-29北京智能管家科技有限公司 Method for ordering songs by voice, terminal and storage medium
CN107221323B (en)*2017-06-052019-05-28北京儒博科技有限公司Voice song requesting method, terminal and storage medium

Similar Documents

PublicationPublication DateTitle
CN102236686A (en)Voice sectional song search method
US12002452B2 (en)Background audio identification for speech disambiguation
CN110797027B (en)Multi-recognizer speech recognition
EP2685450B1 (en)Device and method for recognizing content using audio signals
US20180190288A1 (en)System and method of performing automatic speech recognition using local private data
US10043520B2 (en)Multilevel speech recognition for candidate application group using first and second speech commands
CN103268315B (en) Natural Language Dialogue Method and System
CN103279508B (en) Method for correcting voice response and natural language dialogue system
US9502031B2 (en)Method for supporting dynamic grammars in WFST-based ASR
CN104867492B (en)Intelligent interactive system and method
US20190370398A1 (en)Method and apparatus for searching historical data
US8015005B2 (en)Method and apparatus for voice searching for stored content using uniterm discovery
CN104239459A (en)Voice search method, voice search device and voice search system
CN103903621A (en)Method for voice recognition and electronic equipment
CN104778946A (en)Voice control method and system
CN105095406A (en)Method and apparatus for voice search based on user feature
CN104252464A (en)Information processing method and information processing device
US10741178B2 (en)Method for providing vehicle AI service and device using the same
CN104282301A (en)Voice command processing method and system
CN103187076A (en)Voice music control device
CN105487668A (en)Display method and apparatus for terminal device
CN110675869A (en)Method and device for controlling applications in smart city app through voice
CN106653006B (en)Searching method and device based on interactive voice
CN104731918A (en)Voice search method and device
CN104484426A (en)Multi-mode music searching method and system

Legal Events

DateCodeTitleDescription
C06Publication
PB01Publication
C10Entry into substantive examination
SE01Entry into force of request for substantive examination
C12Rejection of a patent application after its publication
RJ01Rejection of invention patent application after publication

Application publication date:20111109


[8]ページ先頭

©2009-2025 Movatter.jp