Movatterモバイル変換


[0]ホーム

URL:


CN108009303B - Search method and device based on voice recognition, electronic equipment and storage medium - Google Patents

Search method and device based on voice recognition, electronic equipment and storage medium
Download PDF

Info

Publication number
CN108009303B
CN108009303BCN201711485685.3ACN201711485685ACN108009303BCN 108009303 BCN108009303 BCN 108009303BCN 201711485685 ACN201711485685 ACN 201711485685ACN 108009303 BCN108009303 BCN 108009303B
Authority
CN
China
Prior art keywords
search
result
user
voice
speech recognition
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201711485685.3A
Other languages
Chinese (zh)
Other versions
CN108009303A (en
Inventor
谢波
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co LtdfiledCriticalBeijing Baidu Netcom Science and Technology Co Ltd
Priority to CN201711485685.3ApriorityCriticalpatent/CN108009303B/en
Publication of CN108009303ApublicationCriticalpatent/CN108009303A/en
Application grantedgrantedCritical
Publication of CN108009303BpublicationCriticalpatent/CN108009303B/en
Activelegal-statusCriticalCurrent
Anticipated expirationlegal-statusCritical

Links

Images

Classifications

Landscapes

Abstract

Translated fromChinese

本发明公开了一种基于语音识别的搜索方法、装置、电子设备和计算机可读存储介质。其中方法包括:在检测到用户开始输入语音时,实时获取用户输入的当前语音数据;对实时获取的当前语音数据进行语音识别以得到对应的当前中间文本信息;根据当前中间文本信息进行结果预测以得到目标文本结果;根据目标文本结果进行搜索,获取对应的搜索结果,并将对应的搜索结果提供给用户。该方法实时对用户输入的语音数据进行识别响应,无需等待用户语音全部输入完成和麦克风关闭,这样无形之中大大节省了设备对语音识别处理的响应时间,从而提高了语音搜索效率,提升了用户体验。

Figure 201711485685

The invention discloses a search method, device, electronic device and computer-readable storage medium based on speech recognition. The method includes: when it is detected that the user starts to input speech, acquiring the current speech data input by the user in real time; performing speech recognition on the current speech data acquired in real time to obtain the corresponding current intermediate text information; Obtain the target text result; perform a search according to the target text result, obtain the corresponding search result, and provide the corresponding search result to the user. The method recognizes and responds to the voice data input by the user in real time, and does not need to wait for all the user's voice input to be completed and the microphone to be turned off, which invisibly greatly saves the response time of the device to the voice recognition processing, thereby improving the voice search efficiency and improving the user. experience.

Figure 201711485685

Description

Search method and device based on voice recognition, electronic equipment and storage medium
Technical Field
The present invention relates to the field of voice search technologies, and in particular, to a search method and apparatus based on voice recognition, an electronic device, and a computer-readable storage medium.
Background
In the related art, the smart device usually stores the complete voice data input by the user after the user inputs the voice, and then performs voice recognition on the complete voice data. For example, the smart device performs corresponding processing on the voice data input by the user only after the user inputs voice and clicks the confirmation key for ending the input, and closes the microphone of the smart device, thereby reducing the response speed of the smart device to voice recognition invisibly, and thus resulting in low voice search efficiency.
Disclosure of Invention
The object of the present invention is to solve at least to some extent one of the above mentioned technical problems.
To this end, a first object of the present invention is to propose a search method based on speech recognition. The method identifies and responds the voice data input by the user in real time, and does not need to wait for the completion of all the input of the voice of the user and the closing of the microphone, so that the response time of equipment for voice identification processing is saved invisibly, the voice search efficiency is improved, and the user experience is improved.
The second purpose of the invention is to provide a searching device based on voice recognition.
A third object of the invention is to propose an electronic device.
A fourth object of the invention is to propose a computer-readable storage medium.
In order to achieve the above object, a search method based on speech recognition according to an embodiment of the first aspect of the present invention includes: when detecting that a user starts to input voice, acquiring current voice data input by the user in real time; performing voice recognition on the current voice data acquired in real time to obtain corresponding current intermediate text information; predicting a result according to the current intermediate text information to obtain a target text result; and searching according to the target text result, acquiring a corresponding search result, and providing the corresponding search result for the user.
According to the searching method based on voice recognition, when the fact that a user starts to input voice is detected, current voice data input by the user are obtained in real time, voice recognition is conducted on the current voice data obtained in real time to obtain corresponding current intermediate text information, result prediction is conducted according to the current intermediate text information to obtain a target text result, then searching is conducted according to the target text result to obtain a corresponding searching result, and the corresponding searching result is provided for the user. The voice data input by the user is identified and responded in real time, the user does not need to wait for the completion of all voice input and the closing of the microphone, so that the response time of the equipment for voice identification processing is saved invisibly, the voice search efficiency is improved, and the user experience is improved.
In order to achieve the above object, a search device based on speech recognition according to a second embodiment of the present invention includes: the acquisition module is used for acquiring current voice data input by a user in real time when the voice input by the user is detected to start; the voice recognition module is used for carrying out voice recognition on the current voice data acquired in real time to obtain corresponding current intermediate text information; the text result prediction module is used for predicting the result according to the current intermediate text information to obtain a target text result; the searching module is used for searching according to the target text result to obtain a corresponding searching result; a providing module for providing the corresponding search result to the user.
According to the searching device based on voice recognition, the current voice data input by the user can be obtained in real time when the obtaining module detects that the user starts to input voice, the voice recognition module conducts voice recognition on the current voice data obtained in real time to obtain corresponding current intermediate text information, the text result prediction module conducts result prediction according to the current intermediate text information to obtain a target text result, the searching module conducts searching according to the target text result to obtain a corresponding searching result, and the providing module provides the corresponding searching result for the user. The voice data input by the user is identified and responded in real time, the user does not need to wait for the completion of all voice input and the closing of the microphone, so that the response time of the equipment for voice identification processing is saved invisibly, the voice search efficiency is improved, and the user experience is improved.
In order to achieve the above object, an electronic device according to an embodiment of the third aspect of the present invention includes a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor executes the computer program to implement the search method based on speech recognition according to the embodiment of the first aspect of the present invention.
To achieve the above object, a non-transitory computer-readable storage medium is provided in an embodiment of a fourth aspect of the present invention, on which a computer program is stored, and the computer program, when executed by a processor, implements the search method based on speech recognition according to the embodiment of the first aspect of the present invention.
Additional aspects and advantages of the invention will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the invention.
Drawings
The foregoing and/or additional aspects and advantages of the present invention will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:
FIG. 1 is a flow diagram of a method of search based on speech recognition according to one embodiment of the present invention;
FIG. 2 is an exemplary diagram of a search method based on speech recognition according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of a search apparatus based on speech recognition according to an embodiment of the present invention;
FIG. 4 is a schematic diagram of a search apparatus based on speech recognition according to an embodiment of the present invention;
FIG. 5 is a schematic diagram of a search apparatus based on speech recognition according to another embodiment of the present invention;
fig. 6 is a schematic structural diagram of an electronic device according to an embodiment of the invention.
Detailed Description
Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the drawings are illustrative and intended to be illustrative of the invention and are not to be construed as limiting the invention.
A search method, apparatus, electronic device, and computer-readable storage medium based on speech recognition according to embodiments of the present invention are described below with reference to the accompanying drawings.
Fig. 1 is a flowchart of a search method based on speech recognition according to an embodiment of the present invention. It should be noted that the search method based on speech recognition according to the embodiment of the present invention is applicable to the search apparatus based on speech recognition according to the embodiment of the present invention, and the search apparatus may be configured in an electronic device.
As shown in fig. 1, the search method based on speech recognition may include:
s110, when detecting that the user starts to input voice, acquiring current voice data input by the user in real time.
For example, it is assumed that the search method based on speech recognition according to the embodiment of the present invention is applied to an electronic device, and the electronic device may provide a speech input module for a user, for example, the speech input module may be a microphone or a component with a speech acquisition function, such as a sound box, so that the user may input speech through the speech input module. When the voice input module is detected to be used by a user to start inputting voice, the current voice data input by the user can be acquired in real time. That is, since the voice generation has a time sequence, the current voice data input by the user can be acquired in real time during the voice input process by the user.
And S120, performing voice recognition on the current voice data acquired in real time to obtain corresponding current intermediate text information.
Optionally, the current voice data obtained in real time may be subjected to voice recognition by a voice recognition technology to obtain a corresponding text, and the text is used as the current intermediate text information corresponding to the current voice data.
And S130, predicting a result according to the current intermediate text information to obtain a target text result.
Optionally, the voice input intention of the user may be predicted according to the current intermediate text information, which search result the user wants to implement by the voice is predicted, and a corresponding target text result is predicted according to the predicted voice input intention of the user, so that a search operation is performed according to the target text result in the following.
As an example implementation manner, the result prediction may be performed on the current intermediate text information according to a pre-established prediction model to obtain a corresponding search keyword sample with the largest utilization rate, and the corresponding search keyword sample with the largest utilization rate is used as the target text result. In an embodiment of the present invention, the prediction model is obtained by training a plurality of search keyword samples and usage rates corresponding to the search keyword samples.
That is, the prediction model may be established by training in advance according to a plurality of search keyword samples and usage rates corresponding to the search keyword samples. In this way, in practical application, the result of the current intermediate text information can be tested through the prediction model to obtain a corresponding search keyword sample with the maximum utilization rate, wherein the search keyword sample with the maximum utilization rate can be understood as the search keyword sample with the maximum probability of performing a search, and finally, the corresponding search keyword sample with the maximum utilization rate is used as the target text result.
For example, taking the current intermediate text information corresponding to the current speech data as "weather" as an example, it is assumed that the prediction model includes search keyword samples such as "weather forecast", "weather forecast 15-day query", "beijing weather", "shanghai weather", and the like, and the usage rates of these search keyword samples are 90%, 85%, 50%, and 40%. The prediction model can be used for predicting the result of the current intermediate text information, namely weather, so as to obtain the search keyword sample weather forecast with the highest utilization rate, and at the moment, the search keyword sample weather forecast with the highest utilization rate can be used as the target text result.
In order to ensure the accuracy of speech recognition, optionally, in an embodiment of the present invention, in the process of performing result prediction according to the current intermediate text information to obtain the target text result, next speech data input by the user may be further acquired, speech recognition is performed on the next speech data to obtain corresponding intermediate text information, and the result prediction is calibrated according to the intermediate text information corresponding to the next speech data.
Optionally, in the process of performing result prediction according to the current intermediate text information, the next voice data input by the user may be obtained in real time, the next voice data is subjected to voice recognition through a voice recognition technology to obtain corresponding intermediate text information, and the prediction result when performing result prediction on the current intermediate text information is calibrated according to the intermediate text information.
For example, taking the current intermediate text information as "weather", assuming that the predicted result is "weather forecast" when the result of the current intermediate text information is predicted, at this time, the next voice data input by the user may also be acquired, and voice recognition may be performed on the next voice data to obtain the corresponding intermediate text information "early warning", and at this time, the predicted result "weather forecast" when the result of the previous intermediate text information "weather" is predicted may be calibrated according to the intermediate text information "early warning" to obtain the text result "weather early warning". Therefore, in the process of predicting the result according to the current intermediate text information, the previous prediction result can be calibrated through the intermediate text information corresponding to the next voice data, so that the voice recognition efficiency is improved, and the accuracy of the voice recognition is guaranteed.
And S140, searching according to the target text result, acquiring a corresponding search result, and providing the corresponding search result for the user.
As an example implementation manner, when a target text result is obtained, a search may be performed according to the target text result to obtain a corresponding search result, and then a format type of the search result may be determined, a corresponding presentation manner may be determined according to the format type, and the search result may be presented to the user according to the corresponding presentation manner.
For example, when the format type is the MP3 format, determining that the corresponding presentation mode is a playing mode, and playing the search result to the user through an audio playing module; when the format type is a TTS (text to speech) format (such as weather forecast), determining that the corresponding presentation mode is a voice broadcast and text presentation mode, and providing the search result to the user through the voice broadcast and text presentation modes.
For example, as shown in fig. 2, it is assumed that the search method based on speech recognition according to the embodiment of the present invention is applied to an intelligent robot, and the intelligent robot has a sound box therein, and sound of a surrounding environment can be collected through the sound box. When the voice input of a user is detected, the current voice data input by the user can be obtained in real time through the sound box, the voice recognition system is used for carrying out voice recognition on the current voice data to obtain corresponding current intermediate text information, result prediction is carried out on the current intermediate text information to obtain a target text result, then searching can be carried out in a resource library according to the target text result to obtain a corresponding search result, the format type of the search result is determined, a corresponding display mode is determined according to the format type, and the search result is displayed to the user through the sound box according to the corresponding display mode.
In order to improve the usability and feasibility of the present invention, optionally, in an embodiment of the present invention, before the search is performed according to the target text result, it may be determined whether the user ends the voice input, and when the user ends the voice input, the search is performed according to the target text result.
In the embodiment of the present invention, a specific implementation manner of determining whether the user ends the voice input may be as follows: when the fact that the user starts inputting the voice is detected, the voice feature of the user can be extracted from the voice which starts inputting, therefore, in the process of obtaining the voice which is input by the user, whether the sound sent by the user is contained in the collected audio is judged in real time according to the voice feature, and if the fact that the sound sent by the user is not contained in the currently collected audio is judged, the fact that the user finishes the voice inputting can be judged.
In order to further improve the accuracy of the determination, optionally, in an embodiment of the present invention, when it is detected that the user starts inputting the voice, the voice feature of the user may be extracted from the voice which starts inputting, so that in the process of acquiring the voice input by the user, it is determined whether the collected audio contains the sound emitted by the user according to the voice feature in real time, and if it is determined that the currently collected audio does not contain the sound emitted by the user and the audio containing the sound emitted by the user is collected for a certain time, it may be determined that the user has ended the voice input.
According to the searching method based on voice recognition, when the fact that a user starts to input voice is detected, current voice data input by the user are obtained in real time, voice recognition is conducted on the current voice data obtained in real time to obtain corresponding current intermediate text information, result prediction is conducted according to the current intermediate text information to obtain a target text result, then searching is conducted according to the target text result to obtain a corresponding searching result, and the corresponding searching result is provided for the user. The voice data input by the user is identified and responded in real time, the user does not need to wait for the completion of all voice input and the closing of the microphone, so that the response time of the equipment for voice identification processing is saved invisibly, the voice search efficiency is improved, and the user experience is improved.
Corresponding to the search methods based on speech recognition provided in the above-mentioned several embodiments, an embodiment of the present invention further provides a search apparatus based on speech recognition, and since the search apparatus based on speech recognition provided in the embodiment of the present invention corresponds to the search methods based on speech recognition provided in the above-mentioned several embodiments, the implementation manner of the search method based on speech recognition is also applicable to the search apparatus based on speech recognition provided in the embodiment, and is not described in detail in the embodiment. Fig. 3 is a schematic structural diagram of a search apparatus based on speech recognition according to an embodiment of the present invention. As shown in fig. 3, the speech recognition-based search apparatus 300 may include: an acquisition module 310, a speech recognition module 320, a text result prediction module 330, a search module 340, and a provision module 350.
Specifically, the obtaining module 310 is configured to obtain current voice data input by the user in real time when it is detected that the user starts inputting voice.
The speech recognition module 320 is configured to perform speech recognition on the current speech data acquired in real time to obtain corresponding current intermediate text information.
The text result prediction module 330 is configured to perform result prediction according to the current intermediate text information to obtain a target text result. As an example implementation manner, the text result predicting module 330 may perform result prediction on the current intermediate text information according to a pre-established prediction model to obtain a corresponding search keyword sample with the maximum utilization rate, where the prediction model is obtained by training a plurality of search keyword samples and the utilization rates corresponding to the plurality of search keyword samples, and takes the corresponding search keyword sample with the maximum utilization rate as the target text result.
The search module 340 is configured to perform a search according to the target text result to obtain a corresponding search result.
The providing module 350 is used for providing the corresponding search results to the user. As an example, as shown in fig. 4, the providing module 350 may include a determining unit 351 and a providing unit 352. The determining unit 351 is configured to determine a format type of the search result. The providing unit 352 is configured to determine a corresponding presentation manner according to the format type, and present the search result to the user according to the corresponding presentation manner.
For example, when the format type is MP3 format, the providing unit 352 may determine that the corresponding presentation mode is a playing mode, and play the search result to the user through an audio playing module; when the format type is a TTS format, the providing unit 352 may determine that the corresponding presentation manner is a voice broadcast and text presentation manner, and provide the search result to the user through the voice broadcast and text presentation manner.
In order to guarantee the accuracy of the speech recognition, optionally, in an embodiment of the present invention, as shown in fig. 5, the speech recognition-based search apparatus 300 may further include: prediction result calibration module 360. In an embodiment of the present invention, the obtaining module 310 is further configured to obtain next voice data input by the user; the voice recognition module 320 is further configured to perform voice recognition on the next voice data to obtain corresponding intermediate text information; the prediction result calibration module 360 is configured to calibrate the result prediction according to the intermediate text information corresponding to the next speech data in the process of performing the result prediction according to the current intermediate text information to obtain the target text result.
According to the searching device based on voice recognition, the current voice data input by the user can be obtained in real time when the obtaining module detects that the user starts to input voice, the voice recognition module conducts voice recognition on the current voice data obtained in real time to obtain corresponding current intermediate text information, the text result prediction module conducts result prediction according to the current intermediate text information to obtain a target text result, the searching module conducts searching according to the target text result to obtain a corresponding searching result, and the providing module provides the corresponding searching result for the user. The voice data input by the user is identified and responded in real time, the user does not need to wait for the completion of all voice input and the closing of the microphone, so that the response time of the equipment for voice identification processing is saved invisibly, the voice search efficiency is improved, and the user experience is improved.
In order to implement the above embodiments, the present invention further provides an electronic device.
Fig. 6 is a schematic structural diagram of an electronic device according to an embodiment of the invention. It should be noted that, in the embodiment of the present invention, the electronic device may be a device having a speech recognition system and a search function, so as to implement a speech search function. For example, the electronic equipment can be an intelligent robot, and human-computer voice interaction with a user is realized; as another example, the electronic device can also be a search server with voice search.
As shown in fig. 6, theelectronic device 600 may include: amemory 610, aprocessor 620 and acomputer program 630 stored in thememory 610 and operable on theprocessor 620, wherein theprocessor 620 executes theprogram 630 to implement the search method based on speech recognition according to any of the above embodiments of the present invention.
In order to implement the above embodiments, the present invention also proposes a non-transitory computer-readable storage medium, on which a computer program is stored, which when executed by a processor implements the speech recognition based search method according to any of the above embodiments of the present invention.
In the description of the present invention, it is to be understood that the terms "first", "second" and the like are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implying any number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one such feature. In the description of the present invention, "a plurality" means at least two, e.g., two, three, etc., unless specifically limited otherwise.
In the description herein, references to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above are not necessarily intended to refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, various embodiments or examples and features of different embodiments or examples described in this specification can be combined and combined by one skilled in the art without contradiction.
Any process or method descriptions in flow charts or otherwise described herein may be understood as representing modules, segments, or portions of code which include one or more executable instructions for implementing specific logical functions or steps of the process, and alternate implementations are included within the scope of the preferred embodiment of the present invention in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those reasonably skilled in the art of the present invention.
The logic and/or steps represented in the flowcharts or otherwise described herein, e.g., an ordered listing of executable instructions that can be considered to implement logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device. More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or more wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). Additionally, the computer-readable medium could even be paper or another suitable medium upon which the program is printed, as the program can be electronically captured, via for instance optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner if necessary, and then stored in a computer memory.
It should be understood that portions of the present invention may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, the various steps or methods may be implemented in software or firmware stored in memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, any one or combination of the following techniques, which are known in the art, may be used: a discrete logic circuit having a logic gate circuit for implementing a logic function on a data signal, an application specific integrated circuit having an appropriate combinational logic gate circuit, a Programmable Gate Array (PGA), a Field Programmable Gate Array (FPGA), or the like.
It will be understood by those skilled in the art that all or part of the steps carried by the method for implementing the above embodiments may be implemented by hardware related to instructions of a program, which may be stored in a computer readable storage medium, and when the program is executed, the program includes one or a combination of the steps of the method embodiments.
In addition, functional units in the embodiments of the present invention may be integrated into one processing module, or each unit may exist alone physically, or two or more units are integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode. The integrated module, if implemented in the form of a software functional module and sold or used as a stand-alone product, may also be stored in a computer readable storage medium.
The storage medium mentioned above may be a read-only memory, a magnetic or optical disk, etc. Although embodiments of the present invention have been shown and described above, it is understood that the above embodiments are exemplary and should not be construed as limiting the present invention, and that variations, modifications, substitutions and alterations can be made to the above embodiments by those of ordinary skill in the art within the scope of the present invention.

Claims (10)

Translated fromChinese
1.一种基于语音识别的搜索方法,其特征在于,包括以下步骤:1. a search method based on speech recognition, is characterized in that, comprises the following steps:在检测到用户开始输入语音时,实时获取所述用户输入的当前语音数据;When detecting that the user starts to input voice, obtain the current voice data input by the user in real time;对所述实时获取的当前语音数据进行语音识别以得到对应的当前中间文本信息;performing speech recognition on the current voice data obtained in real time to obtain corresponding current intermediate text information;根据所述当前中间文本信息进行结果预测以得到目标文本结果;Predict the result according to the current intermediate text information to obtain the target text result;根据所述目标文本结果进行搜索,获取对应的搜索结果,并将所述对应的搜索结果提供给所述用户;Search according to the target text results, obtain corresponding search results, and provide the corresponding search results to the user;所述根据当前中间文本信息进行结果预测以得到目标文本结果,包括:The result prediction according to the current intermediate text information to obtain the target text result includes:根据预先建立的预测模型对所述当前中间文本信息进行结果预测,得到对应的使用率最大的搜索关键词样本,其中,所述预测模型是根据多个搜索关键词样本和所述多个搜索关键词样本对应的使用率进行训练而得到的;Predict the result of the current intermediate text information according to a pre-established prediction model, and obtain the corresponding search keyword sample with the largest usage rate, wherein the prediction model is based on multiple search keyword samples and the multiple search keywords. The usage rate corresponding to the word sample is obtained by training;将所述对应的使用率最大的搜索关键词样本作为所述目标文本结果。The corresponding search keyword sample with the largest usage rate is used as the target text result.2.如权利要求1所述的基于语音识别的搜索方法,其特征在于,在根据所述当前中间文本信息进行结果预测以得到目标文本结果的过程中,所述方法还包括:2. The search method based on speech recognition as claimed in claim 1, wherein in the process of performing result prediction according to the current intermediate text information to obtain the target text result, the method further comprises:获取所述用户输入的下一个语音数据;obtaining the next voice data input by the user;对所述下一个语音数据进行语音识别以得到对应的中间文本信息;performing speech recognition on the next speech data to obtain corresponding intermediate text information;根据与所述下一个语音数据对应的中间文本信息,对所述结果预测进行校准。The resulting prediction is calibrated according to the intermediate text information corresponding to the next speech data.3.如权利要求1所述的基于语音识别的搜索方法,其特征在于,所述将对应的搜索结果提供给所述用户,包括:3. The search method based on speech recognition as claimed in claim 1, wherein the providing corresponding search results to the user comprises:确定所述搜索结果的格式类型;determining the format type of the search result;根据所述格式类型确定对应的展现方式,并根据所述对应的展现方式将所述搜索结果展现给所述用户。A corresponding presentation manner is determined according to the format type, and the search result is presented to the user according to the corresponding presentation manner.4.如权利要求3所述的基于语音识别的搜索方法,其特征在于,所述根据格式类型确定对应的展现方式,并根据所述对应的展现方式将所述搜索结果展现给所述用户,包括:4. The search method based on speech recognition as claimed in claim 3, wherein the corresponding presentation mode is determined according to the format type, and the search result is presented to the user according to the corresponding presentation mode, include:当所述格式类型为MP3格式时,确定所述对应的展现方式为播放方式,并通过音频播放模块将所述搜索结果播放给所述用户;When the format type is MP3 format, determine that the corresponding presentation mode is a playback mode, and play the search result to the user through an audio playback module;当所述格式类型为TTS格式时,确定所述对应的展现方式为语音播报和文本呈现的方式,并通过所述语音播报和文本呈现的方式将所述搜索结果提供给所述用户。When the format type is TTS format, it is determined that the corresponding presentation mode is voice broadcast and text presentation, and the search result is provided to the user through the voice broadcast and text presentation.5.一种基于语音识别的搜索装置,其特征在于,包括:5. A search device based on speech recognition, characterized in that, comprising:获取模块,用于在检测到用户开始输入语音时,实时获取所述用户输入的当前语音数据;an acquisition module, configured to acquire the current voice data input by the user in real time when it is detected that the user starts to input voice;语音识别模块,用于对所述实时获取的当前语音数据进行语音识别以得到对应的当前中间文本信息;A speech recognition module for performing speech recognition on the current speech data obtained in real time to obtain corresponding current intermediate text information;文本结果预测模块,用于根据所述当前中间文本信息进行结果预测以得到目标文本结果;A text result prediction module, used for predicting the result according to the current intermediate text information to obtain the target text result;搜索模块,用于根据所述目标文本结果进行搜索,获取对应的搜索结果;a search module, configured to search according to the target text results to obtain corresponding search results;提供模块,用于将所述对应的搜索结果提供给所述用户;a providing module for providing the corresponding search result to the user;所述文本结果预测模块具体用于:The text result prediction module is specifically used for:根据预先建立的预测模型对所述当前中间文本信息进行结果预测,得到对应的使用率最大的搜索关键词样本,其中,所述预测模型是根据多个搜索关键词样本和所述多个搜索关键词样本对应的使用率进行训练而得到的;Predict the result of the current intermediate text information according to a pre-established prediction model, and obtain the corresponding search keyword sample with the largest usage rate, wherein the prediction model is based on multiple search keyword samples and the multiple search keywords. The usage rate corresponding to the word sample is obtained by training;将所述对应的使用率最大的搜索关键词样本作为所述目标文本结果。The corresponding search keyword sample with the largest usage rate is used as the target text result.6.如权利要求5所述的基于语音识别的搜索装置,其特征在于,所述装置还包括:预测结果校准模块;6. The search device based on speech recognition according to claim 5, wherein the device further comprises: a prediction result calibration module;其中,所述获取模块,还用于获取所述用户输入的下一个语音数据;Wherein, the acquisition module is further configured to acquire the next voice data input by the user;所述语音识别模块,还用于对所述下一个语音数据进行语音识别以得到对应的中间文本信息;The speech recognition module is further configured to perform speech recognition on the next speech data to obtain corresponding intermediate text information;所述预测结果校准模块,用于在根据所述当前中间文本信息进行结果预测以得到目标文本结果的过程中,根据与所述下一个语音数据对应的中间文本信息,对所述结果预测进行校准。The prediction result calibration module is used to calibrate the result prediction according to the intermediate text information corresponding to the next speech data in the process of performing the result prediction according to the current intermediate text information to obtain the target text result .7.如权利要求5所述的基于语音识别的搜索装置,其特征在于,所述提供模块包括:7. The search device based on speech recognition as claimed in claim 5, wherein the providing module comprises:确定单元,用于确定所述搜索结果的格式类型;a determining unit for determining the format type of the search result;提供单元,用于根据所述格式类型确定对应的展现方式,并根据所述对应的展现方式将所述搜索结果展现给所述用户。A providing unit is configured to determine a corresponding presentation manner according to the format type, and present the search result to the user according to the corresponding presentation manner.8.如权利要求7所述的基于语音识别的搜索装置,其特征在于,所述提供单元具体用于:8. The search device based on speech recognition as claimed in claim 7, wherein the providing unit is specifically used for:在所述格式类型为MP3格式时,确定所述对应的展现方式为播放方式,并通过音频播放模块将所述搜索结果播放给所述用户;When the format type is MP3 format, determine that the corresponding presentation mode is a playback mode, and play the search result to the user through an audio playback module;在所述格式类型为TTS格式时,确定所述对应的展现方式为语音播报和文本呈现的方式,并通过所述语音播报和文本呈现的方式将所述搜索结果提供给所述用户。When the format type is TTS format, it is determined that the corresponding presentation mode is a voice broadcast and text presentation mode, and the search result is provided to the user through the voice broadcast and text presentation mode.9.一种电子设备,包括存储器、处理器及存储在所述存储器上并可在所述处理器上运行的计算机程序,其特征在于,所述处理器执行所述程序时,实现如权利要求1至4中任一项所述的基于语音识别的搜索方法。9. An electronic device, comprising a memory, a processor and a computer program stored on the memory and running on the processor, characterized in that, when the processor executes the program, the process as claimed in the claims The voice recognition-based search method according to any one of 1 to 4.10.一种非临时性计算机可读存储介质,其上存储有计算机程序,其特征在于,所述程序被处理器执行时实现如权利要求1至4中任一项所述的基于语音识别的搜索方法。10. A non-transitory computer-readable storage medium on which a computer program is stored, characterized in that, when the program is executed by a processor, the voice-recognition-based voice recognition system according to any one of claims 1 to 4 is implemented when the program is executed by a processor. search method.
CN201711485685.3A2017-12-302017-12-30Search method and device based on voice recognition, electronic equipment and storage mediumActiveCN108009303B (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN201711485685.3ACN108009303B (en)2017-12-302017-12-30Search method and device based on voice recognition, electronic equipment and storage medium

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN201711485685.3ACN108009303B (en)2017-12-302017-12-30Search method and device based on voice recognition, electronic equipment and storage medium

Publications (2)

Publication NumberPublication Date
CN108009303A CN108009303A (en)2018-05-08
CN108009303Btrue CN108009303B (en)2021-09-14

Family

ID=62049692

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN201711485685.3AActiveCN108009303B (en)2017-12-302017-12-30Search method and device based on voice recognition, electronic equipment and storage medium

Country Status (1)

CountryLink
CN (1)CN108009303B (en)

Families Citing this family (15)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN111291168A (en)*2018-12-072020-06-16北大方正集团有限公司 Book retrieval method, device and readable storage medium
CN109979440B (en)*2019-03-132021-05-11广州市网星信息技术有限公司Keyword sample determination method, voice recognition method, device, equipment and medium
CN110012166B (en)*2019-03-312021-02-19联想(北京)有限公司Information processing method and device
CN110287303B (en)*2019-06-282021-08-20北京猎户星空科技有限公司Man-machine conversation processing method, device, electronic equipment and storage medium
CN110517673B (en)*2019-07-182023-08-18平安科技(深圳)有限公司Speech recognition method, device, computer equipment and storage medium
KR20210034276A (en)*2019-09-202021-03-30현대자동차주식회사Dialogue system, dialogue processing method and electronic apparatus
CN112825248B (en)*2019-11-192024-08-02阿里巴巴集团控股有限公司Voice processing method, model training method, interface display method and equipment
CN111045836B (en)*2019-11-252023-05-09腾讯科技(深圳)有限公司Search method, search device, electronic equipment and computer readable storage medium
CN111916082B (en)*2020-08-142024-07-09腾讯科技(深圳)有限公司Voice interaction method, device, computer equipment and storage medium
CN113704384A (en)*2021-08-272021-11-26挂号网(杭州)科技有限公司Method and device for generating code through voice recognition, electronic equipment and storage medium
CN114360530B (en)*2021-11-302024-08-27北京罗克维尔斯科技有限公司Voice test method, device, computer equipment and storage medium
CN114021579B (en)*2022-01-052022-04-19浙江口碑网络技术有限公司Object prediction method, device, electronic equipment and computer readable storage medium
CN114579841A (en)*2022-01-272022-06-03北京声智科技有限公司Audio processing method, device, equipment, storage medium and computer program product
CN114743540B (en)*2022-04-142025-07-29携程旅游信息技术(上海)有限公司Speech recognition method, system, electronic device and storage medium
CN117271752B (en)*2023-11-172024-02-27炫我信息技术(北京)有限公司Data processing method and device and electronic equipment

Citations (7)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN101404035A (en)*2008-11-212009-04-08北京得意音通技术有限责任公司Information search method based on text or voice
CN103761261A (en)*2013-12-312014-04-30北京紫冬锐意语音科技有限公司Voice recognition based media search method and device
CN104794218A (en)*2015-04-282015-07-22百度在线网络技术(北京)有限公司Voice searching method and device
CN104813275A (en)*2012-09-272015-07-29谷歌公司Methods and systems for predicting a text
CN105895090A (en)*2016-03-302016-08-24乐视控股(北京)有限公司Voice signal processing method and device
CN107093425A (en)*2017-03-302017-08-25安徽继远软件有限公司Speech guide system, audio recognition method and the voice interactive method of power system
CN107357875A (en)*2017-07-042017-11-17北京奇艺世纪科技有限公司A kind of voice search method, device and electronic equipment

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US7904298B2 (en)*2006-11-172011-03-08Rao Ashwin PPredictive speech-to-text input
US9046917B2 (en)*2012-05-172015-06-02Sri InternationalDevice, method and system for monitoring, predicting, and accelerating interactions with a computing device

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN101404035A (en)*2008-11-212009-04-08北京得意音通技术有限责任公司Information search method based on text or voice
CN104813275A (en)*2012-09-272015-07-29谷歌公司Methods and systems for predicting a text
CN103761261A (en)*2013-12-312014-04-30北京紫冬锐意语音科技有限公司Voice recognition based media search method and device
CN104794218A (en)*2015-04-282015-07-22百度在线网络技术(北京)有限公司Voice searching method and device
CN105895090A (en)*2016-03-302016-08-24乐视控股(北京)有限公司Voice signal processing method and device
CN107093425A (en)*2017-03-302017-08-25安徽继远软件有限公司Speech guide system, audio recognition method and the voice interactive method of power system
CN107357875A (en)*2017-07-042017-11-17北京奇艺世纪科技有限公司A kind of voice search method, device and electronic equipment

Also Published As

Publication numberPublication date
CN108009303A (en)2018-05-08

Similar Documents

PublicationPublication DateTitle
CN108009303B (en)Search method and device based on voice recognition, electronic equipment and storage medium
CN107919130B (en)Cloud-based voice processing method and device
US12159622B2 (en)Text independent speaker recognition
CN110544473B (en)Voice interaction method and device
CN109669663B (en)Method and device for acquiring range amplitude, electronic equipment and storage medium
US20140172429A1 (en)Local recognition of content
JP2020079921A (en) Method, apparatus, computer device and program for realizing voice interaction
CN108563655B (en) Text-based event recognition method and device
CN105279227B (en)Method and device for processing voice search of homophone
CN107203265B (en)Information interaction method and device
US20220047954A1 (en)Game playing method and system based on a multimedia file
CN109947993A (en) Plot jump method, device and computer equipment based on speech recognition
CN110875059A (en)Method and device for judging reception end and storage device
CN117253478A (en)Voice interaction method and related device
JP7717822B2 (en) Audio similarity determination method, device, and program product
CN110070891B (en)Song identification method and device and storage medium
CN104239442A (en)Method and device for representing search results
CN110019848A (en)Conversation interaction method and device and robot
CN111724781A (en) Audio data storage method, device, terminal and storage medium
CN110704592B (en) Statement analysis processing method, apparatus, computer equipment and storage medium
CN110322587B (en)Evaluation recording method, device and equipment in driving process and storage medium
US10693944B1 (en)Media-player initialization optimization
CN113129902B (en)Voice processing method and device, electronic equipment and storage medium
CN111063356B (en)Electronic equipment response method and system, sound box and computer readable storage medium
US20250315200A1 (en)Lyric-based information prompting method and apparatus, device, medium and product

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination
GR01Patent grant
GR01Patent grant

[8]ページ先頭

©2009-2025 Movatter.jp