






技术领域technical field
本发明属于语音数据处理技术领域,尤其涉及用于语音对话的通讯补偿方法和装置。The invention belongs to the technical field of voice data processing, and in particular relates to a communication compensation method and device for voice dialogue.
背景技术Background technique
相关技术中,市面上语音对话实时解析必须保持通讯连接状态,通讯中断必须等待重新连接后方可继续对当前语音对话进行解析。In the related art, the real-time analysis of the voice dialogue on the market must maintain the communication connection state, and the current voice dialogue can be continued to be analyzed after the communication interruption must wait for the reconnection.
目前市面上语音对话解析大致分为离线解析和实时解析。At present, speech dialogue analysis on the market is roughly divided into offline analysis and real-time analysis.
发明人发现现有技术的方案至少存在以下问题:离线解析只能以录音音频进行完整的解析,比较耗时,无法实时确认解析正确情况;实时解析比较依赖通讯的稳定性,在通讯中断的情况下,语音对话解析立即中断,无法继续使用。The inventors found that the prior art solution has at least the following problems: offline analysis can only perform complete analysis with recorded audio, which is time-consuming and cannot confirm the correct analysis in real time; real-time analysis is more dependent on the stability of communication, and in the case of communication interruption , the voice dialogue parsing is immediately interrupted and cannot be used any more.
发明内容SUMMARY OF THE INVENTION
本发明实施例提供一种用于语音对话的通讯补偿方法和装置,用于至少解决上述技术问题之一。Embodiments of the present invention provide a communication compensation method and device for voice dialogue, which are used to solve at least one of the above technical problems.
第一方面,本发明实施例提供一种用于语音对话的通讯补偿方法,包括:响应于用户开启通讯过程语音对话解析任务,连接解析服务器对用户的语音对话进行实时解析并获取第一实时解析结果;获取当前通讯过程中的通讯连接状态;若在通讯过程中出现连接断开,从实时解析任务切换至录音任务以录制通讯连接断开期间用户的录音音频;若在录音时出现通讯连接恢复,从录音任务切换至实时解析任务并获取第二实时解析结果;上传并解析所述录音音频以生成第一录音解析结果;将所述第一实时解析结果、所述第一录音解析结果和所述第二实时解析结果进行合并以得到完整的解析结果。In a first aspect, an embodiment of the present invention provides a communication compensation method for voice dialogue, including: in response to a user starting a voice dialogue analysis task in a communication process, connecting to an analysis server to perform real-time analysis on the user's voice dialogue and obtaining a first real-time analysis Result; obtain the communication connection status during the current communication process; if the connection is disconnected during the communication process, switch from the real-time parsing task to the recording task to record the user's recording audio during the disconnection of the communication connection; if the communication connection is restored during recording , switch from the recording task to the real-time analysis task and obtain the second real-time analysis result; upload and analyze the recorded audio to generate the first recording analysis result; combine the first real-time analysis result, the first recording analysis result and all The second real-time analysis result is merged to obtain a complete analysis result.
第二方面,本发明实施例提供一种用于语音对话的通讯补偿装置,包括:第一解析模块,配置为响应于用户开启通讯过程语音对话解析任务,连接解析服务器对用户的语音对话进行实时解析并获取第一实时解析结果;连接状态获取模块,配置为获取当前通讯过程中的通讯连接状态;录音模块,配置为若在通讯过程中出现连接断开,从实时解析任务切换至录音任务以录制通讯连接断开期间用户的录音音频;第二解析模块,配置为若在录音时出现通讯连接恢复,从录音任务切换至实时解析任务并获取第二实时解析结果;录音解析模块,配置为上传并解析所述录音音频以生成第一录音解析结果;以及合并模块,配置为将所述第一实时解析结果、所述第一录音解析结果和所述第二实时解析结果进行合并以得到完整的解析结果。In a second aspect, an embodiment of the present invention provides a communication compensation device for voice dialogue, including: a first parsing module, configured to connect to a parsing server to perform real-time analysis on the user's voice dialogue in response to a user starting a voice dialogue parsing task in a communication process Parse and acquire the first real-time parsing result; the connection status acquisition module is configured to acquire the communication connection status in the current communication process; the recording module is configured to switch from the real-time parsing task to the recording task if the connection is disconnected during the communication process. Record the user's recorded audio during the disconnection of the communication connection; the second parsing module is configured to switch from the recording task to the real-time parsing task and obtain the second real-time parsing result if the communication connection is restored during recording; the recording parsing module is configured to upload and parse the recorded audio to generate a first recording analysis result; and a merging module configured to combine the first real-time analysis result, the first recording analysis result and the second real-time analysis result to obtain a complete Parse the result.
第三方面,提供一种电子设备,其包括:至少一个处理器,以及与所述至少一个处理器通信连接的存储器,其中,所述存储器存储有可被所述至少一个处理器执行的指令,所述指令被所述至少一个处理器执行,以使所述至少一个处理器能够执行本发明任一实施例的用于语音对话的通讯补偿方法的步骤。In a third aspect, an electronic device is provided, comprising: at least one processor, and a memory communicatively connected to the at least one processor, wherein the memory stores instructions executable by the at least one processor, The instructions are executed by the at least one processor to enable the at least one processor to perform the steps of the communication compensation method for a voice conversation according to any embodiment of the present invention.
第四方面,本发明实施例还提供一种计算机程序产品,所述计算机程序产品包括存储在非易失性计算机可读存储介质上的计算机程序,所述计算机程序包括程序指令,当所述程序指令被计算机执行时,使所述计算机执行本发明任一实施例的用于语音对话的通讯补偿方法的步骤。In a fourth aspect, an embodiment of the present invention further provides a computer program product, the computer program product includes a computer program stored on a non-volatile computer-readable storage medium, the computer program includes program instructions, and when the program is When the instructions are executed by the computer, the computer is made to execute the steps of the communication compensation method for voice dialogue according to any embodiment of the present invention.
本申请的方法和装置可以让用户在任何通讯状况下都不用担心语音对话解析结果的不完整,即使通讯连接状况比较差,也可以大大减少直接录音后上传解析的时间。The method and device of the present application can allow users to not worry about the incompleteness of the voice dialogue analysis results under any communication conditions, and even if the communication connection conditions are relatively poor, it can greatly reduce the time for uploading and analysis after direct recording.
附图说明Description of drawings
为了更清楚地说明本发明实施例的技术方案,下面将对实施例描述中所需要使用的附图作一简单地介绍,显而易见地,下面描述中的附图是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to illustrate the technical solutions of the embodiments of the present invention more clearly, the following briefly introduces the accompanying drawings used in the description of the embodiments. Obviously, the drawings in the following description are some embodiments of the present invention. For those of ordinary skill in the art, other drawings can also be obtained from these drawings without any creative effort.
图1为本发明一实施例提供的一种用于语音对话的通讯补偿方法的流程图;FIG. 1 is a flowchart of a communication compensation method for voice dialogue according to an embodiment of the present invention;
图2为本发明一实施例提供的另一种用于语音对话的通讯补偿方法的流程图;2 is a flowchart of another communication compensation method for voice dialogue provided by an embodiment of the present invention;
图3为本发明一实施例提供的又一种用于语音对话的通讯补偿方法的流程图;3 is a flowchart of yet another communication compensation method for voice dialogue provided by an embodiment of the present invention;
图4为本发明一实施例提供的再一种用于语音对话的通讯补偿方法的流程图;4 is a flowchart of still another communication compensation method for voice dialogue provided by an embodiment of the present invention;
图5为本发明一实施例提供的用于语音对话的通讯补偿方法的一个具体示例的流程图;5 is a flowchart of a specific example of a communication compensation method for voice dialogue provided by an embodiment of the present invention;
图6为本发明一实施例提供的另一种用于语音对话的通讯补偿装置的框图;6 is a block diagram of another communication compensation device for voice dialogue provided by an embodiment of the present invention;
图7是本发明一实施例提供的电子设备的结构示意图。FIG. 7 is a schematic structural diagram of an electronic device provided by an embodiment of the present invention.
具体实施方式Detailed ways
为使本发明实施例的目的、技术方案和优点更加清楚,下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。In order to make the purposes, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments These are some embodiments of the present invention, but not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative efforts shall fall within the protection scope of the present invention.
请参考图1,其示出了本申请的用于语音对话的通讯补偿方法一实施例的流程图,本实施例的用于语音对话的通讯补偿方法可以适用于具备通讯或者实时语音对话功能的终端、如智能手机、平板、电脑等。Please refer to FIG. 1 , which shows a flowchart of an embodiment of the communication compensation method for voice dialogue of the present application. The communication compensation method for voice dialogue in this embodiment can be applied to a communication device with a communication or real-time voice dialogue function. Terminals, such as smartphones, tablets, computers, etc.
如图1所示,在步骤101中,响应于用户开启通讯过程语音对话解析任务,连接解析服务器对用户的语音对话进行实时解析并获取第一实时解析结果;As shown in Figure 1, in
在步骤102中,获取当前通讯过程中的通讯连接状态;In
在步骤103中,若在通讯过程中出现连接断开,从实时解析任务切换至录音任务以录制通讯连接断开期间用户的录音音频;In
在步骤104中,若在录音时出现通讯连接恢复,从录音任务切换至实时解析任务并获取第二实时解析结果;In
在步骤105中,上传并解析录音音频以生成第一录音解析结果;In
在步骤106中,将第一实时解析结果、第一录音解析结果和第二实时解析结果进行合并以得到完整的解析结果。In
在本实施例中,对于步骤101,通讯补偿装置在用户开启通讯过程语音对话解析任务之后,连接解析服务器对用户的语音对话进行实时解析并获取第一实时解析结果。其中语音对话解析任务包括实时解析任务和录音任务。之后,对于步骤102,通讯补偿装置需要一直获取当前通讯过程中的通讯连接状态,并判断当前的连接是断开还是正常连接。然后,对于步骤103,如果在通讯过程中出现连接断开,则需要从实时解析任务切换至录音任务以录制通讯连接断开期间用户的录音音频,这样不能联网实时解析的部分可以被录音之后再上传解析。然后,对于步骤104,如果在录音时出现通讯连接恢复,则需要从录音任务切换至实时解析任务并获取第二实时解析结果,即通讯连接恢复了就又可以实时解析了。之后,对于步骤 105中,通讯补偿装置上传之前中断实时解析时的录音音频并解析该录音音频以生成第一录音解析结果。最后,对于步骤106,通讯补偿装置将该第一实时解析结果、该第一录音解析结果和该第二实时解析结果进行合并以得到完整的解析结果。当然,在实际应用中,可能会出现多次中断,因此上述的第一录音解析结果可能存在多个,上述第二实时解析结果也可能存在多个,本申请在此没有限制。然后将所有的结果按照时间顺序拼接起来即可形成完整的解析结果。In this embodiment, for
本实施例的方法通过在通讯中断的时候进行录音之后对录音进行上传和解析,然后将录音的解析结果和实时解析的结果进行拼接,就能形成完整的解析结果,因此即使用户的通讯连接不稳定,也能得到完整的语音解析结果,用户体验极好。The method of this embodiment can form a complete analysis result by uploading and analyzing the recording after recording when the communication is interrupted, and then splicing the analysis result of the recording and the real-time analysis result, so even if the user's communication connection is not connected It is stable and can also get complete speech analysis results, and the user experience is excellent.
进一步参考图2,其示出了本申请一实施例提供的另一种用于语音对话的通讯补偿方法的流程图。该流程图主要是对流程图1的附加流程进一步限定的步骤的流程图。在该实施例中,第一实时解析结果和第二实时解析结果存储在解析结果集合中。Referring further to FIG. 2 , it shows a flowchart of another communication compensation method for voice dialogue provided by an embodiment of the present application. The flow chart is mainly a flow chart of the steps further defined by the additional flow of the flow chart 1 . In this embodiment, the first real-time analysis result and the second real-time analysis result are stored in the analysis result set.
如图2所示,在步骤201中,在每次出现通讯连接恢复时,在解析结果集合中添加相应的通讯恢复标记;As shown in Figure 2, in
在步骤202中,使用第一录音解析结果替换相应的通讯恢复标记以形成完整的解析结果集合。In
在本实施例中,对于步骤201,通讯补偿装置需要在每次出现通讯连接恢复的时候,在解析结果集合中添加相应的通讯恢复标记,每一次通讯恢复都会有一个对应的通讯恢复标记,该通讯恢复标记例如使用整个语音解析结果的标识符加上时间来进行命名。之后,对于步骤202,使用通讯恢复之前的录音对应的录音解析结果就能替换相应的通讯恢复标记,从而形成完整的解析结果集合。In this embodiment, for
本实施例的方法通过采用通讯恢复标记来标记通讯连接恢复的时间点,之后可以用该标记之前的录音解析结果替换掉该标记的方式来形成完整的解析结果集合,简单易操作且不占用太多的空间。In the method of this embodiment, a communication recovery mark is used to mark the time point when the communication connection is restored, and then the mark can be replaced by the recording analysis result before the mark to form a complete set of analysis results, which is simple and easy to operate and does not take up too much time. lots of space.
进一步参考图3,其示出了本申请一实施例提供的又一种用于语音对话的通讯补偿方法的流程图。该流程图主要是对“通讯连接断开之后一直未恢复连接”的情况的进一步限定的步骤的流程图。Referring further to FIG. 3 , it shows a flowchart of another communication compensation method for voice dialogue provided by an embodiment of the present application. This flow chart is mainly a flow chart of further defined steps for the case of "the connection has not been restored after the communication connection is disconnected".
如图3所示,在步骤301中,若在录音时未出现通讯连接恢复,则在语音对话解析任务结束之后保存录音音频;As shown in Figure 3, in
在步骤302中,获取通讯连接恢复之后上传的录音音频的第二录音解析结果,将第二录音解析结果插入至解析结果集合中。In
在本实施例中,对于步骤301,通讯补偿装置如果检测到通讯断开之后一直未恢复,则会在语音对话解析任务结束之后保存录音音频。之后,对于步骤302,由于通讯一直未恢复,只有一个录音也不存在通讯恢复标记,直接将对该录音音频的解析结果附加在之前实时解析的结果之后即可。从而本实施例的方法操作简单,不占用太多的资源。In this embodiment, for
请参考图4,其示出了本申请一实施例提供的再一种用于语音对话的通讯补偿方法的流程图。该流程图主要是针对“通讯连接比较稳定”的情况进一步限定的步骤的流程图。Please refer to FIG. 4 , which shows a flowchart of still another communication compensation method for voice dialogue provided by an embodiment of the present application. This flow chart is mainly a flow chart of steps further defined for the case of "the communication connection is relatively stable".
如图4所示,在步骤401中,若当前通讯连接可用,与解析服务器建立长连接;As shown in Figure 4, in
在步骤402中,获取解析服务器返回的实时解析结果;In
在步骤403中,将实时解析结果存储至解析结果集合中。In
在本实施例中,对于步骤401,如果判断当前通讯连接可用,则会与解析服务器建立长连接,从而不需要一直不断地进行连接浪费网络带宽和系统资源。之后,在步骤402中,获取解析服务器返回的实时解析结果,并在步骤403中将实时解析结果存储之相应的解析结果集合中。In this embodiment, for
本实施例的方法通过在连接可用的时候与解析服务器建立长连接,通过复用TCP链接可以减少3次TCP握手的时间,并且在网络连接不中断的情况下可以一直维持一个比较稳定比较快的网络连接状态和数据传输环境,有利于实时解析结果的传输。By establishing a long connection with the parsing server when the connection is available, the method of this embodiment can reduce the time of three TCP handshakes by multiplexing the TCP connection, and can maintain a relatively stable and fast connection without interruption of the network connection. The network connection status and data transmission environment are conducive to the transmission of real-time analysis results.
在一些可选的实施例中,上述方法还包括:在每次出现通讯连接恢复时,重新与解析服务器建立长连接。从而每次通讯恢复之后都能通过长连接减少不必要的握手时间并能维持一个较快的传输速度。In some optional embodiments, the above method further includes: re-establishing a long connection with the resolution server every time the communication connection is restored. Therefore, unnecessary handshake time can be reduced through a long connection and a faster transmission speed can be maintained after each communication recovery.
需要说明的是,上述方法步骤并不用于限制各步骤的执行顺序,实际上,某些步骤可能会同时执行或者以与步骤限定的相反的顺序执行,本申请在此没有限制。It should be noted that the above method steps are not used to limit the execution order of each step. In fact, some steps may be executed simultaneously or in the reverse order of the steps defined, which is not limited in this application.
目前市面上还未见到可以在通讯中断的情况下,保持语音对话的持续录入,并在通讯恢复后保持当前语音对话解析且对通讯中断过程中的语音对话同步解析的技术方案。At present, there is no technical solution on the market that can keep the continuous input of the voice dialogue in the case of communication interruption, maintain the current voice dialogue analysis after the communication is resumed, and synchronously analyze the voice dialogue during the communication interruption process.
下面对通过描述发明人在实现本发明的过程中遇到的一些问题和对最终确定的方案的一个具体实施例进行说明,以使本领域技术人员更好地理解本申请的方案。The following describes some problems encountered by the inventor in the process of implementing the present invention and a specific embodiment of the finalized solution, so that those skilled in the art can better understand the solution of the present application.
发明人在实现本申请的过程中发现现有技术中存在的缺陷主要是由以下原因导致的:语音对话对实时解析过于依赖,对通讯中断情况下的离线解析没有支持。In the process of realizing the present application, the inventor found that the defects in the prior art are mainly caused by the following reasons: voice dialogue relies too much on real-time parsing, and does not support offline parsing in the case of communication interruption.
发明人还发现,目前市面上的产品功能比较单一,对通讯连接要求比较高,原因有以下几个方面:The inventor also found that the functions of the products currently on the market are relatively simple, and the requirements for communication connections are relatively high. The reasons are as follows:
首先,市面上的产品面向具体的用户群体,受众比较单一。First of all, the products on the market are aimed at specific user groups, and the audience is relatively single.
其次,融合功能的语音对话实时解析逻辑复杂,异常情况较多且不容易解决。Secondly, the real-time analysis logic of the voice dialogue of the fusion function is complex, and there are many abnormal situations that are not easy to solve.
最后,融合功能的语音对话实时解析对服务端的压力比较大,不仅要建立长连接实现语音对话的实时解析,而且还要提供录音大文件的上传解析,对软件和硬件来说成本都比较高。Finally, the real-time analysis of voice dialogues with integrated functions puts a lot of pressure on the server. Not only does it have to establish a long connection to realize real-time analysis of voice dialogues, but it also needs to provide uploading and analysis of large recording files, which is expensive for both software and hardware.
本申请的方案主要从以下几个方面入手进行设计和优化:The scheme of this application is mainly designed and optimized from the following aspects:
1、语音对话实时解析的过程中,如果通讯连接中断,自动切换为录音状态,并记录录音文件。1. During the real-time analysis of voice dialogue, if the communication connection is interrupted, it will automatically switch to the recording state and record the recording file.
2、在通讯连接恢复时,再切换回实时解析状态。同时将录音文件上传至服务端进行语音对话解析,并将解析结果插入通讯连接恢复之前的结果中。2. When the communication connection is restored, switch back to the real-time analysis state. At the same time, upload the recording file to the server for voice dialogue analysis, and insert the analysis result into the result before the communication connection is restored.
3、在语音对话结束时通讯连接仍未恢复,则保存录音文件,可由用户在通讯连接时手动选择上传,并将解析结果存入指定到对应的语音对话解析文件中。3. If the communication connection is still not restored at the end of the voice conversation, the recording file will be saved, which can be manually selected and uploaded by the user during the communication connection, and the analysis result will be stored in the corresponding voice dialogue analysis file.
请参考图5,其示出了本申请一个方案的具体流程图。Please refer to FIG. 5 , which shows a specific flow chart of a solution of the present application.
如图5所示,本申请的方案主要包括以下步骤:As shown in Figure 5, the scheme of the present application mainly includes the following steps:
用户首先需要创建语音对话解析任务,触发处理流程。The user first needs to create a voice dialogue parsing task to trigger the processing flow.
处理流程见图5所示:The processing flow is shown in Figure 5:
步骤1:客户端开启语音对话解析任务。Step 1: The client starts the voice dialogue analysis task.
步骤2:判断当前通讯连接状态。Step 2: Determine the current communication connection status.
步骤3:通讯连接可用。Step 3: The communication connection is available.
a.客户端连接至语音对话解析服务,实时获取结果。a. The client connects to the voice dialogue analysis service and obtains the results in real time.
b.实时语音对话解析过程,若通讯异常,则返回步骤1。b. Real-time voice dialogue analysis process, if the communication is abnormal, go back to step 1.
步骤4:通讯连接断开。Step 4: The communication connection is disconnected.
a.开启录音,并保存录音文件。a. Start the recording and save the recording file.
b.通讯连接恢复。b. The communication connection is restored.
a)在解析结果集合中添加标记。a) Add a tag to the parsing result collection.
b)上传录音进行语音对话解析,将解析结果替换结果集合中的标记位。b) Upload the recording for voice dialogue analysis, and replace the mark bit in the result set with the analysis result.
c)与实时解析服务建立连接,并获取解析结果附加到结果集合中。c) Establish a connection with the real-time parsing service, and obtain the parsing result and add it to the result set.
c.通讯连接未恢复直至语音对话解析任务结束。c. The communication connection is not restored until the end of the voice dialogue analysis task.
a)保存通讯连接中断后的录音文件。a) Save the recording file after the communication connection is interrupted.
b)在确定通讯连接可用的情况下,可在客户端选择录音文件上传进行语音对话解析。b) In the case of determining that the communication connection is available, you can select the recording file to upload on the client side for voice dialogue analysis.
c)解析结果将附加到选定的语音对话解析结果集合中。c) The parsing result will be appended to the selected speech dialog parsing result set.
步骤5:获得完整的语音对话解析结果。Step 5: Obtain the complete speech dialogue analysis result.
发明人在实现本申请的过程中,还尝试过以下方案:在通讯状况比较差的情况下,切换至手机进行语音对话采集,手机网络稳定性比较高,可以减少服务端录音文件解析时的资源消耗,并有效降低复杂逻辑的编码量。但是缺点也很明显,手机拾音对距离要求比较高,当语音对话距离手机较远时,解析的结果正确率低,效果较差。In the process of realizing the present application, the inventor has also tried the following solutions: when the communication situation is relatively poor, switching to the mobile phone for voice dialogue collection, the mobile phone network stability is relatively high, and the resources for parsing the recording file of the server can be reduced. consumption, and effectively reduce the coding amount of complex logic. However, the shortcomings are also obvious. The distance requirements for phone pickup are relatively high. When the voice conversation is far away from the mobile phone, the accuracy of the analysis result is low and the effect is poor.
衡量了实际应用场景的使用情况后,目前的方案以本方案进行。After measuring the usage of the actual application scenario, the current scheme is carried out with this scheme.
通过本解决方案可以让用户在任何通讯状况下都不用担心语音对话解析结果的不完整,即使通讯连接状况比较差,也可以大大减少直接录音后上传解析的时间。With this solution, users do not have to worry about the incompleteness of voice dialogue analysis results in any communication situation. Even if the communication connection is poor, the time for uploading and analysis after direct recording can be greatly reduced.
请参考图6,其示出了本发明一实施例提供的用于语音对话的通讯补偿装置的框图。Please refer to FIG. 6 , which shows a block diagram of a communication compensation apparatus for voice dialogue provided by an embodiment of the present invention.
如图6所示,用于语音对话的通讯补偿装置600,包括第一解析模块 610、连接状态获取模块620、录音模块630、第二解析模块640、录音解析模块650和合并模块660。As shown in FIG. 6 , the communication compensation device 600 for voice dialogue includes a
其中,第一解析模块610,配置为响应于用户开启通讯过程语音对话解析任务,连接解析服务器对用户的语音对话进行实时解析并获取第一实时解析结果;连接状态获取模块620,配置为获取当前通讯过程中的通讯连接状态;录音模块630,配置为若在通讯过程中出现连接断开,从实时解析任务切换至录音任务以录制通讯连接断开期间用户的录音音频;第二解析模块640,配置为若在录音时出现通讯连接恢复,从录音任务切换至实时解析任务并获取第二实时解析结果;录音解析模块650,配置为上传并解析录音音频以生成第一录音解析结果;以及合并模块660,配置为将第一实时解析结果、第一录音解析结果和第二实时解析结果进行合并以得到完整的解析结果。The
在一些可选的实施例中,第一实时解析结果和第二实时解析结果存储在解析结果集合中,上述用于语音对话的通讯补偿装置600还包括:恢复标记添加模块(图中未示出),配置为在每次出现通讯连接恢复时,在解析结果集合中添加相应的通讯恢复标记;替换模块(图中未示出),配置为使用第一录音解析结果替换相应的通讯恢复标记以形成完整的解析结果集合。In some optional embodiments, the first real-time parsing result and the second real-time parsing result are stored in the parsing result set, and the above-mentioned communication compensation apparatus 600 for voice dialogue further includes: a recovery mark adding module (not shown in the figure). ), configured to add a corresponding communication recovery mark in the analysis result set every time the communication connection recovery occurs; the replacement module (not shown in the figure) is configured to use the first recording analysis result to replace the corresponding communication recovery mark to Form a complete set of parsing results.
在另一些可选的实施例中,上述用于语音对话的通讯补偿装置600装置,还包括:保存模块(图中未示出),配置为若在录音时未出现通讯连接恢复,则在语音对话解析任务结束之后保存录音音频;以及录音插入模块(图中未示出),配置为获取通讯连接恢复之后上传的录音音频的第二录音解析结果,将第二录音解析结果插入至解析结果集合中。In some other optional embodiments, the above-mentioned communication compensation device 600 for voice dialogue further includes: a saving module (not shown in the figure), configured to save the communication After the dialogue analysis task ends, save the recording audio; and a recording insertion module (not shown in the figure), configured to obtain the second recording analysis result of the recorded audio uploaded after the communication connection is restored, and insert the second recording analysis result into the analysis result set middle.
应当理解,图6中记载的诸模块与参考图1、图2、图3和图4中描述的方法中的各个步骤相对应。由此,上文针对方法描述的操作和特征以及相应的技术效果同样适用于图6中的诸模块,在此不再赘述。It should be understood that the modules recited in FIG. 6 correspond to various steps in the method described with reference to FIGS. 1 , 2 , 3 and 4 . Therefore, the operations and features described above with respect to the method and the corresponding technical effects are also applicable to the modules in FIG. 6 , and will not be repeated here.
值得注意的是,本公开的实施例中的模块并不用于限制本公开的方案,例如判断模块可以描述为当设备处于交互状态时,判断交互状态是否为播放场景的模块。另外,还可以通过硬件处理器来实现相关功能模块,例如判断模块也可以用处理器实现,在此不再赘述。It is worth noting that the modules in the embodiments of the present disclosure are not used to limit the solution of the present disclosure. For example, the judgment module can be described as a module for judging whether the interactive state is a playing scene when the device is in an interactive state. In addition, the relevant functional modules can also be implemented by a hardware processor, for example, the judgment module can also be implemented by a processor, which will not be repeated here.
在另一些实施例中,本发明实施例还提供了一种非易失性计算机存储介质,计算机存储介质存储有计算机可执行指令,该计算机可执行指令可执行上述任意方法实施例中的用于语音对话的通讯补偿方法;In other embodiments, embodiments of the present invention further provide a non-volatile computer storage medium, where the computer storage medium stores computer-executable instructions, and the computer-executable instructions can execute any of the above method embodiments for Communication compensation method for voice dialogue;
作为一种实施方式,本发明的非易失性计算机存储介质存储有计算机可执行指令,计算机可执行指令设置为:As an embodiment, the non-volatile computer storage medium of the present invention stores computer-executable instructions, and the computer-executable instructions are set to:
响应于用户开启通讯过程语音对话解析任务,连接解析服务器对用户的语音对话进行实时解析并获取第一实时解析结果;In response to the user opening the voice dialogue analysis task in the communication process, the connection analysis server performs real-time analysis on the user's voice dialogue and obtains the first real-time analysis result;
获取当前通讯过程中的通讯连接状态;Get the communication connection status in the current communication process;
若在通讯过程中出现连接断开,从实时解析任务切换至录音任务以录制通讯连接断开期间用户的录音音频;If the connection is disconnected during the communication process, switch from the real-time analysis task to the recording task to record the user's recorded audio during the disconnection of the communication connection;
若在录音时出现通讯连接恢复,从录音任务切换至实时解析任务并获取第二实时解析结果;If the communication connection is restored during recording, switch from the recording task to the real-time analysis task and obtain the second real-time analysis result;
上传并解析所述录音音频以生成第一录音解析结果;uploading and parsing the recorded audio to generate a first recording parsing result;
将所述第一实时解析结果、所述第一录音解析结果和所述第二实时解析结果进行合并以得到完整的解析结果。The first real-time analysis result, the first recording analysis result, and the second real-time analysis result are combined to obtain a complete analysis result.
非易失性计算机可读存储介质可以包括存储程序区和存储数据区,其中,存储程序区可存储操作系统、至少一个功能所需要的应用程序;存储数据区可存储根据用于语音对话的通讯补偿装置的使用所创建的数据等。此外,非易失性计算机可读存储介质可以包括高速随机存取存储器,还可以包括非易失性存储器,例如至少一个磁盘存储器件、闪存器件、或其他非易失性固态存储器件。在一些实施例中,非易失性计算机可读存储介质可选包括相对于处理器远程设置的存储器,这些远程存储器可以通过网络连接至用于语音对话的通讯补偿装置。上述网络的实例包括但不限于互联网、企业内部网、局域网、移动通信网及其组合。The non-volatile computer-readable storage medium can include a stored program area and a stored data area, wherein the stored program area can store an operating system, an application program required for at least one function; Data created by the use of compensation devices, etc. In addition, the non-volatile computer-readable storage medium may include high-speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid-state storage device. In some embodiments, the non-transitory computer-readable storage medium may optionally include memory located remotely from the processor, and the remote memory may be connected via a network to the communication compensation device for voice conversations. Examples of such networks include, but are not limited to, the Internet, an intranet, a local area network, a mobile communication network, and combinations thereof.
本发明实施例还提供一种计算机程序产品,计算机程序产品包括存储在非易失性计算机可读存储介质上的计算机程序,计算机程序包括程序指令,当程序指令被计算机执行时,使计算机执行上述任一项用于语音对话的通讯补偿方法。An embodiment of the present invention further provides a computer program product, the computer program product includes a computer program stored on a non-volatile computer-readable storage medium, the computer program includes program instructions, and when the program instructions are executed by a computer, the computer is made to execute the above Any communication compensation method for voice dialogue.
图7是本发明实施例提供的电子设备的结构示意图,如图7所示,该设备包括:一个或多个处理器710以及存储器720,图7中以一个处理器 710为例。用于语音对话的通讯补偿方法的设备还可以包括:输入装置730 和输出装置740。处理器710、存储器720、输入装置730和输出装置740 可以通过总线或者其他方式连接,图7中以通过总线连接为例。存储器720 为上述的非易失性计算机可读存储介质。处理器710通过运行存储在存储器720中的非易失性软件程序、指令以及模块,从而执行服务器的各种功能应用以及数据处理,即实现上述方法实施例用于语音对话的通讯补偿方法。输入装置730可接收输入的数字或字符信息,以及产生与通讯补偿装置的用户设置以及功能控制有关的键信号输入。输出装置740可包括显示屏等显示设备。FIG. 7 is a schematic structural diagram of an electronic device provided by an embodiment of the present invention. As shown in FIG. 7 , the device includes: one or
上述产品可执行本发明实施例所提供的方法,具备执行方法相应的功能模块和有益效果。未在本实施例中详尽描述的技术细节,可参见本发明实施例所提供的方法。The above product can execute the method provided by the embodiment of the present invention, and has corresponding functional modules and beneficial effects for executing the method. For technical details not described in detail in this embodiment, reference may be made to the method provided by the embodiment of the present invention.
作为一种实施方式,上述电子设备应用于用于语音对话的通讯补偿装置中,用于客户端,包括:至少一个处理器;以及,与至少一个处理器通信连接的存储器;其中,存储器存储有可被至少一个处理器执行的指令,指令被至少一个处理器执行,以使至少一个处理器能够:As an embodiment, the above electronic device is applied to a communication compensation device for voice dialogue, used for a client, comprising: at least one processor; and a memory communicatively connected to the at least one processor; wherein the memory stores instructions executable by at least one processor, the instructions being executed by at least one processor to enable the at least one processor to:
响应于用户开启通讯过程语音对话解析任务,连接解析服务器对用户的语音对话进行实时解析并获取第一实时解析结果;In response to the user opening the voice dialogue analysis task in the communication process, the connection analysis server performs real-time analysis on the user's voice dialogue and obtains the first real-time analysis result;
获取当前通讯过程中的通讯连接状态;Get the communication connection status in the current communication process;
若在通讯过程中出现连接断开,从实时解析任务切换至录音任务以录制通讯连接断开期间用户的录音音频;If the connection is disconnected during the communication process, switch from the real-time analysis task to the recording task to record the user's recorded audio during the disconnection of the communication connection;
若在录音时出现通讯连接恢复,从录音任务切换至实时解析任务并获取第二实时解析结果;If the communication connection is restored during recording, switch from the recording task to the real-time analysis task and obtain the second real-time analysis result;
上传并解析所述录音音频以生成第一录音解析结果;uploading and parsing the recorded audio to generate a first recording parsing result;
将所述第一实时解析结果、所述第一录音解析结果和所述第二实时解析结果进行合并以得到完整的解析结果。The first real-time analysis result, the first recording analysis result, and the second real-time analysis result are combined to obtain a complete analysis result.
本申请实施例的电子设备以多种形式存在,包括但不限于:The electronic devices in the embodiments of the present application exist in various forms, including but not limited to:
(1)移动通信设备:这类设备的特点是具备移动通信功能,并且以提供话音、数据通信为主要目标。这类终端包括:智能手机(例如iPhone)、多媒体手机、功能性手机,以及低端手机等。(1) Mobile communication equipment: This type of equipment is characterized by having mobile communication functions, and its main goal is to provide voice and data communication. Such terminals include: smart phones (eg iPhone), multimedia phones, feature phones, and low-end phones.
(2)超移动个人计算机设备:这类设备属于个人计算机的范畴,有计算和处理功能,一般也具备移动上网特性。这类终端包括:PDA、MID和 UMPC设备等,例如iPad。(2) Ultra-mobile personal computer equipment: This type of equipment belongs to the category of personal computers, has computing and processing functions, and generally has the characteristics of mobile Internet access. Such terminals include: PDAs, MIDs, and UMPC devices, such as iPads.
(3)便携式娱乐设备:这类设备可以显示和播放多媒体内容。该类设备包括:音频、视频播放器(例如iPod),掌上游戏机,电子书,以及智能玩具和便携式车载导航设备。(3) Portable entertainment equipment: This type of equipment can display and play multimedia content. Such devices include: audio and video players (eg iPod), handheld game consoles, e-books, as well as smart toys and portable car navigation devices.
(4)服务器:提供计算服务的设备,服务器的构成包括处理器、硬盘、内存、系统总线等,服务器和通用的计算机架构类似,但是由于需要提供高可靠的服务,因此在处理能力、稳定性、可靠性、安全性、可扩展性、可管理性等方面要求较高。(4) Server: A device that provides computing services. The composition of the server includes a processor, a hard disk, a memory, a system bus, etc. The server is similar to a general computer architecture, but due to the need to provide highly reliable services, the processing power, stability , reliability, security, scalability, manageability and other aspects of high requirements.
(5)其他具有数据交互功能的电子装置。(5) Other electronic devices with data interaction function.
以上所描述的装置实施例仅仅是示意性的,其中作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。本领域普通技术人员在不付出创造性的劳动的情况下,即可以理解并实施。The device embodiments described above are only illustrative, wherein the units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in one place , or distributed to multiple network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution in this embodiment. Those of ordinary skill in the art can understand and implement it without creative effort.
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到各实施方式可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件。基于这样的理解,上述技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品可以存储在计算机可读存储介质中,如ROM/RAM、磁碟、光盘等,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行各个实施例或者实施例的某些部分的方法。From the description of the above embodiments, those skilled in the art can clearly understand that each embodiment can be implemented by means of software plus a necessary general hardware platform, and certainly can also be implemented by hardware. Based on this understanding, the above-mentioned technical solutions can be embodied in the form of software products in essence or the parts that make contributions to the prior art, and the computer software products can be stored in computer-readable storage media, such as ROM/RAM, magnetic Disks, optical discs, etc., include instructions for causing a computer device (which may be a personal computer, server, or network device, etc.) to perform the methods of various embodiments or portions of embodiments.
最后应说明的是:以上实施例仅用以说明本发明的技术方案,而非对其限制;尽管参照前述实施例对本发明进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本发明各实施例技术方案的精神和范围。Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention, but not to limit them; although the present invention has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art should understand that it can still be The technical solutions described in the foregoing embodiments are modified, or some technical features thereof are equivalently replaced; and these modifications or replacements do not make the essence of the corresponding technical solutions deviate from the spirit and scope of the technical solutions of the embodiments of the present invention.
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201811637588.6ACN109743436B (en) | 2018-12-29 | 2018-12-29 | Communication compensation method, apparatus, device and storage medium for voice dialogue |
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201811637588.6ACN109743436B (en) | 2018-12-29 | 2018-12-29 | Communication compensation method, apparatus, device and storage medium for voice dialogue |
| Publication Number | Publication Date |
|---|---|
| CN109743436A CN109743436A (en) | 2019-05-10 |
| CN109743436Btrue CN109743436B (en) | 2020-08-28 |
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN201811637588.6AActiveCN109743436B (en) | 2018-12-29 | 2018-12-29 | Communication compensation method, apparatus, device and storage medium for voice dialogue |
| Country | Link |
|---|---|
| CN (1) | CN109743436B (en) |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2002215584A (en)* | 2001-01-22 | 2002-08-02 | Omron Corp | Device, method, and program for voice response, and computer-readable recording medium where the same is recorded |
| CN1677997A (en)* | 2004-03-31 | 2005-10-05 | 日本电气株式会社 | Call interruption compensation system |
| US7881234B2 (en)* | 2006-10-19 | 2011-02-01 | International Business Machines Corporation | Detecting interruptions in audio conversations and conferences, and using a conversation marker indicative of the interrupted conversation |
| CN106469558A (en)* | 2015-08-21 | 2017-03-01 | 中兴通讯股份有限公司 | Audio recognition method and equipment |
| US9659564B2 (en)* | 2014-10-24 | 2017-05-23 | Sestek Ses Ve Iletisim Bilgisayar Teknolojileri Sanayi Ticaret Anonim Sirketi | Speaker verification based on acoustic behavioral characteristics of the speaker |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US8396192B2 (en)* | 2010-03-03 | 2013-03-12 | Calabrio, Inc. | Desktop recording architecture for recording call sessions over a telephony network |
| CN103096186B (en)* | 2011-10-28 | 2016-06-29 | 上海博泰悦臻网络技术服务有限公司 | The continuous even method of mobile unit, the talkback unit of vehicle and off-line thereof |
| CN103369094B (en)* | 2013-07-15 | 2015-09-02 | 北京邮电大学 | The successional mobile terminal of communication process is not affected when conversing and being interrupted |
| US9911415B2 (en)* | 2014-12-19 | 2018-03-06 | Lenovo (Singapore) Pte. Ltd. | Executing a voice command during voice input |
| CN104702791A (en)* | 2015-03-13 | 2015-06-10 | 安徽声讯信息技术有限公司 | Smart phone recording sound for a long time and synchronously transliterating text, information processing method thereof |
| CN205943456U (en)* | 2016-08-24 | 2017-02-08 | 安徽咪鼠科技有限公司 | Pronunciation are gathered and preprocessing device based on intelligence pronunciation mouse |
| US20180166073A1 (en)* | 2016-12-13 | 2018-06-14 | Ford Global Technologies, Llc | Speech Recognition Without Interrupting The Playback Audio |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2002215584A (en)* | 2001-01-22 | 2002-08-02 | Omron Corp | Device, method, and program for voice response, and computer-readable recording medium where the same is recorded |
| CN1677997A (en)* | 2004-03-31 | 2005-10-05 | 日本电气株式会社 | Call interruption compensation system |
| US7881234B2 (en)* | 2006-10-19 | 2011-02-01 | International Business Machines Corporation | Detecting interruptions in audio conversations and conferences, and using a conversation marker indicative of the interrupted conversation |
| US9659564B2 (en)* | 2014-10-24 | 2017-05-23 | Sestek Ses Ve Iletisim Bilgisayar Teknolojileri Sanayi Ticaret Anonim Sirketi | Speaker verification based on acoustic behavioral characteristics of the speaker |
| CN106469558A (en)* | 2015-08-21 | 2017-03-01 | 中兴通讯股份有限公司 | Audio recognition method and equipment |
| Publication number | Publication date |
|---|---|
| CN109743436A (en) | 2019-05-10 |
| Publication | Publication Date | Title |
|---|---|---|
| CN110597774B (en) | File sharing method, system, device, computing equipment and terminal equipment | |
| CN110765744B (en) | Multi-user collaborative document editing method and system | |
| CN106791958B (en) | Position mark information generation method and device | |
| CN113542888B (en) | Video processing method and device, electronic equipment and storage medium | |
| CA2951525A1 (en) | Communication apparatus, communication system, communication management system, and communication control method | |
| CN112562688A (en) | Voice transcription method, device, recording pen and storage medium | |
| CN105550934A (en) | System and method for pushing WeChat soft advertisement in virtual reality | |
| CN113672748A (en) | Multimedia information playing method and device | |
| CN104023176A (en) | Method and device of processing audio frequency and image information as well as terminal equipment | |
| CN111767558A (en) | Data access monitoring method, device and system | |
| CN111541905A (en) | Live broadcast method and device, computer equipment and storage medium | |
| CN113144620B (en) | Method, device, platform, readable medium and equipment for detecting frame synchronous game | |
| CN114845136A (en) | Video synthesis method, device, equipment and storage medium | |
| CN109743436B (en) | Communication compensation method, apparatus, device and storage medium for voice dialogue | |
| CN110912948A (en) | Method and device for reporting problems | |
| CN113132808B (en) | Video generation method and device and computer readable storage medium | |
| CN112423098A (en) | Video processing method, electronic device and storage medium | |
| CN116567289A (en) | Data processing method, device, head-mounted display device and medium | |
| WO2015131700A1 (en) | File storage method and device | |
| CN103929544B (en) | A kind of method and system realizing pc end and mobile terminal automatic recording | |
| CN111107296B (en) | Audio data acquisition method and device, electronic equipment and readable storage medium | |
| CN114153542A (en) | Screen projection method and device, electronic equipment and computer readable storage medium | |
| CN115225917A (en) | Recording plug-flow method, device, equipment and medium | |
| US12164833B2 (en) | Audio injection in remote device infrastructure | |
| CN115037978B (en) | Screen projection method and related equipment |
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| GR01 | Patent grant | ||
| GR01 | Patent grant | ||
| CP01 | Change in the name or title of a patent holder | Address after:215123 14 Tengfei Innovation Park, 388 Xinping street, Suzhou Industrial Park, Suzhou, Jiangsu. Patentee after:Sipic Technology Co.,Ltd. Address before:215123 14 Tengfei Innovation Park, 388 Xinping street, Suzhou Industrial Park, Suzhou, Jiangsu. Patentee before:AI SPEECH Ltd. | |
| CP01 | Change in the name or title of a patent holder |