CN114390133A

Movatterモバイル変換

Info

Publication number: CN114390133A
Application number: CN202210082304.1A
Authority: CN
Inventors: 高志稳
Original assignee: Vivo Mobile Communication Co Ltd
Current assignee: Vivo Mobile Communication Co Ltd
Priority date: 2022-01-24
Filing date: 2022-01-24
Publication date: 2022-04-22
Also published as: WO2023138632A1

Abstract

The application discloses a recording method, a recording device and electronic equipment, and belongs to the field of electronic equipment. The terminal determines the position of a target sound source through a microphone array; wherein the target sound source is a sound source generating a voice signal whose sound intensity satisfies a first condition, and the microphone array includes at least three microphones that are not arranged in a straight line; and acquiring voice signals in a directional acquisition mode according to the position of the target sound source to obtain a voice file.

Description

Translated fromChinese

录音方法、装置和电子设备Recording method, apparatus and electronic equipment

技术领域technical field

本申请属于电子设备领域，具体涉及一种录音方法、装置和电子设备。The present application belongs to the field of electronic equipment, and specifically relates to a recording method, device and electronic equipment.

背景技术Background technique

随着科学技术的不断进步和发展，智能手机的功能也越来越丰富多样，人们对于录音技术和质量要求也越来越高，好的录音技术能带给人很好的体验效果，同样精彩丰富的声音也带给人更真挚的听觉。With the continuous progress and development of science and technology, the functions of smart phones are becoming more and more diverse, and people have higher and higher requirements for recording technology and quality. Good recording technology can bring people a good experience, which is equally exciting. The rich sound also brings people a more sincere hearing.

智能手机录音对于环境噪声的抑制能力不强，尤其是在多人会议等场景下，周围稍微大一点的声音都会被录音进去，声音识别度低，噪声大。Smartphone recording does not have a strong ability to suppress environmental noise, especially in scenarios such as multi-person conferences, where slightly louder sounds around will be recorded, resulting in low sound recognition and loud noise.

发明内容SUMMARY OF THE INVENTION

本申请实施例的目的是提供一种录音方法、装置和电子设备，能够解决对于环境噪声的抑制能力不强，尤其是在多人会议等场景下，周围稍微大一点的声音都会被录音进去，声音识别度低，噪声大的问题。The purpose of the embodiments of the present application is to provide a recording method, device and electronic device, which can solve the problem that the ability to suppress environmental noise is not strong. The problem of low sound recognition and high noise.

第一方面，本申请实施例提供了一种录音方法，所述方法包括：In a first aspect, an embodiment of the present application provides a recording method, the method includes:

终端通过麦克风阵列，确定目标声源的位置；其中，所述目标声源为产生的语音信号的声音强度满足第一条件的声源，所述麦克风阵列包括至少三个不排列在一条直线上的麦克风；The terminal determines the position of the target sound source through the microphone array; wherein, the target sound source is a sound source whose sound intensity of the generated voice signal satisfies the first condition, and the microphone array includes at least three sound sources that are not arranged in a straight line. microphone;

根据所述目标声源的位置，以定向采集模式采集语音信号，得到语音文件。According to the position of the target sound source, the voice signal is collected in a directional acquisition mode to obtain a voice file.

第二方面，本申请实施例提供了一种录音装置，所述装置包括：In a second aspect, an embodiment of the present application provides a recording device, the device comprising:

声源定位模块，用于通过麦克风阵列，确定目标声源的位置；其中，所述目标声源为产生的语音信号的声音强度满足第一条件的声源，所述麦克风阵列包括至少三个不排列在一条直线上的麦克风；The sound source localization module is used to determine the position of the target sound source through the microphone array; wherein, the target sound source is a sound source whose sound intensity of the generated voice signal satisfies the first condition, and the microphone array includes at least three different sound sources. Microphones arranged in a straight line;

语音采集模块，用于根据所述目标声源的位置，以定向采集模式采集语音信号，得到语音文件。The voice acquisition module is used for collecting voice signals in a directional acquisition mode according to the position of the target sound source to obtain a voice file.

第三方面，本申请实施例提供了一种电子设备，该电子设备包括处理器和存储器，所述存储器存储可在所述处理器上运行的程序或指令，所述程序或指令被所述处理器执行时实现如第一方面所述的方法的步骤。In a third aspect, an embodiment of the present application provides an electronic device, the electronic device includes a processor and a memory, the memory stores a program or an instruction that can be executed on the processor, and the program or instruction is processed by the processor The steps of the method as described in the first aspect are implemented when the device is executed.

第四方面，本申请实施例提供了一种可读存储介质，所述可读存储介质上存储程序或指令，所述程序或指令被处理器执行时实现如第一方面所述的方法的步骤。In a fourth aspect, an embodiment of the present application provides a readable storage medium, where a program or an instruction is stored on the readable storage medium, and when the program or instruction is executed by a processor, the steps of the method according to the first aspect are implemented .

第五方面，本申请实施例提供了一种芯片，所述芯片包括处理器和通信接口，所述通信接口和所述处理器耦合，所述处理器用于运行程序或指令，实现如第一方面所述的方法。In a fifth aspect, an embodiment of the present application provides a chip, the chip includes a processor and a communication interface, the communication interface is coupled to the processor, and the processor is configured to run a program or an instruction, and implement the first aspect the method described.

第六方面，本申请实施例提供一种计算机程序产品，该程序产品被存储在存储介质中，该程序产品被至少一个处理器执行以实现如第一方面所述的方法。In a sixth aspect, an embodiment of the present application provides a computer program product, where the program product is stored in a storage medium, and the program product is executed by at least one processor to implement the method according to the first aspect.

在本申请实施例中，终端通过麦克风阵列，确定目标声源的位置；其中，所述目标声源为产生的语音信号的声音强度满足第一条件的声源，所述麦克风阵列包括至少三个不排列在一条直线上的麦克风；根据所述目标声源的位置，以定向采集模式采集语音信号。通过本发明实施例，先对确定的目标声源进行定位并定向采集，从而可以有效得从噪声环境中分离出目标声源的语音信号并进行采集，以获得更加清晰的声音片段。In the embodiment of the present application, the terminal determines the position of the target sound source through a microphone array; wherein, the target sound source is a sound source whose sound intensity of the generated voice signal satisfies the first condition, and the microphone array includes at least three The microphones are not arranged in a straight line; according to the position of the target sound source, the voice signal is collected in a directional collection mode. Through the embodiments of the present invention, the determined target sound source is first positioned and directionally collected, so that the speech signal of the target sound source can be effectively separated from the noise environment and collected, so as to obtain clearer sound segments.

附图说明Description of drawings

图1是本申请实施例提供的一种录音方法的流程示意图；1 is a schematic flowchart of a recording method provided by an embodiment of the present application;

图2是本申请实施例提供的麦克风阵列结构示意图；2 is a schematic structural diagram of a microphone array provided by an embodiment of the present application;

图3是本申请实施例提供的另一种录音方法的流程示意图；3 is a schematic flowchart of another recording method provided by an embodiment of the present application;

图4是本申请实施例提供的另一种录音方法的流程示意图；4 is a schematic flowchart of another recording method provided by an embodiment of the present application;

图5是本申请实施例提供的另一种录音方法的流程示意图；5 is a schematic flowchart of another recording method provided by an embodiment of the present application;

图6是本申请实施例提供的一种声音片段的身份识别方法的流程示意图；6 is a schematic flowchart of a method for identifying a voice clip provided by an embodiment of the present application;

图7是本申请实施例提供的一种显示界面的显示示意图；FIG. 7 is a schematic display diagram of a display interface provided by an embodiment of the present application;

图8是本申请实施例提供的一种终端组的结构示意图；FIG. 8 is a schematic structural diagram of a terminal group provided by an embodiment of the present application;

图9是本申请实施例提供的另一种显示界面的显示示意图；9 is a schematic display diagram of another display interface provided by an embodiment of the present application;

图10是本申请实施例提供的另一种显示界面的显示示意图；FIG. 10 is a schematic display diagram of another display interface provided by an embodiment of the present application;

图11是本申请实施例提供的一种录音装置的结构示意图；11 is a schematic structural diagram of a recording device provided by an embodiment of the present application;

图12是本申请实施例提供一种电子设备的结构示意图；12 is a schematic structural diagram of an electronic device provided by an embodiment of the present application;

图13是实现本申请实施例的一种电子设备的硬件结构示意图。FIG. 13 is a schematic diagram of a hardware structure of an electronic device implementing an embodiment of the present application.

具体实施方式Detailed ways

下面将结合本申请实施例中的附图，对本申请实施例中的技术方案进行清楚地描述，显然，所描述的实施例是本申请一部分实施例，而不是全部的实施例。基于本申请中的实施例，本领域普通技术人员获得的所有其他实施例，都属于本申请保护的范围。The technical solutions in the embodiments of the present application will be clearly described below with reference to the accompanying drawings in the embodiments of the present application. Obviously, the described embodiments are part of the embodiments of the present application, but not all of the embodiments. Based on the embodiments in this application, all other embodiments obtained by those of ordinary skill in the art fall within the protection scope of this application.

本申请的说明书和权利要求书中的术语“第一”、“第二”等是用于区别类似的对象，而不用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换，以便本申请的实施例能够以除了在这里图示或描述的那些以外的顺序实施，且“第一”、“第二”等所区分的对象通常为一类，并不限定对象的个数，例如第一对象可以是一个，也可以是多个。此外，说明书以及权利要求中“和/或”表示所连接对象的至少其中之一，字符“/”，一般表示前后关联对象是一种“或”的关系。The terms "first", "second" and the like in the description and claims of the present application are used to distinguish similar objects, and are not used to describe a specific order or sequence. It is to be understood that the data so used are interchangeable under appropriate circumstances so that the embodiments of the present application can be practiced in sequences other than those illustrated or described herein, and distinguish between "first", "second", etc. The objects are usually of one type, and the number of objects is not limited. For example, the first object may be one or more than one. In addition, "and/or" in the description and claims indicates at least one of the connected objects, and the character "/" generally indicates that the associated objects are in an "or" relationship.

下面结合附图，通过具体的实施例及其应用场景对本申请实施例提供的录音方法进行详细地说明。The recording method provided by the embodiments of the present application will be described in detail below through specific embodiments and application scenarios with reference to the accompanying drawings.

如图1所示，本申请实施例提供了一种录音方法，该方法的执行主体为终端，在所述终端上设置有麦克风阵列，所述麦克风阵列包括至少三个麦克风，且要求所述至少三个麦克风不设置在同一条直线上。如图2所示，在所述终端100的顶端、中间和底部分别设备三个不在同一直线上的麦克风101，形成一个麦克风阵列。所述录音方法可以包括以下步骤。As shown in FIG. 1 , an embodiment of the present application provides a recording method. The method is performed by a terminal, and a microphone array is provided on the terminal. The microphone array includes at least three microphones, and the at least The three microphones are not set in the same line. As shown in FIG. 2 , threemicrophones 101 that are not on the same straight line are respectively installed at the top, middle and bottom of theterminal 100 to form a microphone array. The recording method may include the following steps.

步骤110、终端通过麦克风阵列，确定目标声源的位置；其中，所述目标声源为产生的语音信号的声音强度满足第一条件的声源，所述麦克风阵列包括至少三个不排列在一条直线上的麦克风。Step 110: The terminal determines the position of the target sound source through the microphone array; wherein, the target sound source is a sound source whose sound intensity of the generated voice signal satisfies the first condition, and the microphone array includes at least three sound sources that are not arranged in a row. Microphone in line.

所述目标声源为所述麦克风阵列采集的语音信号中，与满足第一条件的语音信号对应的声源。The target sound source is the sound source corresponding to the voice signal satisfying the first condition in the voice signal collected by the microphone array.

所述第一条件可以根据实际的需要进行设定，可以基于语音信号的声音强度进行设定，也可以是基于语音信号的持续时间进行设定。The first condition may be set according to actual needs, may be set based on the sound intensity of the voice signal, or may be set based on the duration of the voice signal.

在一种实施方式中，第一条件为语音信号的声音强度超过第一阈值。麦克风阵列通过监测环境中采集到的语音信号的声音强度，在语音信号的声音强度超过所述第一阈值时，将产生该语音座号的声源确定为目标声源。In one embodiment, the first condition is that the sound intensity of the speech signal exceeds a first threshold. The microphone array monitors the sound intensity of the speech signal collected in the environment, and when the sound intensity of the speech signal exceeds the first threshold, the sound source that generates the speech seat number is determined as the target sound source.

所述第一条件也可以为声音强度超过第一阈值的持续时间超过第一时长；或者，当前环境中声音强度超过第一阈值的语音信号中声音强度最高。为了简便起见，在下面的实施例中均以所述第一条件为语音信号的声音强度超过第一阈值A为例进行举例说明。The first condition may also be that the duration for which the sound intensity exceeds the first threshold exceeds the first duration; or, in the current environment, the voice signal with the sound intensity exceeding the first threshold has the highest sound intensity. For the sake of simplicity, in the following embodiments, the first condition is that the sound intensity of the speech signal exceeds the first threshold value A as an example for illustration.

终端通过所述麦克风阵列可以对目标声源进行定位，以获取所述目标声源的位置。其中，利用多个麦克风对声源进行定位的方式可以多种多样，可以利用所述麦克风阵列中不同麦克风之间采集到的语音信号在声音强度和相位等因素上的差异，计算得到产生所述语音信号的目标声源在三维空间中的位置。The terminal can locate the target sound source through the microphone array to obtain the position of the target sound source. There are various ways of using multiple microphones to locate the sound source, and the difference in the sound intensity and phase of the voice signals collected between different microphones in the microphone array can be used to calculate and generate the sound source. The position of the target sound source of the speech signal in three-dimensional space.

步骤120、根据所述目标声源的位置，以定向采集模式采集语音信号，得到语音文件。Step 120: According to the position of the target sound source, collect the voice signal in a directional acquisition mode to obtain a voice file.

在确定所述目标声源的位置后，终端可以控制所述麦克风阵列采用定向采集模式，向所述目标声源的位置所在方向定向采集语音信号，直到所述所采集到的语音信号的声音强度小于第二阈值时，或者采集到的语音信号的声音强度小于第二阈值的持续时间超过第二时长时，退出所述定向采集模式，采集的语音信息将被保存为语音文件。回到正常采集模式，监测当前环境中是否再次出现满足第一条件的语音信号。此时，若采集到不满足第一条件的语音信号将被当作底噪声覆盖保存到语音文件中。其中，所述第二阈值可以等于第一阈值，或者小于所述第一阈值。After determining the position of the target sound source, the terminal can control the microphone array to adopt a directional acquisition mode, and directionally collect the voice signal in the direction of the position of the target sound source until the sound intensity of the collected voice signal is reached When it is less than the second threshold, or when the sound intensity of the collected voice signal is less than the second threshold and the duration exceeds the second duration, the directional collection mode is exited, and the collected voice information will be saved as a voice file. Return to the normal acquisition mode, and monitor whether a voice signal satisfying the first condition appears again in the current environment. At this time, if the collected voice signal does not meet the first condition, it will be overwritten and saved in the voice file as the noise floor. Wherein, the second threshold may be equal to or smaller than the first threshold.

进一步地，所述步骤120中的定向采集模式可以多种多样，可以包括：Further, the directional acquisition modes in the step 120 may be various, and may include:

以所述目标声源为中心，设定采集区域；With the target sound source as the center, the collection area is set;

从所述采集区域内，采集语音信号得到声音片段。From the collection area, a voice signal is collected to obtain a sound segment.

进一步地，所述采集区域的确定方式可预多种多样，可以为以所述目标声源为中心的半径范围内，也可以为以所述麦克风阵列为顶点，以所述麦克风阵列到所述目标声源的连线为中心线的扇形区域，所述扇形区域的开角可以设置为第一角度X。如图3所示，以所述目标声源200为中心形成的扇形区域为采集区域采集语音信息。为了简便起见，在下面的实施例中均为扇形区域为例进行举例说明。Further, the manner of determining the collection area can be pre-varied, and it may be within a radius range with the target sound source as the center, or may be within a radius with the microphone array as the vertex, and the microphone array to the The connecting line of the target sound source is a fan-shaped area of the center line, and the opening angle of the fan-shaped area can be set as the first angle X. As shown in FIG. 3 , a fan-shaped area formed with thetarget sound source 200 as the center is a collection area to collect voice information. For the sake of simplicity, in the following embodiments, the fan-shaped area is taken as an example for illustration.

在一种实施方式中，为了保证在采集区域内采集到的语音信号的清晰度，所述方法还包括：In one embodiment, in order to ensure the clarity of the speech signal collected in the collection area, the method further includes:

对所述采集区域外的语音信号进行屏蔽或抑制，如图3所示，对扇形区域外的其它声源201产生的语音信号进行屏蔽或抑制。The voice signals outside the collection area are shielded or suppressed. As shown in FIG. 3 , the voice signals generated byother sound sources 201 outside the sector area are shielded or suppressed.

所述对语音信号进行屏蔽或抑制的方式可以根据实际的需要进行设定，例如，可以通过调整麦克见阵列中各麦克风的参数，形成不同方向上的不同增益。使采集区域方向上的增益升高，而所述采集区域外的增益降低；还可以通过软件算法滤除或抑制采集区域外的语音信号。The method of shielding or suppressing the voice signal can be set according to actual needs. For example, different gains in different directions can be formed by adjusting the parameters of each microphone in the microphone array. The gain in the direction of the acquisition area is increased, while the gain outside the acquisition area is decreased; and the voice signal outside the acquisition area can also be filtered or suppressed through a software algorithm.

图4给出了本申请实施例的录音方法的一种举例说明。FIG. 4 shows an example of the recording method according to the embodiment of the present application.

录音开始；recording starts;

监测当前环境中不同声源产生的语音信号的声音强度是否超过第一阈值A；依次判断包括声源1、声源2和声源3在内的各声源的语音信号的声音强度是否超过第一阈值A；Monitor whether the sound intensity of the voice signals generated by different sound sources in the current environment exceeds the first threshold A; judge in turn whether the sound intensity of the voice signals of each sound source including thesound source 1, thesound source 2 and thesound source 3 exceeds the first threshold A. a threshold A;

若任一声源的语音信息的声音强度超过第一阈值A，则将该声源确定为目标声源并进行定位，以对该声源的语音信号进行X角度扇形的定向采集，同时屏蔽或抑制其它声源产生的语音信号；If the sound intensity of the voice information of any sound source exceeds the first threshold value A, the sound source is determined as the target sound source and localized, so that the voice signal of the sound source is collected in an X-angle sector direction, and at the same time, the sound source is shielded or suppressed. Speech signals produced by other sound sources;

若未超过第一阈值A，则继续监测下一声源的语音信号，以此类推，直到录音结束。If the first threshold value A is not exceeded, continue to monitor the voice signal of the next sound source, and so on, until the recording ends.

在录音结束后，终端可以得到本次录音的完整语音文件。After the recording is completed, the terminal can obtain the complete voice file of the recording.

由以上本发明实施例提供的技术方案可见，本发明实施例通过麦克风阵列，确定目标声源的位置；其中，所述目标声源为产生的语音信号的声音强度满足第一条件的声源，所述麦克风阵列包括至少三个不排列在一条直线上的麦克风；根据所述目标声源的位置，以定向采集模式采集语音信号。通过本发明实施例，先对确定的目标声源进行定位并定向采集，从而可以有效得从噪声环境中分离出目标声源的语音信号并进行采集，以获得更加清晰的声音片段。It can be seen from the technical solutions provided by the above embodiments of the present invention that the position of the target sound source is determined by the microphone array in the embodiment of the present invention; wherein, the target sound source is a sound source whose sound intensity of the generated voice signal satisfies the first condition, The microphone array includes at least three microphones that are not arranged in a straight line; according to the position of the target sound source, the voice signal is collected in a directional collection mode. Through the embodiments of the present invention, the determined target sound source is first positioned and directionally collected, so that the speech signal of the target sound source can be effectively separated from the noise environment and collected, so as to obtain clearer sound segments.

基于上述实施例，进一步地，如图5所示，在步骤120中采集语音信号的过程中或者采集完成后，所述方法还包括：Based on the foregoing embodiment, further, as shown in FIG. 5 , during or after the acquisition of the voice signal in step 120, the method further includes:

步骤130、根据所述语音信号的声纹特征，确定与所述语音信号对应的声源。Step 130: Determine a sound source corresponding to the voice signal according to the voiceprint feature of the voice signal.

在采集语音信号的同时，还可以开启对所述产生所述语音信号的声源的身份识别。所述身份识别的方式可以多种多样，本申请实施例仅以基于声纹特征进行匹配的方式为例进行举例说明。While collecting the voice signal, the identification of the sound source that generates the voice signal can also be enabled. The identity identification methods may be various, and the embodiments of the present application only take the matching method based on the voiceprint feature as an example for illustration.

如图6所示，通过特征提取，提取待识别的语音信号的声纹特征，并与终端已有声源的已有声纹特征进行匹配，若匹配成功，则确定产生所述语音信号的声源为已有声源，然后提取包含所述语音信号的声音片段并记录到对应目录中；而若匹配失败，则确定产生所述语音信号的声源为新的声源，按照命名规则对所述新的声源的身份标识进行命名，并将所述新的声源、声纹特征入库，成为新的已有声源和已有声纹特征，然后提取包含所述语音信号的声音片段并记录到对应目录中。As shown in FIG. 6 , through feature extraction, the voiceprint feature of the voice signal to be recognized is extracted and matched with the existing voiceprint feature of the existing sound source of the terminal. If the matching is successful, it is determined that the sound source that generates the voice signal is If there is an existing sound source, then extract the sound segment containing the voice signal and record it in the corresponding directory; and if the match fails, then determine that the sound source that generates the voice signal is a new sound source, according to the naming rules for the new sound source. Name the identity of the sound source, and store the new sound source and voiceprint feature into the database to become a new existing sound source and existing voiceprint feature, and then extract the sound clip containing the voice signal and record it to the corresponding directory middle.

进一步地，在所述步骤S130之后，所述方法还包括：Further, after the step S130, the method further includes:

根据声音强度，提取出包括所述语音信号的声音片段；According to the sound intensity, extract the sound segment including the voice signal;

其中，对所述声音片段的提取方式可以多种多样，本申请实施例仅给出了其中的一种举例说明。可以基于所述第一条件对声音片段进行提取，根据语音信号的声音强度的变化情况，每一段声音片段的开始标志可以为声音强度满足第一条件时或者满足第一条件持续第一时长时，结束标志为声音强度不满足第一条件时或者不满足第一条件持续第二时长时。There are various ways of extracting the sound segment, and only one of them is given as an example in the embodiment of the present application. The sound clip can be extracted based on the first condition, and according to the change of the sound intensity of the speech signal, the start mark of each sound clip can be when the sound intensity satisfies the first condition or when the first condition is satisfied for a first duration, The end flag is when the sound intensity does not satisfy the first condition or when the first condition is not satisfied for a second duration.

在显示界面将所述声音片段记录到对应声源的第一目录中，所述第一目录位于与语音文件对应的第二目录下。On the display interface, the sound clip is recorded in a first directory corresponding to the sound source, where the first directory is located under a second directory corresponding to the voice file.

在所述显示界面中目录的排列规则和显示规则可以根据实际的需要进行设定，本申请实施例仅给出了其中的一种实施方式进行举例说明。如图7所示，在显示界面建立两级目录，以所述语音文件对应的第二目录作为第一级目录，所述第二目录的目录名可以为所述语音文件的标识，在所述第二目录下建立与各声源对应的第一目录作为第二级目录，所述第一目录的目录名可以为对应声源的身份标识：声源A、声源B和声源C，并在各第一目录下记录与各声源对应的声音片段，A-1、A-2和A-3，各声音片段的名称可以包括对应声源的身份标识和序号。The arrangement rules and display rules of directories in the display interface can be set according to actual needs, and the embodiment of the present application only provides one of the implementation manners for illustration. As shown in FIG. 7 , a two-level directory is established on the display interface, and the second directory corresponding to the voice file is used as the first-level directory, and the directory name of the second directory can be the identifier of the voice file. A first directory corresponding to each sound source is established under the second directory as a second-level directory, and the directory name of the first directory can be the identification of the corresponding sound source: sound source A, sound source B and sound source C, and The sound clips corresponding to each sound source, A-1, A-2 and A-3, are recorded under each first directory, and the name of each sound clip may include the identification and serial number of the corresponding sound source.

在对声音片段进行身份识别的过程中，若匹配到与所述声音片段对应的已有声源，则将所述声音片段记录到与所述已有声源对应的第一目录中；若没有匹配到与所述声音片段对应的已有声源，则在第二目录下新建与新的声源对应的第一目录，并将所述声音片段记录到新建的第一目录下。In the process of identifying the sound clip, if an existing sound source corresponding to the sound clip is matched, the sound clip is recorded in the first directory corresponding to the existing sound source; For the existing sound source corresponding to the sound clip, a first directory corresponding to the new sound source is created under the second directory, and the sound clip is recorded in the newly created first directory.

进一步地，所述方法还包括：Further, the method also includes:

接收对显示界面的第一输入；其中，所述第一输入可以由触控操作产生，例如，点击操作、长按操作或滑动操作等，还可以由语音操作或手势操作产生，此处不具等具体地限定。Receive the first input to the display interface; wherein, the first input can be generated by a touch operation, for example, a click operation, a long press operation or a sliding operation, etc., or can also be generated by a voice operation or a gesture operation, which is not included here. Specifically defined.

响应于所述第一输入，播放所述声音片段或语音文件。In response to the first input, the sound clip or voice file is played.

用户通过对显示界面的第一输入，选择播放在所述显示界面显示的语音文件或声音片段。例如，如图7所示，用户可以通过对语音文件对应的第二目录进行第一操作，播放或暂停播放包含各声音片段的语音文件；用户还可以通过对第二目录进行第二操作，展开或收起在所述第二目录下的第一目录的列表；用户还可以通过对展开的第一目录进行第三操作，展开或收起在所述第一目录下的声音片段的列表；用户还可以通过对声音片段的第四操作，播放或暂停播放所述声音片段。The user selects to play the voice file or sound clip displayed on the display interface through the first input on the display interface. For example, as shown in FIG. 7 , the user can play or pause the voice file containing each sound clip by performing a first operation on the second directory corresponding to the voice file; the user can also perform a second operation on the second directory to expand Or retract the list of the first directory under the second directory; the user can also expand or retract the list of sound clips under the first directory by performing a third operation on the expanded first directory; the user The sound clip can also be played or paused through the fourth operation on the sound clip.

由以上本发明实施例提供的技术方案可见，本发明实施例根据所述语音信号的声纹特征，确定产生所述语音信号的声源。通过本发明实施例，对语音信号的声源进行身份识别，并提取对应的声音片段，再记录到对应的目录中，从而可以对声音片段进行高效管理，合理展示并播放。It can be seen from the technical solutions provided by the above embodiments of the present invention that the embodiment of the present invention determines the sound source that generates the voice signal according to the voiceprint feature of the voice signal. Through the embodiments of the present invention, the sound source of the voice signal is identified, the corresponding sound clips are extracted, and then recorded in the corresponding directory, so that the sound clips can be efficiently managed, reasonably displayed and played.

基于上述实施例，进一步地，若所述终端为终端组的主终端，所述终端组包括至少一个主终端和N个附终端，其中N大于等于1，则所述方法还包括：Based on the foregoing embodiment, further, if the terminal is a master terminal of a terminal group, and the terminal group includes at least one master terminal and N additional terminals, where N is greater than or equal to 1, the method further includes:

接收所述附终端采集的语音信号并汇总，得到语音文件。The voice signal collected by the attached terminal is received and aggregated to obtain a voice file.

如图8所示，根据实际的需要，可以组成由多个终端组成的终端组，例如，在需要进行多终端会议的情况下，将入会的多个终端组成终端组。在一种实施方式中，可以将其中的至少一个终端设置的主终端，其它的终端设置的附终端，由所述主终端来对多终端会议的全程进行录音和记录。As shown in FIG. 8 , according to actual needs, a terminal group composed of multiple terminals can be formed. For example, when a multi-terminal conference needs to be performed, multiple terminals joining the conference are formed into a terminal group. In an implementation manner, at least one of the terminals may be set as the main terminal, and other terminals may be set as auxiliary terminals, and the main terminal may record and record the whole process of the multi-terminal conference.

各终端可以按照上述实施例的方式，确定周边环境的目标声源并进行定位，然后，根据所述目标声源的位置，以定向采集模式采集语音信号，再将语音信号发送给主终端，由主终端按照时间顺序进行汇总，形成总的语音文件。Each terminal can determine and locate the target sound source of the surrounding environment according to the method of the above-mentioned embodiment, and then, according to the position of the target sound source, collect the voice signal in the directional acquisition mode, and then send the voice signal to the main terminal. The master terminal summarizes in time order to form a total voice file.

进一步地，接收所述附终端采集的语音信号的过程中，所述方法还包括：Further, in the process of receiving the voice signal collected by the attached terminal, the method further includes:

确定与所述语音信号对应的声源。A sound source corresponding to the speech signal is determined.

应理解的是，对各语音信号对应的声源进行身份识别过程可以由各终端自己完成，然后再将识别结果随同所述语音信号一并发送给主终端；也可以由主终端执行所述身份识别过程。It should be understood that the identification process of the sound source corresponding to each voice signal can be completed by each terminal itself, and then the identification result is sent to the main terminal together with the voice signal; the main terminal can also perform the identification process. identification process.

从所述语音文件中提取出包括所述语音信号的声音片段。A sound segment including the speech signal is extracted from the speech file.

可以基于所述第一条件对声音片段进行提取，根据语音信号的声音强度的变化情况，每一段声音片段的开始标志可以为声音强度满足第一条件时或者满足第一条件持续第一时长时，结束标志为声音强度不满足第一条件时或者不满足第一条件持续第二时长时。The sound clip can be extracted based on the first condition, and according to the change of the sound intensity of the speech signal, the start mark of each sound clip can be when the sound intensity satisfies the first condition or when the first condition is satisfied for a first duration, The end flag is when the sound intensity does not satisfy the first condition or when the first condition is not satisfied for a second duration.

在显示界面将所述声音片段记录到对应声源的第一目录中；其中，所述第一目录位于与终端对应的第三目录下，所述第三目录位于与所述语音文件对应的第二目录下。The sound clip is recorded in the first directory corresponding to the sound source on the display interface; wherein, the first directory is located in the third directory corresponding to the terminal, and the third directory is located in the first directory corresponding to the voice file. in the second directory.

主终端根据所述声音片段的身份识别结果中与所述声音片段对应声源，以及采集所述声音片段的终端，将所述声音片段记录到显示界面的对应目录中。The main terminal records the sound clip into a corresponding directory on the display interface according to the sound source corresponding to the sound clip in the identification result of the sound clip and the terminal that collects the sound clip.

在所述显示界面中目录的排列规则和显示规则可以根据实际的需要进行设定，本申请实施例仅给出了其中的一种举例说明。如图9所示，在显示界面建立二级目录，以所述语音文件对应的第二目录作为第一级目录，所述第二目录的目录名可以为所述语音文件的标识：终端1、终端2和终端3，在所述第二目录下建立与各终端对应的第三目录作为第二级目录，所述第三目录的目录名可以为所述终端的标识，在所述第三目录记录对应的声音片段，各声音片段的名称可以包括：终端的标识、对应声源的身份标识和序号：1-A-1、1-A-2、1-B-1等。The arrangement rules and display rules of the directories in the display interface can be set according to actual needs, and the embodiment of the present application only provides an example for illustration. As shown in Figure 9, a second-level directory is established on the display interface, and the second directory corresponding to the voice file is used as the first-level directory, and the directory name of the second directory can be the identification of the voice file:terminal 1,Terminal 2 andTerminal 3 establish a third directory corresponding to each terminal in the second directory as a second-level directory, and the directory name of the third directory can be the identifier of the terminal, and the third directory Corresponding sound clips are recorded, and the name of each sound clip may include: the identification of the terminal, the identification and serial number of the corresponding sound source: 1-A-1, 1-A-2, 1-B-1, etc.

在另一种实施方式中，还可以在显示界面建立三级目录，以所述语音文件对应的第二目录作为第一级目录，所述第二目录的目录名可以为所述语音文件的标识，在所述第二目录下建立与各终端对应的第三目录作为第二级目录，所述第三目录的目录名可以为所述终端的标识：终端1、终端2和终端3，在所述第三目录下建立与各声源对应的第一目录作为第三级目录，所述第一目录的目录名可以为对应声源的身份标识：声源A、声源B和声源C，并在各第一目录下记录与各声源对应的声音片段：1-A-1、1-A-2、1-B-1等。In another implementation manner, a three-level directory can also be established on the display interface, and the second directory corresponding to the voice file is used as the first-level directory, and the directory name of the second directory can be the identifier of the voice file. , establish a third directory corresponding to each terminal as a second-level directory under the second directory, and the directory name of the third directory may be the identification of the terminal: terminal 1,terminal 2 andterminal 3, where The first directory corresponding to each sound source is established under the third directory as a third-level directory, and the directory name of the first directory can be the identity of the corresponding sound source: sound source A, sound source B and sound source C, And record sound clips corresponding to each sound source in each first directory: 1-A-1, 1-A-2, 1-B-1 and so on.

在另一种实施方式中，还可以将上订定三级目录中第一目录与第二目录之间的层级关系反转，将第一目录作为第二级目录，将第三目录建立在第二目录下作为第三级目录，并在所述第三目录中记录对应声源。In another implementation manner, the hierarchical relationship between the first directory and the second directory in the three-level directory specified above can also be reversed, the first directory is used as the second-level directory, and the third directory is established in the third-level directory. The second directory is used as the third-level directory, and the corresponding sound source is recorded in the third directory.

为了简便起见，在下面的实施例中均以建立如图9所示的二级目录为例进行举例说明。For the sake of simplicity, in the following embodiments, the establishment of a secondary directory as shown in FIG. 9 is taken as an example for illustration.

在一种实施方式中，由主终端对总的语音文件中的各声音片段进行提取和身份识别同，并在对声音片段进行身份识别的过程中，通过声纹特征的匹配过程，确定与所述声音片段对应的声源的身份标识，记录到相应的第三目录中并命名。In one embodiment, the main terminal extracts and identifies each sound segment in the total voice file, and in the process of identifying the sound segment, through the matching process of voiceprint features, it is determined that the The identification of the sound source corresponding to the sound clip is recorded in the corresponding third directory and named.

进一步地，所述方法还包括：Further, the method also includes:

用户通过对显示界面的第一输入，选择播放在所述显示界面显示的语音文件或声音片段。例如，如图9所示，用户可以通过对语音文件对应的第二目录进行第一操作，播放或暂停播放包含各声音片段的语音文件；用户还可以通过对第二目录进行第二操作，展开或收起在所述第二目录下的第三目录的列表；用户还可以通过对展开的第三目录进行第三操作，展开或收起在所述第三目录下的声音片段的列表；用户还可以通过对声音片段的第四操作，播放或暂停播放所述声音片段。The user selects to play the voice file or sound clip displayed on the display interface through the first input on the display interface. For example, as shown in FIG. 9 , the user can play or pause the voice file containing each sound clip by performing the first operation on the second directory corresponding to the voice file; the user can also perform the second operation on the second directory to expand Or retract the list of the third directory under the second directory; the user can also expand or retract the list of sound clips under the third directory by performing a third operation on the expanded third directory; the user The sound clip can also be played or paused through the fourth operation on the sound clip.

由以上本发明实施例提供的技术方案可见，本发明实施例若所述终端为终端组的主终端，所述终端组包括至少一个主终端和N个附终端，其中N大于等于1，所述主终端接收所述附终端采集的声音片段，并汇总保存为语音文件。通过本发明实施例，对声音片段的声源进行身份识别，并记录到对应的目录中，从而可以对声音片段进行高效管理，合理展示并播放。It can be seen from the technical solutions provided by the above embodiments of the present invention that, in the embodiment of the present invention, if the terminal is the main terminal of a terminal group, the terminal group includes at least one main terminal and N additional terminals, where N is greater than or equal to 1, and the terminal group includes at least one main terminal and N additional terminals. The main terminal receives the sound clips collected by the attached terminal, and saves them as a voice file in a summary. Through the embodiment of the present invention, the sound source of the sound clip is identified and recorded in the corresponding directory, so that the sound clip can be efficiently managed, displayed and played reasonably.

本申请实施例提供的录音方法，执行主体可以为录音装置。本申请实施例中以录音装置执行录音方法为例，说明本申请实施例提供的录音装置。In the recording method provided by the embodiment of the present application, the execution body may be a recording device. In the embodiment of the present application, the recording device provided by the embodiment of the present application is described by taking the recording method performed by the recording device as an example.

如图11所示，所述录音装置包括：声源定位模块111和语音采集模块112；其中，所述声源定位模块111用于通过麦克风阵列，确定目标声源的位置；其中，所述目标声源为产生的语音信号的声音强度满足第一条件的声源，所述麦克风阵列包括至少三个不排列在一条直线上的麦克风；所述语音采集模块112用于根据所述目标声源的位置，以定向采集模式采集语音信号，得到语音文件。As shown in FIG. 11 , the recording device includes: a soundsource localization module 111 and avoice acquisition module 112; wherein, the soundsource localization module 111 is used to determine the position of the target sound source through a microphone array; The sound source is a sound source whose sound intensity of the generated voice signal satisfies the first condition, and the microphone array includes at least three microphones that are not arranged in a straight line; thevoice acquisition module 112 is configured to position, collect the voice signal in the directional acquisition mode, and obtain the voice file.

进一步地，所述第一条件包括：Further, the first condition includes:

语音信号的声音强度超过第一阈值。The sound intensity of the speech signal exceeds the first threshold.

进一步地，所述语音采集模块用于：Further, the voice acquisition module is used for:

从所述采集区域内，采集语音信号。From the collection area, a voice signal is collected.

进一步地，所述采集区域为，以所述麦克风阵列为顶点，以所述麦克风阵列到所述目标声源的连线为中心线的扇形区域。Further, the collection area is a fan-shaped area with the microphone array as a vertex and a line connecting the microphone array to the target sound source as a center line.

进一步地，所述语音采集模块还用于：Further, the voice acquisition module is also used for:

对所述采集区域外的语音信号进行屏蔽或抑制。The voice signal outside the collection area is shielded or suppressed.

由以上本发明实施例提供的技术方案可见，本发明实施例的通过麦克风阵列，确定目标声源的位置；其中，所述目标声源为产生的语音信号的声音强度满足第一条件的声源，所述麦克风阵列包括至少三个不排列在一条直线上的麦克风；根据所述目标声源的位置，以定向采集模式采集语音信号。通过本发明实施例，先对确定的目标声源进行定位并定向采集，从而可以有效得从噪声环境中分离出目标声源的语音信号并进行采集，以获得更加清晰的声音片段。It can be seen from the technical solutions provided by the above embodiments of the present invention that the position of the target sound source is determined through the microphone array in the embodiment of the present invention; wherein, the target sound source is the sound source whose sound intensity of the generated voice signal satisfies the first condition , the microphone array includes at least three microphones that are not arranged in a straight line; according to the position of the target sound source, the voice signal is collected in a directional collection mode. Through the embodiments of the present invention, the determined target sound source is first positioned and directionally collected, so that the speech signal of the target sound source can be effectively separated from the noise environment and collected, so as to obtain clearer sound segments.

基于上述实施例，进一步地，在以定向采集模式采集语音信号的过程中，所述语音采集模块还用于：Based on the above embodiment, further, in the process of collecting voice signals in the directional acquisition mode, the voice acquisition module is further used for:

根据所述语音信号的声纹特征，确定与所述语音信号对应的声源。According to the voiceprint feature of the voice signal, the sound source corresponding to the voice signal is determined.

进一步地，在确定与所述语音信号对应的声源之后，所述语音采集模块还用于：Further, after determining the sound source corresponding to the voice signal, the voice acquisition module is further used for:

进一步地，在将所述声音片段记录到对应身份标识信息对应的第一目录中之后，所述语音采集模块还用于：Further, after recording the sound clip in the first directory corresponding to the corresponding identity information, the voice collection module is further used for:

接收对显示界面的第一输入；receiving a first input to the display interface;

由以上本发明实施例提供的技术方案可见，本发明实施例根据所述语音信号的声纹特征，确定产生所述语音信号的声源。通过本发明实施例，对语音信号对应的声源进行身份识别，并提取对应的声音片段，再记录到对应的目录中，从而可以对声音片段进行高效管理，合理展示并播放。It can be seen from the technical solutions provided by the above embodiments of the present invention that the embodiment of the present invention determines the sound source that generates the voice signal according to the voiceprint feature of the voice signal. Through the embodiment of the present invention, the sound source corresponding to the voice signal is identified, and the corresponding sound clip is extracted and recorded in the corresponding directory, so that the sound clip can be efficiently managed, reasonably displayed and played.

基于上述实施例，进一步地，在所述录音装置为终端组的主终端，所述终端组包括至少一个主终端和N个附终端，其中N大于等于1的情况下，所述语音采集模块还用于：Based on the above embodiment, further, when the recording device is the main terminal of a terminal group, and the terminal group includes at least one main terminal and N additional terminals, where N is greater than or equal to 1, the voice collection module further Used for:

进一步地，在接收所述附终端采集的语音信号的过程中，所述语音采集模块还用于：Further, in the process of receiving the voice signal collected by the attached terminal, the voice collecting module is also used for:

确定与所述语音信号对应的声源；determining a sound source corresponding to the speech signal;

从所述语音文件中提取出包括所述语音信号的声音片段；extracting a sound segment including the speech signal from the speech file;

在显示界面将所述声音片段记录到与采集所述声音片段的终端对应的第三目录下；其中，所述第三目录位于与所述语音文件对应的第二目录下。On the display interface, the sound clip is recorded in a third directory corresponding to the terminal that collects the sound clip; wherein, the third directory is located in a second directory corresponding to the voice file.

由以上本发明实施例提供的技术方案可见，本发明实施例若所述装置为终端组的主终端，所述终端组包括至少一个主终端和N个附终端，其中N大于等于1，所述主终端接收所述附终端采集的声音片段，并汇总保存为语音文件。通过本发明实施例，对声音片段的声源进行身份识别，并记录到对应的目录中，从而可以对声音片段进行高效管理，合理展示并播放。It can be seen from the technical solutions provided by the above embodiments of the present invention that, in this embodiment of the present invention, if the device is a master terminal of a terminal group, the terminal group includes at least one master terminal and N additional terminals, where N is greater than or equal to 1, and the The main terminal receives the sound clips collected by the attached terminal, and saves them as a voice file in a summary. Through the embodiment of the present invention, the sound source of the sound clip is identified and recorded in the corresponding directory, so that the sound clip can be efficiently managed, displayed and played reasonably.

本申请实施例中的录音装置可以是电子设备，也可以是电子设备中的部件，例如集成电路或芯片。该电子设备可以是终端，也可以为除终端之外的其他设备。示例性的，电子设备可以为手机、平板电脑、笔记本电脑、掌上电脑、车载电子设备、移动上网装置(Mobile Internet Device，MID)、增强现实(augmented reality，AR)/虚拟现实(virtualreality，VR)设备、机器人、可穿戴设备、超级移动个人计算机(ultra-mobile personalcomputer，UMPC)、上网本或者个人数字助理(personal digital assistant，PDA)等，还可以为服务器、网络附属存储器(Network Attached Storage，NAS)、个人计算机(personalcomputer，PC)、电视机(television，TV)、柜员机或者自助机等，本申请实施例不作具体限定。The recording device in this embodiment of the present application may be an electronic device, or may be a component in the electronic device, such as an integrated circuit or a chip. The electronic device may be a terminal, or may be other devices other than the terminal. Exemplarily, the electronic device may be a mobile phone, a tablet computer, a notebook computer, a palmtop computer, a vehicle-mounted electronic device, a Mobile Internet Device (MID), an augmented reality (AR)/virtual reality (VR) Devices, robots, wearable devices, ultra-mobile personal computers (UMPCs), netbooks or personal digital assistants (PDAs), etc., and can also be servers, network attached storages (NAS) , a personal computer (personal computer, PC), a television (television, TV), a teller machine or a self-service machine, etc., which are not specifically limited in the embodiments of the present application.

本申请实施例中的录音装置可以为具有操作系统的装置。该操作系统可以为安卓(Android)操作系统，可以为ios操作系统，还可以为其他可能的操作系统，本申请实施例不作具体限定。The recording device in the embodiment of the present application may be a device having an operating system. The operating system may be an Android (Android) operating system, an ios operating system, or other possible operating systems, which are not specifically limited in the embodiments of the present application.

本申请实施例提供的录音装置能够实现图1至图10的方法实施例实现的各个过程，为避免重复，这里不再赘述。The recording device provided in the embodiment of the present application can implement each process implemented by the method embodiments in FIG. 1 to FIG. 10 , and to avoid repetition, details are not repeated here.

可选地，如图12所示，本申请实施例还提供一种电子设备1200，包括处理器1201和存储器1202，存储器1202上存储有可在所述处理器1201上运行的程序或指令，该程序或指令被处理器1201执行时实现上述录音方法实施例的各个步骤，且能达到相同的技术效果，为避免重复，这里不再赘述。Optionally, as shown in FIG. 12 , an embodiment of the present application further provides anelectronic device 1200 , including aprocessor 1201 and amemory 1202 . Thememory 1202 stores programs or instructions that can run on theprocessor 1201 . When the program or the instruction is executed by theprocessor 1201, each step of the above-mentioned embodiment of the recording method can be implemented, and the same technical effect can be achieved. In order to avoid repetition, details are not repeated here.

需要说明的是，本申请实施例中的电子设备包括上述所述的移动电子设备和非移动电子设备。It should be noted that the electronic devices in the embodiments of the present application include the aforementioned mobile electronic devices and non-mobile electronic devices.

图13为实现本申请实施例的一种电子设备的硬件结构示意图。FIG. 13 is a schematic diagram of a hardware structure of an electronic device implementing an embodiment of the present application.

该电子设备1300包括但不限于：射频单元1301、网络模块1302、音频输出单元1303、输入单元1304、传感器1305、显示单元1306、用户输入单元1307、接口单元1308、存储器1309、以及处理器1310等部件。Theelectronic device 1300 includes but is not limited to: aradio frequency unit 1301, anetwork module 1302, anaudio output unit 1303, aninput unit 1304, asensor 1305, adisplay unit 1306, auser input unit 1307, aninterface unit 1308, amemory 1309, and aprocessor 1310, etc. part.

本领域技术人员可以理解，电子设备1300还可以包括给各个部件供电的电源(比如电池)，电源可以通过电源管理系统与处理器1310逻辑相连，从而通过电源管理系统实现管理充电、放电、以及功耗管理等功能。图13中示出的电子设备结构并不构成对电子设备的限定，电子设备可以包括比图示更多或更少的部件，或者组合某些部件，或者不同的部件布置，在此不再赘述。Those skilled in the art can understand that theelectronic device 1300 may also include a power source (such as a battery) for supplying power to various components, and the power source may be logically connected to theprocessor 1310 through a power management system, so as to manage charging, discharging, and power consumption through the power management system. consumption management and other functions. The structure of the electronic device shown in FIG. 13 does not constitute a limitation on the electronic device. The electronic device may include more or less components than the one shown, or combine some components, or arrange different components, which will not be repeated here. .

其中，处理器1310，用于通过输入单元1304确定目标声源的位置；其中，所述目标声源为产生的语音信号的声音强度满足第一条件的声源；所述输入单元1304包括至少三个不排列在一条直线上的麦克风；Theprocessor 1310 is configured to determine the position of the target sound source through theinput unit 1304; wherein, the target sound source is a sound source whose sound intensity of the generated voice signal satisfies the first condition; theinput unit 1304 includes at least three microphones that are not arranged in a straight line;

输入单元1304，用于根据所述目标声源的位置，以定向采集模式采集语音信号，得到语音文件。Theinput unit 1304 is configured to collect voice signals in a directional acquisition mode according to the position of the target sound source to obtain a voice file.

进一步地，所述输入单元1304用于：Further, theinput unit 1304 is used for:

进一步地，所述输入单元1304还用于：Further, theinput unit 1304 is also used for:

通过本发明实施例，先对确定的目标声源进行定位并定向采集，从而可以有效得从噪声环境中分离出目标声源的语音信号并进行采集，以获得更加清晰的声音片段。Through the embodiments of the present invention, the determined target sound source is first positioned and directionally collected, so that the speech signal of the target sound source can be effectively separated from the noise environment and collected, so as to obtain clearer sound segments.

基于上述实施例，进一步地，在以定向采集模式采集语音信号的过程中，所述处理器1310，还用于：根据所述语音信号的声纹特征，确定与所述语音信号对应的声源。Based on the above embodiment, further, in the process of collecting the voice signal in the directional acquisition mode, theprocessor 1310 is further configured to: determine the sound source corresponding to the voice signal according to the voiceprint feature of the voice signal .

进一步地，在确定与所述语音信号对应的声源之后，所述处理器1310还用于：Further, after determining the sound source corresponding to the speech signal, theprocessor 1310 is further configured to:

进一步地，所述用户输入单元1307用于接收对显示界面的第一输入；Further, theuser input unit 1307 is configured to receive the first input to the display interface;

音频输出单元1303用于响应于所述第一输入，播放所述声音片段或语音文件。Theaudio output unit 1303 is configured to play the sound clip or voice file in response to the first input.

通过本发明实施例，对语音信号对应的声源进行身份识别，并提取对应的声音片段，再记录到对应的目录中，从而可以对声音片段进行高效管理，合理展示并播放。Through the embodiment of the present invention, the sound source corresponding to the voice signal is identified, and the corresponding sound clip is extracted and recorded in the corresponding directory, so that the sound clip can be efficiently managed, reasonably displayed and played.

进一步地，在所述录音装置为终端组的主终端，所述终端组包括至少一个主终端和N个附终端，其中N大于等于1的情况下，所述射频单元1301还用于：接收所述附终端采集的语音信号并汇总，得到语音文件。Further, in the case where the recording device is the master terminal of a terminal group, the terminal group includes at least one master terminal and N additional terminals, where N is greater than or equal to 1, theradio frequency unit 1301 is further configured to: receive the The voice signal collected by the attached terminal is described and summarized to obtain a voice file.

进一步地，在接收所述附终端采集的语音信号的过程中，所述处理器1310，还用于：Further, in the process of receiving the voice signal collected by the attached terminal, theprocessor 1310 is further configured to:

通过本发明实施例，对声音片段的声源进行身份识别，并记录到对应的目录中，从而可以对声音片段进行高效管理，合理展示并播放。Through the embodiment of the present invention, the sound source of the sound clip is identified and recorded in the corresponding directory, so that the sound clip can be efficiently managed, displayed and played reasonably.

应理解的是，本申请实施例中，输入单元1304可以包括图形处理器(GraphicsProcessing Unit，GPU)13041和麦克风13042，图形处理器13041对在视频捕获模式或图像捕获模式中由图像捕获装置(如摄像头)获得的静态图片或视频的图像数据进行处理。显示单元1306可包括显示面板13061，可以采用液晶显示器、有机发光二极管等形式来配置显示面板13061。用户输入单元1307包括触控面板13071以及其他输入设备13072中的至少一种。触控面板13071，也称为触摸屏。触控面板13071可包括触摸检测装置和触摸控制器两个部分。其他输入设备13072可以包括但不限于物理键盘、功能键(比如音量控制按键、开关按键等)、轨迹球、鼠标、操作杆，在此不再赘述。It should be understood that, in this embodiment of the present application, theinput unit 1304 may include a graphics processing unit (Graphics Processing Unit, GPU) 13041 and amicrophone 13042. camera) to process the image data of still pictures or videos. Thedisplay unit 1306 may include adisplay panel 13061, which may be configured in the form of a liquid crystal display, an organic light emitting diode, or the like. Theuser input unit 1307 includes at least one of atouch panel 13071 andother input devices 13072 . Thetouch panel 13071 is also called a touch screen. Thetouch panel 13071 may include two parts, a touch detection device and a touch controller.Other input devices 13072 may include, but are not limited to, physical keyboards, function keys (such as volume control keys, switch keys, etc.), trackballs, mice, and joysticks, which will not be described herein again.

存储器1309可用于存储软件程序以及各种数据。存储器1309可主要包括存储程序或指令的第一存储区和存储数据的第二存储区，其中，第一存储区可存储操作系统、至少一个功能所需的应用程序或指令(比如声音播放功能、图像播放功能等)等。此外，存储器1309可以包括易失性存储器或非易失性存储器，或者，存储器1309可以包括易失性和非易失性存储器两者。其中，非易失性存储器可以是只读存储器(Read-Only Memory，ROM)、可编程只读存储器(Programmable ROM，PROM)、可擦除可编程只读存储器(Erasable PROM，EPROM)、电可擦除可编程只读存储器(Electrically EPROM，EEPROM)或闪存。易失性存储器可以是随机存取存储器(Random Access Memory，RAM)，静态随机存取存储器(Static RAM，SRAM)、动态随机存取存储器(Dynamic RAM，DRAM)、同步动态随机存取存储器(Synchronous DRAM，SDRAM)、双倍数据速率同步动态随机存取存储器(Double Data Rate SDRAM，DDRSDRAM)、增强型同步动态随机存取存储器(Enhanced SDRAM，ESDRAM)、同步连接动态随机存取存储器(Synch link DRAM，SLDRAM)和直接内存总线随机存取存储器(Direct Rambus RAM，DRRAM)。本申请实施例中的存储器1309包括但不限于这些和任意其它适合类型的存储器。Thememory 1309 may be used to store software programs as well as various data. Thememory 1309 may mainly include a first storage area for storing programs or instructions and a second storage area for storing data, wherein the first storage area may store an operating system, an application program or instructions required for at least one function (such as a sound playback function, image playback function, etc.), etc. Additionally,memory 1309 may include volatile memory or non-volatile memory, ormemory 1309 may include both volatile and non-volatile memory. Wherein, the non-volatile memory may be Read-Only Memory (ROM), Programmable Read-Only Memory (PROM), Erasable Programmable Read-Only Memory (Erasable PROM, EPROM), Erase programmable read-only memory (Electrically EPROM, EEPROM) or flash memory. Volatile memory can be random access memory (Random Access Memory, RAM), static random access memory (Static RAM, SRAM), dynamic random access memory (Dynamic RAM, DRAM), synchronous dynamic random access memory (Synchronous random access memory) DRAM, SDRAM), double data rate synchronous dynamic random access memory (Double Data Rate SDRAM, DDRSDRAM), enhanced synchronous dynamic random access memory (Enhanced SDRAM, ESDRAM), synchronous link dynamic random access memory (Synch link DRAM) , SLDRAM) and direct memory bus random access memory (Direct Rambus RAM, DRRAM). Thememory 1309 in this embodiment of the present application includes, but is not limited to, these and any other suitable types of memory.

处理器1310可包括一个或多个处理单元；可选的，处理器1310集成应用处理器和调制解调处理器，其中，应用处理器主要处理涉及操作系统、用户界面和应用程序等的操作，调制解调处理器主要处理无线通信信号，如基带处理器。可以理解的是，上述调制解调处理器也可以不集成到处理器1310中。Theprocessor 1310 may include one or more processing units; optionally, theprocessor 1310 integrates an application processor and a modem processor, wherein the application processor mainly processes operations involving an operating system, a user interface, and an application program, etc. Modem processors mainly deal with wireless communication signals, such as baseband processors. It can be understood that, the above-mentioned modulation and demodulation processor may not be integrated into theprocessor 1310.

本申请实施例还提供一种可读存储介质，所述可读存储介质上存储有程序或指令，该程序或指令被处理器执行时实现上述录音方法实施例的各个过程，且能达到相同的技术效果，为避免重复，这里不再赘述。Embodiments of the present application further provide a readable storage medium, where a program or an instruction is stored on the readable storage medium. When the program or instruction is executed by a processor, each process of the above-mentioned recording method embodiment can be achieved, and the same can be achieved. The technical effect, in order to avoid repetition, will not be repeated here.

其中，所述处理器为上述实施例中所述的电子设备中的处理器。所述可读存储介质，包括计算机可读存储介质，如计算机只读存储器ROM、随机存取存储器RAM、磁碟或者光盘等。Wherein, the processor is the processor in the electronic device described in the foregoing embodiments. The readable storage medium includes a computer-readable storage medium, such as computer read-only memory ROM, random access memory RAM, magnetic disk or optical disk, and the like.

本申请实施例另提供了一种芯片，所述芯片包括处理器和通信接口，所述通信接口和所述处理器耦合，所述处理器用于运行程序或指令，实现上述录音方法实施例的各个过程，且能达到相同的技术效果，为避免重复，这里不再赘述。An embodiment of the present application further provides a chip, where the chip includes a processor and a communication interface, the communication interface is coupled to the processor, and the processor is configured to run a program or an instruction to implement each of the foregoing recording method embodiments process, and can achieve the same technical effect, in order to avoid repetition, it will not be repeated here.

应理解，本申请实施例提到的芯片还可以称为系统级芯片、系统芯片、芯片系统或片上系统芯片等。It should be understood that the chip mentioned in the embodiments of the present application may also be referred to as a system-on-chip, a system-on-chip, a system-on-a-chip, or a system-on-a-chip, or the like.

本申请实施例提供一种计算机程序产品，该程序产品被存储在存储介质中，该程序产品被至少一个处理器执行以实现如上述录音方法实施例的各个过程，且能达到相同的技术效果，为避免重复，这里不再赘述。The embodiments of the present application provide a computer program product, the program product is stored in a storage medium, and the program product is executed by at least one processor to implement the various processes in the above-mentioned recording method embodiments, and can achieve the same technical effect, In order to avoid repetition, details are not repeated here.

需要说明的是，在本文中，术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含，从而使得包括一系列要素的过程、方法、物品或者装置不仅包括那些要素，而且还包括没有明确列出的其他要素，或者是还包括为这种过程、方法、物品或者装置所固有的要素。在没有更多限制的情况下，由语句“包括一个……”限定的要素，并不排除在包括该要素的过程、方法、物品或者装置中还存在另外的相同要素。此外，需要指出的是，本申请实施方式中的方法和装置的范围不限按示出或讨论的顺序来执行功能，还可包括根据所涉及的功能按基本同时的方式或按相反的顺序来执行功能，例如，可以按不同于所描述的次序来执行所描述的方法，并且还可以添加、省去、或组合各种步骤。另外，参照某些示例所描述的特征可在其他示例中被组合。It should be noted that, herein, the terms "comprising", "comprising" or any other variation thereof are intended to encompass non-exclusive inclusion, such that a process, method, article or device comprising a series of elements includes not only those elements, It also includes other elements not expressly listed or inherent to such a process, method, article or apparatus. Without further limitation, an element qualified by the phrase "comprising a..." does not preclude the presence of additional identical elements in a process, method, article or apparatus that includes the element. Furthermore, it should be noted that the scope of the methods and apparatus in the embodiments of the present application is not limited to performing the functions in the order shown or discussed, but may also include performing the functions in a substantially simultaneous manner or in the reverse order depending on the functions involved. To perform functions, for example, the described methods may be performed in an order different from that described, and various steps may also be added, omitted, or combined. Additionally, features described with reference to some examples may be combined in other examples.

通过以上的实施方式的描述，本领域的技术人员可以清楚地了解到上述实施例方法可借助软件加必需的通用硬件平台的方式来实现，当然也可以通过硬件，但很多情况下前者是更佳的实施方式。基于这样的理解，本申请的技术方案本质上或者说对现有技术做出贡献的部分可以以计算机软件产品的形式体现出来，该计算机软件产品存储在一个存储介质(如ROM/RAM、磁碟、光盘)中，包括若干指令用以使得一台终端(可以是手机，计算机，服务器，或者网络设备等)执行本申请各个实施例所述的方法。From the description of the above embodiments, those skilled in the art can clearly understand that the method of the above embodiment can be implemented by means of software plus a necessary general hardware platform, and of course can also be implemented by hardware, but in many cases the former is better implementation. Based on this understanding, the technical solutions of the present application can be embodied in the form of computer software products that are essentially or contribute to the prior art, and the computer software products are stored in a storage medium (such as ROM/RAM, magnetic disk , CD), including several instructions to make a terminal (which may be a mobile phone, a computer, a server, or a network device, etc.) execute the methods described in the various embodiments of the present application.

上面结合附图对本申请的实施例进行了描述，但是本申请并不局限于上述的具体实施方式，上述的具体实施方式仅仅是示意性的，而不是限制性的，本领域的普通技术人员在本申请的启示下，在不脱离本申请宗旨和权利要求所保护的范围情况下，还可做出很多形式，均属于本申请的保护之内。The embodiments of the present application have been described above in conjunction with the accompanying drawings, but the present application is not limited to the above-mentioned specific embodiments, which are merely illustrative rather than restrictive. Under the inspiration of this application, without departing from the scope of protection of the purpose of this application and the claims, many forms can be made, which all fall within the protection of this application.