CN116436715A

Movatterモバイル変換

Info

Publication number: CN116436715A
Application number: CN202310431579.6A
Authority: CN
Inventors: 方斌; 段克; 马起礼; 黄伟
Original assignee: Beijing Shitong Science And Technology Co ltd
Current assignee: Beijing Shitong Science And Technology Co ltd
Priority date: 2023-04-20
Filing date: 2023-04-20
Publication date: 2023-07-14

Abstract

Translated fromChinese

本申请涉及一种视频会议控制方法、装置、设备及计算机可读存储介质，应用于会议系统技术领域，其方法包括：获取会议参与人员的人员信息和当前会议内容，基于人员信息和当前会议内容确定当前发言人；响应于发言打断请求，获取当前发言人的发言信息和面部信息；基于发言信息和面部信息确定当前发言人的发言状态；基于发言状态判断是否能够立即打断当前发言人；若能够立即打断当前发言人，则基于请求人信息生成发言打断通知，将发言打断通知发送至当前发言人的会议界面；若不能够立即打断当前发言人，则采集发言打断请求的请求人的打断发言信息；基于打断发言信息和发言状态生成发言策略。本申请具有提高会议的进行效率的效果。

The application relates to a video conference control method, device, equipment and computer-readable storage medium, which are applied in the technical field of conference systems. The method includes: obtaining the personnel information of conference participants and the current conference content, Determine the current speaker; respond to the speech interruption request, obtain the speech information and facial information of the current speaker; determine the speech status of the current speaker based on the speech information and facial information; judge whether the current speaker can be interrupted immediately based on the speech status; If the current speaker can be interrupted immediately, a speech interruption notification will be generated based on the requester’s information, and the speech interruption notification will be sent to the conference interface of the current speaker; if the current speaker cannot be interrupted immediately, the speech interruption request will be collected Interrupt speech information of the requester; generate a speech policy based on the interrupt speech information and speech status. This application has the effect of improving the progress efficiency of a meeting.

Description

Translated fromChinese

视频会议控制方法、装置、设备及计算机可读存储介质Video conference control method, device, equipment and computer-readable storage medium

技术领域technical field

本申请涉及会议系统的技术领域，尤其是涉及一种视频会议控制方法、装置、设备及计算机可读存储介质。The present application relates to the technical field of conference systems, in particular to a video conference control method, device, equipment and computer-readable storage medium.

背景技术Background technique

随着互联网科技的不断发展，人们对解决问题的时效性和紧迫性要求越来越高，视频会议在日常生活中得到广泛应用。With the continuous development of Internet technology, people have higher and higher requirements for the timeliness and urgency of solving problems, and video conferencing is widely used in daily life.

目前的视频会议只具有开麦发言和关麦停止发言两种方式，当参加会议的人数较多时，很容易产生多人同时开麦进行发言影响会议质量的情况，并且当存在其他干扰因素时，会议将陷入混乱状态，影响会议的正常进行，从而影响会议的进行效率。The current video conferencing only has two ways of opening the microphone to speak and turning off the microphone to stop speaking. When there are many people participating in the meeting, it is easy for many people to open the microphone at the same time to speak and affect the quality of the meeting. When there are other interference factors, The meeting will fall into a chaotic state, affecting the normal progress of the meeting, thereby affecting the efficiency of the meeting.

发明内容Contents of the invention

为了提高会议的进行效率，本申请提供一种视频会议控制方法、装置、设备及计算机可读存储介质。In order to improve the efficiency of the conference, the present application provides a video conference control method, device, equipment and computer-readable storage medium.

第一方面，本申请提供一种视频会议控制方法，采用如下的技术方案：In the first aspect, the present application provides a video conference control method, which adopts the following technical solution:

一种视频会议控制方法，包括：A video conference control method, comprising:

获取会议参与人员的人员信息和当前会议内容，基于所述人员信息和所述当前会议内容确定当前发言人；Obtaining the personnel information of the meeting participants and the current meeting content, and determining the current speaker based on the personnel information and the current meeting content;

响应于发言打断请求，获取所述当前发言人的发言信息和面部信息；Responding to the speech interruption request, acquiring the speech information and facial information of the current speaker;

基于所述发言信息和所述面部信息确定所述当前发言人的发言状态；determining the speaking status of the current speaker based on the speaking information and the facial information;

基于所述发言状态判断是否能够立即打断所述当前发言人；judging whether the current speaker can be interrupted immediately based on the speaking state;

若能够立即打断所述当前发言人，则获取所述发言打断请求的请求人信息；If the current speaker can be interrupted immediately, obtain the requester information of the speaking interruption request;

基于所述请求人信息生成发言打断通知，将所述发言打断通知发送至所述当前发言人的会议界面；generating a speech interruption notification based on the requester information, and sending the speech interruption notification to the conference interface of the current speaker;

若不能够立即打断所述当前发言人，则采集所述发言打断请求的请求人的打断发言信息；If the current speaker cannot be interrupted immediately, collecting the interrupted speech information of the requester of the interrupted speech request;

基于所述打断发言信息和所述发言状态生成发言策略。A speaking policy is generated based on the interrupted speaking information and the speaking state.

通过采用上述技术方案，根据当前的会议内容确定当前发言人，在确定当前发言人之后将会对当前发言人的发言信息和面部信息机重点进行采集，在接收到发言打断请求的时候，根据发言信息和面部信息确定当前发言人的发言状态，在发言状态允许进行打断发言的时候打断当前发言人的发言并予以提示，不允许则生成发言策略，在合适的时候让请求人进行发言，减少了多人次同时进行发言扰乱会议正常进行的可能性，从而提高了会议的进行效率。By adopting the above-mentioned technical scheme, the current speaker is determined according to the current conference content. After the current speaker is determined, the speech information of the current speaker and the focus of the facial information machine will be collected. Speech information and facial information determine the speaking status of the current speaker, interrupt the current speaker's speech and give a reminder when the interrupting speech is allowed in the speaking status, and generate a speech strategy if it is not allowed, and let the requester speak at an appropriate time , reducing the possibility of many people speaking at the same time disrupting the normal progress of the meeting, thereby improving the efficiency of the meeting.

可选的，所述基于所述人员信息和所述当前会议内容确定当前发言人包括：Optionally, the determining the current speaker based on the personnel information and the current meeting content includes:

获取所述会议参与人员的第一声音信息和发送所述当前会议内容的发送人的第二声音信息；Obtaining the first voice information of the conference participants and the second voice information of the sender who sent the current conference content;

查找所述第一声音信息中与所述第二声音信息相同的声音信息；Finding the same sound information as the second sound information in the first sound information;

基于所述声音信息和所述人员信息确定当前发言人。A current speaker is determined based on the voice information and the person information.

可选的，所述基于所述发言信息和所述面部信息确定所述当前发言人的发言状态包括：Optionally, the determining the speaking status of the current speaker based on the speaking information and the facial information includes:

基于所述发言信息确定所述当前发言人的发言语速和发言音量；determining the speech rate and speech volume of the current speaker based on the speech information;

基于所述发言语速、所述发言音量和预设等级规则确定所述当前发言人的发言急迫等级；determining the speaking urgency level of the current speaker based on the speaking speed, the speaking volume and preset level rules;

基于所述面部信息和预设情绪规则确定所述当前发言人的情绪状态；determining the emotional state of the current speaker based on the facial information and preset emotional rules;

基于所述发言急迫等级和所述情绪状态确定所述当前发言人的发言状态。The speaking state of the current speaker is determined based on the speaking urgency level and the emotional state.

可选的，所述基于所述请求人信息生成发言打断通知包括：Optionally, the generating a speech interruption notification based on the requester information includes:

获取所述当前发言人的发言内容，对所述发言内容进行分析，生成预设打断理由；Obtaining the speech content of the current speaker, analyzing the speech content, and generating a preset interruption reason;

获取所述发言打断请求的请求时间和请求次数；Obtain the request time and number of requests for the speech interruption request;

基于所述请求时间和所述请求次数计算请求频率；calculating a request frequency based on the request time and the number of requests;

获取预设请求等级规则，基于所述预设等级规则和所述请求频率确定请求等级；Obtaining a preset request level rule, and determining a request level based on the preset level rule and the request frequency;

基于所述打断理由和所述请求等级生成发言打断通知。A speech interruption notification is generated based on the interruption reason and the request level.

可选的，所述打断发言信息包括打断发言内容；所述基于所述打断发言信息和所述发言状态生成发言策略包括：Optionally, the interrupted speech information includes interrupted speech content; the generating a speech policy based on the interrupted speech information and the speech status includes:

计算所述打断发言内容与所述当前会议内容的相关性；calculating the correlation between the content of the interrupted speech and the content of the current meeting;

判断所述相关性是否不小于预设相关性阈值；judging whether the correlation is not less than a preset correlation threshold;

若所述相关性不小于预设相关性阈值，则当所述当前发言人产生发言停顿时直接播放所述打断发言内容；If the correlation is not less than a preset correlation threshold, the interrupted speech content is played directly when the current speaker pauses;

若所述相关性小于预设相关性阈值，则基于所述打断发言内容生成发言提示，将所述发言提示发送至所述当前发言人的会议界面。If the correlation is less than the preset correlation threshold, a speaking prompt is generated based on the content of the interrupted speech, and the speech prompt is sent to the conference interface of the current speaker.

可选的，所述基于所述打断发言内容生成发言提示包括：Optionally, the generating a speech prompt based on the interrupted speech content includes:

对所述发言内容进行关键字提取，基于提取的关键字生成发言概述；Extracting keywords from the speech content, and generating a summary of the speech based on the extracted keywords;

获取所述请求人的姓名，基于所述发言概述和所述姓名生成发言提示。The name of the requester is acquired, and a speaking prompt is generated based on the speaking summary and the name.

可选的，还包括：Optionally, also include:

获取全部所述会议参与人员的会议言论内容和言论表达时间；Obtain the meeting speech content and speech expression time of all the above-mentioned meeting participants;

基于所述会议言论内容和所述言论表达时间生成会议报告。A conference report is generated based on the speech content of the conference and the speech expression time.

第二方面，本申请提供一种视频会议控制装置，采用如下的技术方案：In the second aspect, the present application provides a video conference control device, which adopts the following technical solution:

一种视频会议控制装置，包括：A video conference control device, comprising:

当前发言确定模块，用于获取会议参与人员的人员信息和当前会议内容，基于所述人员信息和所述当前会议内容确定当前发言人；The current speech determination module is used to obtain the personnel information of the conference participants and the current conference content, and determine the current speaker based on the personnel information and the current conference content;

发言信息获取模块，用于响应于发言打断请求，获取所述当前发言人的发言信息和面部信息；A speech information acquisition module, configured to obtain the speech information and facial information of the current speaker in response to the speech interruption request;

发言状态确认模块，用于基于所述发言信息和所述面部信息确定所述当前发言人的发言状态；A speaking status confirmation module, configured to determine the speaking status of the current speaker based on the speaking information and the facial information;

发言打断判断模块，用于基于所述发言状态判断是否能够立即打断所述当前发言人；A speaking interruption judging module, configured to judge whether the current speaker can be interrupted immediately based on the speaking state;

打断请求获取模块，用于获取所述发言打断请求的请求人信息；An interruption request obtaining module, configured to obtain the requester information of the speaking interruption request;

打断通知生成模块，用于基于所述请求人信息生成发言打断通知，将所述发言打断通知发送至所述当前发言人的会议界面；An interrupt notification generation module, configured to generate a speech interruption notification based on the requester information, and send the speech interruption notification to the conference interface of the current speaker;

打断信息获取模块，用于打断采集所述发言打断请求的请求人的打断发言信息；An interruption information acquisition module, configured to interrupt and collect the interruption speech information of the requester of the speech interruption request;

发言策略生成模块，用于基于所述打断发言信息和所述发言状态生成发言策略。A speaking strategy generation module, configured to generate a speaking strategy based on the interrupted speaking information and the speaking state.

第三方面，本申请提供一种电子设备，采用如下的技术方案：In a third aspect, the present application provides an electronic device, which adopts the following technical solution:

一种电子设备，包括处理器，所述处理器与存储器耦合；An electronic device comprising a processor coupled to a memory;

所述处理器用于执行所述存储器中存储的计算机程序，以使得所述电子设备执行第一方面任一项所述的视频会议控制方法的计算机程序。The processor is configured to execute the computer program stored in the memory, so that the electronic device executes the computer program of the video conference control method according to any one of the first aspect.

第四方面，本申请提供一种计算机可读存储介质，采用如下的技术方案：In the fourth aspect, the present application provides a computer-readable storage medium, adopting the following technical solution:

一种计算机可读存储介质，存储有能够被处理器加载并执行第一方面任一项所述的视频会议控制方法的计算机程序。A computer-readable storage medium storing a computer program capable of being loaded by a processor and executing the video conference control method according to any one of the first aspect.

附图说明Description of drawings

图1是本申请实施例提供的一种视频会议控制方法的流程示意图。Fig. 1 is a schematic flowchart of a video conference control method provided by an embodiment of the present application.

图2是本申请实施例提供的一种视频会议控制装置的结构框图。Fig. 2 is a structural block diagram of a video conference control device provided by an embodiment of the present application.

图3是本申请实施例提供的电子设备的结构框图。Fig. 3 is a structural block diagram of an electronic device provided by an embodiment of the present application.

具体实施方式Detailed ways

以下结合附图对本申请作进一步详细说明。The application will be described in further detail below in conjunction with the accompanying drawings.

本申请实施例提供一种视频会议控制方法，该视频会议控制方法可由电子设备执行，该电子设备可以为服务器也可以为终端设备，其中该服务器可以是独立的物理服务器，也可以是多个物理服务器构成的服务器集群或者分布式系统，还可以是提供云让算服务的云服务器。终端设备可以是智能手机、平板电脑、台式计算机等，但并不局限于此。An embodiment of the present application provides a video conference control method, which can be executed by an electronic device, and the electronic device can be a server or a terminal device, where the server can be an independent physical server, or multiple physical servers. A server cluster or distributed system composed of servers can also be a cloud server that provides cloud computing services. The terminal device may be a smart phone, a tablet computer, a desktop computer, etc., but is not limited thereto.

图1为本申请实施例提供的一种视频会议控制方法的流程示意图。FIG. 1 is a schematic flowchart of a method for controlling a video conference provided by an embodiment of the present application.

如图1所示，该方法主要流程描述如下（步骤S101～S108）：As shown in Figure 1, the main flow of the method is described as follows (steps S101-S108):

步骤S101，获取会议参与人员的人员信息和当前会议内容，基于人员信息和当前会议内容确定当前发言人。In step S101, the personnel information of the conference participants and the current conference content are obtained, and the current speaker is determined based on the personnel information and the current conference content.

针对步骤S101，获取会议参与人员的第一声音信息和发送当前会议内容的发送人的第二声音信息；查找第一声音信息中与第二声音信息相同的声音信息；基于声音信息和人员信息确定当前发言人。For step S101, obtain the first voice information of the meeting participants and the second voice information of the sender who sent the current meeting content; search for the same voice information in the first voice information as the second voice information; determine based on the voice information and personnel information current speaker.

在本实施例中，会议参与人员的人员信息包括人员姓名、人员职务、人员的声音信息和人员的面部信息，其中，人员的声音信息即为第一声音信息，包括人员的音色、发音特点和发音频率等，人员的面部信息包括面部特征、惯用表情等。In this embodiment, the personnel information of the meeting participants includes the personnel name, personnel position, personnel voice information and personnel facial information, wherein the personnel voice information is the first voice information, including the personnel's timbre, pronunciation characteristics and Pronunciation frequency, etc., and the facial information of the person includes facial features, habitual expressions, etc.

在会议进行过程中存在多个会议参与人员的麦克风同时打开但只有一个会议参与人员在发言的可能性，在上述情况下，无法单纯的根据麦克风打开状态确定当前发言人，从而需要根据当前会议内容的发送人即正在会议中进行发言的人的声音信息来确定真正的当前发言人。对当前会议内容的发送人发送的信息进行采集，从而获得第二声音信息，将第二声音信息与人员信息中的第一声音信息进行对比，与第二声音信息相同的第一声音信息对应的人员即为当前发言人，并且在确定当前发言人后只保持当前发言人的麦克风处于打开状态，将其余打开麦克风但未进行发言的会议参与人员的麦克风关闭。During the conference, there is a possibility that the microphones of multiple conference participants are turned on at the same time, but only one conference participant is speaking. In the above cases, the current speaker cannot be determined simply based on the status of the microphone, so it needs to be based on the current conference content. The voice information of the sender, that is, the person who is speaking in the conference, is used to determine the real current speaker. Collect the information sent by the sender of the current meeting content to obtain the second sound information, compare the second sound information with the first sound information in the personnel information, and the first sound information that is the same as the second sound information corresponds to The person is the current speaker, and after the current speaker is determined, only the microphone of the current speaker is turned on, and the microphones of the other conference participants who turn on the microphone but do not speak are turned off.

步骤S102，响应于发言打断请求，获取当前发言人的发言信息和面部信息。Step S102, in response to the speaking interruption request, acquire the speech information and face information of the current speaker.

在本实施例中，实时检测每个会议参与人员的麦克风状态，当检测到有会议参与人员打开麦克风时，向打开麦克风的会议参与人员的会议界面发送发言询问，当问结果为是的时候，将此次麦克风打开操作判定为发言打断请求。产生发言打断请求时，对当前发言人的发言信息和面部信息进行重点采集。In this embodiment, the microphone status of each conference participant is detected in real time. When it is detected that a conference participant turns on the microphone, a speech query is sent to the conference interface of the conference participant who turned on the microphone. When the result of the question is yes, The operation of turning on the microphone is judged as a speaking interruption request. When a speech interruption request is generated, focus on collecting the speech information and facial information of the current speaker.

步骤S103，基于发言信息和面部信息确定当前发言人的发言状态。Step S103, determine the speaking status of the current speaker based on the speaking information and facial information.

针对步骤S103，基于发言信息确定当前发言人的发言语速和发言音量；基于发言语速、发言音量和预设等级规则确定当前发言人的发言急迫等级；基于面部信息和预设情绪规则确定当前发言人的情绪状态；基于发言急迫等级和情绪状态确定当前发言人的发言状态。For step S103, determine the speaking speed and speaking volume of the current speaker based on the speaking information; determine the speaking urgency level of the current speaker based on the speaking speed, speaking volume and preset level rules; determine the current speaker's speaking urgency level based on facial information and preset emotional rules The emotional state of the speaker; determine the current speaker's speaking state based on the speaking urgency level and emotional state.

在本实施例中，发言信息包括当前发言人的发言语速和发言音量，面部信息包括当前发言人的表情信息。预设等级规则包括语速等级规则和音量等级规则，根据语速等级规则和发言语速确定语速等级，根据发言音量和音量等级规则确定音量等级，将语速等级和音量等级相加得到发言急迫等级。语速等级规则为设置发言字数区间，每个发言字数区间对应有相应的语速等级，发言字数区间的值越大对应的语速等级越高，采集在单位时间内的发言字数，将发言字数与发言字数区间进行匹配，匹配成功的发言字数区间的语速等级即为当前发言人的语速等级，其中发言区间的区间值可以相同也可以程递减的形式，并且需要根据人类语速的正常值和极限值进行设置，不可脱离实际的人类语速范围，并且单位时间需要根据实际需求设置在1分钟之内的整数秒，如10秒、15秒或者20秒等。In this embodiment, the speech information includes the speech rate and volume of the current speaker, and the facial information includes expression information of the current speaker. The preset level rules include speech rate level rules and volume level rules. The speech rate level is determined according to the speech rate level rules and speech speed, and the volume level is determined according to the speech volume and volume level rules. The speech rate level and volume level are added together to obtain the speech Urgency level. The speech rate level rule is to set the speaking word count interval. Each speech word count interval corresponds to a corresponding speech speed level. The larger the value of the speech word count interval, the higher the corresponding speech speed level. Collect the speech word count per unit time, and the speech word count Match with the speech word count range, and the speech speed level of the successfully matched speech word count interval is the speech speed class of the current speaker, where the interval values of the speech interval can be the same or in the form of decreasing degrees, and need to be based on the normal human speech speed The value and limit value should be set, and should not deviate from the actual human speech rate range, and the unit time needs to be set to an integer second within 1 minute according to actual needs, such as 10 seconds, 15 seconds or 20 seconds.

音量等级规则为根据单位时间内的分贝平均值确定音量等级，每个分贝对应有一个音量等级，将平均分贝值与音量等级规则中设置的音量分贝进行对比，将与平均分贝值相同的音量等级规则中设置的音量分贝对应的音量等级作为当前发言人的音量等级。The volume level rule is to determine the volume level based on the average decibel value per unit time. Each decibel corresponds to a volume level. Compare the average decibel value with the volume decibel set in the volume level rule, and set the same volume level as the average decibel value The volume level corresponding to the volume decibel set in the rule is used as the volume level of the current speaker.

在确定发言急迫等级之后，根据当前发言人的面部表情确定当前发言人的情绪状态，不同的面部表情对应有不同的情绪状态，将采集到的面部表情与预设情绪规则进行匹配，预设情绪规则中设置有每个表情对应的情绪状态和情绪状态等级，匹配的结果即可反应当前发言人的情绪状态和情绪状态等级，比如，面部表情为皱眉、面部肌肉绷紧且聚集，对应的情绪为生气，生气等级为5级。发言状态即为发言急迫等级和情绪状态以及情绪状态等级的叠加，需要说明的是，具体的情绪状态和情绪等级需要跟人体的真实反映进行设置，在此不作具体限定。After determining the speaking urgency level, the emotional state of the current speaker is determined according to the facial expression of the current speaker. Different facial expressions correspond to different emotional states, and the collected facial expressions are matched with the preset emotional rules. The emotional state and emotional state level corresponding to each expression are set in the rules, and the matching result can reflect the emotional state and emotional state level of the current speaker. For example, if the facial expression is frowning, and the facial muscles are tense and gathered, the corresponding emotion For anger, the anger level is 5. The speaking state is the superposition of speaking urgency level, emotional state, and emotional state level. It should be noted that the specific emotional state and emotional level need to be set according to the real reflection of the human body, and are not specifically limited here.

步骤S104，基于发言状态判断是否能够立即打断当前发言人。Step S104, judging whether the current speaker can be interrupted immediately based on the speaking state.

除发言状态外，还包括网络环境，当其中任一项不满足预设条件时，均判定为不能够立即打断当前发言人，即只有当两项全部满足预设条件时才能判定为能够立即打断当前发言人，预设条件为根据当前网络环境和能够正常交谈设置的会议条件，需要根据实际情况进行调节设置。In addition to the speaking state, it also includes the network environment. When any one of them does not meet the preset conditions, it is judged that the current speaker cannot be interrupted immediately, that is, only when both of them meet the preset conditions can it be judged that it can be interrupted immediately. To interrupt the current speaker, the preset condition is the meeting condition set according to the current network environment and the ability to talk normally, and the setting needs to be adjusted according to the actual situation.

步骤S105，若能够立即打断当前发言人，则获取发言打断请求的请求人信息。Step S105, if the current speaker can be interrupted immediately, obtain requester information of the speaking interruption request.

在本实施例中，请求人信息包括请求人的姓名、性别、声音信息和面部信息等。In this embodiment, the requester information includes the requester's name, gender, voice information, facial information, and the like.

步骤S106，基于请求人信息生成发言打断通知，将发言打断通知发送至当前发言人的会议界面。Step S106, generating a speech interruption notification based on the requester information, and sending the speech interruption notification to the conference interface of the current speaker.

针对步骤S106，获取当前发言人的发言内容，对发言内容进行分析，生成预设打断理由；获取发言打断请求的请求时间和请求次数；基于请求时间和请求次数计算请求频率；获取预设请求等级规则，基于预设等级规则和请求频率确定请求等级；基于打断理由和请求等级生成发言打断通知。For step S106, obtain the speech content of the current speaker, analyze the speech content, and generate a preset interrupt reason; obtain the request time and request times of the speech interruption request; calculate the request frequency based on the request time and request times; obtain the preset The request level rule determines the request level based on the preset level rule and the request frequency; generates a speech interruption notification based on the interruption reason and the request level.

在本实施例中，在判定能够立即打断当前发言人之后，为了保证当前发言人的情绪不会受到影响，从而提示请求人正常进行发言，此发言并非完整发言，而是发言理由的简述，将请求人的发言内容进行录制存储，对发言内容进行语义分析，确定发言内容的关键内容，同时采用上述的发言急迫等级的确定方式确定发言急迫等级，根据关键内容和发言急迫等级生成预设打断理由。并且对发言打断请求的请求时间和请求次数进行采集记录，计算请求时间和请求次数的比值，将比值作为请求频率，预设请求等级规则为每个请求频率对应一个请求等级，不同的请求频率可以对应一个请求等级，将预设打断理由和请求等级进行组合生成发言打断通知，从而使得当前发言人能够在被打断时及时获知被打断的原因。In this embodiment, after it is determined that the current speaker can be interrupted immediately, in order to ensure that the current speaker’s mood will not be affected, the requester is prompted to speak normally. This speech is not a complete speech, but a brief description of the reasons for the speech , record and store the speech content of the requester, perform semantic analysis on the speech content, determine the key content of the speech content, and at the same time use the above-mentioned method of determining the urgency level of the speech to determine the urgency level of the speech, and generate a preset according to the key content and the urgency level of the speech Reason for interruption. And collect and record the request time and request times of the speech interruption request, calculate the ratio of the request time to the number of requests, and use the ratio as the request frequency. The preset request level rule is that each request frequency corresponds to a request level, and different request frequencies Corresponding to a request level, the preset interruption reason and the request level can be combined to generate a speech interruption notification, so that the current speaker can know the reason for interruption in time when being interrupted.

步骤S107，若不能够立即打断当前发言人，则采集发言打断请求的请求人的打断发言信息。In step S107, if the current speaker cannot be interrupted immediately, the speech interruption information of the requester of the speech interruption request is collected.

步骤S108，基于打断发言信息和发言状态生成发言策略。Step S108, generating a speaking policy based on the interrupted speaking information and the speaking state.

针对步骤S108，计算打断发言内容与当前会议内容的相关性；判断相关性是否不小于预设相关性阈值；若相关性不小于预设相关性阈值，则当当前发言人产生发言停顿时直接播放打断发言内容；若相关性小于预设相关性阈值，则基于打断发言内容生成发言提示，将发言提示发送至当前发言人的会议界面。For step S108, calculate the correlation between the interrupted speech content and the current meeting content; judge whether the correlation is not less than the preset correlation threshold; if the correlation is not less than the preset correlation threshold, then directly when the current speaker produces a speech pause Play the interrupted speech content; if the correlation is less than the preset correlation threshold, generate a speech prompt based on the interrupted speech content, and send the speech prompt to the conference interface of the current speaker.

进一步的，对发言内容进行关键字提取，基于提取的关键字生成发言概述；获取请求人的姓名，基于发言概述和姓名生成发言提示。Further, keywords are extracted from the speech content, and a speech summary is generated based on the extracted keywords; the name of the requester is obtained, and a speech prompt is generated based on the speech summary and the name.

在本实施例中，当判定不能够立即打断当前发言人之后，为了保证请求人的当前发言不会受到时间的影响，从而需要提示请求人对打断发言信息马上进行表述，对全部的打断发言信息进行存储，并对打断发言信息进行关键词提取，将提取出的关键词与当前会议内容的关键词进行对比，根据本关键词确定相关性。当关键词相同时，直接将相关性设置为最高值，当关键词不同但属于同类型或者同语境词时，根据预设的差异相关性值确定相关性。在相关性大于等于预设相关性阈值时，只要当前发言人产生发言停顿，将立即播放打断发言内容，其中，发言停顿为在一定时间内当前发言人没有进行发言行为，例如，当前发言人在发言后5秒钟内没有检测到发言行为即可判定为发言停顿。在相关性小于预设相关性阈值时，根据打断发言信息中提取到的关键字生成发言概述，根据请求人的姓名，基于发言概述和姓名生成发言提示，将发言提示发送到当前发言人的会议界面，在当前发言人允许时，播放打断发言内容。In this embodiment, after it is judged that the current speaker cannot be interrupted immediately, in order to ensure that the requester's current speech will not be affected by time, it is necessary to prompt the requester to express the interrupted speech information immediately, and all interrupted Store the interrupted speech information, and extract keywords from the interrupted speech information, compare the extracted keywords with the keywords of the current meeting content, and determine the relevance according to the keywords. When the keywords are the same, the correlation is directly set to the highest value, and when the keywords are different but belong to the same type or the same context words, the correlation is determined according to the preset difference correlation value. When the correlation is greater than or equal to the preset correlation threshold, as long as the current speaker pauses in speech, the interrupted speech content will be played immediately, where the speech pause means that the current speaker does not speak within a certain period of time, for example, the current speaker If no speaking behavior is detected within 5 seconds after speaking, it can be judged as a speech pause. When the correlation is less than the preset correlation threshold, a speech summary is generated according to the keywords extracted from the interrupted speech information, and a speech prompt is generated based on the speech summary and name according to the name of the requester, and the speech prompt is sent to the current speaker's In the conference interface, when the current speaker allows it, the content of the interrupted speech will be played.

在本实施例中，获取全部会议参与人员的会议言论内容和言论表达时间；基于会议言论内容和言论表达时间生成会议报告。In this embodiment, the conference speech content and speech expression time of all conference participants are acquired; a conference report is generated based on the conference speech content and speech expression time.

一般的视频会议都会对会议进行记录，但是由于存在打断发言内容录制存储选择播放时机的情况，可能导致再次观看会议记录时，得到的信息不连贯，因此，将每个会议参与人员的会议言论内容按照言论表达时间顺序进行排列生成会议报告，从而使得在查看时，可以根据会议报告快速了解到什么时间哪个人发表了什么言论。General video conferences will record the meeting, but due to the interruption of the recording and storage of the speech content and the selection of playback timing, the information obtained may be incoherent when viewing the meeting record again. Therefore, the conference speeches of each conference participant The content is arranged in the order of speech expression time to generate a meeting report, so that when viewing, you can quickly know who made what speech at what time according to the meeting report.

图2为申请实施例提供的一种视频会议控制装置200的结构框图。Fig. 2 is a structural block diagram of a videoconference control device 200 provided in the embodiment of the application.

如图2所示，视频会议控制装置200主要包括：As shown in Figure 2, the videoconference control device 200 mainly includes:

当前发言确定模块201，用于获取会议参与人员的人员信息和当前会议内容，基于人员信息和当前会议内容确定当前发言人；The currentspeech determination module 201 is used to obtain the personnel information and the current meeting content of the meeting participants, and determine the current speaker based on the personnel information and the current meeting content;

发言信息获取模块202，用于响应于发言打断请求，获取当前发言人的发言信息和面部信息；Speechinformation acquisition module 202, used to respond to the speech interruption request, and obtain the speech information and facial information of the current speaker;

发言状态确认模块203，用于基于发言信息和面部信息确定当前发言人的发言状态；Speakingstatus confirmation module 203, for determining the speaking status of the current speaker based on speaking information and facial information;

发言打断判断模块204，用于基于发言状态判断是否能够立即打断当前发言人；Speechinterruption judging module 204, for judging whether the current speaker can be interrupted immediately based on the speech state;

打断请求获取模块205，用于获取发言打断请求的请求人信息；Interruptionrequest acquisition module 205, configured to acquire the requester information of the speaking interruption request;

打断通知生成模块206，用于基于请求人信息生成发言打断通知，将发言打断通知发送至当前发言人的会议界面；Interruptionnotification generating module 206, configured to generate a speech interruption notification based on the requester information, and send the speech interruption notification to the conference interface of the current speaker;

打断信息获取模块207，用于打断采集发言打断请求的请求人的打断发言信息；The interruptioninformation acquisition module 207 is used to interrupt and collect the interruption speech information of the requester of the speech interruption request;

发言策略生成模块208，用于基于打断发言信息和发言状态生成发言策略。A speakingstrategy generation module 208, configured to generate a speaking strategy based on the interrupted speaking information and the speaking state.

作为本实施例的一种可选实施方式，当前发言确定模块201具体用于获取会议参与人员的第一声音信息和发送当前会议内容的发送人的第二声音信息；查找第一声音信息中与第二声音信息相同的声音信息；基于声音信息和人员信息确定当前发言人。As an optional implementation of this embodiment, the current speakingdetermination module 201 is specifically configured to obtain the first voice information of the conference participants and the second voice information of the sender who sent the current conference content; The same sound information as the second sound information; determine the current speaker based on the sound information and the person information.

作为本实施例的一种可选实施方式，发言状态确认模块203具体用于基于发言信息确定当前发言人的发言语速和发言音量；基于发言语速、发言音量和预设等级规则确定当前发言人的发言急迫等级；基于面部信息和预设情绪规则确定当前发言人的情绪状态；基于发言急迫等级和情绪状态确定当前发言人的发言状态。As an optional implementation of this embodiment, the speechstatus confirmation module 203 is specifically configured to determine the speech rate and speech volume of the current speaker based on the speech information; determine the current speech based on the speech rate, speech volume, and preset level rules The person's speaking urgency level; determine the current speaker's emotional state based on facial information and preset emotional rules; determine the current speaker's speaking state based on the speaking urgency level and emotional state.

作为本实施例的一种可选实施方式，打断通知生成模块206具体用于获取当前发言人的发言内容，对发言内容进行分析，生成预设打断理由；获取发言打断请求的请求时间和请求次数；基于请求时间和请求次数计算请求频率；获取预设请求等级规则，基于预设等级规则和请求频率确定请求等级；基于打断理由和请求等级生成发言打断通知。As an optional implementation of this embodiment, the interruptnotification generation module 206 is specifically used to obtain the speech content of the current speaker, analyze the speech content, and generate a preset interruption reason; obtain the request time of the speech interruption request and request times; calculate request frequency based on request time and request times; obtain preset request level rules, determine request level based on preset level rules and request frequency; generate speech interruption notification based on interruption reason and request level.

作为本实施例的一种可选实施方式，发言策略生成模块208包括：As an optional implementation manner of this embodiment, the speakingpolicy generation module 208 includes:

相关计算模块，用于计算打断发言内容与当前会议内容的相关性；A correlation calculation module, configured to calculate the correlation between the content of the interrupted speech and the content of the current meeting;

相关判断模块，用于判断相关性是否不小于预设相关性阈值；A correlation judging module, configured to judge whether the correlation is not less than a preset correlation threshold;

发言播放模块，用于当当前发言人产生发言停顿时直接播放打断发言内容；The speech playback module is used to directly play the interrupted speech content when the current speaker generates a speech pause;

提示生成模块，用于基于打断发言内容生成发言提示，将发言提示发送至当前发言人的会议界面。The prompt generation module is configured to generate a speech prompt based on the content of the interrupted speech, and send the speech prompt to the conference interface of the current speaker.

在本可选实施例中，提示生成模块具体用于对发言内容进行关键字提取，基于提取的关键字生成发言概述；获取请求人的姓名，基于发言概述和姓名生成发言提示。In this optional embodiment, the prompt generation module is specifically configured to extract keywords from the speech content, generate a speech summary based on the extracted keywords; obtain the name of the requester, and generate a speech prompt based on the speech summary and name.

作为本实施例的一种可选实施方式，视频会议控制装置200还包括：As an optional implementation manner of this embodiment, the videoconference control device 200 further includes:

会议言论获取模块，用于获取全部会议参与人员的会议言论内容和言论表达时间；The conference speech acquisition module is used to obtain the conference speech content and speech expression time of all conference participants;

会议报告生成模块，用于基于会议言论内容和言论表达时间生成会议报告。The conference report generation module is used to generate a conference report based on the conference speech content and speech expression time.

在一个例子中，以上任一装置中的模块可以是被配置成实施以上方法的一个或多个集成电路，例如：一个或多个专用集成电路(application specificintegratedcircuit，ASIC)，或，一个或多个数字信号处理器(digital signal processor，DSP)，或，一个或者多个现场可编程门阵列(field programmable gate array，FPGA)，或这些集成电路形式中至少两种的组合。In one example, the modules in any of the above devices may be one or more integrated circuits configured to implement the above method, for example: one or more application specific integrated circuits (ASIC), or one or more A digital signal processor (DSP), or, one or more field programmable gate arrays (FPGA), or a combination of at least two of these integrated circuit forms.

再如，当装置中的模块可以通过处理元件调度程序的形式实现时，该处理元件可以是通用处理器，例如中央处理器(central processing unit，CPU)或其它可以调用程序的处理器。再如，这些模块可以集成在一起，以片上系统(system-on-a-chip，SOC)的形式实现。For another example, when the modules in the device can be implemented in the form of a processing element scheduler, the processing element can be a general processor, such as a central processing unit (CPU) or other processors that can call programs. For another example, these modules can be integrated together and implemented in the form of a system-on-a-chip (SOC).

所属领域的技术人员可以清楚地了解到，为描述的方便和简洁，上述描述的装置和模块的具体工作过程，可以参考前述方法实施例中的对应过程，在此不再赘述。Those skilled in the art can clearly understand that for the convenience and brevity of the description, the specific working process of the above-described devices and modules can refer to the corresponding process in the foregoing method embodiments, which will not be repeated here.

图3为本申请实施例提供的电子设备300的结构框图。FIG. 3 is a structural block diagram of anelectronic device 300 provided by an embodiment of the present application.

如图3所示，电子设备300包括处理器301和存储器302，还可以进一步包括信息输入/信息输出(I/O)接口303、通信组件304中的一种或多种以及通信总线305。As shown in FIG. 3 , theelectronic device 300 includes aprocessor 301 and amemory 302 , and may further include an information input/output (I/O)interface 303 , one or more ofcommunication components 304 and acommunication bus 305 .

其中，处理器301用于控制电子设备300的整体操作，以完成上述的视频会议控制方法的全部或部分步骤；存储器302用于存储各种类型的数据以支持在电子设备300的操作，这些数据例如可以包括用于在该电子设备300上操作的任何应用程序或方法的指令，以及应用程序相关的数据。该存储器302可以由任何类型的易失性或非易失性存储设备或者它们的组合实现，例如静态随机存取存储器(Static Random Access Memory，SRAM)、电可擦除可编程只读存储器(Electrically Erasable Programmable Read-Only Memory，EEPROM)、可擦除可编程只读存储器(Erasable Programmable Read-Only Memory，EPROM)、可编程只读存储器(Programmable Read-Only Memory，PROM)、只读存储器(Read-OnlyMemory，ROM)、磁存储器、快闪存储器、磁盘或光盘中的一种或多种。Among them, theprocessor 301 is used to control the overall operation of theelectronic device 300, so as to complete all or part of the steps of the above video conference control method; thememory 302 is used to store various types of data to support the operation of theelectronic device 300, these data For example, instructions for any application or method operating on theelectronic device 300 may be included, as well as application-related data. Thememory 302 can be implemented by any type of volatile or non-volatile memory device or their combination, such as Static Random Access Memory (Static Random Access Memory, SRAM), Electrically Erasable Programmable Read-Only Memory (Electrically Erasable Programmable Read-Only Memory, EEPROM), Erasable Programmable Read-Only Memory (Erasable Programmable Read-Only Memory, EPROM), Programmable Read-Only Memory (Programmable Read-Only Memory, PROM), Read-Only Memory (Read-Only Memory) One or more of OnlyMemory, ROM), magnetic memory, flash memory, magnetic disk or optical disk.

I/O接口303为处理器301和其他接口模块之间提供接口，上述其他接口模块可以是键盘，鼠标，按钮等。这些按钮可以是虚拟按钮或者实体按钮。通信组件304用于电子设备300与其他设备之间进行有线或无线通信。无线通信，例如Wi-Fi，蓝牙，近场通信(NearField Communication，简称NFC)，2G、3G或4G，或它们中的一种或几种的组合，因此相应的该通信组件104可以包括：Wi-Fi部件，蓝牙部件，NFC部件。The I/O interface 303 provides an interface between theprocessor 301 and other interface modules, which may be a keyboard, a mouse, buttons, and the like. These buttons can be virtual buttons or physical buttons. Thecommunication component 304 is used for wired or wireless communication between theelectronic device 300 and other devices. Wireless communication, such as Wi-Fi, Bluetooth, near field communication (NearField Communication, NFC for short), 2G, 3G or 4G, or a combination of one or more of them, so the corresponding communication component 104 may include: Wi -Fi parts, bluetooth parts, NFC parts.

电子设备300可以被一个或多个应用专用集成电路 (Application SpecificIntegrated Circuit，简称ASIC)、数字信号处理器(Digital Signal Processor，简称DSP)、数字信号处理设备(Digital Signal Processing Device，简称DSPD)、可编程逻辑器件(Programmable Logic Device，简称PLD)、现场可编程门阵列(Field ProgrammableGate Array，简称FPGA)、控制器、微控制器、微处理器或其他电子元件实现，用于执行上述实施例给出的视频会议控制方法。Theelectronic device 300 may be implemented by one or more application-specific integrated circuits (Application Specific Integrated Circuit, ASIC for short), digital signal processors (Digital Signal Processor, DSP for short), digital signal processing devices (Digital Signal Processing Device, DSPD for short), Programmable logic device (Programmable Logic Device, be called for short PLD), field programmable gate array (Field ProgrammableGate Array, be called for short FPGA), controller, microcontroller, microprocessor or other electronic components realize, be used to carry out the above-mentioned embodiment given video conferencing control method.

通信总线305可包括一通路，在上述组件之间传送信息。通信总线305可以是PCI(Peripheral Component Interconnect，外设部件互连标准)总线或EISA (ExtendedIndustry Standard Architecture，扩展工业标准结构)总线等。通信总线305可以分为地址总线、数据总线、控制总线等。Communication bus 305 may include a path for communicating information between the components described above. Thecommunication bus 305 may be a PCI (Peripheral Component Interconnect, Peripheral Component Interconnect Standard) bus or an EISA (Extended Industry Standard Architecture, Extended Industry Standard Architecture) bus or the like. Thecommunication bus 305 can be divided into an address bus, a data bus, a control bus, and the like.

电子设备300可以包括但不限于移动电话、笔记本电脑、数字广播接收器、PDA（个人数字助理）、PAD（平板电脑）、PMP（便携式多媒体播放器）、车载终端（例如车载导航终端）等等的移动终端以及诸如数字TV、台式计算机等等的固定终端，还可以为服务器等。Theelectronic device 300 may include, but is not limited to, a mobile phone, a notebook computer, a digital broadcast receiver, a PDA (personal digital assistant), a PAD (tablet computer), a PMP (portable multimedia player), a vehicle terminal (such as a vehicle navigation terminal), etc. Mobile terminals and fixed terminals such as digital TVs, desktop computers, etc., may also be servers.

本申请还提供一种计算机可读存储介质，计算机可读存储介质上存储有计算机程序，计算机程序被处理器执行时实现上述的视频会议控制方法的步骤。The present application also provides a computer-readable storage medium, on which a computer program is stored, and when the computer program is executed by a processor, the steps of the above video conference control method are implemented.

该计算机可读存储介质可以包括：U盘、移动硬盘、只读存储器 (R ead-OnlyMemory，ROM)、随机存取存储器(Random Access Memory，RAM)、磁碟或者光盘等各种可以存储程序代码的介质。The computer-readable storage medium may include: U disk, mobile hard disk, read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disk, etc., which can store program codes. medium.

术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含，从而使得包括一系列要素的过程、方法、物品或者设备不仅包括那些要素，而且还包括没有明确列出的其他要素，或者是还包括为这种过程、方法、物品或者设备所固有的要素。The term "comprises", "comprises" or any other variation thereof is intended to cover a non-exclusive inclusion such that a process, method, article or apparatus comprising a set of elements includes not only those elements but also other elements not expressly listed elements, or also elements inherent in such a process, method, article, or apparatus.

以上描述仅为本申请的较佳实施例以及对所运用技术原理的说明。本领域技术人员应当理解，本申请中所涉及的申请范围，并不限于上述技术特征的特定组合而成的技术方案，同时也应涵盖在不脱离前述申请构思的情况下，由上述技术特征或其等同特征进行任意组合而形成的其它技术方案。例如上述特征与本申请中申请的(但不限于)具有类似功能的技术特征进行互相替换而形成的技术方案。The above description is only a preferred embodiment of the present application and an illustration of the applied technical principles. Those skilled in the art should understand that the application scope involved in this application is not limited to the technical solution formed by the specific combination of the above-mentioned technical features, and should also cover the technical solution formed by the above-mentioned technical features or Other technical solutions formed by any combination of equivalent features. For example, a technical solution formed by replacing the above-mentioned features with (but not limited to) technical features with similar functions in this application.