CN114822557B

Movatterモバイル変換

Info

Publication number: CN114822557B
Application number: CN202210339090.1A
Authority: CN
Inventors: 孙德宇; 谷娜娜
Original assignee: Beijing Zonekey Modern Technology Co ltd
Current assignee: Beijing Zonekey Modern Technology Co ltd
Priority date: 2022-04-01
Filing date: 2022-04-01
Publication date: 2025-04-04
Anticipated expiration: 2042-04-01
Also published as: CN114822557A

Abstract

Translated fromChinese

本申请涉及一种课堂中不同声音的区分方法、装置、设备以及存储介质，涉及声音分类的技术领域，其方法包括采集课堂声音，将课堂声音输入至训练好的声纹模型中，得到多段声音片段的声纹向量；根据声纹向量，判断与声纹向量对应的声音片段是否为非教师声音；若是，则将声纹向量输入至训练好的声音分类模型中，根据声纹向量，对与声纹向量对应的声音片段进行分类；其中，声音分类模型的训练方法，包括：提取训练样本集中每一个训练样本的梅尔谱特征；将梅尔谱特征转化为二维的梅尔频谱图；将梅尔频谱图输入至声音分类模型中，利用VGG11网络结构对声音分类模型进行训练。本申请具有区分出课堂中的不同声音的效果。

The present application relates to a method, device, equipment and storage medium for distinguishing different sounds in a classroom, and to the technical field of sound classification. The method includes collecting classroom sounds, inputting the classroom sounds into a trained voiceprint model, and obtaining voiceprint vectors of multiple sound segments; judging whether the sound segment corresponding to the voiceprint vector is a non-teacher's voice according to the voiceprint vector; if so, inputting the voiceprint vector into a trained sound classification model, and classifying the sound segment corresponding to the voiceprint vector according to the voiceprint vector; wherein the training method of the sound classification model includes: extracting the Mel spectrum features of each training sample in the training sample set; converting the Mel spectrum features into a two-dimensional Mel spectrum graph; inputting the Mel spectrum graph into the sound classification model, and training the sound classification model using the VGG11 network structure. The present application has the effect of distinguishing different sounds in the classroom.