



技术领域technical field
本发明属于语音降噪技术领域,尤其涉及一种基于RCED网络的降噪方法。The invention belongs to the technical field of speech noise reduction, and in particular relates to a noise reduction method based on an RCED network.
背景技术Background technique
实际生活中,噪声无处不在,基本不存在完全纯净的语音,所以使用语音增强来提高带噪语音的可懂度和质量,目前,已经广泛应用于语音识别、助听器等领域。常见的语音增强技术可分为传统方法和基于深度学习的方法两大类。传统方法主要包含谱减法、维纳滤波法、基于统计模型的方法、基于子空间的方法。它们基于噪声平稳的假设,因此无法处理非平稳噪声。In real life, noise is everywhere, and there is basically no completely pure speech. Therefore, speech enhancement is used to improve the intelligibility and quality of noisy speech. At present, it has been widely used in speech recognition, hearing aids and other fields. Common speech enhancement techniques can be divided into two categories: traditional methods and deep learning-based methods. Traditional methods mainly include spectral subtraction, Wiener filtering, statistical model-based methods, and subspace-based methods. They are based on the assumption that the noise is stationary and thus cannot handle non-stationary noise.
发明内容SUMMARY OF THE INVENTION
本发明提供一种基于RCED网络的降噪方法,旨在解决上述存在的问题。The present invention provides a noise reduction method based on an RCED network, aiming at solving the above-mentioned problems.
本发明是这样实现的,一种基于RCED网络的降噪方法,包括以下步骤:The present invention is realized in this way, a kind of noise reduction method based on RCED network, comprises the following steps:
S1:构建RCED;S1: build RCED;
S2:将目标增强帧和其两侧的部分帧进行拼接,然后通过RCED进行卷积操作;S2: Splicing the target enhanced frame and some frames on both sides of it, and then performing the convolution operation through RCED;
S3:引入shortcut机制,将RCED中的编码器输出和相应解码器输出进行拼接,然后输入到下一个卷积层中执行后续操作;S3: Introduce the shortcut mechanism, splicing the encoder output in RCED with the corresponding decoder output, and then inputting it into the next convolutional layer to perform subsequent operations;
S4:引入shortcut机制,将所有编码器和所有解码器分别组合成一个DenseBlock,在层之间增加短路路径。S4: Introduce the shortcut mechanism, combine all encoders and all decoders into a DenseBlock, and add short-circuit paths between layers.
进一步的,所述RCED包括多个相同模块A,所述模块A包含卷积层、块归一化层和ReLU激活层。Further, the RCED includes a plurality of identical modules A, and the module A includes a convolution layer, a block normalization layer and a ReLU activation layer.
进一步的,所述RCED还包括位于末端的一个模块B,所述模块B包含卷积层,并输出增强帧。Further, the RCED further includes a module B at the end, the module B includes a convolution layer, and outputs an enhanced frame.
进一步的,所述RCED前后各拼接帧数为7。Further, the number of spliced frames before and after the RCED is 7.
进一步的,在步骤S3中,应用公式为:Further, in step S3, the application formula is:
xdecoder+1=f(xdecoder,xencoder,)xdecoder+1 = f(xdecoder , xencoder ,)
用逗号连接表示在深度上拼接;Connecting with commas means splicing in depth;
其中,f(.)为卷积、Batch Normalization和ReLU的集合。Among them, f(.) is the set of convolution, Batch Normalization and ReLU.
xencoder和xdecoder分别为encoder和decoder中对称的层,将这两个层拼接过后输入(decoder+1)层。The xencoder and xdecoder are the symmetrical layers in the encoder and the decoder, respectively, and the two layers are spliced and input into the (decoder+1) layer.
进一步的,在步骤S4中,应用公式为:Further, in step S4, the application formula is:
用逗号连接表示在深度上拼接;Connecting with commas means splicing in depth;
f(.)为卷积、Batch Normalization和ReLU的集合;f(.) is the set of convolution, Batch Normalization and ReLU;
xt(t=0,1,...l-1)为dense block中的第t层。xt (t=0, 1,...l-1) is the t-th layer in the dense block.
与现有技术相比,本发明的有益效果是:本发明使用只包含卷积层的 RCED,丢弃了池化层和与其对应的上采样层;并在其上引入不同的shortcut 机制,性能良好且有泛化性,可以重复利用信息,从而使用更少的数据来提取出更多有用的特征;易于训练、减少梯度消亡、减少参数,同时在小数据集上克服过拟合的问题。Compared with the prior art, the beneficial effects of the present invention are: the present invention uses RCED that only includes convolutional layers, discards the pooling layer and its corresponding upsampling layer; and introduces different shortcut mechanisms on it, with good performance It is generalizable and can reuse information to extract more useful features with less data; easy to train, reduce gradient demise, reduce parameters, and overcome the problem of overfitting on small data sets.
附图说明Description of drawings
图1为本发明实施例图一;Fig. 1 is Fig. 1 of the embodiment of the present invention;
图2为本发明实施例图二;Fig. 2 is Fig. 2 of the embodiment of the present invention;
图3为本发明实施例图三;Fig. 3 is Fig. 3 of the embodiment of the present invention;
图4为本发明实施例图四。FIG. 4 is a fourth embodiment of the present invention.
具体实施方式Detailed ways
为了使本发明的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本发明进行进一步详细说明。应当理解,此处所描述的具体实施例仅仅用以解释本发明,并不用于限定本发明。In order to make the objectives, technical solutions and advantages of the present invention clearer, the present invention will be further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are only used to explain the present invention, but not to limit the present invention.
请参阅图1-2,本发明提供一种技术方案:一种基于RCED网络的降噪方法,包括以下步骤:1-2, the present invention provides a technical solution: a noise reduction method based on an RCED network, comprising the following steps:
S1:构建RCED;S1: build RCED;
S2:将目标增强帧和其两侧的部分帧进行拼接,然后通过RCED进行卷积操作;S2: Splicing the target enhanced frame and some frames on both sides of it, and then performing the convolution operation through RCED;
S3:将RCED中的编码器输出和相应解码器输出进行拼接,然后输入到下一个卷积层中执行后续操作;S3: Concatenate the encoder output in RCED with the corresponding decoder output, and then input it into the next convolutional layer to perform subsequent operations;
S4:引入shortcut机制,将所有编码器和所有解码器分别组合成一个DenseBlock,在层之间增加短路路径。S4: Introduce the shortcut mechanism, combine all encoders and all decoders into a DenseBlock, and add short-circuit paths between layers.
本实施方式中,在SC-FCN中,本发明使用RCED作为基础模型,并进行改进。In this embodiment, in SC-FCN, the present invention uses RCED as a basic model and makes improvements.
RCED如图1所示:其由多个相同模块组成。RCED is shown in Figure 1: it consists of multiple identical modules.
除了最后一个模块,每个模块包含卷积层、块归一化层(batch normalization)和ReLU激活层。Except for the last module, each module contains convolutional layers, batch normalization layers, and ReLU activation layers.
最后一个模块只包含卷积层,并输出增强帧。由于卷积层的输入是图,所以RCED将目标增强帧和其两侧的部分帧进行拼接,然后进行卷积操作,这样也可以有效利用上下文信息。The last module contains only convolutional layers and outputs enhanced frames. Since the input of the convolutional layer is a graph, RCED splices the target enhanced frame and the partial frames on both sides of it, and then performs the convolution operation, which can also effectively utilize the context information.
但是实验发现,拼接少量的上下文帧可以获得不错的增强效果,但随着拼接帧数的增加,降噪效果会明显下降。However, the experiment found that a good enhancement effect can be obtained by splicing a small number of context frames, but with the increase of the number of spliced frames, the noise reduction effect will decrease significantly.
这表明增加上下文帧的同时也会引入部分冗余信息,导致神经网络的训练受到干扰。This indicates that adding some context frames will also introduce some redundant information, which will interfere with the training of the neural network.
由于pesq取值范围在[-0.5,4.5],stoi取值范围为[0,1],所以通过式(1) 来对扩展不同帧的模型打分:Since the value range of pesq is [-0.5, 4.5], and the value range of stoi is [0, 1], the model that extends different frames is scored by formula (1):
(1)score=pesq/5+stoi(1) score=pesq/5+stoi
请参阅图3,图3中纵轴表示SNR,横轴表示score的值,从图3中可以看出,拼接帧数在7左右时,RCED的性能达到峰值。同时考虑模型的性能和计算效率,本发明在RCED前后各拼接7帧作为基础模型。Please refer to Figure 3. In Figure 3, the vertical axis represents SNR, and the horizontal axis represents the value of score. It can be seen from Figure 3 that when the number of stitched frames is around 7, the performance of RCED reaches its peak. Considering the performance and computational efficiency of the model at the same time, the present invention splices 7 frames before and after RCED as the basic model.
其中,如图1所示,将RCED中的编码器输出和相应解码器输出进行拼接,然后输入到下一个卷积层中执行后续操作。由于RCED中不含池化层,所以无需进行裁剪,直接拼接即可。这种机制可以重复利用信息,从而使用更少的数据来提取出更多有用的特征。此外,提供短路路径也使得训练更容易。Among them, as shown in Figure 1, the encoder output and the corresponding decoder output in RCED are spliced, and then input to the next convolutional layer to perform subsequent operations. Since RCED does not contain a pooling layer, it can be directly stitched without cropping. This mechanism reuses information to extract more useful features using less data. Additionally, providing short-circuit paths also makes training easier.
在DC-FCN中引入DenseNet中的Dense Block,Dense Block结构如图2 实线框中部分所示。一般来说,包含L个卷积层的网络中存在L个连接,但在包含L个卷积层的DenseBlock中存在L*(L+1)/2个连接。即每个卷积层的输入都是由其之前所有层的输出拼接得到的,该层的输出也会拼接到其后所有层的输入中。在提出的DC-FCN中,将所有编码器和所有解码器分别组合成一个 Dense Block。The Dense Block in DenseNet is introduced into DC-FCN, and the structure of the Dense Block is shown in the solid line box in Figure 2. In general, there are L connections in a network containing L convolutional layers, but there are L*(L+1)/2 connections in a DenseBlock containing L convolutional layers. That is, the input of each convolutional layer is obtained by splicing the outputs of all previous layers, and the output of this layer is also spliced into the inputs of all subsequent layers. In the proposed DC-FCN, all encoders and all decoders are combined into a Dense Block respectively.
具体地,SC-FCN公式为:Specifically, the SC-FCN formula is:
xdecoder+1=f(xdecoder,xencoder,),用逗号连接表示在深度上拼接;xdecoder+1 = f(xdecoder , xencoder , ), which is connected by commas to indicate depth splicing;
f(.)为卷积、Batch Normalization和ReLU的集合。f(.) is a set of convolution, Batch Normalization and ReLU.
xencoder和xdecoder分别为encoder和decoder中对称的层。将这两个层拼接过后输入(decoder+1)层。The xencoder and xdecoder are symmetric layers in the encoder and decoder, respectively. After splicing these two layers, input the (decoder+1) layer.
其中,encoder:编码器,模型的左半部分;decoder:解码器,模型的右半部分。Among them, encoder: encoder, the left half of the model; decoder: decoder, the right half of the model.
DC-FCN的公式:The formula of DC-FCN:
xl=f(x0,x1,...,xt,...,xl-1),用逗号连接表示在深度上拼接;xl = f(x0 , x1 , ..., xt , ..., xl-1 ), connected by commas to indicate depth splicing;
f(.)为卷积、Batch Normalization和ReLU的集合。f(.) is a set of convolution, Batch Normalization and ReLU.
xt(t=0,1,...l-1)为dense block中的第t层。xt (t=0, 1,...l-1) is the t-th layer in the dense block.
试验例Test example
请参阅图4,实验采用TIMIT作为干净语音数据集,Nonspeech和noisex92 作为噪声数据集。Please refer to Figure 4, the experiment adopts TIMIT as the clean speech dataset and Nonspeech and noisex92 as the noise dataset.
每一轮训练时,将干净语音训练集中的所有语音依次取出,并和从 Nonspeech训练集中的随机选取的一条噪声以0dB进行混合。测试集也使用相同混合方法得到。In each round of training, all speeches in the clean speech training set are taken out in turn, and mixed with a piece of noise randomly selected from the Nonspeech training set at 0dB. The test set was also obtained using the same mixing method.
为了评估模型的性能以及泛化能力,本发明分别使用已知噪声和未知噪声在-10、-5、0、5、10dB下进行测试。In order to evaluate the performance and generalization ability of the model, the present invention uses known noise and unknown noise to test at -10, -5, 0, 5, and 10 dB, respectively.
选取训练时使用过的噪声作为已知噪声,noisex92中的babble、f16和 factory2作为未知噪声。Select the noise used during training as the known noise, and babble, f16 and factory2 in noisex92 as the unknown noise.
将测试集中的干净语音分别和已知噪声(seen)及noisex92中的未知噪声(unseen)以-10,-5,0,5,10db进行混合,并对得到的带噪语音进行测试。The clean speech in the test set is mixed with known noise (seen) and unknown noise (unseen) in noisex92 at -10, -5, 0, 5, 10db, and the resulting noisy speech is tested.
模型使用的语音信号被提前下采样为8kHz。The speech signal used by the model is downsampled to 8kHz in advance.
本发明使用包含256个采样点的hamming窗以及128的帧移的STFT来计算幅度向量。由于256-point STFT magnitude vector是对称的,所以只使用其一半,共计129个点。The present invention uses a hamming window containing 256 sample points and a frame-shifted STFT of 128 to calculate the magnitude vector. Since the 256-point STFT magnitude vector is symmetric, only half of it is used, for a total of 129 points.
使用的RCED网络包含10个卷积层,对于前9层中的每一层分别进行卷积、ReLU激活和批归一化(Batch Normalization)处理,最后一层只进行卷积并得到增强帧。The RCED network used contains 10 convolutional layers, convolution, ReLU activation and Batch Normalization are performed separately for each of the first 9 layers, and the last layer only performs convolution and gets enhanced frames.
使用的卷积核数量分别为12-16-20-24-32-24-20-16-12-1。每一层使用1D卷积,过滤器的长度和输入帧数相同,过滤器的宽度分别为 13-11-9-7-7-7-9-11-13-1。The number of convolution kernels used are 12-16-20-24-32-24-20-16-12-1, respectively. Each layer uses 1D convolution, the length of the filter is the same as the number of input frames, and the width of the filter is 13-11-9-7-7-7-9-11-13-1 respectively.
使用Adam optimizer来优化模型,初始学习率设置为0.00005。如果连续5 轮系统性能没有提升,则学习率被重设为0.00001。使用STOI、PESQ和SSNR 作为评价指标,如下表所示:The model was optimized using Adam optimizer with the initial learning rate set to 0.00005. If there is no improvement in system performance for 5 consecutive epochs, the learning rate is reset to 0.00001. STOI, PESQ and SSNR are used as evaluation metrics, as shown in the following table:
TABLEⅠ THE PESQ,STOIAND SSNR COMPARISON OF DIFFERENT MODELS AT-5,0,5AND10DBTABLEⅠTHE PESQ,STOIAND SSNR COMPARISON OF DIFFERENT MODELS AT -5,0,
本发明使用只包含卷积层的RCED,丢弃了池化层和与其对应的上采样层;并在其上引入不同的shortcut机制,性能良好且有泛化性,可以重复利用信息,从而使用更少的数据来提取出更多有用的特征;易于训练、减少梯度消亡、减少参数,同时在小数据集上克服过拟合的问题。The present invention uses RCED which only contains the convolution layer, discards the pooling layer and the corresponding upsampling layer; and introduces different shortcut mechanisms on it, which has good performance and generalization, and can reuse information, so as to use more Extract more useful features from less data; easy to train, reduce gradient demise, reduce parameters, and overcome overfitting on small datasets.
以上仅为本发明的较佳实施例而已,并不用以限制本发明,凡在本发明的精神和原则之内所作的任何修改、等同替换和改进等,均应包含在本发明的保护范围之内。The above are only preferred embodiments of the present invention and are not intended to limit the present invention. Any modifications, equivalent replacements and improvements made within the spirit and principles of the present invention shall be included in the protection scope of the present invention. Inside.
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201910913616.0ACN110751958A (en) | 2019-09-25 | 2019-09-25 | A Noise Reduction Method Based on RCED Network |
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201910913616.0ACN110751958A (en) | 2019-09-25 | 2019-09-25 | A Noise Reduction Method Based on RCED Network |
| Publication Number | Publication Date |
|---|---|
| CN110751958Atrue CN110751958A (en) | 2020-02-04 |
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN201910913616.0APendingCN110751958A (en) | 2019-09-25 | 2019-09-25 | A Noise Reduction Method Based on RCED Network |
| Country | Link |
|---|---|
| CN (1) | CN110751958A (en) |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN112017986A (en)* | 2020-10-21 | 2020-12-01 | 季华实验室 | Semiconductor product defect detection method and device, electronic equipment and storage medium |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20170092265A1 (en)* | 2015-09-24 | 2017-03-30 | Google Inc. | Multichannel raw-waveform neural networks |
| CN106940998A (en)* | 2015-12-31 | 2017-07-11 | 阿里巴巴集团控股有限公司 | A kind of execution method and device of setting operation |
| CN107452389A (en)* | 2017-07-20 | 2017-12-08 | 大象声科(深圳)科技有限公司 | A kind of general monophonic real-time noise-reducing method |
| CN107633842A (en)* | 2017-06-12 | 2018-01-26 | 平安科技(深圳)有限公司 | Audio recognition method, device, computer equipment and storage medium |
| CN108460764A (en)* | 2018-03-31 | 2018-08-28 | 华南理工大学 | The ultrasonoscopy intelligent scissor method enhanced based on automatic context and data |
| CN108806708A (en)* | 2018-06-13 | 2018-11-13 | 中国电子科技集团公司第三研究所 | Voice de-noising method based on Computational auditory scene analysis and generation confrontation network model |
| CN109086656A (en)* | 2018-06-06 | 2018-12-25 | 平安科技(深圳)有限公司 | Airport foreign matter detecting method, device, computer equipment and storage medium |
| CN109166126A (en)* | 2018-08-13 | 2019-01-08 | 苏州比格威医疗科技有限公司 | A method of paint crackle is divided on ICGA image based on condition production confrontation network |
| US20190066713A1 (en)* | 2016-06-14 | 2019-02-28 | The Trustees Of Columbia University In The City Of New York | Systems and methods for speech separation and neural decoding of attentional selection in multi-speaker environments |
| CN109637520A (en)* | 2018-10-16 | 2019-04-16 | 平安科技(深圳)有限公司 | Sensitive content recognition methods, device, terminal and medium based on speech analysis |
| CN109686426A (en)* | 2018-12-29 | 2019-04-26 | 上海商汤智能科技有限公司 | Medical imaging processing method and processing device, electronic equipment and storage medium |
| CN109753866A (en)* | 2017-11-03 | 2019-05-14 | 西门子保健有限责任公司 | Object Detection in Medical Images with Dense Feature Pyramid Network Architecture in Machine Learning |
| CN109841226A (en)* | 2018-08-31 | 2019-06-04 | 大象声科(深圳)科技有限公司 | A kind of single channel real-time noise-reducing method based on convolution recurrent neural network |
| CN109886971A (en)* | 2019-01-24 | 2019-06-14 | 西安交通大学 | A method and system for image segmentation based on convolutional neural network |
| CN110246510A (en)* | 2019-06-24 | 2019-09-17 | 电子科技大学 | A kind of end-to-end speech Enhancement Method based on RefineNet |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20170092265A1 (en)* | 2015-09-24 | 2017-03-30 | Google Inc. | Multichannel raw-waveform neural networks |
| CN106940998A (en)* | 2015-12-31 | 2017-07-11 | 阿里巴巴集团控股有限公司 | A kind of execution method and device of setting operation |
| US20190066713A1 (en)* | 2016-06-14 | 2019-02-28 | The Trustees Of Columbia University In The City Of New York | Systems and methods for speech separation and neural decoding of attentional selection in multi-speaker environments |
| CN107633842A (en)* | 2017-06-12 | 2018-01-26 | 平安科技(深圳)有限公司 | Audio recognition method, device, computer equipment and storage medium |
| CN107452389A (en)* | 2017-07-20 | 2017-12-08 | 大象声科(深圳)科技有限公司 | A kind of general monophonic real-time noise-reducing method |
| CN109753866A (en)* | 2017-11-03 | 2019-05-14 | 西门子保健有限责任公司 | Object Detection in Medical Images with Dense Feature Pyramid Network Architecture in Machine Learning |
| CN108460764A (en)* | 2018-03-31 | 2018-08-28 | 华南理工大学 | The ultrasonoscopy intelligent scissor method enhanced based on automatic context and data |
| CN109086656A (en)* | 2018-06-06 | 2018-12-25 | 平安科技(深圳)有限公司 | Airport foreign matter detecting method, device, computer equipment and storage medium |
| CN108806708A (en)* | 2018-06-13 | 2018-11-13 | 中国电子科技集团公司第三研究所 | Voice de-noising method based on Computational auditory scene analysis and generation confrontation network model |
| CN109166126A (en)* | 2018-08-13 | 2019-01-08 | 苏州比格威医疗科技有限公司 | A method of paint crackle is divided on ICGA image based on condition production confrontation network |
| CN109841226A (en)* | 2018-08-31 | 2019-06-04 | 大象声科(深圳)科技有限公司 | A kind of single channel real-time noise-reducing method based on convolution recurrent neural network |
| CN109637520A (en)* | 2018-10-16 | 2019-04-16 | 平安科技(深圳)有限公司 | Sensitive content recognition methods, device, terminal and medium based on speech analysis |
| CN109686426A (en)* | 2018-12-29 | 2019-04-26 | 上海商汤智能科技有限公司 | Medical imaging processing method and processing device, electronic equipment and storage medium |
| CN109886971A (en)* | 2019-01-24 | 2019-06-14 | 西安交通大学 | A method and system for image segmentation based on convolutional neural network |
| CN110246510A (en)* | 2019-06-24 | 2019-09-17 | 电子科技大学 | A kind of end-to-end speech Enhancement Method based on RefineNet |
| Title |
|---|
| C. Y: "Densely Connected Convolutional Networks for Speech Recognition", 《SPEECH COMMUNICATION; 13TH ITG-SYMPOSIUM, OLDENBURG, GERMANY, 2018》* |
| DU, XINGJIAN: "End-to-End Model for Speech Enhancement by Consistent Spectrogram Masking", 《ARXIV PREPRINT ARXIV:1901.00295 (2019)》* |
| PARK, SE RIM: "A fully convolutional neural network for speech enhancement", 《ARXIV: LEARNING (2016)》* |
| T. GRZYWALSKI: "Application of recurrent U-net architecture to speech enhancement", 《2018 SIGNAL PROCESSING: ALGORITHMS, ARCHITECTURES, ARRANGEMENTS, AND APPLICATIONS (SPA)》* |
| 孙全: "基于生成对抗网络的图像修复", 《计算机科学》* |
| 时文华: "利用深度全卷积编解码网络的单通道语音增强", 《信号处理》* |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN112017986A (en)* | 2020-10-21 | 2020-12-01 | 季华实验室 | Semiconductor product defect detection method and device, electronic equipment and storage medium |
| Publication | Publication Date | Title |
|---|---|---|
| Schröter et al. | Deepfilternet2: Towards real-time speech enhancement on embedded devices for full-band audio | |
| CN109841226B (en) | Single-channel real-time noise reduction method based on convolution recurrent neural network | |
| Li et al. | Speech enhancement using progressive learning-based convolutional recurrent neural network | |
| EP3123466B1 (en) | Mixed speech signals recognition | |
| CN111081268A (en) | A Phase-Correlated Shared Deep Convolutional Neural Network Speech Enhancement Method | |
| Liu et al. | Speech enhancement method based on LSTM neural network for speech recognition | |
| Cord-Landwehr et al. | Monaural source separation: From anechoic to reverberant environments | |
| Zezario et al. | Self-supervised denoising autoencoder with linear regression decoder for speech enhancement | |
| Mun et al. | The sound of my voice: Speaker representation loss for target voice separation | |
| CN111081266A (en) | Training generation countermeasure network, and voice enhancement method and system | |
| CN117219109A (en) | Double-branch voice enhancement algorithm based on structured state space sequence model | |
| Tu et al. | DNN training based on classic gain function for single-channel speech enhancement and recognition | |
| Chao et al. | Cross-domain single-channel speech enhancement model with bi-projection fusion module for noise-robust ASR | |
| Chai et al. | Gaussian density guided deep neural network for single-channel speech enhancement | |
| CN116343807A (en) | An improved speech enhancement method, system, medium, device and terminal | |
| Shetu et al. | Gan-based speech enhancement for low snr using latent feature conditioning | |
| CN114155868B (en) | Speech enhancement method, device, equipment and storage medium | |
| CN110751958A (en) | A Noise Reduction Method Based on RCED Network | |
| Hwang et al. | Monoaural Speech Enhancement Using a Nested U-Net with Two-Level Skip Connections. | |
| Li et al. | A Convolutional Neural Network with Non-Local Module for Speech Enhancement. | |
| Li et al. | Convolutional recurrent neural network based progressive learning for monaural speech enhancement | |
| Wang et al. | Robust speech recognition from ratio masks | |
| Parisae et al. | Progressive learning framework for speech enhancement using multi-scale convolution and s-tcn | |
| CN111383652B (en) | A single-channel speech enhancement method based on double-layer dictionary learning | |
| Rana et al. | A study on speech enhancement using deep temporal convolutional neural network |
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| RJ01 | Rejection of invention patent application after publication | Application publication date:20200204 | |
| RJ01 | Rejection of invention patent application after publication |