Movatterモバイル変換


[0]ホーム

URL:


CN110751958A - A Noise Reduction Method Based on RCED Network - Google Patents

A Noise Reduction Method Based on RCED Network
Download PDF

Info

Publication number
CN110751958A
CN110751958ACN201910913616.0ACN201910913616ACN110751958ACN 110751958 ACN110751958 ACN 110751958ACN 201910913616 ACN201910913616 ACN 201910913616ACN 110751958 ACN110751958 ACN 110751958A
Authority
CN
China
Prior art keywords
rced
layer
decoder
noise reduction
splicing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910913616.0A
Other languages
Chinese (zh)
Inventor
蓝天
吕忆蓝
李森
刘峤
惠国强
钱宇欣
叶文政
彭川
李萌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Electronic Science and Technology of China
Original Assignee
University of Electronic Science and Technology of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Electronic Science and Technology of ChinafiledCriticalUniversity of Electronic Science and Technology of China
Priority to CN201910913616.0ApriorityCriticalpatent/CN110751958A/en
Publication of CN110751958ApublicationCriticalpatent/CN110751958A/en
Pendinglegal-statusCriticalCurrent

Links

Images

Classifications

Landscapes

Abstract

Translated fromChinese

本发明公开一种基于RCED网络的降噪方法,包括以下步骤:S1:构建RCED;S2:将目标增强帧和其两侧的部分帧进行拼接,然后通过RCED进行卷积操作;S3:将RCED中的编码器输出和相应解码器输出进行拼接,然后输入到下一个卷积层中执行后续操作;S4:引入shortcut机制,将所有编码器和所有解码器分别组合成一个Dense Block,在层之间增加短路路径。本发明使用只包含卷积层的RCED,丢弃了池化层和与其对应的上采样层;并在其上引入不同的shortcut机制,性能良好且有泛化性,可以重复利用信息,从而使用更少的数据来提取出更多有用的特征;易于训练、减少梯度消亡、减少参数,同时在小数据集上克服过拟合的问题。

Figure 201910913616

The present invention discloses a noise reduction method based on RCED network, comprising the following steps: S1: constructing RCED; S2: splicing a target enhanced frame and partial frames on both sides thereof, and then performing a convolution operation through RCED; S3: combining RCED The output of the encoder and the output of the corresponding decoder are spliced, and then input to the next convolutional layer to perform subsequent operations; S4: Introduce the shortcut mechanism to combine all the encoders and all the decoders into a Dense Block. Add short-circuit paths between them. The present invention uses RCED which only contains the convolution layer, discards the pooling layer and its corresponding upsampling layer; and introduces different shortcut mechanisms on it, which has good performance and generalization, and can reuse information, so as to use more Extract more useful features with less data; easy to train, reduce gradient demise, reduce parameters, and overcome the problem of overfitting on small datasets.

Figure 201910913616

Description

Translated fromChinese
一种基于RCED网络的降噪方法A Noise Reduction Method Based on RCED Network

技术领域technical field

本发明属于语音降噪技术领域,尤其涉及一种基于RCED网络的降噪方法。The invention belongs to the technical field of speech noise reduction, and in particular relates to a noise reduction method based on an RCED network.

背景技术Background technique

实际生活中,噪声无处不在,基本不存在完全纯净的语音,所以使用语音增强来提高带噪语音的可懂度和质量,目前,已经广泛应用于语音识别、助听器等领域。常见的语音增强技术可分为传统方法和基于深度学习的方法两大类。传统方法主要包含谱减法、维纳滤波法、基于统计模型的方法、基于子空间的方法。它们基于噪声平稳的假设,因此无法处理非平稳噪声。In real life, noise is everywhere, and there is basically no completely pure speech. Therefore, speech enhancement is used to improve the intelligibility and quality of noisy speech. At present, it has been widely used in speech recognition, hearing aids and other fields. Common speech enhancement techniques can be divided into two categories: traditional methods and deep learning-based methods. Traditional methods mainly include spectral subtraction, Wiener filtering, statistical model-based methods, and subspace-based methods. They are based on the assumption that the noise is stationary and thus cannot handle non-stationary noise.

发明内容SUMMARY OF THE INVENTION

本发明提供一种基于RCED网络的降噪方法,旨在解决上述存在的问题。The present invention provides a noise reduction method based on an RCED network, aiming at solving the above-mentioned problems.

本发明是这样实现的,一种基于RCED网络的降噪方法,包括以下步骤:The present invention is realized in this way, a kind of noise reduction method based on RCED network, comprises the following steps:

S1:构建RCED;S1: build RCED;

S2:将目标增强帧和其两侧的部分帧进行拼接,然后通过RCED进行卷积操作;S2: Splicing the target enhanced frame and some frames on both sides of it, and then performing the convolution operation through RCED;

S3:引入shortcut机制,将RCED中的编码器输出和相应解码器输出进行拼接,然后输入到下一个卷积层中执行后续操作;S3: Introduce the shortcut mechanism, splicing the encoder output in RCED with the corresponding decoder output, and then inputting it into the next convolutional layer to perform subsequent operations;

S4:引入shortcut机制,将所有编码器和所有解码器分别组合成一个DenseBlock,在层之间增加短路路径。S4: Introduce the shortcut mechanism, combine all encoders and all decoders into a DenseBlock, and add short-circuit paths between layers.

进一步的,所述RCED包括多个相同模块A,所述模块A包含卷积层、块归一化层和ReLU激活层。Further, the RCED includes a plurality of identical modules A, and the module A includes a convolution layer, a block normalization layer and a ReLU activation layer.

进一步的,所述RCED还包括位于末端的一个模块B,所述模块B包含卷积层,并输出增强帧。Further, the RCED further includes a module B at the end, the module B includes a convolution layer, and outputs an enhanced frame.

进一步的,所述RCED前后各拼接帧数为7。Further, the number of spliced frames before and after the RCED is 7.

进一步的,在步骤S3中,应用公式为:Further, in step S3, the application formula is:

xdecoder+1=f(xdecoder,xencoder,)xdecoder+1 = f(xdecoder , xencoder ,)

用逗号连接表示在深度上拼接;Connecting with commas means splicing in depth;

其中,f(.)为卷积、Batch Normalization和ReLU的集合。Among them, f(.) is the set of convolution, Batch Normalization and ReLU.

xencoder和xdecoder分别为encoder和decoder中对称的层,将这两个层拼接过后输入(decoder+1)层。The xencoder and xdecoder are the symmetrical layers in the encoder and the decoder, respectively, and the two layers are spliced and input into the (decoder+1) layer.

进一步的,在步骤S4中,应用公式为:Further, in step S4, the application formula is:

Figure RE-GDA0002312906010000021
Figure RE-GDA0002312906010000021

用逗号连接表示在深度上拼接;Connecting with commas means splicing in depth;

f(.)为卷积、Batch Normalization和ReLU的集合;f(.) is the set of convolution, Batch Normalization and ReLU;

xt(t=0,1,...l-1)为dense block中的第t层。xt (t=0, 1,...l-1) is the t-th layer in the dense block.

与现有技术相比,本发明的有益效果是:本发明使用只包含卷积层的 RCED,丢弃了池化层和与其对应的上采样层;并在其上引入不同的shortcut 机制,性能良好且有泛化性,可以重复利用信息,从而使用更少的数据来提取出更多有用的特征;易于训练、减少梯度消亡、减少参数,同时在小数据集上克服过拟合的问题。Compared with the prior art, the beneficial effects of the present invention are: the present invention uses RCED that only includes convolutional layers, discards the pooling layer and its corresponding upsampling layer; and introduces different shortcut mechanisms on it, with good performance It is generalizable and can reuse information to extract more useful features with less data; easy to train, reduce gradient demise, reduce parameters, and overcome the problem of overfitting on small data sets.

附图说明Description of drawings

图1为本发明实施例图一;Fig. 1 is Fig. 1 of the embodiment of the present invention;

图2为本发明实施例图二;Fig. 2 is Fig. 2 of the embodiment of the present invention;

图3为本发明实施例图三;Fig. 3 is Fig. 3 of the embodiment of the present invention;

图4为本发明实施例图四。FIG. 4 is a fourth embodiment of the present invention.

具体实施方式Detailed ways

为了使本发明的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本发明进行进一步详细说明。应当理解,此处所描述的具体实施例仅仅用以解释本发明,并不用于限定本发明。In order to make the objectives, technical solutions and advantages of the present invention clearer, the present invention will be further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are only used to explain the present invention, but not to limit the present invention.

请参阅图1-2,本发明提供一种技术方案:一种基于RCED网络的降噪方法,包括以下步骤:1-2, the present invention provides a technical solution: a noise reduction method based on an RCED network, comprising the following steps:

S1:构建RCED;S1: build RCED;

S2:将目标增强帧和其两侧的部分帧进行拼接,然后通过RCED进行卷积操作;S2: Splicing the target enhanced frame and some frames on both sides of it, and then performing the convolution operation through RCED;

S3:将RCED中的编码器输出和相应解码器输出进行拼接,然后输入到下一个卷积层中执行后续操作;S3: Concatenate the encoder output in RCED with the corresponding decoder output, and then input it into the next convolutional layer to perform subsequent operations;

S4:引入shortcut机制,将所有编码器和所有解码器分别组合成一个DenseBlock,在层之间增加短路路径。S4: Introduce the shortcut mechanism, combine all encoders and all decoders into a DenseBlock, and add short-circuit paths between layers.

本实施方式中,在SC-FCN中,本发明使用RCED作为基础模型,并进行改进。In this embodiment, in SC-FCN, the present invention uses RCED as a basic model and makes improvements.

RCED如图1所示:其由多个相同模块组成。RCED is shown in Figure 1: it consists of multiple identical modules.

除了最后一个模块,每个模块包含卷积层、块归一化层(batch normalization)和ReLU激活层。Except for the last module, each module contains convolutional layers, batch normalization layers, and ReLU activation layers.

最后一个模块只包含卷积层,并输出增强帧。由于卷积层的输入是图,所以RCED将目标增强帧和其两侧的部分帧进行拼接,然后进行卷积操作,这样也可以有效利用上下文信息。The last module contains only convolutional layers and outputs enhanced frames. Since the input of the convolutional layer is a graph, RCED splices the target enhanced frame and the partial frames on both sides of it, and then performs the convolution operation, which can also effectively utilize the context information.

但是实验发现,拼接少量的上下文帧可以获得不错的增强效果,但随着拼接帧数的增加,降噪效果会明显下降。However, the experiment found that a good enhancement effect can be obtained by splicing a small number of context frames, but with the increase of the number of spliced frames, the noise reduction effect will decrease significantly.

这表明增加上下文帧的同时也会引入部分冗余信息,导致神经网络的训练受到干扰。This indicates that adding some context frames will also introduce some redundant information, which will interfere with the training of the neural network.

由于pesq取值范围在[-0.5,4.5],stoi取值范围为[0,1],所以通过式(1) 来对扩展不同帧的模型打分:Since the value range of pesq is [-0.5, 4.5], and the value range of stoi is [0, 1], the model that extends different frames is scored by formula (1):

(1)score=pesq/5+stoi(1) score=pesq/5+stoi

请参阅图3,图3中纵轴表示SNR,横轴表示score的值,从图3中可以看出,拼接帧数在7左右时,RCED的性能达到峰值。同时考虑模型的性能和计算效率,本发明在RCED前后各拼接7帧作为基础模型。Please refer to Figure 3. In Figure 3, the vertical axis represents SNR, and the horizontal axis represents the value of score. It can be seen from Figure 3 that when the number of stitched frames is around 7, the performance of RCED reaches its peak. Considering the performance and computational efficiency of the model at the same time, the present invention splices 7 frames before and after RCED as the basic model.

其中,如图1所示,将RCED中的编码器输出和相应解码器输出进行拼接,然后输入到下一个卷积层中执行后续操作。由于RCED中不含池化层,所以无需进行裁剪,直接拼接即可。这种机制可以重复利用信息,从而使用更少的数据来提取出更多有用的特征。此外,提供短路路径也使得训练更容易。Among them, as shown in Figure 1, the encoder output and the corresponding decoder output in RCED are spliced, and then input to the next convolutional layer to perform subsequent operations. Since RCED does not contain a pooling layer, it can be directly stitched without cropping. This mechanism reuses information to extract more useful features using less data. Additionally, providing short-circuit paths also makes training easier.

在DC-FCN中引入DenseNet中的Dense Block,Dense Block结构如图2 实线框中部分所示。一般来说,包含L个卷积层的网络中存在L个连接,但在包含L个卷积层的DenseBlock中存在L*(L+1)/2个连接。即每个卷积层的输入都是由其之前所有层的输出拼接得到的,该层的输出也会拼接到其后所有层的输入中。在提出的DC-FCN中,将所有编码器和所有解码器分别组合成一个 Dense Block。The Dense Block in DenseNet is introduced into DC-FCN, and the structure of the Dense Block is shown in the solid line box in Figure 2. In general, there are L connections in a network containing L convolutional layers, but there are L*(L+1)/2 connections in a DenseBlock containing L convolutional layers. That is, the input of each convolutional layer is obtained by splicing the outputs of all previous layers, and the output of this layer is also spliced into the inputs of all subsequent layers. In the proposed DC-FCN, all encoders and all decoders are combined into a Dense Block respectively.

具体地,SC-FCN公式为:Specifically, the SC-FCN formula is:

xdecoder+1=f(xdecoder,xencoder,),用逗号连接表示在深度上拼接;xdecoder+1 = f(xdecoder , xencoder , ), which is connected by commas to indicate depth splicing;

f(.)为卷积、Batch Normalization和ReLU的集合。f(.) is a set of convolution, Batch Normalization and ReLU.

xencoder和xdecoder分别为encoder和decoder中对称的层。将这两个层拼接过后输入(decoder+1)层。The xencoder and xdecoder are symmetric layers in the encoder and decoder, respectively. After splicing these two layers, input the (decoder+1) layer.

其中,encoder:编码器,模型的左半部分;decoder:解码器,模型的右半部分。Among them, encoder: encoder, the left half of the model; decoder: decoder, the right half of the model.

DC-FCN的公式:The formula of DC-FCN:

xl=f(x0,x1,...,xt,...,xl-1),用逗号连接表示在深度上拼接;xl = f(x0 , x1 , ..., xt , ..., xl-1 ), connected by commas to indicate depth splicing;

f(.)为卷积、Batch Normalization和ReLU的集合。f(.) is a set of convolution, Batch Normalization and ReLU.

xt(t=0,1,...l-1)为dense block中的第t层。xt (t=0, 1,...l-1) is the t-th layer in the dense block.

试验例Test example

请参阅图4,实验采用TIMIT作为干净语音数据集,Nonspeech和noisex92 作为噪声数据集。Please refer to Figure 4, the experiment adopts TIMIT as the clean speech dataset and Nonspeech and noisex92 as the noise dataset.

每一轮训练时,将干净语音训练集中的所有语音依次取出,并和从 Nonspeech训练集中的随机选取的一条噪声以0dB进行混合。测试集也使用相同混合方法得到。In each round of training, all speeches in the clean speech training set are taken out in turn, and mixed with a piece of noise randomly selected from the Nonspeech training set at 0dB. The test set was also obtained using the same mixing method.

为了评估模型的性能以及泛化能力,本发明分别使用已知噪声和未知噪声在-10、-5、0、5、10dB下进行测试。In order to evaluate the performance and generalization ability of the model, the present invention uses known noise and unknown noise to test at -10, -5, 0, 5, and 10 dB, respectively.

选取训练时使用过的噪声作为已知噪声,noisex92中的babble、f16和 factory2作为未知噪声。Select the noise used during training as the known noise, and babble, f16 and factory2 in noisex92 as the unknown noise.

将测试集中的干净语音分别和已知噪声(seen)及noisex92中的未知噪声(unseen)以-10,-5,0,5,10db进行混合,并对得到的带噪语音进行测试。The clean speech in the test set is mixed with known noise (seen) and unknown noise (unseen) in noisex92 at -10, -5, 0, 5, 10db, and the resulting noisy speech is tested.

模型使用的语音信号被提前下采样为8kHz。The speech signal used by the model is downsampled to 8kHz in advance.

本发明使用包含256个采样点的hamming窗以及128的帧移的STFT来计算幅度向量。由于256-point STFT magnitude vector是对称的,所以只使用其一半,共计129个点。The present invention uses a hamming window containing 256 sample points and a frame-shifted STFT of 128 to calculate the magnitude vector. Since the 256-point STFT magnitude vector is symmetric, only half of it is used, for a total of 129 points.

使用的RCED网络包含10个卷积层,对于前9层中的每一层分别进行卷积、ReLU激活和批归一化(Batch Normalization)处理,最后一层只进行卷积并得到增强帧。The RCED network used contains 10 convolutional layers, convolution, ReLU activation and Batch Normalization are performed separately for each of the first 9 layers, and the last layer only performs convolution and gets enhanced frames.

使用的卷积核数量分别为12-16-20-24-32-24-20-16-12-1。每一层使用1D卷积,过滤器的长度和输入帧数相同,过滤器的宽度分别为 13-11-9-7-7-7-9-11-13-1。The number of convolution kernels used are 12-16-20-24-32-24-20-16-12-1, respectively. Each layer uses 1D convolution, the length of the filter is the same as the number of input frames, and the width of the filter is 13-11-9-7-7-7-9-11-13-1 respectively.

使用Adam optimizer来优化模型,初始学习率设置为0.00005。如果连续5 轮系统性能没有提升,则学习率被重设为0.00001。使用STOI、PESQ和SSNR 作为评价指标,如下表所示:The model was optimized using Adam optimizer with the initial learning rate set to 0.00005. If there is no improvement in system performance for 5 consecutive epochs, the learning rate is reset to 0.00001. STOI, PESQ and SSNR are used as evaluation metrics, as shown in the following table:

TABLEⅠ THE PESQ,STOIAND SSNR COMPARISON OF DIFFERENT MODELS AT-5,0,5AND10DBTABLEⅠTHE PESQ,STOIAND SSNR COMPARISON OF DIFFERENT MODELS AT -5,0,5AND 10DB

Figure RE-GDA0002312906010000061
Figure RE-GDA0002312906010000061

本发明使用只包含卷积层的RCED,丢弃了池化层和与其对应的上采样层;并在其上引入不同的shortcut机制,性能良好且有泛化性,可以重复利用信息,从而使用更少的数据来提取出更多有用的特征;易于训练、减少梯度消亡、减少参数,同时在小数据集上克服过拟合的问题。The present invention uses RCED which only contains the convolution layer, discards the pooling layer and the corresponding upsampling layer; and introduces different shortcut mechanisms on it, which has good performance and generalization, and can reuse information, so as to use more Extract more useful features from less data; easy to train, reduce gradient demise, reduce parameters, and overcome overfitting on small datasets.

以上仅为本发明的较佳实施例而已,并不用以限制本发明,凡在本发明的精神和原则之内所作的任何修改、等同替换和改进等,均应包含在本发明的保护范围之内。The above are only preferred embodiments of the present invention and are not intended to limit the present invention. Any modifications, equivalent replacements and improvements made within the spirit and principles of the present invention shall be included in the protection scope of the present invention. Inside.

Claims (6)

1. A noise reduction method based on an RCED network is characterized by comprising the following steps:
s1: constructing an RCED;
s2: splicing the target enhancement frame and partial frames at two sides of the target enhancement frame, and then performing convolution operation through RCED;
s3: introducing a shortcut mechanism, splicing the output of an encoder in the RCED with the output of a corresponding decoder, and then inputting the spliced output into the next convolutional layer to execute subsequent operation;
s4: all encoders and all decoders are combined into one sense Block, respectively, and short-circuit paths are added between layers.
2. The noise reduction method according to claim 1, characterized in that: the RCED includes a plurality of identical modules A that contain a convolutional layer, a block normalization layer, and a ReLU activation layer.
3. The noise reduction method according to claim 2, characterized in that: the RCED also includes a module B at the end, which contains the convolutional layer and outputs an enhancement frame.
4. The noise reduction method according to claim 1, characterized in that: the number of the splicing frames before and after the RCED is 7.
5. The noise reduction method according to claim 1, wherein in step S3, the formula is applied as:
xdecoder+1=f(xdecoder,xencoder,)
splicing in depth is indicated by comma connection;
where f (.) is the set of convolution, Batch Normalization, and ReLU.
xencoderAnd xdecoderSymmetric layers in the encoder and the decoder are respectively input into the (decoder +1) layer after the two layers are spliced.
6. The method according to claim 1, wherein in step S4, the formula is applied as follows:
xl=f(x0,x1,...,xt,...,xl-1)
splicing in depth is indicated by comma connection;
f () is the set of convolution, Batch Normalization, and ReLU;
xtl-1 is the t-th layer in a dense block.
CN201910913616.0A2019-09-252019-09-25 A Noise Reduction Method Based on RCED NetworkPendingCN110751958A (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN201910913616.0ACN110751958A (en)2019-09-252019-09-25 A Noise Reduction Method Based on RCED Network

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN201910913616.0ACN110751958A (en)2019-09-252019-09-25 A Noise Reduction Method Based on RCED Network

Publications (1)

Publication NumberPublication Date
CN110751958Atrue CN110751958A (en)2020-02-04

Family

ID=69277105

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN201910913616.0APendingCN110751958A (en)2019-09-252019-09-25 A Noise Reduction Method Based on RCED Network

Country Status (1)

CountryLink
CN (1)CN110751958A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN112017986A (en)*2020-10-212020-12-01季华实验室Semiconductor product defect detection method and device, electronic equipment and storage medium

Citations (15)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US20170092265A1 (en)*2015-09-242017-03-30Google Inc.Multichannel raw-waveform neural networks
CN106940998A (en)*2015-12-312017-07-11阿里巴巴集团控股有限公司A kind of execution method and device of setting operation
CN107452389A (en)*2017-07-202017-12-08大象声科(深圳)科技有限公司A kind of general monophonic real-time noise-reducing method
CN107633842A (en)*2017-06-122018-01-26平安科技(深圳)有限公司Audio recognition method, device, computer equipment and storage medium
CN108460764A (en)*2018-03-312018-08-28华南理工大学The ultrasonoscopy intelligent scissor method enhanced based on automatic context and data
CN108806708A (en)*2018-06-132018-11-13中国电子科技集团公司第三研究所Voice de-noising method based on Computational auditory scene analysis and generation confrontation network model
CN109086656A (en)*2018-06-062018-12-25平安科技(深圳)有限公司Airport foreign matter detecting method, device, computer equipment and storage medium
CN109166126A (en)*2018-08-132019-01-08苏州比格威医疗科技有限公司A method of paint crackle is divided on ICGA image based on condition production confrontation network
US20190066713A1 (en)*2016-06-142019-02-28The Trustees Of Columbia University In The City Of New YorkSystems and methods for speech separation and neural decoding of attentional selection in multi-speaker environments
CN109637520A (en)*2018-10-162019-04-16平安科技(深圳)有限公司Sensitive content recognition methods, device, terminal and medium based on speech analysis
CN109686426A (en)*2018-12-292019-04-26上海商汤智能科技有限公司Medical imaging processing method and processing device, electronic equipment and storage medium
CN109753866A (en)*2017-11-032019-05-14西门子保健有限责任公司 Object Detection in Medical Images with Dense Feature Pyramid Network Architecture in Machine Learning
CN109841226A (en)*2018-08-312019-06-04大象声科(深圳)科技有限公司A kind of single channel real-time noise-reducing method based on convolution recurrent neural network
CN109886971A (en)*2019-01-242019-06-14西安交通大学 A method and system for image segmentation based on convolutional neural network
CN110246510A (en)*2019-06-242019-09-17电子科技大学A kind of end-to-end speech Enhancement Method based on RefineNet

Patent Citations (15)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US20170092265A1 (en)*2015-09-242017-03-30Google Inc.Multichannel raw-waveform neural networks
CN106940998A (en)*2015-12-312017-07-11阿里巴巴集团控股有限公司A kind of execution method and device of setting operation
US20190066713A1 (en)*2016-06-142019-02-28The Trustees Of Columbia University In The City Of New YorkSystems and methods for speech separation and neural decoding of attentional selection in multi-speaker environments
CN107633842A (en)*2017-06-122018-01-26平安科技(深圳)有限公司Audio recognition method, device, computer equipment and storage medium
CN107452389A (en)*2017-07-202017-12-08大象声科(深圳)科技有限公司A kind of general monophonic real-time noise-reducing method
CN109753866A (en)*2017-11-032019-05-14西门子保健有限责任公司 Object Detection in Medical Images with Dense Feature Pyramid Network Architecture in Machine Learning
CN108460764A (en)*2018-03-312018-08-28华南理工大学The ultrasonoscopy intelligent scissor method enhanced based on automatic context and data
CN109086656A (en)*2018-06-062018-12-25平安科技(深圳)有限公司Airport foreign matter detecting method, device, computer equipment and storage medium
CN108806708A (en)*2018-06-132018-11-13中国电子科技集团公司第三研究所Voice de-noising method based on Computational auditory scene analysis and generation confrontation network model
CN109166126A (en)*2018-08-132019-01-08苏州比格威医疗科技有限公司A method of paint crackle is divided on ICGA image based on condition production confrontation network
CN109841226A (en)*2018-08-312019-06-04大象声科(深圳)科技有限公司A kind of single channel real-time noise-reducing method based on convolution recurrent neural network
CN109637520A (en)*2018-10-162019-04-16平安科技(深圳)有限公司Sensitive content recognition methods, device, terminal and medium based on speech analysis
CN109686426A (en)*2018-12-292019-04-26上海商汤智能科技有限公司Medical imaging processing method and processing device, electronic equipment and storage medium
CN109886971A (en)*2019-01-242019-06-14西安交通大学 A method and system for image segmentation based on convolutional neural network
CN110246510A (en)*2019-06-242019-09-17电子科技大学A kind of end-to-end speech Enhancement Method based on RefineNet

Non-Patent Citations (6)

* Cited by examiner, † Cited by third party
Title
C. Y: "Densely Connected Convolutional Networks for Speech Recognition", 《SPEECH COMMUNICATION; 13TH ITG-SYMPOSIUM, OLDENBURG, GERMANY, 2018》*
DU, XINGJIAN: "End-to-End Model for Speech Enhancement by Consistent Spectrogram Masking", 《ARXIV PREPRINT ARXIV:1901.00295 (2019)》*
PARK, SE RIM: "A fully convolutional neural network for speech enhancement", 《ARXIV: LEARNING (2016)》*
T. GRZYWALSKI: "Application of recurrent U-net architecture to speech enhancement", 《2018 SIGNAL PROCESSING: ALGORITHMS, ARCHITECTURES, ARRANGEMENTS, AND APPLICATIONS (SPA)》*
孙全: "基于生成对抗网络的图像修复", 《计算机科学》*
时文华: "利用深度全卷积编解码网络的单通道语音增强", 《信号处理》*

Cited By (1)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN112017986A (en)*2020-10-212020-12-01季华实验室Semiconductor product defect detection method and device, electronic equipment and storage medium

Similar Documents

PublicationPublication DateTitle
Schröter et al.Deepfilternet2: Towards real-time speech enhancement on embedded devices for full-band audio
CN109841226B (en)Single-channel real-time noise reduction method based on convolution recurrent neural network
Li et al.Speech enhancement using progressive learning-based convolutional recurrent neural network
EP3123466B1 (en)Mixed speech signals recognition
CN111081268A (en) A Phase-Correlated Shared Deep Convolutional Neural Network Speech Enhancement Method
Liu et al.Speech enhancement method based on LSTM neural network for speech recognition
Cord-Landwehr et al.Monaural source separation: From anechoic to reverberant environments
Zezario et al.Self-supervised denoising autoencoder with linear regression decoder for speech enhancement
Mun et al.The sound of my voice: Speaker representation loss for target voice separation
CN111081266A (en)Training generation countermeasure network, and voice enhancement method and system
CN117219109A (en)Double-branch voice enhancement algorithm based on structured state space sequence model
Tu et al.DNN training based on classic gain function for single-channel speech enhancement and recognition
Chao et al.Cross-domain single-channel speech enhancement model with bi-projection fusion module for noise-robust ASR
Chai et al.Gaussian density guided deep neural network for single-channel speech enhancement
CN116343807A (en) An improved speech enhancement method, system, medium, device and terminal
Shetu et al.Gan-based speech enhancement for low snr using latent feature conditioning
CN114155868B (en) Speech enhancement method, device, equipment and storage medium
CN110751958A (en) A Noise Reduction Method Based on RCED Network
Hwang et al.Monoaural Speech Enhancement Using a Nested U-Net with Two-Level Skip Connections.
Li et al.A Convolutional Neural Network with Non-Local Module for Speech Enhancement.
Li et al.Convolutional recurrent neural network based progressive learning for monaural speech enhancement
Wang et al.Robust speech recognition from ratio masks
Parisae et al.Progressive learning framework for speech enhancement using multi-scale convolution and s-tcn
CN111383652B (en) A single-channel speech enhancement method based on double-layer dictionary learning
Rana et al.A study on speech enhancement using deep temporal convolutional neural network

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination
RJ01Rejection of invention patent application after publication

Application publication date:20200204

RJ01Rejection of invention patent application after publication

[8]ページ先頭

©2009-2025 Movatter.jp