Movatterモバイル変換


[0]ホーム

URL:


CN116129555A - Intelligent door lock recognition system and method based on voice recognition - Google Patents

Intelligent door lock recognition system and method based on voice recognition
Download PDF

Info

Publication number
CN116129555A
CN116129555ACN202211382687.0ACN202211382687ACN116129555ACN 116129555 ACN116129555 ACN 116129555ACN 202211382687 ACN202211382687 ACN 202211382687ACN 116129555 ACN116129555 ACN 116129555A
Authority
CN
China
Prior art keywords
voice
door lock
signal
recognition
noise
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211382687.0A
Other languages
Chinese (zh)
Inventor
张文平
白维朝
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Isurpass Technology Co ltd
Original Assignee
Shenzhen Isurpass Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Isurpass Technology Co ltdfiledCriticalShenzhen Isurpass Technology Co ltd
Priority to CN202211382687.0ApriorityCriticalpatent/CN116129555A/en
Publication of CN116129555ApublicationCriticalpatent/CN116129555A/en
Pendinglegal-statusCriticalCurrent

Links

Images

Classifications

Landscapes

Abstract

The invention discloses an intelligent door lock recognition system and method based on voice recognition, which are characterized in that voice information sent by an intelligent door lock is obtained by preprocessing the voice information, physiological characteristics and individual information representing a speaker are extracted from the voice signal, the voice signal is enhanced according to winning characteristics and the individual information to obtain voice signal, voice enhancement is carried out to obtain test data, the test data is input into a deep neural network to train the voice signal for voiceprint recognition to obtain a recognition result, the recognition result is successfully matched with a pre-constructed corpus to obtain a control instruction, the intelligent door lock is intelligently controlled according to the control instruction, the intelligent door lock can rapidly acquire the voice information, noise and voice information are eliminated, accurate authentication is realized on the voice information, flexible control of the intelligent door lock is realized, and the use experience of a user is improved.

Description

Intelligent door lock recognition system and method based on voice recognition
Technical Field
The invention belongs to the technical field of voice recognition, and particularly relates to an intelligent door lock recognition system and method based on voice recognition.
Background
Currently, in the technical field of biological recognition, voiceprint recognition technology such as speaker recognition is favored by virtue of unique convenience, economy, accuracy and the like, and is becoming an important and popular security verification mode for daily life and work of people. With the rapid development of computer technology, sensor technology and biological feature recognition technology, the biological feature recognition technology has been widely applied to various important occasions such as financial payment, identity authentication and the like, the verification of the biological feature recognition technology for an access control system has been more and more trusted and supported by multiple users, and the requirements for intelligent access control systems are gradually increased in important occasions such as a plurality of high-grade communities, office buildings, high-efficiency dormitories, laboratories and the like. Therefore, there is a need for an intelligent door lock with rapid speech recognition and verification efficiency to solve the above-mentioned technical problems.
Disclosure of Invention
In view of the above, the present invention provides a voice recognition-based intelligent door lock recognition system and method with voice recognition accuracy and intelligent control flexibility, which solve the above technical problems, and is specifically implemented by adopting the following technical schemes.
In a first aspect, the present invention provides an intelligent door lock recognition system based on speech recognition, including:
the preprocessing module is used for preprocessing the voice information sent by the intelligent door lock to obtain a voice signal;
the feature extraction module is used for extracting physiological features and personalized information representing a speaker from the voice signals, enhancing the voice signals according to the victory features and the personalized information to obtain voice signals, and performing voice enhancement on the voice signals to obtain test data, wherein the physiological features comprise gender and age, and the personalized information comprises tone and tone color;
the model training module is used for inputting the test data into the deep neural network to train to carry out voiceprint recognition on the voice signal to obtain a recognition result, wherein the recognition result comprises user identity and message information;
and the identification judging module is used for successfully matching the identification result with a pre-constructed corpus to obtain a control instruction, and intelligently controlling the door lock according to the control instruction.
As a further improvement of the above technical solution, the identification judgment module includes an intelligent door lock unit, a middleware unit and an application unit;
the intelligent door lock unit is used for recording the door lock state, receiving a control instruction issued by the middleware unit and uploading door lock information, the middleware unit is used for transmitting the door lock state information uploaded by the door lock, receiving the operation of the intelligent door lock unit and various index data of the intelligent door lock unit and transmitting the operation and various index data to the corpus, and the corpus analyzes the user behaviors and monitors the real-time condition of the middleware unit according to the index data; the application unit is used for displaying door lock information, operating the door lock state and upgrading the switch firmware to realize man-machine interaction.
As a further improvement of the above technical solution, the execution process of the model training module includes:
comparing the test template corresponding to the test data with the training template, identifying by comparing similarity measures between the test template and the training template, and calculating the pronunciation speed of the voice data in the test data by combining a dynamic time warping algorithm;
preset function f [ (m)i ,ni )]Corresponds to a grid (mi ,ni ) Then there is a path cost function d [ (m)i ,ni )]And f [ (m)i ,ni )]=(mi-1 ,ni-1 ) Initializing to ni =i(i=1,2...,N),n1 =m1 =1,f[(m,n)]=(0,0),
Figure BDA0003929153210000021
Wherein R is a parallelogram constraint;
calculating f [ (m) by adopting a recursion methodi ,ni )]And d [ (m)i ,ni )]The expression is obtained as
Figure BDA0003929153210000031
Wherein m=mi ,mi -1,mi -2, the always true version of the speech signal d [ (M, N)]When i=n, the point (M, N) is traced back forward, thereby obtaining the optimum path (Mi-1 ,ni-1 )=f[(mi ,ni )](i=n, N-1,..3, 2), when (mi-1 ,ni-1 ) Ending at = (0, 0).
As a further improvement of the technical scheme, the vector quantized feature data quantity is adopted to represent the overall feature vector, a plurality of sampling signals are classified into one vector for each class, the vector is quantized, and the input feature vector X= { X is preseti I=1, 2., T }, n represents the number of iterations, Li (n) represents the ith subcell of the nth iteration, Yi (n) code words representing the sub-cells, wherein the code words have J total, the maximum iteration number is M', the iteration threshold is set as epsilon, and the specific process comprises the following steps:
selecting the centroids of all input feature vectors as an initialization codebook { Y }i (0) 1.ltoreq.i.ltoreq.J }, the expressions for dividing the cell into two by adopting the smaller threshold are Y respectivelyi(1) =Yi(0) -ε、Yi(2) =Yi(0) +ε,Yi(1) And Yi(2) Respectively representing a cell code word before splitting and a cell code word after splitting into two, wherein the number of cells is doubled, and n=0;
when n=n+1, calculating characteristic vector and current codeword of each frame, if satisfied, x is Ci (n), d represents the Euclidean distance between vectors, d (x, Y)i (n-1))≤d(x,Yj (n-1)), i.noteq.j, 1.ltoreq.i, j.ltoreq.M', using the expression
Figure BDA0003929153210000032
When n=l or
Figure BDA0003929153210000033
At the end of the iteration, returning to dividing the cell into two by adopting a smaller threshold value, otherwise, saving and outputting the best codebook { Yi I=1, 2..m' }, if the distortion change rate is greater than the threshold, the clustering process that occurs when n=n+1 will be skipped.
As a further improvement of the above technical solution, the executing process of the feature extraction module includes:
performing voice enhancement on the voice signal by adopting spectral subtraction, and subtracting the noise power spectrum from the power spectrum of the voice signal to obtain a pure voice spectrum;
let S (t) denote a clean speech signal, N '(t) denote an additive noise signal, Y (t) denote a noisy speech signal, then Y (t) =s (t) +n' (t), and Y (ω) =s (ω) +n '(ω) are used to represent the fourier transforms of Y (t), S (t) and N' (t), respectively;
if independent of ginkgo and additive noise, Y (omega)2 =|S(ω)|2 +|N′(ω)|2 If P is usedy (ω)、Ps (omega) and Pn (ω) represents the power spectrum of y (t), s (t) and n' (t), respectively, then Py (ω)=Ps (ω)+Pn′ (omega) estimation of noise power spectrum P by speech-free but noisy before utterancen′ (ω), then Ps (ω)=Py (ω)-Pn′ (omega) the subtracted power spectrum Ps (ω) determining a clean speech power spectrum from which the denoised speech time domain signal is recovered to obtain test data.
As a further improvement of the above technical solution, recovering the noise-reduced speech time domain signal from the power spectrum to obtain test data, including:
from noisy speech x0 [n0 ]Power spectrum |x (ejw )|2 In which the power spectrum of clean speech is estimated
Figure BDA0003929153210000041
The usage expression is->
Figure BDA0003929153210000042
Wherein x is0 [n0 ]Representing a discrete sequence of input signals, if in a speech signal s [ n ]0 ]Adding additive noise n0 [n0 ]According to the fact that the noise is irrelevant to the signal and is non-stable, the change rate of the noise is smaller than that of the signal, x is obtained0 [n0 ]=s[n0 ]+n0 [n0 ]And fourier transforming to obtain the expression
Figure BDA0003929153210000043
Wherein->
Figure BDA0003929153210000044
Is to the |N in no voice0 (ejw )|2 Is a statistical estimate of (e) |X (ejw )|2 Represents the power spectrum of noisy speech, |S (ejw )|2 Representing the power spectrum of a speech signal, wherein, in the non-speech section, an estimate of the noise power spectrum is +.>
Figure BDA0003929153210000045
And updating.
As a further improvement of the above technical solution, estimating according to input noise obtained from noisy speech, removing noise from noisy speech by using a spectral subtraction algorithm to obtain an estimated value of a speech signal, re-estimating a transfer function of a wiener filter by using an output signal, and updating background noise in a speech segment and a non-speech segment, including:
calculating an initial smoothed estimate of the background noise magnitude spectrum
Figure BDA0003929153210000046
The N_no frame before noise is preset to be a pure noise signal, and statistical average of the amplitude is adopted to estimate +.>
Figure BDA0003929153210000047
The recursive expression is
Figure BDA0003929153210000051
Where n=1..n_no, -/-for example>
Figure BDA0003929153210000052
N th statistical evaluation value representing background noise, its initial value +.>
Figure BDA0003929153210000053
The expression for obtaining the initial smooth estimated value of background noise amplitude spectrum is expressed as the power spectrum of the nth frame of noisy speech signal
Figure BDA0003929153210000054
Wherein |XN_no (ejw ) The I represents the amplitude spectrum of the noise voice signal of the N_no frame;
let the frame variable n=n_no+1, using the expression as
Figure BDA0003929153210000055
Wherein->
Figure BDA0003929153210000056
Pinghu estimation representing noise power spectrum, +.>
Figure BDA0003929153210000057
An estimate representing a signal power spectrum;
filtering the amplitude spectrum of the voice signal with noise to obtain an estimated value of the background noise amplitude spectrum of the current frame
Figure BDA0003929153210000058
Calculating the amplitude spectrum estimated value of the signal obtained by subtracting the spectrums
Figure BDA0003929153210000059
Figure BDA00039291532100000510
Adopt the current k frame noise amplitude spectrum estimation value +.>
Figure BDA00039291532100000511
Smooth estimation of background noise +.>
Figure BDA00039291532100000512
The expression for updating is +.>
Figure BDA00039291532100000513
Wherein the scale factor is represented;
if the rate of change of the speech signal and the rate of change of the noise signal can be separated, a reasonable setting of p is made,
Figure BDA00039291532100000514
and->
Figure BDA00039291532100000515
Slow change of->
Figure BDA00039291532100000516
The change is quick;
computing a smoothed estimate of a signal magnitude spectrum
Figure BDA00039291532100000517
When the frame variable n=n+1, if n>The total frame number N is ended to obtain the estimated value +.>
Figure BDA00039291532100000518
And repeating the steps to continue to execute as output.
As a further improvement of the above technical solution, the execution process of the preprocessing module includes:
the sampling data of the preset voice signal is { Q ]K -k=1, 2..n), n representing the total number of samples, let Δt=1, using the expression
Figure BDA00039291532100000519
Figure BDA00039291532100000520
Wherein K is E [1, n]Calculating each undetermined coefficient alphaj (j=0, 1,2 m), let function->
Figure BDA00039291532100000521
And discrete sampled data QK Error quadratic sum E is minimum, then +.>
Figure BDA0003929153210000061
Satisfying the E extremum condition is expressed as
Figure BDA0003929153210000062
Sequentially selecting E to alphai Solving and compiling to generate m+1 element linear equation system
Figure BDA0003929153210000063
Figure BDA0003929153210000064
Calculating m+1 undetermined coefficients alphaj (j=0, 1. M., m represents the order of the set polynomial, the value ranges of i and j represent 0.ltoreq.i, j.ltoreq.m, and when m=0, the trend term represents a constant,
Figure BDA0003929153210000065
when m=0, m represents a constant trend term, namely an arithmetic average value of signal sampling data, and when m=1, a linear trend term is represented to be +.>
Figure BDA0003929153210000066
When m is equal to or greater than 2, the trend term represents a curve trend term.
As a further improvement of the technical scheme, the method for successfully matching the recognition result with the pre-constructed corpus to obtain the control instruction comprises the following steps:
the R-dimensional European space R is obtained by adopting a vectorization algorithmr Medium gain vector
Figure BDA0003929153210000067
Obtaining a limited number of vectors in the r dimension according to a preset criterion
Figure BDA0003929153210000068
To indicate +.>
Figure BDA0003929153210000069
Representing the input vector +.>
Figure BDA00039291532100000610
Representing the quantization vector or codeword,
Figure BDA00039291532100000611
representing a codebook, the number of codewords M being referred to as codebook capacity, the criterion for vector quantization in training data being to minimize the quantized distortion at a given codebook capacity k;
when the speaker voice to be recognized is recognized, a group of characteristic vectors are extracted according to the training time,
Figure BDA00039291532100000612
n codebook pairs using N speakers, respectively +.>
Figure BDA00039291532100000613
Is quantized to find a codebook in the feature space that is close to the set of feature vectors>
Figure BDA00039291532100000614
The corresponding speaker i serves as a recognition result to obtain a recognition result. />
In a second aspect, the present invention also provides an intelligent door lock recognition method based on voice recognition, which includes the following steps:
the voice information sent by the intelligent door lock is obtained, and the voice information is preprocessed to obtain a voice signal;
extracting physiological characteristics and personalized information representing a speaker from the voice signal, and enhancing the voice signal according to the winning characteristics and the personalized information to obtain the voice signal, and performing voice enhancement on the voice signal to obtain test data, wherein the physiological characteristics comprise gender and age, and the personalized information comprises tone and tone color;
inputting the test data into a deep neural network, training and carrying out voiceprint recognition on the voice signal to obtain a recognition result, wherein the recognition result comprises user identity and message information;
and successfully matching the identification result with a pre-constructed corpus to obtain a control instruction, and intelligently controlling the door lock according to the control instruction.
The invention provides an intelligent door lock recognition and method based on voice recognition, which comprises the steps of preprocessing voice information sent by an intelligent door lock to obtain the voice signal, extracting physiological characteristics and individual information representing a speaker from the voice signal, enhancing the voice signal according to winning characteristics and individual information to obtain voice signal to carry out voice enhancement to obtain test data, inputting the test data into a deep neural network to train the voice signal to carry out voiceprint recognition to obtain a recognition result, successfully matching the recognition result with a pre-constructed corpus to obtain a control instruction, and carrying out intelligent control on the door lock according to the control instruction, so that the intelligent door lock can rapidly acquire the voice information, and accurately authenticate the voice information to exclude noise and the voice information, thereby realizing flexible control of the intelligent door lock and improving the use experience of users.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed in the embodiments will be briefly described below, it being understood that the following drawings only illustrate some embodiments of the present invention and therefore should not be considered as limiting the scope, and other related drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a block diagram of a voice recognition based intelligent door lock recognition system of the present invention;
fig. 2 is a flowchart of an intelligent door lock recognition method based on voice recognition according to the present invention.
Detailed Description
Embodiments of the present invention are described in detail below, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to like or similar elements or elements having like or similar functions throughout. The embodiments described below by referring to the drawings are illustrative only and are not to be construed as limiting the invention.
Referring to fig. 1, the invention provides an intelligent door lock recognition system based on voice recognition, which comprises:
the preprocessing module is used for preprocessing the voice information sent by the intelligent door lock to obtain a voice signal;
the feature extraction module is used for extracting physiological features and personalized information representing a speaker from the voice signals, enhancing the voice signals according to the victory features and the personalized information to obtain voice signals, and performing voice enhancement on the voice signals to obtain test data, wherein the physiological features comprise gender and age, and the personalized information comprises tone and tone color;
the model training module is used for inputting the test data into the deep neural network to train to carry out voiceprint recognition on the voice signal to obtain a recognition result, wherein the recognition result comprises user identity and message information;
and the identification judging module is used for successfully matching the identification result with a pre-constructed corpus to obtain a control instruction, and intelligently controlling the door lock according to the control instruction.
In this embodiment, the identification and judgment module includes an intelligent door lock unit, a middleware unit and an application unit; the intelligent door lock unit is used for recording the door lock state, receiving a control instruction issued by the middleware unit and uploading door lock information, the middleware unit is used for transmitting the door lock state information uploaded by the door lock, receiving the operation of the intelligent door lock unit and various index data of the intelligent door lock unit and transmitting the operation and various index data to the corpus, and the corpus analyzes the user behaviors and monitors the real-time condition of the middleware unit according to the index data; the application unit is used for displaying door lock information, operating the door lock state and upgrading the switch firmware to realize man-machine interaction. The voice recognition is to identify according to the frequency spectrum of the human voice mode, even if the human voice mode is the same, the voice print frequency spectrum of the human voice mode can change the emotion or disease of the human voice mode, the noisy environment can cause non-negligible influence, and the voice recognition is easy to be maliciously amplitude. Safety is a primary standard for measuring the performance of the intelligent door lock, and is also a necessary condition for realizing popularization of the intelligent door lock. The voice signal is an important carrier for human propagation information and emotion communication, is the perception of mechanical vibration of the acoustic media by the auditory organs, and is the most important, effective, common and convenient communication mode for human. People inevitably experience interference from the surrounding environment, noise introduced by transmission errors, electrical noise within the communication device and other speakers during voice communication, which will eventually result in the received voice signal not being a clean original voice signal, but being a noisy voice signal contaminated with noise. By preprocessing the collected voice information, noise interference can be primarily eliminated, and favorable conditions are provided for follow-up accurate judgment.
It should be noted that, in order to obtain the purest possible speech signal from the noisy speech signal, to reduce the noise interference, speech enhancement is required, where the purpose of speech enhancement is to extract the purest possible original speech from the noisy speech, and perform speech enhancement by using spectral subtraction, which has the characteristics of small operand and easy real-time implementation, and if the noise is stable or slowly-varying additive noise, and the speech signal and the noise are mutually independent, the noise mahonia spectrum is subtracted from the mahonia spectrum of the noisy speech, so as to obtain purer speech frequency. Since the human ear is insensitive to the phase of the speech, what is important to the intelligibility and quality of the speech is the short-time spectral amplitude of the speech signal instead of the phase, which perceptual characteristic of the human ear is exploited in the spectral subtraction, replacing the estimated phase of the speech with the phase of the original noisy speech.
It should be appreciated that speech information is often interspersed with a wide variety of background noise and speech fragments introduced between words of speech, effectively detecting the start and stop points of speech, greatly reducing the amount of data to be subsequently processed, and improving recognition speed and recognition rate. In pure speech, the start-stop point can be detected using the short-time average energy. The energy of the voice signal is continuously changed along with time, the energy of the initial consonant is high, the energy of the final sound is low, if the voice signal is clean, the starting point of the voice can be detected only by using short-time energy, and the boundary between the initial consonant and the final sound and the boundary between the non-speech section and the speech section are distinguished, so that the accuracy of voice recognition is effectively improved.
Optionally, the executing process of the model training module includes:
comparing the test template corresponding to the test data with the training template, identifying by comparing similarity measures between the test template and the training template, and calculating the pronunciation speed of the voice data in the test data by combining a dynamic time warping algorithm;
preset function f [ (m)i ,ni )]Corresponds to a grid (mi ,ni ) Then there is a path cost function d [ (m)i ,ni )]And f [ (m)i ,ni )]=(mi-1 ,ni-1 ) Initializing to ni =i(i=1,2...,N),n1 =m1 =1,f[(m,n)]=(0,0),
Figure BDA0003929153210000101
Wherein R is a parallelogram constraint;
calculating f [ (m) by adopting a recursion methodi ,ni )]And d [ (m)i ,ni )]The expression is obtained as
Figure BDA0003929153210000102
Wherein m=mi ,mi -1,mi -2, the always true version of the speech signal d [ (M, N)]When i=n, the point (M, N) is traced back forward, thereby obtaining the optimum path (Mi-1 ,ni-1 )=f[(mi ,ni )](i=n, N-1,..3, 2), when (mi-1 ,ni-1 ) Ending at = (0, 0).
In this embodiment, the overall feature vector is represented by using the feature data amount of vector quantization, a plurality of sampling signals are classified, each class is a vector, the vector is quantized, and the input feature vector x= { X is preseti I=1, 2., T }, n represents the number of iterations, Li (n) represents the ith subcell of the nth iteration, Yi (n) code words representing the sub-cells, wherein the code words have J total, the maximum iteration number is M', the iteration threshold is set as epsilon, and the specific process comprises the following steps: selecting the centroids of all input feature vectors as an initialization codebook { Y }i (0) 1.ltoreq.i.ltoreq.J }, the expressions for dividing the cell into two by adopting the smaller threshold are Y respectivelyi(1) =Yi(0) -ε、Yi(2) =Yi(0) +ε,Yi(1) And Yi(2) Respectively representing a cell code word before splitting and a cell code word after splitting into two, wherein the number of cells is doubled, and n=0; when n=n+1, calculating characteristic vector and current codeword of each frame, if satisfied, x is Ci (n), d represents the Euclidean distance between vectors, d (x, Y)i (n-1))≤d(x,Yj (n-1)), i.noteq.j, 1.ltoreq.i, j.ltoreq.M', using the expression
Figure BDA0003929153210000103
When n=l or->
Figure BDA0003929153210000104
At the end of the iteration, returning to dividing the cell into two by adopting a smaller threshold value, otherwise, saving and outputting the best codebook { Yi I=1, 2..m' }, if the distortion change rate is greater than the threshold, the clustering process that occurs when n=n+1 will be skipped. The vector quantization is a process of using a small number of representative data volume areas to represent the overall feature vector, classifying a plurality of signals, using one vector as each class, and quantizing the vector, thus greatly compressing the data volume and improving the speech recognition efficiency.
Optionally, the executing process of the feature extraction module includes:
performing voice enhancement on the voice signal by adopting spectral subtraction, and subtracting the noise power spectrum from the power spectrum of the voice signal to obtain a pure voice spectrum;
let S (t) denote a clean speech signal, N '(t) denote an additive noise signal, Y (t) denote a noisy speech signal, then Y (t) =s (t) +n' (t), and Y (ω) =s (ω) +n '(ω) are used to represent the fourier transforms of Y (t), S (t) and N' (t), respectively;
if independent of ginkgo and additive noise, Y (omega)2 =|S(ω)|2 +|N′(ω)|2 If P is usedy (ω)、Ps (omega) and Pn (ω) represents the power spectrum of y (t), s (t) and n' (t), respectively, then Py (ω)=Ps (ω)+Pn′ (omega) estimation of noise power spectrum P by speech-free but noisy before utterancen′ (ω), then Ps (ω)=Py (ω)-Pn′ (omega) the subtracted power spectrum Ps (ω) determining a clean speech power spectrum from which the denoised speech time domain signal is recovered to obtain test data.
In this embodiment, recovering the denoised speech time domain signal from the power spectrum to obtain test data includes: from noisy speech x0 [n0 ]Power spectrum |x (ejw )|2 In which the power spectrum of clean speech is estimated
Figure BDA0003929153210000111
Using the expression as
Figure BDA0003929153210000112
Wherein x is0 [n0 ]Representing a discrete sequence of input signals, if in a speech signal s [ n ]0 ]Adding additive noise n0 [n0 ]According to the fact that the noise is irrelevant to the signal and is non-stable, the change rate of the noise is smaller than that of the signal, x is obtained0 [n0 ]=s[n0 ]+n0 [n0 ]And fourier transforming to obtain the expression +.>
Figure BDA0003929153210000113
Wherein->
Figure BDA0003929153210000114
Is to the |N in no voice0 (ejw )|2 Is a statistical estimate of (e) |X (ejw )|2 Represents the power spectrum of noisy speech, |S (ejw )|2 Representing the power spectrum of a speech signal, wherein, in the non-speech section, an estimate of the noise power spectrum is +.>
Figure BDA0003929153210000121
And updating. The feature extraction is to extract feature acquisition parameters representing physiological characteristics and psychological characteristics of people in the voice signals, so that voice information can be rapidly and effectively analyzed, and noise interference is eliminated.
Optionally, estimating according to the input noise obtained from the noisy speech, removing the noise from the noisy speech by using a spectral subtraction algorithm to obtain an estimated value of the speech signal, re-estimating the transfer function of the wiener filter by using the output signal, and updating the background noise in the speech segment and the non-speech segment, including:
calculating an initial smoothed estimate of the background noise magnitude spectrum
Figure BDA0003929153210000122
The N_no frame before noise is preset to be a pure noise signal, and statistical average of the amplitude is adopted to estimate +.>
Figure BDA0003929153210000123
The recursive expression is
Figure BDA0003929153210000124
Where n=1..n_no, -/-for example>
Figure BDA0003929153210000125
N th statistical evaluation value representing background noise, its initial value +.>
Figure BDA0003929153210000126
The expression for obtaining the initial smooth estimated value of background noise amplitude spectrum is expressed as the power spectrum of the nth frame of noisy speech signal
Figure BDA0003929153210000127
Wherein |XN_no (ejw ) The I represents the amplitude spectrum of the noise voice signal of the N_no frame;
let the frame variable n=n_no+1, using the expression as
Figure BDA0003929153210000128
Wherein->
Figure BDA0003929153210000129
Pinghu estimation representing noise power spectrum, +.>
Figure BDA00039291532100001210
An estimate representing a signal power spectrum;
filtering the amplitude spectrum of the voice signal with noise to obtain an estimated value of the background noise amplitude spectrum of the current frame
Figure BDA00039291532100001211
Calculating the amplitude spectrum estimated value of the signal obtained by subtracting the spectrums
Figure BDA00039291532100001212
Figure BDA00039291532100001213
Adopt the current k frame noise amplitude spectrum estimation value +.>
Figure BDA00039291532100001214
Smooth estimation of background noise +.>
Figure BDA00039291532100001215
The expression for updating is +.>
Figure BDA00039291532100001216
Wherein the scale factor is represented;
if the rate of change of the speech signal and the rate of change of the noise signal can be separated, a reasonable setting of p is made,
Figure BDA0003929153210000131
and->
Figure BDA0003929153210000132
Slow change of->
Figure BDA0003929153210000133
The change is quick;
computing a smoothed estimate of a signal magnitude spectrum
Figure BDA0003929153210000134
When the frame variable n=n+1, if n>The total frame number N is ended to obtain the estimated value +.>
Figure BDA0003929153210000135
And repeating the steps to continue to execute as output.
In this embodiment, the spectral subtraction directly subtracts the noise spectrum from the noisy speech signal, and uses the clean speech after the phase reconstruction of the noisy speech to reconstruct the enhanced noisy speech signal, and under the condition that the additive noise and the short-time stationary speech signal are set to be mutually independent, subtracts the noise power spectrum from the mahonia spectrum of the noisy speech, so as to obtain a purer speech spectrum, and the speech activation detection is combined to judge the speech-free segment, so as to estimate the characteristic of the background noise, and the noise cannot be updated in the speech segment, thus affecting the accuracy of the background noise estimation, and the accuracy of the speech activation detection also has an influence on the estimation of the background noise. And estimating input noise in the noisy speech, removing the noise in the noisy speech by adopting spectrum subtraction to obtain an estimated value of a speech signal, and re-estimating a transfer function of the wiener filter by adopting an output signal to form a reverse structure. Background noise can be updated in the voice section and the non-voice section, and the estimation of the background noise is more accurate.
Optionally, the execution of the preprocessing module includes:
the sampling data of the preset voice signal is { Q ]K -k=1, 2..n), n representing the total number of samples, let Δt=1, using the expression
Figure BDA0003929153210000136
Figure BDA0003929153210000137
Wherein K is E [1, n]Calculating each undetermined coefficient alphaj (j=0, 1,2 m), let function->
Figure BDA0003929153210000138
And discrete sampled data QK Error quadratic sum E is minimum, then +.>
Figure BDA0003929153210000139
Satisfying the E extremum condition is expressed as
Figure BDA00039291532100001310
Sequentially selecting E to alphai Solving and compiling to generate m+1 element linear equation system
Figure BDA00039291532100001311
Figure BDA00039291532100001312
Calculating m+1 undetermined coefficients alphaj (j=0, 1. M., m represents the order of the set polynomial, the value ranges of i and j represent 0.ltoreq.i, j.ltoreq.m, and when m=0, the trend term represents a constant,
Figure BDA0003929153210000141
when m=0, m represents a constant trend term, namely an arithmetic average value of signal sampling data, and when m=1, a linear trend term is represented to be +.>
Figure BDA0003929153210000142
When m is equal to or greater than 2, the trend term represents a curve trend term.
In this embodiment, successfully matching the recognition result with the pre-constructed corpus to obtain the control instruction includes: the R-dimensional European space R is obtained by adopting a vectorization algorithmr Medium gain vector
Figure BDA0003929153210000143
According to a preset standardThen a finite number of vectors are found in the r dimension
Figure BDA0003929153210000144
To indicate +.>
Figure BDA0003929153210000145
Representing the input vector +.>
Figure BDA0003929153210000146
Representing the quantization vector or codeword,
Figure BDA0003929153210000147
representing a codebook, the number of codewords M being referred to as codebook capacity, the criterion for vector quantization in training data being to minimize the quantized distortion at a given codebook capacity k; when the speaker voice to be recognized is recognized, a group of characteristic vectors are extracted according to the training time, and the +.>
Figure BDA0003929153210000148
N codebook pairs using N speakers, respectively +.>
Figure BDA0003929153210000149
Is quantized to find a codebook in the feature space that is close to the set of feature vectors>
Figure BDA00039291532100001410
The corresponding speaker i serves as a recognition result to obtain a recognition result. />
It should be noted that, in the voice Xin Ha, the analysis processing of the voice Xin Ha is usually performed rarely from the appetite signal, the voice signal can be decomposed into a plurality of signals with separate frequencies according to the frequency analysis, the voice features in the voice signal are analyzed, and the operations of feature extraction, removal, substitution, etc. are performed, which are the key steps of the anonymization processing of the voice signal. The automatic voiceprint recognition comprises the steps of determining the identity of a speaker through probability matching in a subspace and extracting the characteristics of a deep neural network to perform identity matching, wherein each speaker can be described through modeling through a hidden Markov model and a Gaussian mixture model, or can be self-adaptive to a previous model by means of a general background model, voiceprint difference and channel difference can be respectively modeled, step length can be performed on the channel difference better, and system performance is enhanced.
Referring to fig. 2, the invention also provides an intelligent door lock recognition method based on voice recognition, which comprises the following steps:
s1: the voice information sent by the intelligent door lock is obtained, and the voice information is preprocessed to obtain a voice signal;
s2: extracting physiological characteristics and personalized information representing a speaker from the voice signal, and enhancing the voice signal according to the winning characteristics and the personalized information to obtain the voice signal, and performing voice enhancement on the voice signal to obtain test data, wherein the physiological characteristics comprise gender and age, and the personalized information comprises tone and tone color;
s3: inputting the test data into a deep neural network, training and carrying out voiceprint recognition on the voice signal to obtain a recognition result, wherein the recognition result comprises user identity and message information;
and S4, successfully matching the identification result with a pre-constructed corpus to obtain a control instruction, and intelligently controlling the door lock according to the control instruction.
In this embodiment, the voice information is preprocessed to obtain the voice signal by sending the voice information to the obtained intelligent door lock, the physiological feature and the personalized information representing the speaker are extracted from the voice signal, the voice signal is enhanced according to the victory feature and the personalized information to obtain the voice signal, the voice enhancement is performed to obtain the test data, the test data is input into the deep neural network to train the voice signal to perform voiceprint recognition to obtain the recognition result, the recognition result is successfully matched with the pre-built corpus to obtain the control instruction, the intelligent control is performed on the door lock according to the control instruction, the intelligent door lock can rapidly acquire the voice information, the voice information is accurately authenticated, and the noise and the voice information are eliminated, so that the flexible control of the intelligent door lock is realized, and the user experience is improved.
Any particular values in all examples shown and described herein are to be construed as merely illustrative and not a limitation, and thus other examples of exemplary embodiments may have different values.
It should be noted that: like reference numerals and letters denote like items in the following figures, and thus once an item is defined in one figure, no further definition or explanation thereof is necessary in the following figures.
The above examples merely represent a few embodiments of the present invention, which are described in more detail and are not to be construed as limiting the scope of the present invention. It should be noted that it will be apparent to those skilled in the art that several variations and modifications can be made without departing from the spirit of the invention, which are all within the scope of the invention.

Claims (10)

1. An intelligent door lock recognition system based on voice recognition, comprising:
the preprocessing module is used for preprocessing the voice information sent by the intelligent door lock to obtain a voice signal;
the feature extraction module is used for extracting physiological features and personalized information representing a speaker from the voice signals, enhancing the voice signals according to the victory features and the personalized information to obtain voice signals, and performing voice enhancement on the voice signals to obtain test data, wherein the physiological features comprise gender and age, and the personalized information comprises tone and tone color;
the model training module is used for inputting the test data into the deep neural network to train to carry out voiceprint recognition on the voice signal to obtain a recognition result, wherein the recognition result comprises user identity and message information;
and the identification judging module is used for successfully matching the identification result with a pre-constructed corpus to obtain a control instruction, and intelligently controlling the door lock according to the control instruction.
2. The intelligent door lock recognition system based on voice recognition according to claim 1, wherein the recognition judging module comprises an intelligent door lock unit, a middleware unit and an application unit;
the intelligent door lock unit is used for recording the door lock state, receiving a control instruction issued by the middleware unit and uploading door lock information, the middleware unit is used for transmitting the door lock state information uploaded by the door lock, receiving the operation of the intelligent door lock unit and various index data of the intelligent door lock unit and transmitting the operation and various index data to the corpus, and the corpus analyzes the user behaviors and monitors the real-time condition of the middleware unit according to the index data; the application unit is used for displaying door lock information, operating the door lock state and upgrading the switch firmware to realize man-machine interaction.
3. The intelligent door lock recognition system based on voice recognition according to claim 1, wherein the executing of the model training module comprises:
comparing the test template corresponding to the test data with the training template, identifying by comparing similarity measures between the test template and the training template, and calculating the pronunciation speed of the voice data in the test data by combining a dynamic time warping algorithm;
preset function f [ (m)i ,ni )]Corresponds to a grid (mi ,ni ) Then there is a path cost function d [ (m)i ,ni )]And f [ (m)i ,ni )]=(mi-1 ,ni-1 ) Initializing to ni =i(i=1,2...,N),n1 =m1 =1,f[(m,n)]=(0,0),
Figure FDA0003929153200000021
Wherein R is a parallelogram constraint;
calculating f [ (m) by adopting a recursion methodi ,ni )]And d [ (m)i ,ni )]The expression is obtained as
Figure FDA0003929153200000022
Wherein m=mi ,mi -1,mi -2, the always true version of the speech signal d [ (M, N)]When i=n, the point (M, N) is traced back forward, thereby obtaining the optimum path (Mi-1 ,ni-1 )=f[(mi ,ni )](i=n, N-1,..3, 2), when (mi-1 ,ni-1 ) Ending at = (0, 0).
4. The intelligent door lock recognition system based on voice recognition according to claim 3, wherein the overall feature vector is represented by a vector quantized feature data amount, a plurality of sampling signals are classified into one vector for each class, the vector is quantized, and an input feature vector x= { X is preseti I=1, 2., T }, n represents the number of iterations, Li (n) represents the ith subcell of the nth iteration, Yi (n) code words representing the sub-cells, wherein the code words have J total, the maximum iteration number is m', the iteration threshold is set as epsilon, and the specific process comprises the following steps:
selecting the centroids of all input feature vectors as an initialization codebook { Y }i (0) 1.ltoreq.i.ltoreq.J }, the expressions for dividing the cell into two by adopting the smaller threshold are Y respectivelyi(1) =Yi(0) -ε、Yi(2) =Yi(0) +ε,Yi(1) And Yi(2) Respectively representing a cell code word before splitting and a cell code word after splitting into two, wherein the number of cells is doubled, and n=0;
when n=n+1, calculating characteristic vector and current codeword of each frame, if satisfied, x is Ci (n), d represents the Euclidean distance between vectors, d (x, Y)i (n-1))≤d(x,Yj (n-1)), i.noteq.j, 1.ltoreq.i, j.ltoreq.M', using the expression
Figure FDA0003929153200000023
When n=l or
Figure FDA0003929153200000024
At the end of the iteration, returning to dividing the cell into two by adopting a smaller threshold value, otherwise, saving and outputting the best codebook { Yi I=1, 2..m' }, if the distortion change rate is greater than the threshold, the clustering process that occurs when n=n+1 will be skipped.
5. The intelligent door lock recognition system based on voice recognition according to claim 1, wherein the feature extraction module is executed by:
performing voice enhancement on the voice signal by adopting spectral subtraction, and subtracting the noise power spectrum from the power spectrum of the voice signal to obtain a pure voice spectrum;
let S (t) denote a clean speech signal, N '(t) denote an additive noise signal, Y (t) denote a noisy speech signal, then Y (t) =s (t) +n' (t), and Y (ω) =s (ω) +n '(ω) are used to represent the fourier transforms of Y (t), S (t) and N' (t), respectively;
if independent of ginkgo and additive noise, Y (omega)2 =|S(ω)|2 +|N′(ω)|2 If P is usedy (ω)、Ps (omega) and Pn (ω) represents the power spectrum of y (t), s (t) and n' (t), respectively, then Py (ω)=Ps (ω)+Pn′ (omega) estimation of noise power spectrum P by speech-free but noisy before utterancen′ (ω), then Ps (ω)=Py (ω)-Pn′ (omega) the subtracted power spectrum Ps (ω) determining a clean speech power spectrum from which the denoised speech time domain signal is recovered to obtain test data.
6. The intelligent door lock recognition system based on voice recognition of claim 5, wherein recovering the denoised voice time domain signal from the power spectrum to obtain the test data comprises:
from noisy speech x0 [n0 ]Power spectrum |x (ejw )|2 In which the power spectrum of clean speech is estimated
Figure FDA0003929153200000031
The usage expression is->
Figure FDA0003929153200000032
Wherein x is0 [n0 ]Representing a discrete sequence of input signals, if in a speech signal s [ n ]0 ]Adding additive noise n0 [n0 ]According to the fact that the noise is irrelevant to the signal and is non-stable, the change rate of the noise is smaller than that of the signal, x is obtained0 [n0 ]=s[n0 ]+n0 [n0 ]And fourier transforming to obtain the expression
Figure FDA0003929153200000033
Wherein->
Figure FDA0003929153200000034
Is to the |N in no voice0 (ejw )|2 Is a statistical estimate of (e) |X (ejw )|2 Represents the power spectrum of noisy speech, |S (ejw )|2 Representing the power spectrum of a speech signal, wherein, in the non-speech section, an estimate of the noise power spectrum is +.>
Figure FDA0003929153200000035
And updating.
7. The intelligent door lock recognition system based on voice recognition according to claim 6, wherein the estimating according to the input noise in the noisy voice, removing the noise in the noisy voice by using a spectral subtraction algorithm to obtain an estimated value of the voice signal, re-estimating the transfer function of the wiener filter by using the output signal, and updating the background noise in the voice segment and the non-voice segment, comprising:
calculating an initial smoothed estimate of the background noise magnitude spectrum
Figure FDA0003929153200000041
The N_no frame before noise is preset to be a pure noise signal, and statistical average of the amplitude is adopted to estimate +.>
Figure FDA0003929153200000042
The recursive expression is
Figure FDA0003929153200000043
Where n=1..n_no, -/-for example>
Figure FDA0003929153200000044
N th statistical evaluation value representing background noise, its initial value +.>
Figure FDA0003929153200000045
|Xn (ejw )|2 The expression for obtaining the initial smooth estimated value of background noise amplitude spectrum is expressed as the power spectrum of the nth frame of noisy speech signal
Figure FDA0003929153200000046
Wherein |XN_no (ejw ) The I represents the amplitude spectrum of the noise voice signal of the N_no frame;
let the frame variable n=n_no+1, using the expression as
Figure FDA0003929153200000047
Wherein the method comprises the steps of
Figure FDA0003929153200000048
Pinghu estimation representing noise power spectrum, +.>
Figure FDA0003929153200000049
An estimate representing a signal power spectrum;
filtering the amplitude spectrum of the voice signal with noise to obtain an estimated value of the background noise amplitude spectrum of the current frame
Figure FDA00039291532000000410
Figure FDA00039291532000000411
Calculating the amplitude spectrum estimated value of the signal obtained by subtracting the spectrums
Figure FDA00039291532000000412
Figure FDA00039291532000000413
Figure FDA00039291532000000414
Adopt the current k frame noise amplitude spectrum estimation value +.>
Figure FDA00039291532000000415
Smooth estimation of background noise +.>
Figure FDA00039291532000000416
The expression for updating is +.>
Figure FDA00039291532000000417
Wherein the scale factor is represented;
if the rate of change of the speech signal and the rate of change of the noise signal can be separated, a reasonable setting of p is made,
Figure FDA00039291532000000418
and->
Figure FDA00039291532000000419
Slow change of->
Figure FDA00039291532000000420
The change is quick;
computing a smoothed estimate of a signal magnitude spectrum
Figure FDA00039291532000000421
When the frame variable n=n+1, if n>The total frame number N is ended to obtain the estimated value +.>
Figure FDA0003929153200000051
As a means ofAnd (5) outputting, repeating the steps and continuing to execute.
8. The intelligent door lock recognition system based on voice recognition according to claim 1, wherein the preprocessing module is executed by:
the sampling data of the preset voice signal is { Q ]K -k=1, 2..n), n representing the total number of samples, let Δt=1, using the expression
Figure FDA0003929153200000052
Figure FDA0003929153200000053
Wherein K is E [1, n]Calculating each undetermined coefficient alphaj (j=0, 1,2 m), let function->
Figure FDA0003929153200000054
And discrete sampled data QK Error quadratic sum E is minimum, then +.>
Figure FDA0003929153200000055
Satisfying the E extremum condition is expressed as
Figure FDA0003929153200000056
Sequentially selecting E to alphai Solving and compiling to generate m+1 element linear equation system
Figure FDA0003929153200000057
Figure FDA0003929153200000058
Calculating m+1 undetermined coefficients alphaj (j=0, 1. M., m represents the order of the set polynomial, the value ranges of i and j represent 0.ltoreq.i, j.ltoreq.m, and when m=0, the trend term represents a constant,
Figure FDA0003929153200000059
when m=0, m represents a constant trend term, namely an arithmetic average value of signal sampling data, and when m=1, a linear trend term is represented to be +.>
Figure FDA00039291532000000510
When m is equal to or greater than 2, the trend term represents a curve trend term.
9. The intelligent door lock recognition system based on voice recognition according to claim 1, wherein the matching of the recognition result with the pre-built corpus successfully results in a control instruction, comprising:
the R-dimensional European space R is obtained by adopting a vectorization algorithmr Medium gain vector
Figure FDA00039291532000000511
Obtaining a limited number of vectors in the r dimension according to a preset criterion
Figure FDA00039291532000000512
To indicate +.>
Figure FDA00039291532000000513
Representing the input vector +.>
Figure FDA00039291532000000514
Representing the quantization vector or codeword,
Figure FDA00039291532000000515
representing a codebook, the number of codewords M being referred to as codebook capacity, the criterion for vector quantization in training data being to minimize the quantized distortion at a given codebook capacity k;
when the speaker voice to be recognized is recognized, a group of characteristic vectors are extracted according to the training time,
Figure FDA0003929153200000061
n codebook pairs using N speakers, respectively +.>
Figure FDA0003929153200000062
Is quantized to find a codebook in the feature space that is close to the set of feature vectors>
Figure FDA0003929153200000063
The corresponding speaker i serves as a recognition result to obtain a recognition result.
10. A voice recognition-based intelligent door lock recognition method of the voice recognition-based intelligent door lock recognition system according to any one of claims 1 to 9, comprising the steps of:
the voice information sent by the intelligent door lock is obtained, and the voice information is preprocessed to obtain a voice signal;
extracting physiological characteristics and personalized information representing a speaker from the voice signal, and enhancing the voice signal according to the winning characteristics and the personalized information to obtain the voice signal, and performing voice enhancement on the voice signal to obtain test data, wherein the physiological characteristics comprise gender and age, and the personalized information comprises tone and tone color;
inputting the test data into a deep neural network, training and carrying out voiceprint recognition on the voice signal to obtain a recognition result, wherein the recognition result comprises user identity and message information;
and successfully matching the identification result with a pre-constructed corpus to obtain a control instruction, and intelligently controlling the door lock according to the control instruction.
CN202211382687.0A2022-11-072022-11-07Intelligent door lock recognition system and method based on voice recognitionPendingCN116129555A (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN202211382687.0ACN116129555A (en)2022-11-072022-11-07Intelligent door lock recognition system and method based on voice recognition

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN202211382687.0ACN116129555A (en)2022-11-072022-11-07Intelligent door lock recognition system and method based on voice recognition

Publications (1)

Publication NumberPublication Date
CN116129555Atrue CN116129555A (en)2023-05-16

Family

ID=86301550

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN202211382687.0APendingCN116129555A (en)2022-11-072022-11-07Intelligent door lock recognition system and method based on voice recognition

Country Status (1)

CountryLink
CN (1)CN116129555A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN119169996A (en)*2024-09-182024-12-20广州市远知初电子科技有限公司 Audio device mode language control method and system

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
周颖: "Android 声纹密码锁设计", 《中国优秀硕士学位论文全文数据库》, no. 4, pages 136 - 151*
张营: "基音特征提取算法的研究及其在语音门锁中的应用", 《中国优秀硕士学位论文全文数据库》, no. 4, pages 136 - 125*
林遂芳: "噪声环境下语音识别方法的研究", 《中国优秀硕士学位论文全文数据库》, no. 3, pages 136 - 32*
郝晓雪: "Android 平台声纹解锁系统的研究", 《中国优秀硕士学位论文全文数据库》, no. 3, pages 136 - 515*
陈超: "基于说话人识别的门禁系统及实现", 《中国优秀硕士学位论文全文数据库》, no. 1, pages 13 - 14*

Cited By (1)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN119169996A (en)*2024-09-182024-12-20广州市远知初电子科技有限公司 Audio device mode language control method and system

Similar Documents

PublicationPublication DateTitle
CN108597496B (en)Voice generation method and device based on generation type countermeasure network
CN103236260B (en)Speech recognition system
US5812973A (en)Method and system for recognizing a boundary between contiguous sounds for use with a speech recognition system
US5596679A (en)Method and system for identifying spoken sounds in continuous speech by comparing classifier outputs
CN114550703B (en) Training method and device of speech recognition system, speech recognition method and device
CN108564956B (en)Voiceprint recognition method and device, server and storage medium
US8447614B2 (en)Method and system to authenticate a user and/or generate cryptographic data
US5734793A (en)System for recognizing spoken sounds from continuous speech and method of using same
JPS62231996A (en)Allowance evaluation of word corresponding to voice input
CN101853661B (en)Noise spectrum estimation and voice activity detection method based on unsupervised learning
Todkar et al.Speaker recognition techniques: A review
CN113744715A (en)Vocoder speech synthesis method, device, computer equipment and storage medium
KR100571574B1 (en) Similar Speaker Recognition Method Using Nonlinear Analysis and Its System
CN110767238B (en)Blacklist identification method, device, equipment and storage medium based on address information
CN116129555A (en)Intelligent door lock recognition system and method based on voice recognition
JP4666129B2 (en) Speech recognition system using speech normalization analysis
Wu et al.Speaker identification based on the frame linear predictive coding spectrum technique
CN113241059B (en)Voice wake-up method, device, equipment and storage medium
Slívová et al.Isolated word automatic speech recognition system
Kumar et al.Text dependent voice recognition system using MFCC and VQ for security applications
Jadhav et al.Speech recognition to distinguish gender and a review and related terms
CN117037840A (en)Abnormal sound source identification method, device, equipment and readable storage medium
Stouten et al.Joint removal of additive and convolutional noise with model-based feature enhancement
Nainan et al.A comparison of performance evaluation of ASR for noisy and enhanced signal using GMM
Ishac et al.Speaker identification based on vocal cords’ vibrations’ signal: effect of the window

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination
RJ01Rejection of invention patent application after publication

Application publication date:20230516

RJ01Rejection of invention patent application after publication

[8]ページ先頭

©2009-2025 Movatter.jp