Disclosure of Invention
The invention aims to provide a coded lock system and method based on voice and text recognition and a safe box, so as to overcome the defects in the prior art.
In order to solve the technical problems, the technical scheme of the invention is as follows:
a voice and text recognition based combination lock system comprising:
the voice acquisition module is used for acquiring voice information of a user and comprises a microphone;
the text input module is used for inputting text information by a user and comprises a keyboard;
the storage module is used for storing voice password information and text password information of a user;
the processing module is used for receiving and matching the output signals of the voice acquisition module and the text input module with the voice password information and the text password information in the storage module and generating an instruction according to a matching result, and comprises a single chip microcomputer;
the voice output module is used for playing specific voice prompt information according to the instruction of the processing module and comprises a loudspeaker;
the coded lock module is used for being opened or closed according to the instruction of the processing module and comprises an electromagnetic lock;
the voice acquisition module, the text input module, the storage module, the voice output module and the coded lock module are all connected with the processing module.
In a preferred embodiment of the present invention, the coded lock module further includes a dc voltage booster, and the dc voltage output by the processing module is boosted by the dc voltage booster and then transmitted to the electromagnetic lock to control the bolt of the electromagnetic lock to retract.
In a preferred embodiment of the present invention, the speech cipher information includes speech feature parameters and speech instructions, and the speech feature parameters include pitch period and linear predictive cepstrum coding.
The invention also provides a coded lock control method based on voice and text recognition, and the system comprises the following steps:
collecting voice information of a user through a voice collecting module;
the processing module receives the voice information acquired by the voice acquisition module, and matches the voice information with the voice password information in the storage module after preprocessing;
if the matching is successful, the processing module sends an instruction to the coded lock module, and the coded lock module unlocks;
and if the matching is unsuccessful, the processing module sends an instruction to the voice output module and/or the text input module.
In a preferred embodiment of the present invention, if the matching is unsuccessful, the sending of the instruction to the voice output module and/or the text input module by the processing module is specifically:
if the number of times of the voice information continuously matched with the voice password information in the storage module by the processing module does not reach a preset threshold value, the processing module sends an instruction to the voice output module, and the voice output module plays a voice prompt of failed recognition to prompt the user to input the voice information again;
if the number of times of unsuccessful connection and matching of the voice information and the voice password information in the storage module by the processing module reaches a preset threshold value, the processing module sends an instruction to the voice output module and the text input module, and the text input module is started to prompt a user to input text information;
the processing module receives the text information input by the text input module and matches the text information with the text password information in the storage module;
if the matching is successful, the processing module sends an instruction to the coded lock module, and the coded lock module unlocks;
if the matching is unsuccessful, the processing module sends an instruction to the password lock module, and the password lock module is locked.
In a preferred embodiment of the present invention, the step of receiving the voice information collected by the voice collecting module and matching the voice information with the voice password information in the storage module by the processing module specifically comprises:
the processing module receives the voice information collected by the voice collecting module, matches the voice characteristic parameters in the voice information with the voice characteristic parameters in the storage module by using a DTW algorithm, and simultaneously matches the voice command in the user voice information with the voice command in the storage module.
In a preferred embodiment of the present invention, the method further comprises the steps of inputting the voice password information and the text password information:
before the voice recognition device is used for the first time, a user inputs voice information at least twice through the voice acquisition module, the processing module processes the voice information acquired by the voice acquisition module to obtain voice characteristic parameters and voice commands and stores the voice characteristic parameters and the voice commands into the storage module, the user inputs a text password through the text input module, and the processing module stores the text password into the storage module after recognition.
In a preferred embodiment of the present invention, the preprocessing of the voice information collected by the voice collecting module by the processing module specifically comprises:
the processing module performs A/D conversion, noise removal and endpoint detection on the voice information acquired by the voice acquisition module.
In a preferred embodiment of the present invention, the specific method for obtaining the speech feature parameters comprises:
the processing module carries out linear prediction on voice information, obtains a prediction residual error, obtains an autocorrelation function on a residual error signal, and finds a first peak point position except a zero point so as to obtain a pitch period; when the voice information is an autoregressive signal, the processing module obtains linear prediction cepstrum coding by utilizing linear prediction analysis.
The invention also provides a safe based on speech and text recognition, characterized in that the safe comprises the system according to any one of claims 1-3 or the control method according to any one of claims 4-8.
Compared with the prior art, the invention has the beneficial effects that:
(1) the invention overcomes the problem of insufficient security of a single password by combining the voice recognition and the text password, and improves the security of the coded lock; furthermore, the voice recognition adopts double recognition of the voice and the voice instruction of a specific person, so that the safety of the coded lock is further improved; and furthermore, the pitch period and Linear Predictive Cepstrum Code (LPCC) are jointly used as characteristic parameters of the voice recognition of the specific person, so that the accuracy of the voice recognition of the specific person is improved, and the safety of the coded lock is further improved.
(2) According to the invention, through the unlocking mode that the voice password is first and the text password is later, the frequency of text input is reduced through convenient voice input, so that the operation convenience of the coded lock is improved.
Detailed Description
So that the manner in which the above recited objects, features and advantages of the present invention can be understood in detail, a more particular description of the invention, briefly summarized above, may be had by reference to the embodiments thereof which are illustrated in the appended drawings. It should be noted that the embodiments and features of the embodiments of the present application may be combined with each other without conflict.
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention, however, the present invention may be practiced in other ways than those specifically described herein, and therefore the scope of the present invention is not limited by the specific embodiments disclosed below.
As shown in fig. 1, a coded lock system based on speech and text recognition includes a speech acquisition module 100, a text input module 200, a processing module 300, a storage module 400, a speech output module 500, and a coded lock module 600, wherein the speech acquisition module 100, the text input module 200, the storage module 400, the speech output module 500, and the coded lock module 600 are all connected to the processing module 300. The system overcomes the problem of insufficient safety of a single password by combining voice recognition and a text password, and improves the safety of the coded lock. It is understood that the text password includes one or any combination of characters, letters, numbers and special symbols, and the user inputs the text password through the text input module 200.
Preferably, the system further comprises an online debugging module, wherein the online debugging module comprises an online debugger, and the online debugging module is used for a system developer to maintain the system and repair bug occurring in the system.
Specifically, the voice collecting module 100 is used for collecting voice information of a user. The speech acquisition module 100 includes a microphone. As shown in fig. 2, the voice collecting module 100 can collect voice signals by using a front-end circuit including a microphone. The front-end circuit is specifically shown in fig. 2, wherein R21, R24 and C25 are filter circuits, R22 and R23 provide bias for the microphone to form a differential circuit, so that an electrical signal can be differentially input to the MICP and MICN pins of the SPCE061A single chip microcomputer, and C30 and C32 are direct current suppression signals.
The text input module 200 is used for a user to input text information. Text input module 200 includes a keyboard.
The storage module 400 is used for storing the voice password information and the text password information of the user, and the voice password information and the text password information are pre-entered before the user uses the information for the first time. The storage module 400 includes an out-extending Flash.
The processing module 300 is configured to receive and match the output signals of the voice collecting module 100 and the text input module 200 with the voice password information and the text password information in the storage module 400, and generate an instruction according to a matching result. The processing module 300 includes a single chip microcomputer, and may specifically adopt an SPCE061A single chip microcomputer.
The voice output module 500 is used for playing the specific voice prompt information according to the instruction of the processing module 300. The voice output module 500 includes a speaker.
The combination lock module 600 is used for opening or closing according to the instruction of the processing module 300. The combination lock module 600 includes an electromagnetic lock.
In a preferred embodiment of the present invention, the coded lock module 600 further includes a dc booster, and the dc voltage output by the processing module 300 is boosted by the dc booster and then transmitted to the electromagnetic lock to control the bolt of the electromagnetic lock to retract. Specifically, the processing module 300 outputs a direct current voltage of about 3V through the I/O port, the voltage rises to 12V or 24V through the direct current booster, and when the electromagnetic lock is connected to a direct current power supply of 12V or 24V, the lock bolt retracts into the lock body to realize unlocking.
In a preferred embodiment of the present invention, the voice password information includes a voice characteristic parameter and a voice command. The voice recognition of the system adopts dual recognition of voice and voice instruction of a specific person, so that the safety of the coded lock is further improved. The speech characteristic parameters include pitch period and linear predictive cepstral coding. The system adopts the pitch period and the Linear Prediction Cepstrum Code (LPCC) as the characteristic parameters of the voice recognition of the specific person together, improves the accuracy of the voice recognition of the specific person, and further improves the safety of the coded lock.
As shown in fig. 3, the present invention further provides a method for controlling a combination lock based on voice and text recognition, comprising the following steps:
the voice information of the user is collected by the voice collecting module 100.
The processing module 300 receives the voice information collected by the voice collecting module 100, and matches the voice information with the voice password information in the storage module 400 after preprocessing.
If the matching is successful, the processing module 300 sends an instruction to the combination lock module 600, and the combination lock module 600 unlocks.
If the matching is not successful, the processing module 300 sends an instruction to the voice output module 500 and/or the text input module 200.
The control method reduces the frequency of text input through convenient voice input by adopting an unlocking mode that the voice password is first and the text password is later, thereby improving the operation convenience of the coded lock.
In a preferred embodiment of the present invention, the method further comprises the steps of inputting the voice password information and the text password information:
before the initial use, a user inputs voice information at least twice through the voice acquisition module 100, the processing module 300 processes the voice information acquired by the voice acquisition module 100 to obtain voice characteristic parameters and voice commands and stores the voice characteristic parameters and the voice commands into the storage module 400, the user inputs a text password through the text input module 200, and the processing module 300 stores the text password into the storage module 400 after recognition.
Generally speaking, a user needs to perform two times of voice training input, the system gives a voice signal input prompt, the voice of the user enters a voice signal acquisition front-end circuit through a microphone, when a voice command is sent for the first time, the SPCE061A performs a/D conversion and preprocessing on the acquired voice signal, and then the pitch period and LPCC of the user are extracted as voice characteristic parameters. When a voice command is sent out for the second time, the process is the same as the first time, and the feature parameters obtained by two times of training are used as feature parameter templates and stored in the extended Flash. And training a plurality of different voice commands, storing the trained voice commands into a Flash voice library, and setting a long password to deal with the condition that the voice recognition is unsuccessful for three times.
If the matching is unsuccessful, the processing module 300 sends an instruction to the voice output module 500 and/or the text input module 200, specifically:
if the number of times of the processing module 300 that the voice information is continuously matched with the voice password information in the storage module 400 is unsuccessful does not reach the preset threshold value, the processing module 300 sends an instruction to the voice output module 500, and the voice output module 500 plays the voice prompt of the recognition failure to prompt the user to input the voice information again.
If the number of unsuccessful times of connecting and matching the voice message with the voice password message in the storage module 400 by the processing module 300 reaches a preset threshold, the processing module 300 sends an instruction to the voice output module 500 and the text input module 200, and the text input module 200 is started to prompt the user to input the text message.
The preset threshold is generally set to 3 times, but is not limited thereto, and may be 1 time, 2 times, or 4 times, 5 times, or more. The preset threshold may be set by the manufacturer or may be set by the user.
The voice information input by the user is generally controlled within 3 s. And after unlocking, carrying out zero clearing treatment on the number of times of unsuccessful matching.
The processing module 300 receives the text information input by the text input module 200 and matches it with the text password information in the storage module 400. If the matching is successful, the processing module 300 sends an instruction to the combination lock module 600, and the combination lock module 600 unlocks. If the matching is not successful, the processing module 300 sends an instruction to the code lock module 600, and the code lock module 600 is locked. Generally speaking, that is, the circuit of the keyboard is switched on by the control of the power supply of the single chip microcomputer, and after the circuit is switched on, the text password which is set in advance can be input through the keyboard for authentication, if the authentication does not work any more, the coded lock can be locked, and only the manufacturer can be found to restore the factory setting. The text password may generally be a long number password.
The processing module 300 receives the voice information collected by the voice collecting module 100, and matches the voice information with the voice password information in the storage module 400 specifically as follows:
the processing module 300 receives the voice information collected by the voice collecting module 100, matches the voice feature parameters in the voice information with the voice feature parameters in the storage module 400 by using the DTW algorithm, and matches the voice command in the user voice information with the voice command in the storage module 400. Namely, the DTW algorithm of the beginning and end of the slack time sequence is used to match the voice feature parameters in the voice message with the voice feature parameters in the storage module 400, and the command word recognition function of the DTW algorithm is used to match the voice command in the user voice message with the voice command in the storage module 400. DTW is a method for measuring the similarity between two time sequences with different lengths, which compares the time sequence similarity between the voice command obtained from the voice acquisition module 100 and the voice command in the storage module 400, and relaxes the time sequences of the two time sequences, i.e., selects a minimum value from the starting point (1,1), (1,2), (2,1), (1,3), (3,1) and the ending point (N, M), (N-1, M), (N, M-1), (N-2, M), (N, M-2), and selects a shortest distance between two voice samples after the relaxation of the corresponding points. For a voice command to be recognized, the distance is calculated by matching each voice command in the memory module 400 using the DTW algorithm. The shortest distance, i.e. the most similar one, is found to be the recognized speech command.
The preprocessing of the voice information acquired by the voice acquisition module 100 by the processing module 300 specifically includes: the processing module 300 performs a/D conversion, noise removal, and endpoint detection on the voice information collected by the voice collecting module 100.
The specific method for obtaining the voice characteristic parameters comprises the following steps: the processing module 300 performs linear prediction on the voice information to obtain a predicted residual, and then obtains an autocorrelation function on the residual signal to find out the position of a first peak point except a zero point, so as to obtain a pitch period; when the speech information is an autoregressive signal, the processing module 300 obtains a linear prediction cepstrum code using linear prediction analysis. That is, the processing module 300 performs linear prediction on the voice information, approximates the prediction residual through linear combination of a plurality of past voice sample values, and then obtains an autocorrelation function on the residual signal, where the period of the autocorrelation function is the same as the period of the voice signal, so that when the first peak point position of the autocorrelation function except the zero point is found, the pitch period can be obtained. The linear prediction cepstrum coding is calculated by a recursion formula based on Linear Prediction Coding (LPC), a prediction coefficient alpha l can be obtained by a formula (1) LPC model, and then a cepstrum coefficient Cn is calculated by the alpha l through a formula (2), so that the linear prediction cepstrum coding can be obtained.
Wherein P is the order of LPC model (P ═ 12), X (n) is the sampling value,
Is the prediction value, and α l is the prediction coefficient.
The invention also provides a safe based on voice and text recognition, which comprises the above coded lock system based on voice and text recognition or the above coded lock control method based on voice and text recognition.
In conclusion, the invention overcomes the problem of insufficient security of a single password by combining the voice recognition and the text password, and improves the security of the coded lock; furthermore, the voice recognition adopts double recognition of the voice and the voice instruction of a specific person, so that the safety of the coded lock is further improved; and furthermore, the pitch period and Linear Predictive Cepstrum Code (LPCC) are jointly used as characteristic parameters of the voice recognition of the specific person, so that the accuracy of the voice recognition of the specific person is improved, and the safety of the coded lock is further improved.
In the several embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. The above-described device embodiments are only illustrative, for example, the division of the unit is only a logical functional division, and in actual implementation, there may be other divisions, such as: multiple units or components may be combined, or may be integrated into another system, or some features may be omitted, or not implemented. In addition, the coupling, direct coupling or communication connection between the components shown or discussed may be through some interfaces, and the indirect coupling or communication connection between the devices or units may be electrical, mechanical or other forms.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units; can be located in one place or distributed on a plurality of network units; some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, all the functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may be separately regarded as one unit, or two or more units may be integrated into one unit; the integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional unit.
The above description is only an embodiment of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive of changes or substitutions within the technical scope of the present invention, and all such changes or substitutions are included in the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the appended claims.