Background
With the continuous development and breakthrough of information technology, networks have become an important part in our study, work and life. Meanwhile, personal identification has been widely used in various fields. However, due to the improvement of the awareness of network security, the traditional identity recognition methods such as certificates, passwords, static passwords, etc. cannot meet the needs of people because of the problems of easy stealing, copying and cracking. With the continuous development of the field of artificial intelligence, the technology of biometric identity recognition is brought forward. Identification methods such as face recognition, fingerprint recognition, iris recognition and electrocardiographic signal recognition have been proposed. Compared with other biological signals, the electrocardiosignal has great advantages to a certain extent. On one hand, the electrocardiosignal is used as the biological characteristic of identity recognition, and the biggest advantage is that the generation mechanism is complex and is not easy to forge. On the other hand, the electrocardiosignal is a one-dimensional signal, and compared with two-position signals such as a fingerprint and an iris, the electrocardiosignal has the advantages of small calculation amount, easiness in storage and processing and the like.
The ECG signal identification technology is provided on the basis of the assumption that the waveform of the electrocardiosignal keeps relatively stable in a certain period, and the assumption is true for normal people, namely the ECG signal meets the requirement of identification on stability. Meanwhile, because the ECG signals are influenced by various aspects such as the age, the sex, the body type and the physical condition of the human body, the ECG signals of different individuals have larger difference, and therefore, the ECG signals meet the requirement of identity recognition on the difference. In summary, the ECG signal meets the basic requirements of biometric identification.
Many researchers at home and abroad make a lot of research on identification based on electrocardiosignals. The main contents of the research comprise the denoising of an ECG signal, the feature point extraction of the ECG signal, the dimension reduction of the heart beat feature and the design of an identity recognition classifier.
In 2001, Lena Biel et al put forward identity recognition based on ECG signals for the first time, the people use commercial ECG recording equipment to acquire electrocardiosignals, extract characteristics such as time, amplitude and the like of 30 electrocardiosignals, classify the electrocardiosignals by a class simulation soft independent modeling method, select 20 experimenters of different age groups to participate in experiments, and the identity recognition rate reaches 100%. However, this method cannot realize automatic identity recognition and has high algorithm complexity. Since then, relevant researchers have made targeted improvements to ECG identification algorithms.
Israel et al locate the time points and time intervals of these waveforms by finding the local extrema around the P-wave, R-wave and T-wave in each heartbeat, and extract 15 of them, and classify them using linear discriminant analysis, the identification rate of this method reaches 100%, and the heartbeat identification rate reaches 81%.
In subsequent work, Shen et al proposed an identity recognition method based on lead electrocardiosignals, individually adapting to template matching and decision neural network methods, and obtaining recognition rates of 95% and 80%, respectively.
At present, ECG signal identification still faces the problems of large calculation amount, insufficient identification precision and the like. With the advent of the big data era, data is converted from a simple processing object into a basic resource, which greatly increases the complexity of feature extraction and classification of heartbeat data, and the efficiency and accuracy become inevitable bottlenecks no matter the heartbeat classification or identity recognition.
Disclosure of Invention
The invention aims to provide a method which is efficient, low in calculation amount and capable of realizing automatic electrocardiosignal identity recognition, so as to solve the problem of huge data volume brought by a big data era. On the basis of extracting complete waveforms to form morphological characteristics based on independent positioning of R wave peak points, by combining singular value decomposition and linear discriminant analysis, on the premise of ensuring a certain recognition rate, the characteristic dimensionality is reduced to the maximum extent, so that the calculation cost is reduced. The classifier is designed by adopting a generalized regression neural network, the algorithm is an improvement based on a radial basis function neural network, and compared with the conventional neural network, the speed is obviously improved. The method has simple feature extraction, does not need excessive dependence on positioning, and can maximize the resource utilization rate. Compared with the traditional BP neural network and RBF neural network, the method effectively improves the training speed and precision of the identity recognition.
The technical scheme of the invention comprises a GRNN-based identity recognition method, which is characterized by comprising the following steps: A. acquiring an electrocardiosignal sample data set of the heart beat data, and removing noise in the electrocardiosignal by adopting wavelet transformation; B. positioning an R wave peak point of the denoised electrocardiosignals, intercepting a fixed point number before and after the R wave peak point to divide a heartbeat so as to construct morphological characteristics of the electrocardiosignals, and respectively constructing a training set heartbeat characteristic database and a testing set heartbeat characteristic database; C. removing redundant characteristics in the electrocardiosignals by using a singular value decomposition method so as to reduce the correlation among heart beat data of different individuals and increase the correlation among the heart beat data of the same individual; D. d, performing dimensionality reduction on the electrocardiosignals obtained in the step C by using a linear discriminant analysis method to obtain feature vectors for training and testing; E. training a generalized regression neural network classifier, identifying the input electrocardiosignals, and outputting the identity information of the individuals according to the principle of multi-heartbeat voting.
According to the GRNN-based identity recognition method, the step A further comprises the steps of obtaining an electrocardiosignal sample data set comprising heart beat data of a plurality of users and a plurality of periods, and removing noise in the electrocardiosignal by adopting wavelet transformation, wherein the noise comprises baseline drift, electromyographic interference and power frequency interference.
According to the GRNN-based identity recognition method, the step A of removing the noise in the electrocardiosignals by adopting wavelet transformation further comprises the following steps: a101, performing nine-layer decomposition on the electrocardiosignal by using a db4 wavelet in a Daubechies wavelet family to obtain a decomposed electrocardiosignal data set; a102, setting wavelet coefficients of high-frequency components of the first layer decomposition to zero to remove high-frequency interference; a103, zeroing wavelet coefficients of low-frequency components of the ninth layer of decomposition to remove low-frequency interference, thereby obtaining a denoised ECG signal data set; and A104, performing threshold quantization processing on the wavelet coefficient in a wavelet domain through a threshold function, and performing inverse discrete wavelet transform according to the estimated wavelet coefficient obtained after quantization processing to obtain a reconstructed electrocardiosignal.
According to the GRNN-based identity recognition method, the threshold quantization processing of the wavelet coefficients by the threshold function of step a104 includes: performing a quantization process using a threshold function given a threshold value λ, wherein
Wherein N is the sampling point number of the electrocardiosignal, and sigma is obtained by wavelet coefficient estimation.
According to the GRNN-based identity recognition method, the step A of removing the noise in the electrocardiosignals by adopting wavelet transformation further comprises the following steps: b101, performing R wave peak point positioning on the denoised electrocardiosignals to obtain R wave peak point sets of all heartbeats; b102, intercepting fixed points in front of and behind the R wave peak value point of the denoised electrocardiosignal, wherein the fixed points and the R wave peak value point jointly form an independent cardiac beat vector, so that each section of electrocardiosignal is divided into a plurality of cardiac beat data to obtain the morphological characteristics of each cardiac beat; and B103, dividing the heart beat data of the same individual into two parts which are respectively used for constructing a training set heart beat characteristic database and a testing set heart beat characteristic database.
According to the GRNN-based identity recognition method, the step C further comprises the following steps: c101, constructing the segmented independent single-period heart beat data into an m × n dimensional heart beat feature matrix, wherein the heart beat feature matrix is as follows:
and carrying out singular value decomposition on the feature matrix, wherein the decomposition step method comprises the following steps:
wherein]Singular values of the feature matrix; c102, the singular values obtained in the step C101 are arranged by size to obtain [ 2 ]](ii) a C103, taking the first L larger eigenvalues to reconstruct an eigenvalue matrix, namely
According to the GRNN-based identity recognition method, step D further comprises the following steps: d101, calculating a heart beat data set after singular value decomposition
The mean vector μ; d102, calculating an inter-class divergence matrix S through the mean vector mu
bAnd an intra-class divergence matrix S
wD103, solving the eigenvalue to obtain the eigenvalue and the eigenvector of the matrix; d104, arranging the eigenvectors in a descending order according to the magnitude of the eigenvalue, and selecting the first K eigenvectors to form a projection matrix W; d105. The dataset D is projected into a new subspace, whose calculation process is Y ═ X × W.
According to the GRNN-based identity recognition method, step E further comprises the following steps: e101, training the generalized regression neural network by taking the training set heartbeat feature subjected to dimensionality reduction as input of GRNN and taking the test set heartbeat feature as output of GRNN; and E102, performing identity recognition on each heartbeat, and finally determining the final output in a heartbeat voting mode.
Detailed Description
The conception, the specific structure and the technical effects of the present invention will be clearly and completely described in conjunction with the embodiments and the accompanying drawings to fully understand the objects, the schemes and the effects of the present invention.
It should be noted that, unless otherwise specified, when a feature is referred to as being "fixed" or "connected" to another feature, it may be directly fixed or connected to the other feature or indirectly fixed or connected to the other feature. Furthermore, the descriptions of upper, lower, left, right, etc. used in the present disclosure are only relative to the mutual positional relationship of the constituent parts of the present disclosure in the drawings. As used in this disclosure, the singular forms "a", "an", and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. Furthermore, unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art. The terminology used in the description herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used herein, the term "and/or" includes any combination of one or more of the associated listed items.
It will be understood that, although the terms first, second, third, etc. may be used herein to describe various elements, these elements should not be limited by these terms. These terms are only used to distinguish one element of the same type from another. For example, a first element could be termed a second element, and, similarly, a second element could be termed a first element, without departing from the scope of the present disclosure. The use of any and all examples, or exemplary language ("e.g.," such as "or the like") provided herein, is intended merely to better illuminate embodiments of the invention and does not pose a limitation on the scope of the invention unless otherwise claimed.
Fig. 1 is a schematic flow chart of an identification method based on electrocardiographic signals (GRNN) according to an embodiment of the present invention. With reference to fig. 1, the steps of the identification method based on electrical signals in the embodiment are as follows:
s100, acquiring an electrocardiosignal sample data set comprising multiple periods of cardiac beat data of multiple users, wherein each user comprises electrocardiosignals acquired at two ends at different time, and removing noises such as baseline drift, electromyographic interference, power frequency interference and the like in the electrocardiosignals by adopting wavelet transform (DWT).
S200, positioning an R wave peak point of the denoised signal, intercepting a number of segmented heartbeats of fixed points before and after the R wave peak point to construct morphological characteristics of the electrocardiosignals, and respectively taking the electrocardiosignals with different two ends as a training set heartbeat characteristic database and a testing set heartbeat characteristic database.
S300, removing redundant features in the electrocardiosignals by using a singular value decomposition method to reduce the correlation among different individual heartbeat data and increase the correlation among different heartbeat data of the same individual.
And S400, reducing the dimension of the electrocardiosignals obtained in the last step by using a linear discriminant analysis method to obtain a feature vector for training and testing.
S500, training a generalized regression neural network classifier, identifying the input electrocardiosignals, and outputting the identity information of the individual according to the multi-heartbeat voting principle.
Fig. 2 is a flowchart illustrating a method for removing noise interference according to an embodiment of the present invention.
The following describes the process of the method for removing noise interference in the embodiment of the present invention in detail with reference to the drawings. The specific substeps are as follows:
s101, selecting a 'db 4' wavelet in a Daubechies wavelet family, and performing nine-layer decomposition on the ECG signal according to frequency distribution of different noises and frequency distribution of different wave bands of the ECG signal to obtain wavelet coefficients of different scales.
S102, setting the wavelet coefficient of the high-frequency component of the first-layer decomposition to zero to remove high-frequency interference.
S103, setting the wavelet coefficient of the low-frequency component of the ninth layer decomposition to zero to remove low-frequency interference, thereby obtaining a denoised ECG signal data set.
S104, performing threshold quantization processing on the wavelet coefficients in a wavelet domain through a threshold function, namely, firstly giving a threshold value lambda,
wherein N is the number of sampling points of the electrocardiosignal, and sigma can be obtained by wavelet coefficient estimation.
S105, Inverse Discrete Wavelet Transform (IDWT) is carried out according to the estimated wavelet coefficient obtained after processing, and the reconstructed electrocardiosignal, namely the denoised signal can be obtained.
Fig. 3 shows a method for locating an R-wave peak point, segmenting a heartbeat, and partitioning a training set heartbeat feature database and a testing set heartbeat feature database according to an embodiment of the present invention, which includes the following specific steps:
s201, carrying out R wave peak point positioning on the denoised electrocardiosignals to obtain R wave peak point sets of all heartbeats;
s202, intercepting the fixed point number in front of and behind the R wave peak value point of the denoised ECG signal, wherein the fixed point number and the R wave peak value point jointly form an independent cardiac beat vector, so that each section of the electrocardiosignal is divided into a plurality of cardiac beat data;
and S203, dividing the heart beat data of the same individual into two parts which are respectively used for constructing a training set heart beat characteristic database and a testing local heart beat characteristic database.
Fig. 4 is a schematic flow chart illustrating a process of removing redundant features of cardiac beat signals based on SVD according to an embodiment of the present invention, which is described in detail with respect to the step of removing redundant cardiac electrical signals by singular value decomposition method in step S300 in fig. 1. The method comprises the following substeps:
s301, constructing the segmented heart beat feature database into a dimensional matrix, wherein A is a non-singular and row full-rank matrix:
s302, performing singular value decomposition on the matrix X to obtain: x ═ U ∑ VTIn the formula, U is an m multiplied by m dimensional matrix; v is an n x n dimensional matrix; Σ is an m × n dimensional matrix. The main diagonal elements are singular values and are arranged from small to large;
s303, the matrix a may be expressed as a sum of useful information and redundant information:
in the formula (I), the compound is shown in the specification,
is a useful signal subspace, and N is a redundant signal subspace,the solution to the original problem is converted into a search matrix
The better the approximation effect, the more obvious the redundancy removing effect. And reserving the first k larger singular values of the diagonal matrix, returning other singular values to zero, and reconstructing a heart beat feature matrix by using the inverse process of the singular values, namely the heart beat feature database containing more useful signals.
Fig. 5 is a schematic flow chart illustrating a process of reducing the dimension of the heartbeat feature matrix based on LDA according to an embodiment of the present invention, which is described in detail with respect to the step of reducing the dimension of the electrocardiosignal by using the linear discriminant analysis method in step S400 in fig. 1. The method comprises the following substeps:
s401, converting the heart beat feature database into a heart beat data set with sample types, wherein A { (x)1,y1),(x2,y2),...,(xk,yk) In which arbitrary heart beat sample xiIs an n-dimensional vector, yi∈{C1,C2,...CqDenotes a set of categories, defines Nj(j is 1,2, … k) is the number of j-th class samples, Xj(j-1, 2, … k) is the set of class j samples, and μj(j 1,2, … k) is the mean vector of the j-th class samples, Σj(j ═ 1,2, … k) is the covariance matrix of the jth sample.
S402, a common optimization objective function of LDA is:
wherein S
bIs an inter-class divergence matrix, S
wFor the intra-class divergence matrix, the optimization process of J (W) can be converted into
And S403, the rightmost side of the above formula is a Rayleigh quotient, the maximum value is the maximum eigenvalue of the matrix, and the corresponding matrix W is a matrix formed by stretching eigenvectors corresponding to the maximum d eigenvalues.
And S404, projecting the heartbeat feature data set into a new subspace, wherein Y is equal to X W.
Fig. 6 is a schematic flowchart illustrating a process of performing heartbeat identification based on a generalized regression neural network according to an embodiment of the present invention;
fig. 7 is a structural diagram of a generalized recurrent neural network classifier according to an embodiment of the present invention.
The embodiment of the present invention is described in detail with respect to the design of the classifier included in step S500 in fig. 1 and the step of identifying the identity of the electrocardiosignal by the structural diagram of the generalized recurrent neural network described in fig. 7, and the method includes the following sub-steps:
s501, constructing a generalized regression neural network, taking a training set heartbeat characteristic database as input of the neural network, and taking a testing set heartbeat characteristic database as output of the neural network. The specific training process of the network is as follows:
the number of input neurons is equal to the dimension of the heartbeat feature, each neuron is a simple distribution unit, and the input heartbeat feature is directly transmitted to the mode layer; the number of neurons in the mode layer is equal to the number of heart beat samples, each neuron corresponds to a different sample, and the transfer function of the neurons in the mode layer is as follows:
the summing layer performs the summing in two different ways:
the utility model provides a carry out the arithmetic summation to the output of all mode layer neurons, its mode layer and each neuron's connection weight is 1, and the transfer function is:
and the other method is to carry out weighted summation on the neurons of all the mode layers, wherein the connection weight value between each ith neuron in the mode layers and the jth numerator summation neuron in the summation layer is the ith output sample YiThe j-th element in (2), the transfer function is:
the number of neurons in the output layer is equal to the dimension of the output vector in the heart beat characteristic sample, each neuron divides the output of the summation layer, and the output of the neuron j corresponds to the estimation result
The jth element of (i.e.
Wherein j is 1,2, …, k;
s502, obtaining an experimental result by adopting a heart beat voting method, classifying each heart beat, and if a jth individual is selected by most heart beats in a section of electrocardiosignals, considering the section of electrocardiosignals as the electrocardiosignals of the jth individual and outputting information of the jth individual.
The technical scheme of the invention provides further explanation to explain the steps, and concretely comprises the following steps:
in a specific embodiment, the electrocardiosignals in the national MIT-BIH database are adopted, the sampling frequency of the electrocardiosignals is 500HZ, each electrocardiosignal is 20 seconds, and the resolution is 12 bits. The embodiment of the invention adopts MATLAB software to carry out simulation. Firstly, 88 individual electrocardiosignals are selected to carry out wavelet transformation to remove noise, an R wave peak point is positioned, then 150 points are intercepted forwards and 300 points are intercepted backwards by taking the R wave peak point as a reference, and a morphological feature with dimension of 451 is constructed; redundant information of the heart beat characteristic is removed by using SVD, and LDA dimension reduction processing is carried out; and finally, constructing GRNN by using MATLAB, training and testing the neural network, and completing the whole process of identity recognition.
It should be recognized that embodiments of the present invention can be realized and implemented by computer hardware, a combination of hardware and software, or by computer instructions stored in a non-transitory computer readable memory. The methods may be implemented in a computer program using standard programming techniques, including a non-transitory computer-readable storage medium configured with the computer program, where the storage medium so configured causes a computer to operate in a specific and predefined manner, according to the methods and figures described in the detailed description. Each program may be implemented in a high level procedural or object oriented programming language to communicate with a computer system. However, the program(s) can be implemented in assembly or machine language, if desired. In any case, the language may be a compiled or interpreted language. Furthermore, the program can be run on a programmed application specific integrated circuit for this purpose.
Further, the operations of processes described herein can be performed in any suitable order unless otherwise indicated herein or otherwise clearly contradicted by context. The processes described herein (or variations and/or combinations thereof) may be performed under the control of one or more computer systems configured with executable instructions, and may be implemented as code (e.g., executable instructions, one or more computer programs, or one or more applications) collectively executed on one or more processors, by hardware, or combinations thereof. The computer program includes a plurality of instructions executable by one or more processors.
Further, the methods may be implemented in any type of computing platform operatively connected to a suitable connection, including but not limited to a personal computer, mini computer, mainframe, workstation, networked or distributed computing environment, separate or integrated computer platform, or in communication with a charged particle tool or other imaging device, and the like. Aspects of the invention may be embodied in machine-readable code stored on a non-transitory storage medium or device, whether removable or integrated into a computing platform, such as a hard disk, optically read and/or write storage medium, RAM, ROM, or the like, such that it may be read by a programmable computer, which when read by the storage medium or device, is operative to configure and operate the computer to perform the procedures described herein. Further, the machine-readable code, or portions thereof, may be transmitted over a wired or wireless network. The invention herein includes these and other different types of non-transitory computer-readable storage media when such media include instructions or programs that implement the above steps in conjunction with a microprocessor or other data processor. The invention also includes the computer itself when programmed according to the methods and techniques of the present invention.
A computer program can be applied to input data to perform the functions herein to transform the input data to generate output data that is stored to non-volatile memory. The output information may also be applied to one or more output devices, such as a display. In a preferred embodiment of the invention, the transformed data represents physical and tangible objects, including particular visual depictions of physical and tangible objects produced on a display.
The present invention is not limited to the above embodiments, and any modifications, equivalent substitutions, improvements, etc. within the spirit and principle of the present invention should be included in the scope of the present invention as long as the technical effects of the present invention are achieved by the same means. The invention is capable of other modifications and variations in its technical solution and/or its implementation, within the scope of protection of the invention.