Disclosure of Invention
The video image fusion method based on wind turbine generator fault identification aims at solving at least one of the technical problems existing in the prior art.
In one aspect of the disclosure, a video image fusion method based on wind turbine fault identification is provided, the method comprising:
collecting video image information of a wind turbine generator;
carrying out image framing treatment on the video image information to obtain multiple images of multiple sources or single source;
performing gray level conversion on the image subjected to the framing treatment, and performing pixel level image fusion under wavelet decomposition on the converted gray level image to obtain a first fusion image;
carrying out pixel-level image fusion under the Laplacian pyramid on the image subjected to the framing treatment to obtain a second fusion image;
a SIFT algorithm is adopted on the image after framing treatment, and a third fusion image under the characteristic level is obtained;
calculating entropy, joint entropy and root mean square error of the first fusion image, the second fusion image and the third fusion image;
designing a GUI man-machine interaction interface, embodying three fusion processing algorithms and calculation of index quantity in the GUI man-machine interaction interface, obtaining a final fusion image through the obtained index value, and carrying out fault identification on the wind turbine by using the final fusion image.
In some optional embodiments, the performing pixel-level image fusion under wavelet decomposition on the converted gray-scale image to obtain a first fused image includes:
and projecting the converted gray level image onto a group of wavelet functions, and decomposing the gray level image into superposition of the group of wavelet functions to obtain the first fusion image.
In some optional embodiments, the performing pixel-level image fusion under the laplacian pyramid on the image after the framing process to obtain a second fused image includes:
taking the image after framing as an original image, taking the original image as a 0 th layer of a Gaussian pyramid, and assuming the original image as G0, and convolving the original image by using a Gaussian kernel w, wherein the Gaussian kernel is represented by the following formula (1):
the image obtained by downsampling after convolution is used as a layer 1G 1 of the image tower, so that after the image is processed, the lower layer image can be changed into 4 times of the upper layer image in size; taking the obtained image as an input image, and repeatedly convoluting and downsampling to obtain images of the 1 st layer to the N th layer to form a pyramid-type Gaussian image tower;
convolving and upsampling the image of the upper layer of the Gaussian pyramid to obtain a predicted image, namely the following formula (2):
the magnification operator Expand can be expressed as the following formulas (3) to (5):
0<l≤N,0≤i<Rl ,0≤j<Cl (4)
wherein G isl Representing the first level of the pyramid of the original image, Gl* Representing the first level of the predicted image pyramid, i, j representing the behavior i, j, respectively, of the image, m, N representing the number of rows and columns, respectively, of the Gaussian kernel, N representing the highest level of the pyramid, Rl Represents the maximum number of lines, C, in the imagel Representing the maximum number of columns in the image;
the formulas (3) to (5) realize that even rows and columns deleted in the Gaussian construction process are inserted into 0, and then the Gaussian kernel w is used for convolution, namely filtering processing is carried out, so that an image with the same size as that before downsampling is obtained as a predicted image;
subtracting the predicted image from the next layer image to obtain a difference image, and repeatedly iterating to obtain a series of decomposed images which are arranged into pyramids, namely Laplacian pyramids;
fusing all layers of unified processing to obtain a corresponding image tower for image reconstruction to obtain the second fused image, wherein the unified processing is as follows: the layers except the top layer take the largest absolute value, and the coefficients of the highest layer are averaged.
In some optional embodiments, the performing a SIFT algorithm on the image after framing to obtain a third fused image at a feature level includes:
and (3) detecting a scale space extremum: searching image positions on all scales, and identifying potential interest points which are unchanged for the scales and the rotation through a Gaussian differential function;
positioning key points: determining the position and the scale at each candidate position by fitting a fine model, and selecting key points according to the stability degree of the key points;
and (3) direction determination: assigning one or more directions to each keypoint location based on the direction of the gradient of the image portion; all subsequent manipulations of the image data are transformed with respect to the orientation, scale and position of the keypoints, thereby providing invariance to these transformations;
key point description: measuring gradients of the image local at selected scales within a neighborhood around each keypoint, the gradients being transformed into a representation that allows for deformation and illumination variation of relatively large local shapes;
and performing image fusion based on a decomposed four-step SIFT algorithm to obtain the third fusion image under the feature level.
In some optional embodiments, the calculating entropy, joint entropy, and root mean square error of the first fused image, the second fused image, and the third fused image includes:
entropy is taken as a parameter for thermodynamically representing the state of an object, and describes the chaotic degree of a system; the use of entropy for image evaluation refers to the degree of information enrichment contained in an image, and if the entropy increases, it means that the image obtains a larger amount of information by processing, and the entropy of image a is defined as the following formula (6):
wherein n is the gray level of the image, generally 256, and pA is the proportion of the pixel point with the gray level of j in the image;
the joint entropy generally characterizes the amount of information transferred from the source image to the fused image, i.e. the joint information between the two images, the joint entropy of image a being defined as the following equation (7):
E(M,N)=-∑m,n pMN (m,n)log2 pMN (m,n) (7)
according to the definition of the joint entropy between the two images, another calculation mode of mutual information between the two images can be obtained; the entropy of the image M is E (M), the entropy of the image N is E (N), and the joint entropy between the two images is E (M, N), then the mutual information between the image M and the image N can be calculated by the following formula (8):
MI(M,N)=E(M)+E(N)-E(M,N) (8)
the peak signal-to-noise ratio is used as a quality evaluation index based on a reference image, is mainly used for judging the fidelity of the image, and is also often used as a measuring method for the signal reconstruction quality in the fields of image compression and the like; it has a great relation to the mean square error, often defined simply by mean square error; the effect and quality of image fusion are shown in that the peak signal-to-noise ratio is large and the mean square error is small; the mean square error between image M and image N can be defined as the following equation (9):
the mean square error can be used to derive the peak signal-to-noise ratio as follows (10):
in another aspect of the present disclosure, there is provided an electronic device including:
one or more processors;
a storage unit for storing one or more programs which, when executed by the one or more processors, enable the one or more processors to implement the method according to the preceding description.
In another aspect of the disclosure, a computer readable storage medium is provided, on which a computer program is stored, which computer program, when being executed by a processor, is capable of implementing the method according to the preceding description.
According to the video image fusion method based on wind turbine generator fault identification, aiming at faults of a fan blade, an impeller and a tower, a video sensor can be installed from the outside; the state images of the external equipment under multiple angles are fused through video image framing to synthesize a panoramic image of the external equipment, so that the fault state of the external equipment of the fan can be better identified, and the influence of a shielding object under a single angle is avoided; under the condition that the sensor shakes due to the large external wind force, the final fault identification image can still be well obtained. In addition, in scientific research and engineering application in the field of automatic control, a great deal of complicated calculation and simulation curve drawing tasks exist, and the time and the workload for calculating and drawing the simulation curve can be greatly saved by adopting GUI interface design programming, and errors in manual fault identification can be reduced.
Detailed Description
In order that those skilled in the art will better understand the technical solutions of the present disclosure, the present disclosure will be described in further detail with reference to the accompanying drawings and detailed description.
First, an example electronic device for implementing a video image fusion method based on wind turbine fault recognition according to an embodiment of the present disclosure is described with reference to fig. 1.
As shown in fig. 1, electronic device 200 includes one or more processors 210, one or more storage devices 220, one or more input devices 230, one or more output devices 240, etc., interconnected by a bus system 250 and/or other forms of connection mechanisms. It should be noted that the components and structures of the electronic device shown in fig. 1 are exemplary only and not limiting, as the electronic device may have other components and structures as desired.
Processor 210 may be a Central Processing Unit (CPU), or may be a processing unit that is comprised of multiple processing cores, or has data processing capabilities and/or instruction execution capabilities, and may control other components in electronic device 200 to perform desired functions.
The storage 220 may include one or more computer program products that may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory. The volatile memory may include, for example, random Access Memory (RAM) and/or cache memory (cache), and the like. The non-volatile memory may include, for example, read Only Memory (ROM), hard disk, flash memory, and the like. One or more computer program instructions may be stored on the computer readable storage medium that can be executed by a processor to perform client functions and/or other desired functions in the disclosed embodiments (implemented by the processor) as described below. Various applications and various data, such as various data used and/or generated by the applications, may also be stored in the computer readable storage medium.
The input device 230 may be a device used by a user to input instructions and may include one or more of a keyboard, a mouse, a microphone, a touch screen, and the like.
The output device 240 may output various information (e.g., images or sounds) to the outside (e.g., a user), and may include one or more of a display, a speaker, and the like.
Next, a video image fusion method based on wind turbine generator fault recognition according to another embodiment of the present disclosure will be described with reference to fig. 2.
As shown in fig. 2, a video image fusion method S100 based on wind turbine generator fault recognition, the method includes:
s110, collecting video image information of the wind turbine generator.
Specifically, in this step, image acquisition devices such as a camera may be used to collect video image information of the wind turbine, for example, a micro camera may be installed around the wind turbine external blade, the impeller and the tower, and multiple cameras may be used to perform image fusion on the external device video.
And S120, carrying out image framing processing on the video image information to obtain multiple images or multiple images with single source.
S130, carrying out gray level conversion on the image subjected to the framing treatment, and carrying out pixel level image fusion under wavelet decomposition on the converted gray level image to obtain a first fusion image.
In particular, wavelet transformation refers to projecting a signal onto a set of wavelet functions and decomposing the signal into a superposition of the series of wavelet functions, which are scaled and translated from the basic wavelet functions. In this step, the converted gray-scale image is projected onto a set of wavelet functions, and the gray-scale image is decomposed into a superposition of the set of wavelet functions, resulting in the first fused image, as shown in fig. 3.
And S140, performing pixel-level image fusion under the Laplacian pyramid on the image subjected to the framing treatment to obtain a second fusion image.
Specifically, in this step, a gaussian pyramid decomposition is first performed on the source image. The fusion principle of the image pyramid algorithm is that each image participating in fusion is decomposed into the same pyramid, the same pyramid is arranged from top to bottom according to the sequence from low resolution to high resolution, then the pyramids decomposed by all the images are fused on the corresponding layer according to a certain rule, the corresponding image tower is obtained, and the image reconstruction is carried out, so that the last fused image is obtained. Embodiments of the present disclosure employ a laplacian pyramid algorithm, where the laplacian pyramid is optimized from a gaussian pyramid.
The gaussian pyramid is the most basic image tower, the image after framing is taken as an original image, the original image is taken as a 0 th layer of the gaussian pyramid, the original image is assumed to be G0, and the original image is convolved by using a gaussian kernel w, wherein the gaussian kernel is represented by the following formula (1):
the image obtained by downsampling (removing even lines and columns) after convolution is used as a layer 1G 1 of the image tower, so that after the image is processed, the lower layer image can be 4 times as large as the upper layer image; taking the obtained image as an input image, and repeatedly convoluting and downsampling to obtain images of the 1 st layer to the N th layer to form a pyramid-type Gaussian image tower;
however, because the gaussian pyramid needs to be rolled and downsampled in the construction process, some high-frequency information is lost, and thus the new pyramid algorithm, namely the laplacian pyramid, is adopted in the embodiment of the disclosure. The construction of the laplacian pyramid is based on a gaussian pyramid. To construct a laplacian pyramid, the previous layer of image of the gaussian pyramid needs to be convolved and up-sampled to obtain a predicted image, namely the following formula (2):
the magnification operator Expand can be expressed as the following formulas (3) to (5):
0<l≤N,0≤i<Rl ,0≤j<Cl (4)
wherein G isl Representing the first level of the pyramid of the original image, Gl* Representing the first level of the predicted image pyramid, i, j representing the behavior i, j, respectively, of the image, m, N representing the number of rows and columns, respectively, of the Gaussian kernel, N representing the highest level of the pyramid, Rl Represents the maximum number of lines, C, in the imagel Representing the maximum number of columns in the image;
the formulas (3) to (5) realize that even rows and columns deleted in the Gaussian construction process are inserted into 0, and then the Gaussian kernel w is used for convolution, namely filtering processing is carried out, so that an image with the same size as that before downsampling is obtained as a predicted image;
subtracting the predicted image from the next layer image to obtain a difference image, and repeatedly iterating to obtain a series of decomposed images which are arranged into pyramids, namely Laplacian pyramids;
fusing all layers of unified processing to obtain a corresponding image tower for image reconstruction to obtain the second fused image, wherein the unified processing is as follows: the layers except the top layer take the largest absolute value, and the coefficients of the highest layer are averaged.
A specific flow of image fusion using a laplacian pyramid is shown in fig. 4.
And S150, adopting a SIFT algorithm to the image subjected to the framing processing to obtain a third fusion image under the characteristic level.
Specifically, in this step, when image fusion is performed, the image registration and the target tracking and recognition performance are affected by factors such as the self-state of the target, the environment in which the scene is located, and the imaging characteristics of the imaging equipment. And then the SIFT algorithm can solve the problem of image registration failure caused by object shielding, illumination influence, sundry scene, noise, rotation, scaling and translation of the object and affine projection transformation of the image to a certain extent.
The SIFT algorithm searches feature points in different scale spaces, and only through gaussian blur can different scale spaces be obtained. Gaussian blur corresponds to an image filtering which is also present in pyramid decomposition, with a unique linear kernel (gaussian convolution kernel) to achieve the scaling. The scale space of the image is defined by convolving the source image with a two-dimensional gaussian function of variable scale:
L(i,j,σ)=G(i,j,kσ)*I(i,j)
where G (I, j, kσ) is a variable-scale gaussian function, I (I, j) is a source image, and k is a scale variation. The stability of the detected feature points is ensured by generating a Gaussian difference scale space by adopting different Gaussian difference kernels and image convolution:
D(i,j,σ)=(G(i,j,kσ)-G(i,j,σ))*I(i,j)=L(i,j,kσ)-L(i,j,σ)
a gaussian pyramid is constructed by a scale-space approach plus downsampling. In order to ensure the accuracy of feature point detection, the detected feature points need to be compared with 8 adjacent pixel points of the Gaussian pyramid layer and 9 adjacent pixel points of the upper layer and the lower layer. Feature point detection is achieved through a SIFT algorithm, and panoramic fusion of images can be achieved by fusing the obtained feature points. Fig. 5 clearly shows the process of achieving feature level fusion of images using SIFT algorithm.
More specifically, in the present embodiment, the SIFT algorithm is decomposed into four steps:
(1) And (3) detecting a scale space extremum: searching image positions on all scales, and identifying potential interest points which are unchanged for the scales and the rotation through a Gaussian differential function;
(2) Positioning key points: determining the position and the scale at each candidate position by fitting a fine model, and selecting key points according to the stability degree of the key points;
(3) And (3) direction determination: assigning one or more directions to each keypoint location based on the direction of the gradient of the image portion; all subsequent manipulations of the image data are transformed with respect to the orientation, scale and position of the keypoints, thereby providing invariance to these transformations;
(4) Key point description: within a neighborhood around each keypoint, gradients of the image local are measured on a selected scale, which gradients are transformed into a representation that allows for deformation and illumination variation of a relatively large local shape.
And performing image fusion based on a decomposed four-step SIFT algorithm to obtain the third fusion image under the feature level.
And S160, calculating entropy, joint entropy and root mean square error of the first fusion image, the second fusion image and the third fusion image.
Specifically, entropy is taken as a parameter for thermodynamically representing the state of an object, and describes the degree of confusion of a system; the use of entropy for image evaluation refers to the degree of information enrichment contained in an image, and if the entropy increases, it means that the image obtains a larger amount of information by processing, and the entropy of image a is defined as the following formula (6):
wherein n is the gray level of the image, generally 256, and pA is the proportion of the pixel point with the gray level of j in the image;
the joint entropy generally characterizes the amount of information transferred from the source image to the fused image, i.e. the joint information between the two images, the joint entropy of image a being defined as the following equation (7):
E(M,N)=-∑m,n pMN (m,n)log2 pMN (m,n) (7)
according to the definition of the joint entropy between the two images, another calculation mode of mutual information between the two images can be obtained; the entropy of the image M is E (M), the entropy of the image N is E (N), and the joint entropy between the two images is E (M, N), then the mutual information between the image M and the image N can be calculated by the following formula (8):
MI(M,N)=E(M)+E(N)-E(M,N) (8)
the peak signal-to-noise ratio is used as a quality evaluation index based on a reference image, is mainly used for judging the fidelity of the image, and is also often used as a measuring method for the signal reconstruction quality in the fields of image compression and the like; it has a great relation to the mean square error, often defined simply by mean square error; the effect and quality of image fusion are shown in that the peak signal-to-noise ratio is large and the mean square error is small; the mean square error between image M and image N can be defined as the following equation (9):
the mean square error can be used to derive the peak signal-to-noise ratio as follows (10):
the three indexes are calculated based on a single image, a source image and a reference image respectively, and the performance of the fusion image is judged more comprehensively by integrating the three indexes.
S170, designing a GUI man-machine interaction interface, embodying three fusion processing algorithms and index quantity calculation in the GUI man-machine interaction interface, obtaining a final fusion image through the obtained index value, and carrying out fault identification on the wind turbine generator by using the final fusion image.
According to the video image fusion method based on wind turbine generator fault recognition, a multi-source video is adopted to perform feature level image fusion aiming at the situation that a target object is blocked in the video shooting process, and a SIFT algorithm is specifically adopted; aiming at sensor shake in the video shooting process, pixel-level image fusion is adopted, and specifically, wavelet transformation and Laplacian pyramid image fusion algorithm are adopted. In addition, the fusion image with the maximum information quantity is obtained through calculation of three index quantities of entropy, joint entropy and root mean square error. And finally, constructing two types of video image fusion algorithms and the calculation of three index quantities into a GUI interface, so that the GUI interface is convenient to use in the actual fault identification process.
According to the video image fusion method based on wind turbine generator fault identification, aiming at faults of a fan blade, an impeller and a tower, a video sensor can be installed from the outside; the state images of the external equipment under multiple angles are fused through video image framing to synthesize a panoramic image of the external equipment, so that the fault state of the external equipment of the fan can be better identified, and the influence of a shielding object under a single angle is avoided; under the condition that the sensor shakes due to the large external wind force, the final fault identification image can still be well obtained. In addition, in scientific research and engineering application in the field of automatic control, a great deal of complicated calculation and simulation curve drawing tasks exist, and the time and the workload for calculating and drawing the simulation curve can be greatly saved by adopting GUI interface design programming, and errors in manual fault identification can be reduced.
In another aspect of the present disclosure, there is provided an electronic device including:
one or more processors;
and a storage unit for storing one or more programs, which when executed by the one or more processors, enable the one or more processors to implement the method according to the preceding description.
In another aspect of the disclosure, a computer readable storage medium is provided, on which a computer program is stored, which computer program, when being executed by a processor, is capable of implementing the method according to the preceding description.
Wherein the computer readable medium may be embodied in the apparatus, device, system of the present disclosure or may exist alone.
Wherein the computer readable storage medium may be any tangible medium that can contain, or store a program that can be an electronic, magnetic, optical, electromagnetic, infrared, semiconductor system, apparatus, device, more specific examples of which include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, an optical fiber, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination thereof.
The computer-readable storage medium may also include a data signal propagated in baseband or as part of a carrier wave, with the computer-readable program code embodied therein, specific examples of which include, but are not limited to, electromagnetic signals, optical signals, or any suitable combination thereof.
It is to be understood that the above embodiments are merely exemplary embodiments employed to illustrate the principles of the present disclosure, however, the present disclosure is not limited thereto. Various modifications and improvements may be made by those skilled in the art without departing from the spirit and substance of the disclosure, and are also considered to be within the scope of the disclosure.