Disclosure of Invention
Therefore, the application provides a medical image denoising model training method and device, which are used for solving the problems that the existing OCT image denoising method is easy to have unobvious tissue boundary layering and poor denoising performance.
In order to achieve the above object, the present application provides the following technical solutions:
in a first aspect, a medical image denoising model training method includes:
Step 1, acquiring medical images of the same part, wherein the medical images comprise noise images and clean images;
step 2, adjusting the sizes of the noise image and the clean image, and carrying out normalization processing to obtain an input image and a label image;
step 3, preparing the clean image as a mask image;
step 4, inputting the input image into an encoder of a U-Net network for feature extraction to obtain an original feature map;
Step 5, calculating a Hessian matrix of the original feature map, and calculating a matrix feature value of the Hessian matrix;
step 6, calculating Hessian response according to the matrix eigenvalues, and extracting an edge eigenvector to obtain a depth Hessian attention eigenvector;
Step 7, splicing the original feature map and the depth Hessian attention feature map, and inputting the spliced original feature map and the depth Hessian attention feature map into a decoder of the U-Net network for feature fusion to obtain an output image;
And 8, calculating the loss between the output image and the label image according to the mask image, and back-propagating the update weight so as to obtain a trained medical image denoising model.
Preferably, the step 3 specifically includes making an all 0-value image with the same size as the clean image, generating a polygonal area in a target tissue area of the all 0-value image, and filling with 1-value to obtain a mask image.
Preferably, in the step 4, the U-Net network is ResUNet networks, attention U-Net networks, or Mamba-UNet.
Preferably, in the step 5, the Hessian matrix calculation formula is:
Wherein Hi is a Hessian matrix,Represents partial differentiation, x represents transverse coordinates, y represents longitudinal coordinates, Ei represents the original feature map,Representing the second partial derivative of Ei in the x-direction,Representing the second partial derivative of Ei in the y-direction,Representing the mixed partial derivative of Ei in the x, y directions.
Preferably, in the step 6, the method Jerman, the method Frangi or the method Erdt is used for calculating the Hessian response.
Preferably, in the step 7, when the original feature map and the depth Hessian attention feature map are spliced, a ratio of the original feature map to the depth Hessian attention feature map is 1:2.
Preferably, in the step 8, when the loss between the output image and the tag image is calculated from the mask image, the loss function is any combination of a mean square error loss, an L1 loss, a PSNR loss, and an SSIM loss.
Preferably, when the loss function is a combination of L1 loss and SSIM loss, the loss function calculation formula is:
wherein, theRepresents the output image, b represents the label image,Representing the loss of the SSIM,Representing the loss of L1, Imask representing the mask image,AndThe weight coefficient is represented by a number of weight coefficients,A process for sharpening enhancement using 3*3 convolution kernels is shown.
Preferably, in the step 8, the label image is a sharpened and enhanced label image.
In a second aspect, a medical image denoising model training apparatus includes:
the medical image acquisition module is used for acquiring medical images of the same part, wherein the medical images comprise noise images and clean images;
The data preprocessing module is used for adjusting the sizes of the noise image and the clean image and carrying out normalization processing to obtain an input image and a label image;
a mask image making module, configured to make the clean image into a mask image;
The original feature map extraction module is used for inputting the input image into an encoder of the U-Net network to perform feature extraction to obtain an original feature map;
the calculation module is used for calculating a Hessian matrix of the original feature map and calculating matrix feature values of the Hessian matrix;
the depth Hessian attention feature map extraction module is used for calculating Hessian response according to the matrix feature values and extracting an edge feature map to obtain a depth Hessian attention feature map;
The feature fusion module is used for splicing the original feature image and the depth Hessian attention feature image, inputting the spliced original feature image and the depth Hessian attention feature image into a decoder of the U-Net network for feature fusion, and obtaining an output image;
and the training module is used for calculating the loss between the output image and the label image according to the mask image and back-propagating the update weight so as to obtain a trained medical image denoising model.
Compared with the prior art, the application has at least the following beneficial effects:
The application provides a medical image denoising model training method and device, which are characterized in that a noise image and a clean image of the same part are obtained, preprocessing is carried out to obtain an input image, a label image and a mask image, the input image is input into an encoder of a U-Net network for feature extraction to obtain an original feature image, a Hessian matrix of the original feature image is calculated, matrix feature values of the Hessian matrix are calculated, hessian response is calculated according to the matrix feature values, an edge feature image is extracted to obtain a depth Hessian attention feature image, the original feature image and the depth Hessian attention feature image are spliced and are input into a decoder of the U-Net network for feature fusion to obtain an output image, loss between the output image and the label image is calculated according to the mask image, and updating weights are reversely propagated to obtain a trained medical image denoising model. The application enhances the attention of the U-Net network to the structural details by combining the depth Hessian attention characteristic, can pay more attention to the tissue boundary information and the texture details in the medical image when removing inherent speckles of the medical image and reconstructing the original tissue structure of the medical image, is beneficial to tissue layering, and remarkably improves the image quality of the medical image after denoising, so that the diagnosis result is more accurate.
Detailed Description
The application will be further described in detail by means of specific embodiments with reference to the accompanying drawings.
In the description of the present application, unless otherwise indicated, the meaning of "a plurality" is two or more. The terms "first," "second," "third," and the like in this disclosure are intended to distinguish between the referenced objects without a special meaning in terms of technical connotation (e.g., should not be construed as emphasis on the degree of importance or order, etc.). The expressions "comprising", "including", "having", etc. also mean "not limited to" (certain units, components, materials, steps, etc.).
The terms such as "upper", "lower", "left", "right", "middle", and the like, as used herein, are generally used for the purpose of facilitating an intuitive understanding with reference to the drawings and are not intended to be an absolute limitation of the positional relationship in actual products.
The application provides a novel medical image denoising method, namely a medical image denoising method based on a depth Hessian attention feature improved neural network. The main idea is to enhance the attention of the U-Net network to structural details by combining with depth Hessian attention features which emphasize boundary information in OCT images, thereby injecting a visual attention to the network. In addition, to effectively help restore more structural detail, the present application introduces mask loss to improve OCT image quality, especially in clinically significant areas. The mask is a binary image used to mark a region of interest, such as a retinal region in an ophthalmic image. By incorporating the mask into the loss function, the denoising process is focused on these critical areas, enhancing overall image quality and diagnostic accuracy. The method effectively solves the challenges of maintaining tissue boundary integrity and enhancing texture features in OCT images, thereby significantly improving denoising performance and diagnostic accuracy.
The method has good application prospect when processing other imaging modes such as CT (Computed Tomography, CT) and the like. The details of the vascular structure in the CT image are important information for diagnosing lung diseases, and the definition of the blood vessels during imaging directly influences the accuracy of diagnosis. However, noise and artifacts are often present in CT images, which may affect image quality. By introducing deep Hessian attention features, the method can enhance the focus on fine structures in CT images, thereby improving the visibility and resolution of these structures, particularly the edges and bifurcation points of blood vessels. By combining with mask loss, the method can ensure that the denoising process is focused on key areas such as blood vessels in the lung image, and the image quality of the areas is improved, so that the diagnosis precision and reliability are improved. The multi-mode image processing method not only can improve the quality of OCT images, but also has obvious effect on clear display of pulmonary vascular structures in CT images. The popularization and application of the method can play an important role in different medical image fields.
Example 1
Referring to fig. 1, the embodiment provides a medical image denoising model training method, which includes:
S1, acquiring medical images of the same part, wherein the medical images comprise a noise image Inoisy and a clean image Iclean;
S2, adjusting the sizes of the noise image and the clean image, and carrying out normalization processing to obtain an input image and a label image;
Referring to fig. 2, the step of data preprocessing is to perform data preprocessing on the acquired noise image Inoisy and clean image Iclean, where the data preprocessing includes normalizing the noise image Inoisy and the clean image Iclean, and adjusting the data range to be between 0 and 1. The noise image Inoisy is the input image Iinput, and the clean image Iclean is the label image Iabel.
S3, preparing a clean image as a mask image;
Specifically, for each clean image Iclean, this step generates a polygon area in the target tissue area of the all 0-value image by making the all 0-value image of the same size as the clean image Iclean, and fills in with 1-values, to obtain a mask image Imake.
S4, inputting the input image into an encoder of a U-Net network for feature extraction to obtain an original feature map;
Specifically, the U-Net network may be ResUNet network, attention U-Net network or Mamba-UNet network, and is preferably ResUNet network. The present embodiment is based on ResUNet network, which replaces the original jump connection of ResUNet with a modified depth Hessian attention feature complementary connection, so ResUNet network comprises three parts of encoder, decoder and depth Hessian attention feature complementary connection, wherein both encoder and decoder have residual structure as shown in fig. 3.
Referring to fig. 4, assuming that the number of downsampling times of the encoder is n, the feature map of the input image Iinput that is downsampled and extracted by the encoder isI.e. Ei is the original signature, in this step the number of next samples of the encoder can be increased or decreased, e.g. four times.
S5, calculating a Hessian matrix of the original feature map, and calculating matrix feature values of the Hessian matrix;
Specifically, the calculation formula of the Hessian matrix is as follows:
Wherein Hi is a Hessian matrix,Represents partial differentiation, x represents transverse coordinates, y represents longitudinal coordinates, Ei represents the original feature map,Representing the second partial derivative of Ei in the x-direction,Representing the second partial derivative of Ei in the y-direction,Representing the mixed partial derivative of Ei in the x, y directions.
S6, calculating Hessian response according to the matrix eigenvalues, and extracting an edge eigenvector to obtain a depth Hessian attention eigenvector;
Specifically, the step may calculate the Hessian response using the Jerman method, the Frangi method, or the Erdt method.
S7, splicing the original feature map and the depth Hessian attention feature map, and inputting the spliced original feature map and the depth Hessian attention feature map into a decoder of a U-Net network for feature fusion to obtain an output image;
Specifically, this step takes the original signature Ei and the depth Hessian attention signatureThe input for jump connection after splicing is:
The feature map Dn-i which enters the decoder for n-i times of last sample extraction is spliced with Fi and is transmitted to n-i+1 times of last samples, namely:
。
In this step, the number of times of last sampling of the decoder can be increased or decreased, for example, the number of times of last sampling can be increased;
When the original feature map and the extracted depth Hessian attention feature are spliced to be input as jump connection, the ratio between the original feature map and the depth Hessian attention feature can be changed, for example, the ratio between the original feature map and the depth Hessian attention feature map can be 1:2.
And S8, calculating the loss between the output image and the label image according to the mask image, and back-propagating the update weight so as to obtain a trained medical image denoising model.
Specifically, referring to fig. 5, during training, in order to increase the sharpening degree of the output image, the output image and the label image after the sharpening enhancement are subjected to loss calculation, and update weights are propagated reversely. When calculating the loss between the output image and the label image according to the mask image, the loss function is any combination of mean square error loss, L1 loss, PSNR loss and SSIM loss, and other regularization terms or loss terms, such as contrast loss (Contrastive Loss), perception loss (Perceptual Loss) and the like, can also be introduced to further improve the denoising effect and the image quality.
When the loss function is a combination of L1 loss and SSIM loss, the loss function f is calculated as:
wherein, theRepresenting the output image, i.e., Ioutput, b representing the label image, i.e., Ilabel,Representing the loss of the SSIM,Representing the loss of L1, Imask representing the mask image,AndThe weight coefficient is represented by a number of weight coefficients,+=1,≥0,≥0。
The SSIM loss calculation formula is:
the L1 loss calculation formula is:
wherein, theRepresents the process of sharpening enhancement using 3*3 convolution kernels [ [0, -0.5,0], [ -0.5,3, -0.5], [0, -0.5,0],、、AndRespectively representThe mean and standard deviation of b, c1 and c2 are two constants, n representsThe number of pixels in b,、Representation ofAnd the ith pixel value in b.
The sharpened convolution kernel [ [0, -0.5,0], [ -0.5,3, -0.5], [0, -0.5,0] ] may be replaced with other sharpened convolution kernels, such as [ [0, -1, 0], [ -1, 4, -1], [0, -1, 0] ], and the like.
When the medical image denoising model trained by the embodiment is utilized to perform OCT image denoising, Iinput is input into a ResUNet network (namely the medical image denoising model) trained based on depth Hessian attention characteristic improvement to obtain Ioutput, and then the sharpened convolution is used for checking the enhancement of Ioutput to obtainWill beThe value range of (2) is adjusted between 0 and 1, and the final result is obtained by inverse normalization.
According to the medical image denoising model training method, through combining the depth Hessian attention characteristic, the attention of the U-Net network to structural details is enhanced, when intrinsic speckles of an OCT image are removed and an original tissue structure of the OCT is reconstructed, tissue boundary information and texture details in the OCT image are more concerned, tissue layering is facilitated, the image quality of the OCT image after denoising is remarkably improved, the signal to noise ratio is improved, and a diagnosis result is more accurate. In addition, the method is also suitable for other imaging modes, such as CT images, and can enhance the visibility of the pulmonary vascular structures in the CT images, so that the edges and details of the blood vessels are more obvious, and the image quality and the diagnosis accuracy are improved.
Compared with the problem of long time consumption of the traditional denoising method (such as a filtering method and a non-local mean method), the embodiment utilizes a deep learning model to realize more efficient calculation. Even under the condition of large-scale data processing, the method can still quickly generate the denoising image with high quality, and has higher time efficiency. In addition, by introducing mask loss, the embodiment can concentrate on areas with important clinical significance, such as retina in OCT images and pulmonary vascular structures in CT images, so that the image quality and diagnosis accuracy of the areas are further improved, and the method has important significance for clinically and accurately positioning lesions and diagnosis.
Example two
The embodiment provides a medical image denoising model training apparatus, which comprises:
the medical image acquisition module is used for acquiring medical images of the same part, wherein the medical images comprise noise images and clean images;
The data preprocessing module is used for adjusting the sizes of the noise image and the clean image and carrying out normalization processing to obtain an input image and a label image;
a mask image making module, configured to make the clean image into a mask image;
The original feature map extraction module is used for inputting the input image into an encoder of the U-Net network to perform feature extraction to obtain an original feature map;
the calculation module is used for calculating a Hessian matrix of the original feature map and calculating matrix feature values of the Hessian matrix;
the depth Hessian attention feature map extraction module is used for calculating Hessian response according to the matrix feature values and extracting an edge feature map to obtain a depth Hessian attention feature map;
The feature fusion module is used for splicing the original feature image and the depth Hessian attention feature image, inputting the spliced original feature image and the depth Hessian attention feature image into a decoder of the U-Net network for feature fusion, and obtaining an output image;
and the training module is used for calculating the loss between the output image and the label image according to the mask image and back-propagating the update weight so as to obtain a trained medical image denoising model.
For details of implementation of each module in a medical image denoising model training apparatus, reference may be made to the above definition of a medical image denoising model training method, which is not repeated here.
Any combination of the features of the above embodiments may be used (as long as there is no contradiction between the combinations of the features), and for brevity of description, all of the possible combinations of the features of the above embodiments are not described, and all of the embodiments not explicitly described are also to be considered as being within the scope of the description.