A kind of three-dimensional image segmentation method and system based on full convolutional neural networksTechnical field
The invention belongs to three-dimensional series technical field of image segmentation, in particular to a kind of three based on full convolutional neural networksTie up image partition method and system.
Background technique
In field of image processing, image segmentation, registration, fusion and three-dimensional reconstruction process are generally comprised.It is led in image segmentationDomain, existing technology mainly by analyzing every tomographic image in a sequence, learn to do section, engineering by traditional graphLearning method and deep learning method etc., are split on two-dimensional surface, then carry out 3-D image fusion again, to realize threeTie up image segmentation.The Segmentation of Image Sequences carried out by existing method, generally only takes into account the pass in two-dimentional level between pixelSystem, and the continuity between each tomographic image is had ignored, thus more contextual information can be lost.
For example, traditional algorithm principle based on Threshold segmentation is simple, optimal threshold realization figure is chosen by traversing manuallyAs segmentation;But its calculating process is complicated, and is easy by noise jamming, robustness is poor.Algorithm based on edge detection isMarginal point in figure is first detected, then strategically connects into profile, to constitute cut zone;Its shortcoming is that noise immunity and inspectionThe contradiction of precision is surveyed, therefore obtained segmentation is often interrupted, incomplete structural information.
To sum up, a kind of novel three-dimensional image segmentation method is needed.
Summary of the invention
The purpose of the present invention is to provide a kind of three-dimensional image segmentation method and system based on full convolutional neural networks, withSolve above-mentioned technical problem.The present invention can make full use of the continuity information of sequence, can be in three-dimensional image segmentationObtain a relatively good result.
In order to achieve the above objectives, the invention adopts the following technical scheme:
A kind of three-dimensional image segmentation method based on full convolutional neural networks, comprising the following steps:
Step 1, acquisition obtains sequence image and is labeled, and obtains training sample data;
Step 2, pretreatment is normalized in the training sample data that step 1 obtains;
Step 3, treated the sample data of applying step 2 to the full convolution residual error U-net network model of the 3-D of prebuild intoRow has the training of supervision, and training obtains trained three-dimensional image segmentation model to the default condition of convergence;
Step 4, by after sequential image data normalized to be split, the trained 3-D image of input step 3 dividesIt cuts in model, obtains segmentation of sequence image result.
Further, in step 1, using image drawing tool, area-of-interest is drawn from whole sequence image,Label as machine learning;Sequence image is labeled on two-dimensional surface, all after the completion of mark, then is merged into3-D image.
Further, the normalization pretreatment of step 2 specifically includes:
(1) 3-D image gray matrix is obtained from original series image;
(2) grey scale pixel value of 3-D image gray matrix is normalized between 0-1;
(3) the voxel spacing of the image after step 2 normalization is all normalized to preset value.
Further, the 3-D constructed in step 3 is complete, and convolution residual error U-net network model is divided into encoder, decoder and notePower of anticipating connects three modules;Including convolution, Chi Hua, ReLU, batch standardization and deconvolution operation.
Further, the 3-D constructed in step 3 is complete, and convolution residual error U-net network model shares 8 residual blocks;EncoderStage includes 4 down-sampling residual blocks, and the size of characteristic pattern is constant after convolution operation each time, feature after pondization operation each timeThe size of figure becomes original 1/2, uses ReLU as nonlinear activation function, for increasing the non-linear table Danone of networkPower;Batch regularization is added, for accelerating network convergence and for gradient explosion or disappearance when preventing network too deep;Decoder stageIncluding 4 up-sampling residual blocks, the size of characteristic pattern becomes original 2 times after the operation of deconvolution each time;Connected using attentionLow-dimensional information in encoder is stitched together by the system of picking with the high dimensional information in decoder, provides more for being retrieved as segmentationFine feature.
Further, during step 3, which carries out, the model training of supervision, random shearing is added, overturning, puts down at randomOne of shifting, contrast enhancing and elastic registration or the operation of a variety of data augmentation;
In the training process, using Adam method undated parameter, loss function selection intersects entropy function, and difficult example is added and digsPick.
Further, further includes: step 5, use condition Random Fields Method optimizes processing to the segmentation result of acquisition.
Further, step 5 specifically includes: use condition Random Fields Method carries out smoothing denoising processing to segmentation result,Segmentation result is taken out in sequence image center finally, and determines the center point coordinate and length of segmentation result.
A kind of three-dimensional image segmentation system based on full convolutional neural networks, comprising:
Sample collection module obtains training sample data for acquiring acquisition sequence image and being labeled;
Preprocessing module is normalized, pre- place is normalized in the training sample data for obtaining sample collection moduleReason;
Model divides module;Using normalization preprocessing module treated sample data to the full convolution of the 3-D of prebuildResidual error U-net network model carries out the training for having supervision, and training obtains trained three-dimensional image segmentation to the default condition of convergenceModel;
Input/output module;For by after sequential image data normalized to be split, input model to divide moduleIn trained three-dimensional image segmentation model, and output sequence image segmentation result.
Compared with prior art, the invention has the following advantages:
Dividing method of the invention carries out normalizing to image by obtaining training data, analytical sequence image essential characteristicChange pretreatment, training obtains full convolution semantic segmentation neural network, attention mechanism is added, by encoder low-dimensional information withHigh dimensional information in decoder is efficiently stitched together, so that being retrieved as segmentation provides finer feature, by region of interestDomain is split from image, and marks its specific location;The continuity information that sequence can be made full use of, in 3-D image pointCut one preferable result of middle acquisition.Specifically, the present invention uses the full convolutional network of 3D, and residual error block structure is added, mergesThe three-dimensional information of sequence image can make segmentation result more acurrate;After optimization in treatment process, condition random field is added and calculatesMethod obtains smoother segmentation result.
Further, segmentation result is advanced optimized with post-processing, output can be made more smooth.
Detailed description of the invention
Fig. 1 is a kind of schematic process flow diagram of three-dimensional image segmentation method based on full convolutional neural networks of the invention;
Fig. 2 is full convolution segmentation network in a kind of three-dimensional image segmentation method based on full convolutional neural networks of the inventionConstruct schematic process flow diagram.
Specific embodiment
Invention is further described in detail in the following with reference to the drawings and specific embodiments.
A kind of three-dimensional image segmentation method based on full convolutional neural networks of the invention, it is intended to by Three dimensional convolution nerve netNetwork is applied to segmentation of sequence image, and is post-processed with condition random field algorithm to further in the output of neural network, fromAnd the segmentation result of one complete, continuous, efficient, high recall rate is obtained, specifically includes the following steps:
Step 1, it obtains sequence image and marks training data.
Sequence image generally refers to one group of image that continuous acquisition is carried out to a certain scene, can be regarded as a three-dimensionalImage.The Target Segmentation of sequence image is similar with two dimensional image segmentation, is to extract area-of-interest from image, obtainsTake its profile, size and center point coordinate.
For there is the deep learning method of supervision, during training, neural network to acquire require it is defeatedThe corresponding correctly output of the data entered, therefore the sequence data that we use must have mark;The present invention uses differentImage drawing tool draws area-of-interest, the label as machine learning from whole figure.Due to the limitation at visual angle,Mark for sequence image is usually to carry out on two-dimensional surface, all after the completion of mark, then is merged into three-dimensional.
Step 2, data prediction and analysis.
During obtaining data, due to objective operational circumstances difference (by taking medical data as an example, different doctors and notThe data parameters obtained with machine are all not quite similar), obtained data deficiency consistency, it is therefore desirable to data be done basic pre-Processing operation.
Specifically pre-treatment step includes:
(1) 3-D image gray matrix is obtained from original document;
(2) grey scale pixel value of image is normalized between 0-1;
(3) the voxel spacing of 3D rendering is all normalized to preset value;For example, it may be 1mm.Then data are analyzedIn area-of-interest size and location distribution, provide prior information for the prediction of next step.
Step 3, full convolution model is constructed.
Building one shares 8 residual blocks, and the full convolution residual error U-net network of 3-D that depth is 28 layers is broadly divided into volumeCode device, three decoder, attention connection mechanism modules, primary operational include convolution, deconvolution, Chi Hua, RELU and batch standardThe operation such as change.
Input training is as having a size of 32*224*224, encoder includes 4 down-sampling residual blocks, each time convolution characteristic patternSize it is constant, the size of Chi Huahou characteristic pattern becomes original 1/2 each time, use Relu as nonlinear activation function,The problem of gradient explodes or disappears when batch standardization is added and accelerates network convergence rate, and preventing network too deep.Decoder includesFour up-sampling residual blocks, are connected to corresponding output for the corresponding characteristic pattern of down-sampling, to obtain more global informations.
Building full convolution model structure include: encoder, decoder, attention connection,
Encoder is operated using pretreated 3-D image as input by a series of convolution, Chi Hua, ReLU etc., rightImage has carried out down-sampling, and extracts high dimensional feature figure as output;
A series of decoder, using the high dimensional feature figure of encoder output as input, by convolution, deconvolution, ReLU etc.Operation, up-samples high dimensional feature, and obtains segmentation result figure identical with original image size as output;
Attention connection, the shallow-layer characteristic pattern of input coding device export the feature of same size after convolution transformFigure, is then stitched together, to obtain more fine granularity segmentation informations with the character pair figure of decoder.
Step 4, training convolutional neural networks.
During training, since sequence image sample size is fewer, it is therefore desirable to random shearing be added, overturning, put downThe operation of the data augmentation such as shifting, contrast enhancing, noise, random deformation.Using Adam method undated parameter, initial learning rate is0.001, learning rate falls to original 1/2 after every 10 iteration, and loss function selection intersects entropy function.
Step 5, processing is optimized to segmentation result.
Preliminary segmentation result is obtained through neural network first, then use condition Random Fields Method carries out cut zoneSmoothing denoising processing, finally takes out the lump that 3D is split in ultrasonic image center, and determines its center point coordinate, lengthThe high physical features of width.
Three-dimensional image segmentation method of the invention, mainly solves in segmentation of sequence image, traditional two dimension segmentation, three-dimensionalFusion method can lose the problem of information between sequence.The present invention uses the full convolutional network of 3D, and residual error block structure and attention is addedConnection mechanism has effectively merged the three-dimensional information of sequence image, keeps segmentation result more acurrate.
Compared with prior art, the present invention can avoid the ladder generated when network depth increases by the way that residual error block structure is addedSpend disappearance problem;By the way that attention connection mechanism (rather than common shearing replicates operation) is added, coding can be efficiently extractedFine granularity low-dimensional information in device, so that being retrieved as segmentation provides finer feature.
Embodiment
Referring to Fig. 1, a kind of three-dimensional image segmentation method based on full convolutional neural networks of the embodiment of the present invention, applicationDivide in medical image processing, comprising the following steps:
S101 obtains Abus ultrasonic image and carries out training data mark.
The dedicated ultrasound of Abus mammary gland screening is with the imaging of total volume breast ultrasound, ultrasonic contrast, elastogram, the three-dimensional four-dimensionThe new technologies such as imaging provide high uniformity and high-resolution ultrasound image using high definition 3D volume images.Picture format isDICOM, i.e. digital imaging and communications in medicine are the international standards of medical image and relevant information;It is contained in image fileMany metadata informations such as Pixel Dimensions, picture size.
It is 3-D image that the ultrasound image come, which is scanned, according to the rule in natural world, in annotation process, Yi ShengtongIt crosses and compares different sections, accurately find the position of lump and sketch out by its profile under two dimensional form, then saveFor .nii formatted file.
S102, data prediction.
During obtaining data, due to the difference of objective operational circumstances, such as different doctors or different instruments, obtainData deficiency consistency, it is therefore desirable to data are done with basic pretreatment operation.
Pretreated specific steps include:
Step1. it obtains from DICOM file with the image grayscale matrix of digital representation;
Step2. the grey scale pixel value of image is normalized between 0-1;
Step3. the voxel spacing of 3D rendering is all normalized to 1mm.
Data normalization problem is major issue when feature vector is expressed in data mining, when different features exists in columnWhen together, the small data on absolute figure is caused " to be eaten up " by big data due to feature expression way itselfThe case where, what this when, we needed to do is exactly that the feature vector extracted is normalized, each to guaranteeFeature is classified device fair play.
Then the mass size and position distribution in data are analyzed, provides prior information for the prediction of next step.
S103 constructs the full convolutional neural networks of 3D.
Building one shares 8 residual blocks, and the full convolution residual error U-net network of 3-D that depth is 28 layers is broadly divided into volumeCode device, decoder, attention connect three modules, primary operational include convolution, Chi Hua, RELU and batch standardization etc. operation.
(1) residual block
In deep neural network, with the increase of network depth, it can be easier gradient disappears, gradient is exploded etc. occurProblem hinders the convergence of model.In traditional neural network, if the input of certain section of neural network is x, desired output is H(x), if to learn such model, training difficulty is bigger.It in residual error structure, inputs as x, desired output is changed toH (x)=F (x)+x is equivalent to network in this way and only needs learning objective value and the difference of x, can substantially reduce the difficulty of study.
In actual operation, layer identical for output characteristic pattern size, keeps identical port number, when characteristic pattern sizeWhen halving, port number is double.
(2) 3-D Unet structure
Traditional Unet is substantially an encoding and decoding structure, obtains high dimensional information by several layers down-sampling, generates lowThen characteristic pattern is up-sampled the segmentation figure at a full resolution by resolution characteristics figure.In encoder stage, all convolution kernelsSize is 3*3*3, it is every to be operated by a residual block with regard to carrying out primary maximum pondization, centre is obtained by four down-samplingsPartial Feature figure.In decoder stage, middle layer characteristic pattern is operated by deconvolution, while attention connection mechanism is added,Characteristic pattern in the shallow-layer characteristic pattern of same size in encoder and decoder is done into parallel link, to obtain more particulatesSpend semantic information.Two segmentation figures identical with original input size are finally exported, wherein each picture in one-dimensional representation original imageThe probability that the corresponding label of element is 0, the probability that the corresponding label of each pixel is 1 in two-dimensional representation original image.
Convolution operation in network uses the empty convolution of different void ratios, this is because image segmentation is all pixelThe prediction of rank, and characteristic pattern is operated by pondization each time, with the increase of receptive field, the location information of some keys is also lostIt loses.The advantages of empty convolution is then to increase the receptive field of convolution kernel in the case where not losing information, and each convolution is allowed to wrapContaining large range of information.
In the present invention, since ultrasound data is natural three-dimensional continuous data, using the residual error U-net of 3-DStructure makes full use of the contextual information of different slices, can obtain segmentation effect more better than conventional two-dimensional method.
S104, training convolutional neural networks.
During training, since medical image sample size is fewer, it is therefore desirable to random shearing be added, overturning, put downThe operation of the data augmentation such as shifting, contrast enhancing, elastic registration.
Random shearing includes: to take tubercle and its peripheral part normal tissue, resampling at random centered on marking tubercleTo fixed size.
Random translation includes: by the center random translation of shearing, it is noted that mass edge not cut, because of lumpEdge remain some relative position informations.
Overturning includes: that left and right is added and spins upside down operation, and tubercle is to absolute location information and insensitive in ultrasound image.
Contrast enhancing includes: that brightness in ultrasonic image, contrast, clarity are all different, and are usually presentThe interference of pseudomorphism, so needing to picture into row stochastic setting contrast.
Elastic registration includes: the random standard deviation that (- 1,1) section is generated to each dimension of pixel, and is filtered with GaussWave (0, sigma) is filtered the deviation matrix of each dimension, finally uses amplification coefficient alpha control deviation range, can be very bigExpanding data amount in degree.
The final loss function of model is the soft_max of a Pixel-level by output forecast image (at this time on each pointValue represents its probability for being predicted as 0 or 1), cross entropy then, which is calculated, with label value obtains.
Specific formula is as follows:
E=∑x∈Ωω(x)log(pl(x)(x))   (1)
In formula, x is location of pixels, and p (x) is the probability value after soft_max on each point, and l is pixel class (thisWhen K=2), w (x) is the true classification of each pixel.When p (x) and w (x) are more close, the value of Loss function is smaller, by mostSmallization loss function, it can obtain the optimal solution of model.
In the training process, since the area of tumor region and non-tumor region has very big great disparity, therefore cum rights is usedThe intersection entropy loss of weight, increases the weight of the loss function of tumor region, enables the network to preferably divide tumor regionIt cuts.
Parameter setting in experimentation is as follows:
The setting of optimizer: Adam, momentum is used to be set as 0.9.
Learning rate: initial learning rate is set as 0.001, and every 10 epoch, learning rate decays to original half.
Divide the setting of weight: according to positive negative sample ratio shared in data, weight, such as positive sample are rationally setIt is set as 0.7, negative sample is set as 0.3.
S105, the optimization processing based on condition random field.
Condition random field is usually used in the image tagged of Pixel-level, for the output of neural network, in conjunction with the picture of original imagePlain grey value profile establishes a graph model, by each iteration, make pixel value it is closer, apart from closer pixelPoint has higher probability to obtain identical label.By this method, can output to neural network it is more smooth, can also be withNoise is reduced, false positive rate is reduced.Then lump is taken out in ultrasonic image center, and determines its center point coordinate, length and widthThen high physical features revert to coordinate and lump actual size in real world.
Three-dimensional series image partition method based on the full convolutional neural networks of depth of the invention, mainly for solution fromTwo dimension angular divides three-dimensional series image and is easy the problem of losing information continuity.Deep learning is compared to simple neural networkFor, have the deeper number of plies and more complicated structure in network structure.Convolutional neural networks by visual structure andThe inspiration of traditional images processing method and generate, the conversion of image based on local link and laminated tissue between neuron, lead toParameter sharing and gradient decline building and training neural network are crossed, therefore higher achievement can be obtained in image domains.
To sum up, the present invention provides a kind of three-dimensional image segmentation methods based on full convolutional network, firstly, obtaining sequence instructionPractice data, and carries out professional mark;Secondly, pre-processed to data, by its gray value specification between 0-1, and by pixelSpacing normalizes;Again, the essential characteristics such as the size based on data, contrast, pixel value construct the full convolutional Neural of 3D residual errorNetwork is simultaneously trained;Finally, predicting using trained convolutional neural networks new data, item is used to prediction resultPart Random Fields Method carries out smoothing denoising, accurately finds area-of-interest.The present invention passes through the structure of the full convolutional neural networks of 3-DIt builds, the defect of information between traditional two dimension segmentation, three-dimensional fusion method loss sequence can be made up.
It should be understood by those skilled in the art that, embodiments herein can provide as method, system or computer programProduct.Therefore, complete hardware embodiment, complete software embodiment or reality combining software and hardware aspects can be used in the applicationApply the form of example.Moreover, it wherein includes the computer of computer usable program code that the application, which can be used in one or more,The computer program implemented in usable storage medium (including but not limited to magnetic disk storage, CD-ROM, optical memory etc.) producesThe form of product.
The application is referring to method, the process of equipment (system) and computer program product according to the embodiment of the present applicationFigure and/or block diagram describe.It should be understood that every one stream in flowchart and/or the block diagram can be realized by computer program instructionsThe combination of process and/or box in journey and/or box and flowchart and/or the block diagram.It can provide these computer programsInstruct the processor of general purpose computer, special purpose computer, Embedded Processor or other programmable data processing devices to produceA raw machine, so that being generated by the instruction that computer or the processor of other programmable data processing devices execute for realThe device for the function of being specified in present one or more flows of the flowchart and/or one or more blocks of the block diagram.
These computer program instructions, which may also be stored in, is able to guide computer or other programmable data processing devices with spyDetermine in the computer-readable memory that mode works, so that it includes referring to that instruction stored in the computer readable memory, which generates,Enable the manufacture of device, the command device realize in one box of one or more flows of the flowchart and/or block diagram orThe function of being specified in multiple boxes.
These computer program instructions also can be loaded onto a computer or other programmable data processing device, so that countingSeries of operation steps are executed on calculation machine or other programmable devices to generate computer implemented processing, thus in computer orThe instruction executed on other programmable devices is provided for realizing in one or more flows of the flowchart and/or block diagram oneThe step of function of being specified in a box or multiple boxes.
The above embodiments are merely illustrative of the technical scheme of the present invention and are not intended to be limiting thereof, although referring to above-described embodiment pairThe present invention is described in detail, those of ordinary skill in the art still can to a specific embodiment of the invention intoRow modification perhaps equivalent replacement these without departing from any modification of spirit and scope of the invention or equivalent replacement, applyingWithin pending claims of the invention.