Disclosure of Invention
The invention provides a cell classification method and device based on signal point content constraint, which are used for solving the defect of poor cell classification accuracy in the prior art.
The invention provides a cell classification method based on signal point content constraint, which comprises the following steps:
carrying out image segmentation on the cell image to obtain a cell contour of each cell, and acquiring a signal fluorescence image of each cell under a plurality of channels based on the cell contour;
performing signal point detection on signal fluorescence images of any cell under a plurality of channels based on each channel fluorescence signal detection submodel of the cell classification model to obtain signal point detection results of any cell under a plurality of channels;
and based on a classification branch of a cell classification model, performing feature extraction on a fusion image of the signal fluorescence image of any cell under a plurality of channels and a fusion result of the signal point detection result under the plurality of channels to obtain an image fusion feature and a signal point fusion feature of any cell, and performing cell classification based on the image fusion feature and the signal point fusion feature to obtain a classification result of any cell.
According to the cell classification method based on signal point content constraint provided by the invention, the signal point detection is carried out on the signal fluorescence image of any cell under a plurality of channels by each channel fluorescence signal detection submodel based on the cell classification model to obtain the signal point detection result of any cell under a plurality of channels, and the method specifically comprises the following steps:
performing signal point prediction on a signal fluorescence image of any cell under any channel based on a heat map sub-branch of a fluorescence signal detection sub-model corresponding to any channel to obtain a fluorescence signal heat map of any cell under any channel;
performing signal point regression classification on the signal fluorescence image of any cell under any channel based on the regression classification sub-branch of the fluorescence signal detection sub-model corresponding to any channel to obtain the number of signal points of any cell under any channel; the signal point detection result comprises the fluorescence signal heat map and the number of signal points.
According to the cell classification method based on signal point content constraint provided by the invention, the feature extraction is performed on the fusion image of the signal fluorescence image of any cell under a plurality of channels and the fusion result of the signal point detection result under a plurality of channels to obtain the image fusion feature and the signal point fusion feature of any cell, and the method specifically comprises the following steps:
superposing the signal fluorescence images of any cell under a plurality of channels to obtain a fusion image, and then extracting the characteristics of the fusion image to obtain the image fusion characteristics of any cell;
superposing fluorescence signal heat maps of any cell under a plurality of channels to obtain a fusion heat map, and then carrying out feature extraction on the fusion heat map to obtain heat map fusion features of any cell;
fusing the number of signal points of any cell under a plurality of channels to obtain the statistical fusion characteristic of any cell; the signal point fusion features include the heat map fusion features and the statistical fusion features.
According to the cell classification method based on signal point content constraint provided by the invention, the sub-model for detecting the fluorescence signal of each channel based on the cell classification model is used for detecting the signal point of the fluorescence image of any cell under a plurality of channels to obtain the detection result of the signal point of any cell under a plurality of channels, and the method specifically comprises the following steps:
carrying out image coding on a signal fluorescence image of any cell under any channel based on an encoder of a fluorescence signal detection submodel corresponding to any channel to obtain an image feature vector of the signal fluorescence image under any channel;
performing signal point prediction on each pixel in the signal fluorescence image under any channel based on the heat map sub-branch of the fluorescence signal detection sub-model corresponding to any channel and combining the image feature vector to obtain the fluorescence signal heat map of any cell under any channel;
and performing regression classification on signal points in the signal fluorescence image under any channel based on the regression classification sub-branch of the fluorescence signal detection sub-model corresponding to any channel by combining the image feature vector and the fluorescence signal heat map to obtain the number of the signal points of any cell under any channel.
According to the cell classification method based on signal point content constraint provided by the invention, the regression classification sub-branch of the fluorescence signal detection sub-model corresponding to any channel is combined with the image feature vector and the fluorescence signal heat map to perform regression classification on the signal points in the signal fluorescence image of any channel to obtain the number of the signal points of any cell in any channel, and the method specifically comprises the following steps:
performing signal point classification on signal points in the signal fluorescence image under any channel based on a signal point classification branch of the regression classification subbranch and by combining the image feature vector and the fluorescence signal heat map to obtain a first signal point number of any cell under any channel;
and/or performing signal point regression on signal points in the signal fluorescence image under any channel based on a signal point regression branch of the regression classification sub-branch in combination with the image feature vector and the fluorescence signal heat map to obtain a second signal point number of any cell under any channel;
determining the number of signal points of any cell under any channel based on the first number of signal points and/or the second number of signal points of any cell under any channel.
According to the cell classification method based on signal point content constraint provided by the invention, the cell classification model is obtained by training based on the following steps:
respectively training each channel fluorescence signal detection submodel of the cell classification model based on a sample signal fluorescence image of a sample cell under each channel and signal point labels and the number of sample signal points in the sample signal fluorescence image until the training loss of the fluorescence signal detection submodel corresponding to each channel is less than or equal to a first loss threshold;
and training the classification branches by combining the trained fluorescence signal detection submodels corresponding to the channels based on the sample cell types of the sample cells until the training loss of the classification branches is less than or equal to a second loss threshold value.
According to the cell classification method based on signal point content constraint provided by the invention, the cell classification model is trained on the fluorescence signal detection submodels of the channels corresponding to the cell classification model respectively based on the sample signal fluorescence images of the sample cells in the channels and the signal point labels and the number of the sample signal points in the sample signal fluorescence images until the training losses of the fluorescence signal detection submodels corresponding to the channels are less than or equal to a first loss threshold, and the method specifically comprises the following steps:
training the heat map sub-branch of the fluorescence signal detection sub-model corresponding to any channel based on the sample signal fluorescence image of the sample cell in any channel and the signal point label in the sample signal fluorescence image until the training loss of the heat map sub-branch of the fluorescence signal detection sub-model corresponding to any channel is less than or equal to a third loss threshold value;
and performing combined training on the heat map sub-branch and the regression classification sub-branch of the fluorescence signal detection sub-model corresponding to any channel based on the sample signal fluorescence image of the sample cells in any channel and the signal point labels and the number of sample signal points in the sample signal fluorescence image until the training loss of the fluorescence signal detection sub-model corresponding to any channel is less than or equal to a first loss threshold value.
The invention also provides a cell classification device based on signal point content constraint, which comprises:
the image segmentation unit is used for carrying out image segmentation on the cell image to obtain the cell contour of each cell and acquiring a signal fluorescence image of each cell under a plurality of channels based on the cell contour;
the signal point detection unit is used for carrying out signal point detection on signal fluorescence images of any cell under a plurality of channels based on the fluorescence signal detection submodels of each channel of the cell classification model to obtain signal point detection results of any cell under the plurality of channels;
and the cell classification unit is used for performing feature extraction on the fusion image of the signal fluorescence image of any cell under the multiple channels and the fusion result of the signal point detection result under the multiple channels based on the classification branch of the cell classification model to obtain the image fusion feature and the signal point fusion feature of any cell, and performing cell classification based on the image fusion feature and the signal point fusion feature to obtain the classification result of any cell.
The invention also provides an electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor executes the program to implement the cell classification method based on signal point content constraint as described in any one of the above.
The present invention also provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements a method for cell classification based on signal point content constraints as described in any of the above.
The present invention also provides a computer program product comprising a computer program which, when executed by a processor, implements a method for cell classification based on signal point content constraints as described in any of the above.
The invention provides a cell classification method and a device based on signal point content constraint, which perform signal point detection on signal fluorescence images of any cell under a plurality of channels to obtain signal point detection results of the cell under the plurality of channels, perform feature extraction on fusion images of the signal fluorescence images of the cell under the plurality of channels and fusion results of the signal point detection results under the plurality of channels to enrich image semantic information in the signal fluorescence images and signal point information under each channel to obtain image fusion features and signal point fusion features of the cell, perform cell classification based on the image fusion features and the signal point fusion features, complement and verify feature information of each dimension, and select feature information with higher reliability from feature information of each principal dimension and each subordinate dimension to perform subsequent cell classification by trained classification branches, effectively improves the accuracy of cell classification.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention clearer, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is obvious that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Fig. 1 is a schematic flow chart of a cell classification method based on signal point content constraint according to the present invention, as shown in fig. 1, the method includes:
and step 110, carrying out image segmentation on the cell image to obtain a cell contour of each cell, and acquiring a signal fluorescence image of each cell under a plurality of channels based on the cell contour.
Here, the target chromosome in the cell sample may be hybridized and stained by a four-color in situ fluorescence hybridization technique, and the cell images corresponding to the four probe channels in the field of view may be collected by a microscope, or the cell images corresponding to the respective probe channels may be directly read from the database. The images of the cells under any of the probe channels include the individual cells under that field of view. Aiming at the cell image under any probe channel, the cell image can be subjected to image segmentation by using a cell segmentation algorithm, a Mask of each cell is obtained, and the cell contour of each cell is extracted. And cutting the cell image under the probe channel according to the cell contour of any cell to obtain a subgraph of each cell under the probe channel, namely a signal fluorescence image of each cell under the probe channel. Wherein, the signal fluorescence image is a fluorescence image of the interior of a single cell under the probe channel, and the signal fluorescence image comprises a fluorescence signal point obtained by dyeing the interior of the cell.
And step 120, performing signal point detection on the signal fluorescence images of any cell under multiple channels based on the fluorescence signal detection submodels of the channels of the cell classification model to obtain signal point detection results of any cell under multiple channels.
Here, the signal point detection may be performed on the signal fluorescence image in a single channel by using the fluorescence signal detection submodel corresponding to each channel in the trained cell classification model. The fluorescent signal detection submodels corresponding to different channels can have the same structure, but are obtained by respective independent training. For any channel, a target detection algorithm can be used, feature extraction is carried out on the signal fluorescence image under the channel based on the fluorescence signal detection submodel corresponding to the channel, signal point detection is carried out based on the extracted features, signal point information in the signal fluorescence image is obtained, and a signal point detection result of the cell under the channel is obtained. The signal point detection result includes position information of each signal point in the corresponding signal fluorescence image, and may further include morphological information, statistical information, and the like of each signal point in the corresponding signal fluorescence image, which is not particularly limited in this embodiment of the present invention.
And step 130, based on the classification branch of the cell classification model, performing feature extraction on the fusion image of the signal fluorescence image of any cell under the multiple channels and the fusion result of the signal point detection result under the multiple channels to obtain the image fusion feature and the signal point fusion feature of any cell, and performing cell classification based on the image fusion feature and the signal point fusion feature to obtain the classification result of any cell.
Here, in order to improve the accuracy of cell classification, the signal fluorescence images of the cell under multiple channels may be fused based on the classification branches of the cell classification model to fuse image features included in different signal fluorescence images under multiple channels, and the signal point detection results under multiple channels may be fused to fuse related information of signal points in different signal fluorescence images under multiple channels, so as to enrich image semantic information in the signal fluorescence images and signal point information under each channel. And then, respectively carrying out feature extraction on the fusion images of the signal fluorescence images under the multiple channels and the fusion results of the signal point detection results under the multiple channels by using a feature encoder to obtain the image fusion features and the signal point fusion features of the cells, and then integrating the image semantic information and the signal point information provided in the image fusion features and the signal point fusion features to carry out cell classification to obtain the classification results of the cells.
The fusion image of the signal fluorescence images under the multiple channels comprises semantic information in different signal fluorescence images under the multiple channels, including signal points, semantic information around the signal points and the like, and the trained classification branches can extract key features beneficial to cell classification from the fusion image, so that image fusion features are obtained. The detection results of the signal points under the multiple channels comprise signal point information under each channel, such as position information and statistical information of each signal point, and the trained classification branches can also extract key information beneficial to cell classification from the signal point information, so that signal point fusion characteristics are obtained. Therefore, the image fusion feature and the signal point fusion feature respectively provide feature information with richer different dimensionalities, namely feature information of image dimensionality and feature information of signal point dimensionality.
And cell classification is carried out according to the feature information of different dimensions, the feature information of each dimension can be mutually supplemented and verified, and the trained classification branch can select feature information with higher reliability from the feature information of each dimension to carry out subsequent cell classification. Even if the signal fluorescence image has noise due to the imaging problem, the credible characteristics can be screened out for classification by utilizing the multi-dimensional characteristic information complementation mode, so that the accuracy of cell classification is effectively improved.
In addition, in the training process of the cell classification model, the fluorescence signal detection submodels corresponding to the channels are trained firstly, and after the fluorescence signal detection submodels with the performance meeting the preset conditions (namely the precision of signal point detection reaches the preset conditions) are obtained, the classification branches of the cell classification model are trained. The training of the classification branch is restrained by the fluorescent signal detection submodel which achieves certain accuracy, so that the classification branch can be guided to learn to extract more accurate image fusion characteristics and signal point fusion characteristics based on a signal point detection result with certain accuracy, and meanwhile, the classification branch can learn to select characteristic information with higher reliability from image fusion characteristics and signal point fusion characteristics with different dimensionalities for subsequent cell classification, and the accuracy of cell classification can be improved. Therefore, the cell classification provided by the embodiment of the invention is a cell classification method based on signal point content constraint.
The method provided by the embodiment of the invention carries out signal point detection on the signal fluorescence image of any cell under a plurality of channels to obtain the signal point detection results of the cell under the plurality of channels, extracting the characteristics of the fusion image of the signal fluorescence image of the cell under a plurality of channels and the fusion result of the signal point detection result under a plurality of channels, enriching the image semantic information in the signal fluorescence image and the signal point information under each channel to obtain the image fusion characteristics and the signal point fusion characteristics of the cell, cell classification is carried out based on the image fusion characteristics and the signal point fusion characteristics, characteristic information of all dimensions can be mutually supplemented and verified, trained classification branches can select characteristic information with higher reliability from characteristic information of all dimensions from principal and subordinate, and accordingly accuracy of cell classification is effectively improved.
Based on the above embodiment, the sub-model for detecting fluorescence signals of each channel based on the cell classification model performs signal point detection on a fluorescence signal image of any cell in multiple channels to obtain a signal point detection result of any cell in multiple channels, and specifically includes:
performing signal point prediction on a signal fluorescence image of any cell under any channel based on a heat map sub-branch of a fluorescence signal detection sub-model corresponding to any channel to obtain a fluorescence signal heat map of any cell under any channel;
performing signal point regression classification on the signal fluorescence image of any cell under any channel based on the regression classification sub-branch of the fluorescence signal detection sub-model corresponding to any channel to obtain the number of signal points of any cell under any channel; the signal point detection result comprises the fluorescence signal heat map and the number of signal points.
Specifically, the fluorescence signal detection submodel corresponding to any channel comprises two sub-branches: heat map sub-branch and regression classification sub-branch. The heat map sub-branch is used for predicting each pixel point in the signal fluorescence image under the channel, determining the probability that each pixel point belongs to a signal point pixel, and obtaining a fluorescence signal heat map of the cell under the channel, wherein the fluorescence signal heat map comprises the probability that each pixel point is a signal point pixel. And the regression classification sub-branch is used for performing regression analysis and/or classification on the number of signal points of the signal fluorescence image of the cell under the channel to obtain the number of the signal points of the cell under the channel. The regression classification sub-branch specifically adopts a regression analysis mode or a classification mode or a combination mode of the regression analysis mode and the classification mode, and a mode with better performance can be selected according to the current application scene, which is not specifically limited in the embodiment of the present invention. Here, the number of signal points is the number of fluorescent signal points in the signal fluorescence image in the corresponding channel, and the detection result of the signal point in any channel includes the fluorescent signal heat map and the number of signal points in the channel.
In the embodiment of the invention, the regression classification subbranch is arranged, so that the learning of the heat map subbranch can be guided in the training process, the detection precision of the heat map subbranch on the signal points is improved, the fluorescence signal heat map is restrained based on the number of the signal points predicted by the regression classification subbranch, and the precision of adhesion signal points or miscellaneous points and the like can be effectively improved. In addition, the heat map subbranch can also help the learning of the regression classification subbranch, and the accuracy of the number of the signal points output by the regression classification subbranch is improved, so that the overall accuracy of the detection result of the signal points is improved.
Based on any one of the above embodiments, the performing feature extraction on the fusion image of the signal fluorescence image of any one cell under multiple channels and the fusion result of the signal point detection result under multiple channels to obtain the image fusion feature and the signal point fusion feature of any one cell specifically includes:
superposing the signal fluorescence images of any cell under a plurality of channels to obtain a fusion image, and then extracting the characteristics of the fusion image to obtain the image fusion characteristics of any cell;
superposing fluorescence signal heat maps of any cell under a plurality of channels to obtain a fusion heat map, and then carrying out feature extraction on the fusion heat map to obtain heat map fusion features of any cell;
fusing the number of signal points of any cell under a plurality of channels to obtain the statistical fusion characteristic of any cell; the signal point fusion features include the heat map fusion features and the statistical fusion features.
Specifically, the signal fluorescence images of the cell under each channel are superposed to form a new vector matrix, and a fused image is obtained. And then, carrying out feature coding on the fused image through a feature coder to obtain the image fusion feature of the cell. Meanwhile, the fluorescence signal heat maps under all channels are superposed to form a new vector matrix, and a fusion heat map is obtained. And then, carrying out feature encoding on the fused heat map through another feature encoder to obtain the heat map fused features of the cells. The network structure of the feature encoder that feature encodes the fused heat map may be the same as the feature encoder that feature encodes the fused image.
And respectively combining the number of signal points output by the regression classification sub-branches of the fluorescence signal detection sub-models corresponding to the channels in sequence, thereby fusing to obtain the statistical fusion characteristics of the cells. Wherein the signal point fusion features comprise the heat map fusion features and the statistical fusion features.
And performing fusion feature extraction based on the image fusion feature, the heat map fusion feature and the statistical fusion feature, and performing cell classification based on the extracted result to obtain a cell classification result. The image fusion features, the heat map fusion features and the statistical fusion features can be spliced, feature extraction is performed on the splicing result by using a full connection layer to obtain classification features, probability quantization is performed on the classification features through activation functions such as sigmoid, probability of the cell corresponding to each cell type is obtained, and therefore the classification result of the cell is determined.
Based on any of the embodiments, the sub-model for detecting fluorescence signals of each channel based on the cell classification model performs signal point detection on a signal fluorescence image of any cell under a plurality of channels to obtain a signal point detection result of any cell under a plurality of channels, and specifically includes:
carrying out image coding on a signal fluorescence image of any cell under any channel based on an encoder of a fluorescence signal detection sub-model corresponding to any channel to obtain an image feature vector of the signal fluorescence image under any channel;
performing signal point prediction on each pixel in the signal fluorescence image under any channel based on the heat map sub-branch of the fluorescence signal detection sub-model corresponding to any channel and combining the image feature vector to obtain the fluorescence signal heat map of any cell under any channel;
and performing regression classification on signal points in the signal fluorescence image under any channel based on the regression classification sub-branch of the fluorescence signal detection sub-model corresponding to any channel by combining the image feature vector and the fluorescence signal heat map to obtain the number of the signal points of any cell under any channel.
Specifically, the fluorescence signal detection submodel corresponding to any channel comprises an encoder, a heat map sub-branch, a regression classification sub-branch and other structures. For signal point detection, the signal fluorescence image under the channel is first image-encoded by an encoder to obtain an image feature vector of the signal fluorescence image.
The thermographic sub-branch is used to acquire a fluorescence signal thermograph (heatmap) of the same size as the input image. The branch comprises a decoder used for carrying out signal point prediction on each pixel in the signal fluorescence image based on the image feature vector. Here, the decoder may also employ an upsampling structure corresponding to the encoder.
The image feature vectors and fluorescence signal heatmaps are then input to the regression classifier branches for regression classification. Based on regression classification sub-branches, the image feature vector and the fluorescence signal heat map are respectively subjected to dimensionality reduction through full-connection layers and then connected into a new feature vector, the feature vector is further subjected to dimensionality reduction to one-dimensional features through a plurality of full-connection layers, and then regression analysis and/or classification are carried out on the feature vector and the one-dimensional features, so that the number of signal points is obtained.
Based on any one of the embodiments, the performing regression classification on the signal points in the signal fluorescence image of any one channel based on the regression classification sub-branch of the fluorescence signal detection sub-model corresponding to any one channel in combination with the image feature vector and the fluorescence signal heat map to obtain the number of the signal points of any one cell in any one channel specifically includes:
performing signal point classification on signal points in the signal fluorescence image under any channel based on a signal point classification branch of the regression classification subbranch and by combining the image feature vector and the fluorescence signal heat map to obtain a first signal point number of any cell under any channel;
and/or performing signal point regression on signal points in the signal fluorescence image under any channel based on a signal point regression branch of the regression classification sub-branch in combination with the image feature vector and the fluorescence signal heat map to obtain a second signal point number of any cell under any channel;
determining the number of signal points of any cell under any channel based on the first number of signal points and/or the second number of signal points of any cell under any channel.
Specifically, the regression classification subbranch includes a signal point classification branch and a signal point regression branch. And the signal point classification branch is used for carrying out signal point classification on signal points in the signal fluorescence image under the channel based on the image feature vector and the fluorescence signal heat map to obtain the first signal point number of the cell under the channel. Here, the number of different signal points may be set as corresponding types in advance, and then classification is performed based on the image feature vector and the fluorescence signal heat map by using the signal point classification branch by using a classification concept. For example, in the current application scenario, the number of signal points of a cell in any channel may be 0, 1, 2, 3, 4, and 4 or more, wherein the case where the number of signal points is 4 or more is relatively small, so the number of signal points can be preset to the following six types: class 0, class 1, class 2, class 3, class 4, and class 4 or more. When the signal point classification branch is used for classification and judgment, the signal point classification can be carried out based on the image feature vector and the fluorescence signal heat map, so that the number of signal points in the current signal fluorescence image is determined.
And the signal point regression branch is used for performing signal point regression on signal points in the signal fluorescence image under the channel based on the image feature vector and the fluorescence signal heat map to obtain the second signal point number of the cell under the channel. Here, a regression idea may be adopted to predict the number of signal points in the current signal fluorescence image by performing regression analysis based on the above-mentioned image feature vector and fluorescence signal heat map using a signal point regression branch.
When the number of signal points of any cell in any channel is obtained, only one of the signal point classification branches or the signal point regression branches may be adopted, and the number of first signal points or the number of second signal points output by the corresponding branch may be used as the last number of signal points. In addition, the signal point classification branch and the signal point regression branch can be simultaneously used for respectively obtaining the first signal point quantity and the second signal point quantity, and the final signal point quantity is determined by combining the first signal point quantity and the second signal point quantity.
Regarding which manner to select, for example, only selecting the signal point classification branch or the signal point regression branch, or combining the signal point classification branch and the signal point regression branch, the selection may be performed based on the numerical range of the number of signal points that the sample cell may include in the channel in the actual application scenario and the number of sample cells corresponding to the number of each signal point, so as to improve the accuracy of the number of finally obtained signal points. For example, if the number of signal points that a sample cell may include in the channel is in a narrow range, for example, most of the signal points are concentrated in numbers of 0, 1, 2, 3 and 4, and the very small number is 4 or more, the signal point classification branch has better effect than the signal point regression branch, and the signal point classification branch can be selected to perform the acquisition of the number of signal points; if the numerical range of the number of the signal points is wide and the number of the sample cells corresponding to the number of each signal point is balanced, the effect of the signal point regression branch is better than that of the signal point classification branch, and the signal point regression branch can be selected to acquire the number of the signal points.
Furthermore, the method of obtaining the number of signal points jointly by combining the signal point classification branch and the signal point regression branch can be directly selected, and the output of the two branches can be adjusted by using the weight coefficient. If the effect of the signal point classification branch is better than that of the signal point regression branch in the current application scene, the weight of the signal point classification branch can be set to be higher; if the effect of the signal point regression branch is better than that of the signal point classification branch, the weight of the signal point regression branch can be set to be higher, and therefore the accuracy of the number of finally obtained signal points is improved.
Based on any of the above embodiments, the cell classification model is obtained by training based on the following steps:
respectively training each channel fluorescence signal detection submodel of the cell classification model based on a sample signal fluorescence image of a sample cell under each channel and signal point labels and the number of sample signal points in the sample signal fluorescence image until the training loss of the fluorescence signal detection submodel corresponding to each channel is less than or equal to a first loss threshold;
and training the classification branches by combining the trained fluorescence signal detection submodels corresponding to the channels based on the sample cell types of the sample cells until the training loss of the classification branches is less than or equal to a second loss threshold value.
Specifically, when training the cell classification model, the fluorescence signal detection submodels corresponding to the respective channels are first trained, respectively. Here, the fluorescent signal detection submodels of the channels may be trained independently based on the sample signal fluorescent image of the sample cells in each channel and the signal point labels and the number of sample signal points in the sample signal fluorescent image, until the training loss of the fluorescent signal detection submodels corresponding to each channel is less than or equal to the first loss threshold.
And training the classification branches of the cell classification model by combining the fluorescence signal detection submodels corresponding to the channels based on the sample cell types of the sample cells after the signal point detection precision of the fluorescence signal detection submodels corresponding to the channels reaches the preset condition until the training loss of the classification branches is less than or equal to a second loss threshold. Wherein, when training the classification branch, Binary Cross Encopy (BCE) loss can be adopted for carrying out Binary classification constraint. The training of restricting the classification branches by using the fluorescence signal detection submodel which achieves certain accuracy can guide the classification branch academic conference to extract more accurate image fusion characteristics and signal point fusion characteristics based on a signal point detection result with certain accuracy, and meanwhile, the academic conference selects characteristic information with higher reliability from the image fusion characteristics and the signal point fusion characteristics with different dimensionality to perform subsequent cell classification, thereby being beneficial to improving the accuracy of cell classification.
In addition, in the training process, the evaluation indexes of the fluorescence signal detection submodel in the first stage can be choice of dice and AUC, and the model is stored preferentially according to the evaluation indexes; AUC can be selected as evaluation index of the training classification branch in the second stage.
Based on any of the above embodiments, the training of the fluorescence signal detection submodels corresponding to the channels of the cell classification model based on the sample signal fluorescence image of the sample cells in each channel and the signal point labels and the number of the sample signal points in the sample signal fluorescence image is respectively performed until the training loss of the fluorescence signal detection submodels corresponding to each channel is less than or equal to the first loss threshold, specifically including:
training the heat map sub-branch of the fluorescence signal detection sub-model corresponding to any channel based on the sample signal fluorescence image of the sample cell under any channel and signal point marks in the sample signal fluorescence image until the training loss of the heat map sub-branch of the fluorescence signal detection sub-model corresponding to any channel is less than or equal to a third loss threshold value;
and performing combined training on the heat map sub-branch and the regression classification sub-branch of the fluorescence signal detection sub-model corresponding to any channel based on the sample signal fluorescence image of the sample cells in any channel and the signal point labels and the number of sample signal points in the sample signal fluorescence image until the training loss of the fluorescence signal detection sub-model corresponding to any channel is less than or equal to a first loss threshold value.
Specifically, the fluorescence signal detection sub-model corresponding to any channel comprises two sub-branches of a heat map sub-branch and a regression classification sub-branch, so that the two sub-branches of the heat map sub-branch and the regression classification sub-branch can mutually guide the learning of the other sub-branch in the training process, and the performance of the two sub-branches is improved. The number of the signal points provided by the regression classification subbranch can guide the learning of the heat map subbranch, improve the signal point segmentation effect of the heat map subbranch on the scenes with adhesive signal points and miscellaneous points, and improve the performance of signal point prediction; on the contrary, the information such as the position and distribution of the signal point pixels contained in the fluorescence signal heat map provided by the heat map subbranch can guide the learning of the regression classification subbranch, and can also improve the segmentation effect of the regression classification subbranch on the signal points in the scenes with the adhesive signal points and the miscellaneous points, and improve the performance of acquiring the number of the signal points.
Specifically, when the fluorescence signal detection submodel corresponding to any one channel is trained independently, the heatmap subbranch and the regression classification subbranch may be trained jointly based on the sample signal fluorescence image in the single channel, the signal point labels of the signal points in the sample signal fluorescence image, and the number of the sample signal points.
In the training process, the heat map subbranch may be constrained based on the signal point labels of the respective signal points until the segmentation loss (e.g., the difference between the signal point region predicted by the heat map subbranch and the group route corresponding to the signal point label) is less than or equal to the third loss threshold, that is, the heat map subbranch may substantially separate the signal points in the fluorescence image of the sample signal, but the segmentation effect for the partially stuck signal points may still be insufficient. Therefore, the heatmap subbranches and the regression classifier subbranches can be trained together until the overall training loss of the fluorescence signal detection submodel corresponding to the channel is less than or equal to the first loss threshold. Here, the regression classification subbranch performs regression analysis and/or classification based on the output result of the heatmap subbranch, so that when the heatmap subbranch and the regression classification subbranch are constrained by using the signal point labels of the signal points and the number of the sample signal points, the performances of the two subbranches affect each other, and are improved together. The performance of the heat map subbranches influences the performance of the regression branch, and the regression classification subbranches reversely guide the heat map subbranches in order to improve the performance of the regression classification subbranches in the training process, so that the separation capability of the adhesion signal points and the identification capability of non-signal points such as the impurity points are improved as much as possible, and the two branches can be jointly learned and mutually supplemented, thereby improving the respective performance.
Wherein, in training the heat map sub-branches, the following focal length (c) can be adopted
) And dice loss: (
) Segmentation loss as a heat map sub-branch (
) And (3) constraining the fluorescence signal heatmap (p) predicted by the heat map sub-branch and the group Truth (Y) corresponding to the signal point label:
wherein,
and
in order to be a hyper-parameter,
the probability of a prediction being a positive sample is output for a heat map sub-branch.
Subsequently, when training the heatmap subbranches and regression classification subbranches jointly, the training loss of the regression classification subbranches: (
) Classification loss by signal point classification branch (
) And the return loss of the signal point return branch (
) And (4) forming. The training loss of the regression classifier may be determined using the following formula:
wherein p1 and p2 are the output results of the signal point classification branch and the signal point regression branch, respectively, G is the number of sample signal points,
the weight coefficients of the branches are classified for the signal points,
the weight coefficient is a weight coefficient of the signal point regression branch, and the weight coefficient can be applied to an actual inference process after the model training is finished and is used for adjusting the proportion of the first signal point quantity and the second signal point quantity output by the signal point classification branch and the signal point regression branch when the signal point quantity is determined. Here, the classification loss of the signal point classification branch
The regression loss of the regression branch of the signal point can be calculated by Cross energy loss
Can be calculated by using L1 loss.
Based on any one of the above embodiments, there is further provided a method for constructing a cell classification model, the method including:
constructing a data set: the data set comprises a sample signal fluorescence image of the sample cell under each channel, signal point labels and the number of sample signal points in the sample signal fluorescence image, and a sample cell type of the sample cell. The signal point labeling of each channel can be labeled by adopting a pixel-level full supervision mode, and can also be labeled by adopting a point to realize weak supervision or partially labeled to realize semi-supervision to improve the overall labeling efficiency.
Data preprocessing: and extracting labeling mask information of the signal points and the number of the signal points of the cells under each channel from the signal point labels, and extracting identification information related to different types of the cells. If the point labeling is adopted in the previous step, constructing 'pseudo mask' labeling of the signal points through Gaussian kernel convolution from the point labeling, and further counting the number of the signal points under each cell. In addition, boundary information of each cell is obtained by combining pixel level mask labeling of the cell, and the cell is divided and the number of signal points of each channel is sorted.
Constructing a model: first, an example segmentation method, such as Mask-Rcnn, can be used to perform cell segmentation on the sample cell image to obtain contour information of each cell, and the sample signal fluorescence image of each channel of a single sample cell can be cut according to the contour information.
As shown in fig. 2, the cell segmentation model is a two-stage network, the first stage is used to complete the detection of signal points and the regression and/or classification of the number of signal points in each channel, and the second stage is used to perform cell classification determination based on various types of information and output of the first stage in combination with the original image.
For the first stage, the fluorescence signal detection submodel for each channel has mainly two branches: heat map sub-branch and regression classification sub-branch. According to the group Truth obtained by signal point labeling, the position of each signal point can be extracted, and the number of the signal points under each channel can be counted.
The operation flow of the fluorescence signal detection submodel mainly comprises the following steps: and encoding the sample signal fluorescence image under each channel by an Encoder (Encoder) to obtain an image feature vector (late feature). Because the size of the independent cell image after segmentation is 128 x 128, a 4-layer Unet structure can be adopted to construct an encoder, and the length of the finally obtained image feature vector is 128; the heat map subbranch finally acquires a fluorescence signal heat map (heatmap) with the same size as the input image through a Decoder (Decoder) formed by skip connection and up-sampling. Here, the decoder can also use a 4-layer upsampling structure corresponding to the encoder, with a final fluorescent signal thermographic size of 128 × 128.
And the signal point classification branch in the regression classification sub-branch (R/C) respectively reduces the dimensions of the image feature vector and the fluorescence signal heat map through the full connection layers and then connects the image feature vector and the fluorescence signal heat map into a new feature vector, and further reduces the dimensions to one-dimensional features through a plurality of full connection layers to perform classification and judgment, and performs constraint through CE loss and the number of signal points. Wherein, the image characteristic vector and the fluorescence signal heat map are respectively reduced to 64 dimensions through the full connecting layer, and after being combined, a 128-dimensional characteristic vector is formed, and further reduced to 32 dimensions and 1 dimension through the full connecting layer. To prevent overfitting, dropout layers were added after 128 and 32 dimensions, respectively, at a ratio of 0.5. The signal point regression branch has a similar structure to the signal point classification branch, and is not described herein again.
For the second stage, the classification branch is used to fuse the output and information of each channel, and finally obtain the classification of the cells. Here, the classification branch is performed by adopting a front-end fusion and a rear-end fusion, and the operation flow includes: and (3) superposing the sample signal fluorescence images of the sample cells under each channel to form a new vector matrix, and acquiring image fusion characteristics through an independent encoder (such as Resnet 16). Since there are 4 channels of signal features, combined with the mask of the sample cell, the resultant vector matrix is 128 × 5, and the final feature vector is 64. In addition, the fluorescence signal heat maps of all channels are superposed to form a new vector matrix, and the heat map fusion characteristics are acquired through another independent encoder. The network structure of the independent encoder is the same as the encoder described above, and the final feature vector is 64. And then, sequentially combining the number of signal points acquired by the regression classification subbranches corresponding to each channel in sequence to obtain the statistical fusion characteristics. And combining the final image fusion feature, the heat map fusion feature and the statistical fusion feature, reducing the dimension to 32 through the full connection layer, reducing the dimension to 1 through the full connection layer, further acquiring a corresponding classification result through a sigmoid activation function, and performing secondary classification constraint through BCE loss.
The cell sorting apparatus based on signal point content constraint according to the present invention is described below, and the cell sorting apparatus based on signal point content constraint described below and the cell sorting method based on signal point content constraint described above may be referred to in correspondence with each other.
Based on any of the above embodiments, fig. 3 is a schematic structural diagram of the cell sorting apparatus based on signal point content constraint according to the present invention, as shown in fig. 3, the apparatus includes: an image segmentation unit 310, a signalpoint detection unit 320, and acell classification unit 330.
The image segmentation unit 310 is configured to perform image segmentation on the cell image to obtain a cell contour of each cell, and obtain a signal fluorescence image of each cell under multiple channels based on the cell contour;
the signalpoint detection unit 320 is configured to perform signal point detection on a signal fluorescence image of any cell in multiple channels based on a fluorescence signal detection submodel of each channel of the cell classification model, so as to obtain a signal point detection result of any cell in multiple channels;
thecell classification unit 330 is configured to perform feature extraction on a fusion image of a signal fluorescence image of any cell in multiple channels and a fusion result of a signal point detection result in multiple channels based on a classification branch of a cell classification model to obtain an image fusion feature and a signal point fusion feature of any cell, and perform cell classification based on the image fusion feature and the signal point fusion feature to obtain a classification result of any cell.
The device provided by the embodiment of the invention carries out signal point detection on the signal fluorescence images of any cell under a plurality of channels to obtain the signal point detection results of the cell under the plurality of channels, extracting the characteristics of the fusion image of the signal fluorescence image of the cell under a plurality of channels and the fusion result of the signal point detection result under a plurality of channels, enriching the image semantic information in the signal fluorescence image and the signal point information under each channel to obtain the image fusion characteristics and the signal point fusion characteristics of the cell, cell classification is carried out based on the image fusion characteristics and the signal point fusion characteristics, characteristic information of all dimensions can be mutually supplemented and verified, trained classification branches can select characteristic information with higher reliability from characteristic information of all dimensions from principal and subordinate, and accordingly accuracy of cell classification is effectively improved.
Based on any embodiment, the sub-model for detecting fluorescence signals of each channel based on the cell classification model performs signal point detection on the fluorescence signal image of any cell under multiple channels to obtain the signal point detection result of any cell under multiple channels, and specifically includes:
performing signal point prediction on a signal fluorescence image of any cell under any channel based on a heat map sub-branch of a fluorescence signal detection sub-model corresponding to any channel to obtain a fluorescence signal heat map of any cell under any channel;
performing signal point regression classification on the signal fluorescence image of any cell under any channel based on the regression classification sub-branch of the fluorescence signal detection sub-model corresponding to any channel to obtain the number of signal points of any cell under any channel; the signal point detection result comprises the fluorescence signal heat map and the number of signal points.
Based on any one of the above embodiments, the performing feature extraction on the fusion image of the signal fluorescence image of any one cell under the multiple channels and the fusion result of the signal point detection result under the multiple channels to obtain the image fusion feature and the signal point fusion feature of any one cell specifically includes:
superposing the signal fluorescence images of any cell under a plurality of channels to obtain a fusion image, and then extracting the characteristics of the fusion image to obtain the image fusion characteristics of any cell;
superposing fluorescence signal heat maps of any cell under a plurality of channels to obtain a fusion heat map, and then carrying out feature extraction on the fusion heat map to obtain heat map fusion features of any cell;
fusing the number of signal points of any cell under a plurality of channels to obtain the statistical fusion characteristic of any cell; the signal point fusion features include the heat map fusion features and the statistical fusion features.
Based on any embodiment, the sub-model for detecting fluorescence signals of each channel based on the cell classification model performs signal point detection on the fluorescence signal image of any cell under multiple channels to obtain the signal point detection result of any cell under multiple channels, and specifically includes:
carrying out image coding on a signal fluorescence image of any cell under any channel based on an encoder of a fluorescence signal detection submodel corresponding to any channel to obtain an image feature vector of the signal fluorescence image under any channel;
performing signal point prediction on each pixel in the signal fluorescence image under any channel based on the heat map sub-branch of the fluorescence signal detection sub-model corresponding to any channel and combining the image feature vector to obtain the fluorescence signal heat map of any cell under any channel;
and performing regression classification on signal points in the signal fluorescence image under any channel based on the regression classification sub-branch of the fluorescence signal detection sub-model corresponding to any channel by combining the image feature vector and the fluorescence signal heat map to obtain the number of the signal points of any cell under any channel.
Based on any one of the embodiments, the performing regression classification on the signal points in the signal fluorescence image of any one channel based on the regression classification sub-branch of the fluorescence signal detection sub-model corresponding to any one channel in combination with the image feature vector and the fluorescence signal heat map to obtain the number of the signal points of any one cell in any one channel specifically includes:
performing signal point classification on signal points in the signal fluorescence image under any channel based on a signal point classification branch of the regression classification subbranch and by combining the image feature vector and the fluorescence signal heat map to obtain a first signal point number of any cell under any channel;
and/or performing signal point regression on signal points in the signal fluorescence image under any channel based on a signal point regression branch of the regression classification sub-branch in combination with the image feature vector and the fluorescence signal heat map to obtain a second signal point number of any cell under any channel;
determining the number of signal points of any cell under any channel based on the first number of signal points and/or the second number of signal points of any cell under any channel.
Based on any of the above embodiments, the cell classification model is obtained by training based on the following steps:
respectively training each channel fluorescence signal detection submodel of the cell classification model based on a sample signal fluorescence image of a sample cell under each channel and signal point labels and the number of sample signal points in the sample signal fluorescence image until the training loss of the fluorescence signal detection submodel corresponding to each channel is less than or equal to a first loss threshold;
and training the classification branches by combining the trained fluorescence signal detection submodels corresponding to the channels based on the sample cell types of the sample cells until the training loss of the classification branches is less than or equal to a second loss threshold value.
Based on any of the above embodiments, the training of the fluorescence signal detection submodels corresponding to the channels of the cell classification model based on the sample signal fluorescence image of the sample cells in each channel and the signal point labels and the number of the sample signal points in the sample signal fluorescence image is respectively performed until the training loss of the fluorescence signal detection submodels corresponding to each channel is less than or equal to the first loss threshold, specifically including:
training the heat map sub-branch of the fluorescence signal detection sub-model corresponding to any channel based on the sample signal fluorescence image of the sample cell in any channel and the signal point label in the sample signal fluorescence image until the training loss of the heat map sub-branch of the fluorescence signal detection sub-model corresponding to any channel is less than or equal to a third loss threshold value;
and performing combined training on the heat map sub-branch and the regression classification sub-branch of the fluorescence signal detection sub-model corresponding to any channel based on the sample signal fluorescence image of the sample cells in any channel and the signal point labels and the number of sample signal points in the sample signal fluorescence image until the training loss of the fluorescence signal detection sub-model corresponding to any channel is less than or equal to a first loss threshold value.
Fig. 4 is a schematic structural diagram of an electronic device provided in the present invention, and as shown in fig. 4, the electronic device may include: a processor (processor)410, a memory (memory)420, a communication Interface (Communications Interface)430 and acommunication bus 440, wherein theprocessor 410, thememory 420 and thecommunication Interface 430 are in communication with each other via thecommunication bus 440.Processor 410 may invoke logic instructions inmemory 420 to perform a cell classification method based on signal point content constraints, the method comprising: performing image segmentation on the cell image to obtain a cell contour of each cell, and acquiring a signal fluorescence image of each cell under a plurality of channels based on the cell contour; performing signal point detection on a signal fluorescence image of any cell under a plurality of channels based on a fluorescence signal detection submodel of each channel of a cell classification model to obtain signal point detection results of any cell under a plurality of channels; and based on a classification branch of a cell classification model, performing feature extraction on a fusion image of the signal fluorescence image of any cell under a plurality of channels and a fusion result of the signal point detection result under the plurality of channels to obtain an image fusion feature and a signal point fusion feature of any cell, and performing cell classification based on the image fusion feature and the signal point fusion feature to obtain a classification result of any cell.
Furthermore, the logic instructions in thememory 420 may be implemented in the form of software functional units and stored in a computer readable storage medium when the software functional units are sold or used as independent products. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
In another aspect, the present invention also provides a computer program product comprising a computer program stored on a non-transitory computer-readable storage medium, the computer program comprising program instructions which, when executed by a computer, enable the computer to perform a cell classification method based on signal point content constraints provided by the above methods, the method comprising: carrying out image segmentation on the cell image to obtain a cell contour of each cell, and acquiring a signal fluorescence image of each cell under a plurality of channels based on the cell contour; performing signal point detection on signal fluorescence images of any cell under a plurality of channels based on each channel fluorescence signal detection submodel of the cell classification model to obtain signal point detection results of any cell under a plurality of channels; and based on a classification branch of a cell classification model, performing feature extraction on a fusion image of the signal fluorescence image of any cell under a plurality of channels and a fusion result of the signal point detection result under the plurality of channels to obtain an image fusion feature and a signal point fusion feature of any cell, and performing cell classification based on the image fusion feature and the signal point fusion feature to obtain a classification result of any cell.
In yet another aspect, the present invention also provides a non-transitory computer-readable storage medium having stored thereon a computer program, which when executed by a processor, is implemented to perform the signal point content constraint-based cell classification method provided above, the method comprising: carrying out image segmentation on the cell image to obtain a cell contour of each cell, and acquiring a signal fluorescence image of each cell under a plurality of channels based on the cell contour; performing signal point detection on signal fluorescence images of any cell under a plurality of channels based on each channel fluorescence signal detection submodel of the cell classification model to obtain signal point detection results of any cell under a plurality of channels; and based on a classification branch of a cell classification model, performing feature extraction on a fusion image of the signal fluorescence image of any cell under a plurality of channels and a fusion result of the signal point detection result under the plurality of channels to obtain an image fusion feature and a signal point fusion feature of any cell, and performing cell classification based on the image fusion feature and the signal point fusion feature to obtain a classification result of any cell.
The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.