Disclosure of Invention
In view of the above problems, the present invention provides a method for learning a neural skeleton-aware morphological feature based on self-supervised learning, an electronic device, and a storage medium, so as to solve at least one of the above problems.
According to a first aspect of the present invention, there is provided a method for learning a neural skeleton-aware morphological feature based on self-supervised learning, comprising:
sampling and extracting features of the neuron point cloud to be processed by using a morphological characterization encoder of the skeleton grid generating module after training to obtain sampling points of the neuron point cloud to be processed and skeleton perception morphological characterizations corresponding to the sampling points of the neuron point cloud to be processed one by one;
processing the skeleton perception form representation by using an offset multi-layer perceptron of the skeleton grid generating module after training to obtain an offset predicted value of the sampling point, and processing the skeleton perception form representation by using a radius multi-layer perceptron of the skeleton grid generating module after training to obtain a radius predicted value of the sampling point;
generating skeleton balls which represent three-dimensional geometric features of the neuron point cloud to be processed and are in one-to-one correspondence with the neuron point cloud to be processed according to the three-dimensional coordinate information of the sampling points, the offset predicted value of the sampling points and the radius predicted value of the sampling points;
Classifying the skeleton perception form characterization by using the trained neural network classification model to obtain the class information of the neuron point cloud to be processed;
and performing connection prediction on the skeleton ball by using a skeleton graph connection prediction network of the skeleton grid generation module after training to obtain a three-dimensional skeleton grid of the neuron to be processed, and performing three-dimensional rendering treatment on the three-dimensional skeleton grid of the neuron to be processed.
According to an embodiment of the present invention, the connection prediction of skeleton balls by using the skeleton graph connection prediction network of the skeleton grid generation module after training, to obtain a three-dimensional skeleton grid of neurons to be processed includes:
performing three-dimensional image initialization pretreatment on the skeleton ball, and performing self-supervision coding and decoding operation on the pretreated skeleton ball by using a graph self-coder/decoder of a skeleton graph connected with a prediction network to obtain embedded characteristics of nodes in the pretreated skeleton ball;
and based on the embedded characteristics of the nodes in the pretreated skeleton ball, performing connection prediction on the pretreated skeleton ball by using a skeleton diagram connection prediction network to obtain a three-dimensional skeleton grid of the neurons to be treated.
According to a second aspect of the present invention, there is provided a training method of a skeleton grid generation module and a contrast learning module, applied to a neuron skeleton perception morphology characterization learning method based on self-supervised learning, comprising:
Training a skeleton grid generating module by utilizing a neuron point cloud sample, and weighting and calculating a skeleton grid generating loss value in the training process of the skeleton grid generating module by utilizing a predefined point cloud-to-skeleton loss function;
training the contrast learning module by using the neuron point cloud sample after data enhancement, and calculating a contrast learning loss value in the training process of the contrast learning module by using a predefined contrast learning loss function;
generating a loss value based on the skeleton grid and a comparison learning loss value, and carrying out parameter joint optimization on the skeleton grid generation module and the comparison learning module by using the joint loss value based on a weight sharing mechanism of a morphological representation encoder of the comparison learning module and a morphological representation encoder of the skeleton grid generation module;
and carrying out iteration on training of the skeleton grid generation module, calculation of a skeleton grid generation loss value, training of the comparison learning module, calculation of a comparison learning loss value, acquisition operation of a joint loss value and parameter joint optimization operation until a preset training condition is met, so as to obtain the skeleton grid generation module after training and the comparison learning module after training.
According to an embodiment of the present invention, training the skeleton mesh generation module using the neuron point cloud sample, and weighting and calculating the skeleton mesh generation loss value in the training process of the skeleton mesh generation module using the predefined point cloud-to-skeleton loss function includes:
acquiring a neuron point cloud sample, sampling and extracting features of the neuron point cloud sample by using a morphological characterization encoder of a skeleton grid generation module to obtain sampling points of the neuron point cloud sample and upper and lower Wen Biaozheng which are in one-to-one correspondence with the sampling points of the neuron point cloud sample;
the method comprises the steps of processing context representation by using an offset multi-layer perceptron of a skeleton grid generating module to obtain an offset predicted value of a sampling point, and processing context representation by using a radius multi-layer perceptron of the skeleton grid generating module to obtain a radius predicted value of the sampling point;
generating skeleton balls which represent three-dimensional geometric features of the neuron point cloud samples and are in one-to-one correspondence with the neuron point cloud samples according to the three-dimensional coordinate information of the sampling points, the offset predicted value of the sampling points and the radius predicted value of the sampling points;
performing connection prediction on the skeleton ball by using a skeleton graph connection prediction network of the skeleton grid generation module so as to perform three-dimensional reconstruction on the neuron point cloud sample, thereby generating a three-dimensional skeleton grid of the neuron point cloud sample;
And calculating loss values in the sampling and feature extraction operation, the offset and radius prediction operation, the skeleton sphere generation operation and the three-dimensional skeleton grid generation operation by utilizing a predefined point cloud-to-skeleton loss function weighting, and obtaining skeleton grid generation loss values.
According to an embodiment of the present invention, the acquiring a neuron point cloud sample includes:
extracting boundary points of the segmentation fragments from each frame of image in the neuron electron microscope image data set along a predefined axial direction, and forming the obtained boundary points into neuron point clouds;
and according to the predefined sampling quantity, carrying out average sampling on the neuron point cloud, and normalizing the average sampling result into a predefined unit cube to obtain a neuron point cloud sample.
According to an embodiment of the present invention, the predefined point cloud-to-skeleton loss functions include a sampling loss function, a point-to-sphere loss function, a radius regular penalty loss function, a graph connectivity loss function, an offset regular loss function, and an offset direction loss function;
the skeleton grid generation loss value is obtained by carrying out weighted calculation on a sampling loss value, a point-to-sphere loss value, a radius regular penalty loss value, a graph connectivity loss value, an offset regular loss value and an offset direction loss value.
According to an embodiment of the present invention, training the contrast learning module using the data-enhanced neuron point cloud sample, and calculating the contrast learning loss value in the training process of the contrast learning module using the predefined contrast learning loss function includes:
carrying out data enhancement on the neuron point cloud samples by adopting different data enhancement methods to obtain a first neuron point cloud enhancement sample and a second neuron point cloud enhancement sample;
inputting the first neuron point cloud enhancement sample into an online network of a contrast learning module for processing to obtain a first embedded feature;
inputting the second neuron point cloud enhancement sample into a target network of a contrast learning module for processing to obtain a second embedded feature;
and calculating a loss value in the embedded feature acquisition operation by utilizing a predefined contrast learning loss function to obtain a contrast learning loss value.
According to an embodiment of the present invention, the online network includes a morphological coder, a mapper, and a predictor;
the method comprises the steps that a morphological characterization encoder of a target network and a morphological characterization encoder of a skeleton grid generation module share parameters and weights;
wherein the target network comprises an encoder and a mapper;
The encoder of the target network is obtained by carrying out the exponential moving average operation of parameters on the morphological characterization encoder of the online network;
wherein, the mapper of the target network is obtained by carrying out the exponential sliding average operation of the parameters on the mapper of the online network.
According to a third aspect of the present invention, there is provided an electronic device comprising:
one or more processors;
storage means for storing one or more programs,
wherein the one or more programs, when executed by the one or more processors, cause the one or more processors to perform a neuron skeleton-aware morphology characterization learning method and a training method of the skeleton mesh generation module and the contrast learning module based on self-supervised learning.
According to a fourth aspect of the present invention, there is provided a computer-readable storage medium having stored thereon executable instructions that, when executed by a processor, cause the processor to perform a method of learning a neural skeleton-aware morphological feature based on self-supervised learning and a training method of a skeleton grid generation module and a contrast learning module.
According to the self-supervision learning-based neuron skeleton perception morphological characterization learning method provided by the invention, the training-completed skeleton grid generation module is utilized to extract morphological characterization of the unlabeled neurons so as to obtain the key geometric structures of the unlabeled neurons, so that efficient rendering of the three-dimensional structures of the unlabeled neurons is realized, morphological analysis and research by technicians in the field are facilitated, classification of the unlabeled neurons is realized by utilizing a neural network classification model, and the problem that large-scale neuron electron microscope images in the field do not have labeling information is solved.
Detailed Description
The present invention will be further described in detail below with reference to specific embodiments and with reference to the accompanying drawings, in order to make the objects, technical solutions and advantages of the present invention more apparent.
Neuronal morphology analysis is of broad and important interest in neuroscience research. In recent years, in neuroscience research, development of high-resolution microscopic imaging techniques such as Optical Microscopy (OM) and Electron Microscopy (EM) has brought about mass data, such as Drosophila whole brain electron microscope image dataset FAFB of BigNeuron, 12TB, 1.4mm x.87 mm x.84 mm mouse cortex (P87) data electron microscope dataset MICrONS, and 1.4PB volume of about 1mm, of a mirror dataset BigNeuron containing about 20000 3D optical microscopic image blocks3 Human cerebral cortex electron microscope dataset H01, etc. These data enable detailed observations of neuronal morphology, and current researchers have proposed segmentation methods for neuronal reconstruction of Optical (OM) and Electron Microscope (EM) images. However, due to high resolution imaging, the segmentation result usually contains tens of thousands or even millions of voxels, and the number of over-segmented segments is very large, e.g. FAFB-FFN1 contains 1010 Individual segmentations present significant challenges for 3D rendering, morphology analysis, and morphology-based neuron classification.
In recent years, a number of neuronal morphology characterization methods have been proposed successively. Some methods propose to describe the dendritic structure of neurons by quantitative measurements, which often employ some artificially designed measurement methods to describe the morphological characteristics of the neurons, such as the total length and width of the neurons, the surface area of the neuron cell bodies, the number of prongs attached to the neuron cell bodies, and the angle of the prongs of the neurons, etc. Subsequently, other methods extract the morphological characteristics of neurons based on the characteristics of the neuron tree structure by using various conventional algorithms. For example, the TMD method introduces a topological morphology descriptor to encode the neuronal morphology as a bar code feature. The NeuroPath2Path method uses a Path-based model to extract the morphology features of neurons. However, the manual features employed by these methods often have limited characterization capabilities, making it difficult to distinguish fine-grained differences between large-scale neuronal morphology data.
Recently, some learning-based methods began to improve the performance of neuronal fragment morphology characterization by deep learning techniques, for example, morpeonet used a self-supervised method to learn Xi Sanwei morphological representations from point cloud data extracted from large-scale electron microscopic cell segmentation fragments, which utilized comparative learning to model at the instance and cluster levels, without any manual labeling. Other researchers extract the dendrite skeleton from the reconstructed result of neurons and extract morphological feature representations from the skeleton map using self-supervised learning.
The neuron morphological characterization finally obtained by the method can be used for classifying and identifying neurons, and has important application value in the fields of automatic analysis, mining, real-time tracking and the like of neuron microscope images. At the same time, three-dimensional visualization and real-time display of neurons are also important, and researchers can intuitively observe and understand morphological characteristics and geometric structures of neurons through real-time rendering of three-dimensional surfaces of the neurons, so that the role of the neurons in a brain model is deeply explored. While the skeleton, as an abstract representation of a three-dimensional shape, can be very naturally used for three-dimensional visualization of neurons.
In the fields of computer vision and computer graphics, skeleton extraction has been widely studied. Relevant researchers introduce an L1-media skeleton into their research as a curve skeleton representation method of three-dimensional point cloud data, which can be directly applied to undirected original point cloud scan data with significant noise, outliers and a large number of missing data areas. In FAFB-FFN1, the TEASAR algorithm is used to extract the curve skeleton representation from the EM neuron segmentation fragments. However, the curved skeleton described by the above algorithm only captures the tubular geometry information, which tends to produce significant errors for non-tubular structures. Other researchers have proposed a novel point set representation that can generate skeletons that resemble curved surfaces and curves. However, these skeletal representations are unstructured sets of points, lack topological constraints, and perform poorly on certain tubular structures.
Another classical skeleton representation for encoding three-dimensional shapes is the medial axis transformation (Medial Axis Transform, MAT). For a three-dimensional shape, a Medial Axis Transformation (MAT) is defined as a set of points with multiple closest points inside that are connected to the boundary surface, the MAT encoding the shape into a low-dimensional representation with associated radius functions. There are many methods for extracting MAT in three-dimensional shape. Related researchers put forward an algorithm for extracting the central axis of a plane polygon by utilizing a Voronoi diagram and a bisector, the idea of the algorithm is to firstly construct the Voronoi diagram of the plane polygon, sequentially calculate the bisectors of boundary elements, then delete the edges related to the reflection vertexes one by one, and finally obtain the central axis of the model. In addition, related researchers propose a method for firstly performing Delaunay triangulation on the inner area of the polygon, then calculating the outer centers of all triangles, and finally connecting all the outer centers in sequence to obtain a central axis of the polygon. Other related researchers have proposed a method of obtaining the medial axis of a model based on a discrete bisector function.
Although the definition of MAT is simple, there are two difficulties in using it in practice: (1) MAT is computationally expensive because it requires the input three-dimensional shape to be defined by a closed boundary surface and requires a significant amount of geometric processing time; (2) MAT is very sensitive to surface noise, i.e., small disturbances of the shape surface can cause many unimportant branches, such noise making the MAT unable to clearly reflect the structure of a given shape. Related researchers have proposed Q-MAT that uses a quadratic error minimization method to calculate a simplified, unstable-branching-free MAT. However, all of these methods require manual rules and require expensive geometric processing time.
Recently, deep learning-based methods have been proposed for predicting a skeleton from a three-dimensional grid or point cloud. Some related researchers firstly convert the three-dimensional grid into a three-dimensional grid representation, then predict the probability of skeleton points and skeleton lines by using a stack of hourglass network models, and finally generate a skeleton through minimum spanning tree algorithm processing. Point2Skeleton proposes a joint representation of a Skeleton and a mesh, called Skeleton mesh (skeletal mesh), which is a more general method of simplifying the representation of three-dimensional shapes. However, since Point2Skeleton is designed primarily as an artificial three-dimensional object, the Skeleton points are predicted using a convex combination of input Point clouds, if the input shape is concave, the predicted Skeleton points are likely to lie outside the shape, and these points lying outside the shape are referred to as outliers. Therefore, the Point2Skeleton can generate some outlier Skeleton points deviating from the original data when extracting the skeletons of the neurons with obvious structural differences and rich shape details, so that the final connected skeletons cannot keep the topological structure similar to the input shape.
In order to solve various technical problems in the prior art and to learn the morphological characterization of the neuron fragments more efficiently, the invention provides a neuron skeleton perception morphological characterization learning method based on self-supervision learning.
Morphological analysis becomes very challenging due to the very large-scale data volume of high resolution microscopy imaging. Some existing segmentation methods can segment high resolution microscopic images to obtain 3D segmented segments, but due to the high resolution of microscopic images, these segmented segments typically contain tens of thousands, even millions, of voxels, making manual analysis very challenging for 3D rendering, morphology analysis, and morphology-based neuron classification. Aiming at the problems, the invention provides a self-supervision method for learning the morphology characterization of skeleton perception from the super-large-scale 3D neuron separation fragments, and simultaneously realizing the simplification of a neuron fragment three-dimensional model maintaining a key geometric structure so as to support efficient rendering and morphological analysis.
FIG. 1 is a flow chart of a method for learning a neural skeleton-aware morphology characterization based on self-supervised learning according to an embodiment of the present invention.
As shown in fig. 1, the method for learning the neural skeleton perception morphology characterization based on the self-supervised learning includes operations S110 to S150.
In operation S110, sampling and feature extraction are performed on the neuron point cloud to be processed by using the morphological characterization encoder of the skeleton grid generation module after training, so as to obtain sampling points of the neuron point cloud to be processed and skeleton perception morphological characterizations corresponding to the sampling points of the neuron point cloud to be processed one by one.
In operation S120, the skeleton-aware morphology characterization is processed by using the offset multi-layer perceptron of the skeleton-mesh generating module after training to obtain an offset predicted value of the sampling point, and the skeleton-aware morphology characterization is processed by using the radius multi-layer perceptron of the skeleton-mesh generating module after training to obtain a radius predicted value of the sampling point.
In operation S130, skeleton balls representing three-dimensional geometric features of the neuron point cloud to be processed and corresponding to the neuron point cloud to be processed one by one are generated according to the three-dimensional coordinate information of the sampling points, the offset predicted values of the sampling points and the radius predicted values of the sampling points.
In operation S140, the training neural network classification model is used to classify the skeleton-aware morphology characterization, so as to obtain class information of the neuron point cloud to be processed.
Fig. 2 is a flowchart of acquiring type information of a neural point cloud to be processed according to an embodiment of the present invention.
As shown in fig. 2, after the n×3 neural point cloud is processed by the skeleton grid generation module, a 1×d skeleton-perceived morphological characterization is obtained, and the 1×d skeleton-perceived morphological characterization is input into the classification model, so as to obtain an analogy probability and further obtain class information of the neuron point cloud.
In the skeleton extraction task of the surface point cloud of the electronic microscope segmentation segment, the shape characterization of the neuron segment perceived by the skeleton is obtained through the combined training of the skeleton grid generation module and the contrast learning module, and the specific category of the neuron segment can be obtained by using the characterization and connecting a neural network classification model, and the specific scheme is shown in figure 2.
The skeleton-aware morphological characterization learned by the method provided by the invention can promote the selection of specific subcellular substructures to assist the subsequent neuron reconstruction and analysis work. In order to test the performance of the self-supervision expression method on the cell substructure classification task, the invention adopts a general test strategy in contrast learning, namely a linear evaluation method and a semi-supervision evaluation method.
In operation S150, connection prediction is performed on the skeleton ball by using the skeleton graph connection prediction network of the skeleton mesh generation module after training to obtain a three-dimensional skeleton mesh of the neuron to be processed, and three-dimensional rendering is performed on the three-dimensional skeleton mesh of the neuron to be processed.
According to the self-supervision learning-based neuron skeleton perception morphological characterization learning method provided by the invention, the training-completed skeleton grid generation module is utilized to extract morphological characterization of the unlabeled neurons so as to obtain the key geometric structures of the unlabeled neurons, so that efficient rendering of the three-dimensional structures of the unlabeled neurons is realized, morphological analysis and research by technicians in the field are facilitated, classification of the unlabeled neurons is realized by utilizing a neural network classification model, and the problem that large-scale neuron electron microscope images in the field do not have labeling information is solved.
According to an embodiment of the present invention, the connection prediction of skeleton balls by using the skeleton graph connection prediction network of the skeleton grid generation module after training, to obtain a three-dimensional skeleton grid of neurons to be processed includes: performing three-dimensional image initialization pretreatment on the skeleton ball, and performing self-supervision coding and decoding operation on the pretreated skeleton ball by using a graph self-coder/decoder of a skeleton graph connected with a prediction network to obtain embedded characteristics of nodes in the pretreated skeleton ball; and based on the embedded characteristics of the nodes in the pretreated skeleton ball, performing connection prediction on the pretreated skeleton ball by using a skeleton diagram connection prediction network to obtain a three-dimensional skeleton grid of the neurons to be treated.
The above method provided by the present invention is described in further detail below in conjunction with the specific embodiments and fig. 3.
Fig. 3 is a schematic diagram of obtaining neuron classification information and a three-dimensional skeleton grid from an ultra-high resolution electron microscope image according to an embodiment of the present invention.
As shown in fig. 3, the ultra-high resolution electron microscope image is first segmented and the surface point cloud of the segmented segment is extracted to obtain the neurons to be processed, at this time, the neurons to be processed are not marked with information, and the technical staff in the art does not know the category information and the three-dimensional skeleton grid of the neurons to be processed. And then, obtaining the skeleton-perceived morphological characterization and skeleton ball of the neuron to be processed by using a skeleton grid generation module after training, respectively processing the skeleton-perceived morphological characterization through a neural network classification model to obtain classification information of the neuron, and carrying out connection prediction and three-dimensional rendering on the skeleton ball to obtain a rendered skeleton grid for relevant research by a person skilled in the art.
The methods shown in fig. 1 to 3 all utilize a training-completed skeleton mesh generation module, which is mainly constructed based on a skeleton-aware morphological characterization network for learning expressive and discriminative morphological representations of an original 3D point cloud, and connect the shape simplification of skeleton awareness and the classification of the neuron fragments based on morphology by introducing a self-supervision learning framework, and enhance the identifiability of the learned morphological representations of the skeleton awareness by multitasking joint training, so as to obtain the training-completed skeleton mesh generation module, and the joint training process of the skeleton mesh generation module will be described in further detail with reference to fig. 3 and specific embodiments.
Fig. 4 is a flowchart of a training method of the skeletal mesh generation module and the contrast learning module in accordance with an embodiment of the present invention.
As shown in fig. 4, the training method of the skeleton grid generation module and the contrast learning module applied to the neuron skeleton perception morphology characterization learning method based on self-supervision learning includes operations S410 to S440.
In operation S410, the skeleton mesh generation module is trained using the neuron point cloud samples, and the skeleton mesh generation loss values in the skeleton mesh generation module training process are weighted using the predefined point cloud-to-skeleton loss function.
In operation S420, the contrast learning module is trained using the data-enhanced neuron point cloud samples, and a contrast learning loss value during the contrast learning module training is calculated using a predefined contrast learning loss function.
In operation S430, a loss value is generated based on the skeleton grid and a comparison learning loss value is used to perform parameter joint optimization on the skeleton grid generation module and the comparison learning module based on a weight sharing mechanism of a morphological characterization encoder of the comparison learning module and a morphological characterization encoder of the skeleton grid generation module.
In operation S440, the training of the skeleton-grid generating module, the calculation of the skeleton-grid generating loss value, the training of the comparison learning module, the calculation of the comparison learning loss value, the obtaining operation of the joint loss value and the parameter joint optimization operation are iterated until the preset training condition is satisfied, and the skeleton-grid generating module after the training and the comparison learning module after the training are obtained.
Through the self-supervised combined training of the skeleton grid generating module and the contrast learning module, the perception capability and the identification capability of the morphological characterization of the skeleton grid generating module after the training is finished can be enhanced.
According to an embodiment of the present invention, training the skeleton mesh generation module using the neuron point cloud sample, and weighting and calculating the skeleton mesh generation loss value in the training process of the skeleton mesh generation module using the predefined point cloud-to-skeleton loss function includes: acquiring a neuron point cloud sample, sampling and extracting features of the neuron point cloud sample by using a morphological characterization encoder of a skeleton grid generation module to obtain sampling points of the neuron point cloud sample and upper and lower Wen Biaozheng which are in one-to-one correspondence with the sampling points of the neuron point cloud sample; the method comprises the steps of processing context representation by using an offset multi-layer perceptron of a skeleton grid generating module to obtain an offset predicted value of a sampling point, and processing context representation by using a radius multi-layer perceptron of the skeleton grid generating module to obtain a radius predicted value of the sampling point; generating skeleton balls which represent three-dimensional geometric features of the neuron point cloud samples and are in one-to-one correspondence with the neuron point cloud samples according to the three-dimensional coordinate information of the sampling points, the offset predicted value of the sampling points and the radius predicted value of the sampling points; performing connection prediction on the skeleton ball by using a skeleton graph connection prediction network of the skeleton grid generation module so as to perform three-dimensional reconstruction on the neuron point cloud sample, thereby generating a three-dimensional skeleton grid of the neuron point cloud sample; and calculating loss values in the sampling and feature extraction operation, the offset and radius prediction operation, the skeleton sphere generation operation and the three-dimensional skeleton grid generation operation by utilizing a predefined point cloud-to-skeleton loss function weighting, and obtaining skeleton grid generation loss values.
According to an embodiment of the present invention, the acquiring a neuron point cloud sample includes: extracting boundary points of the segmentation fragments from each frame of image in the neuron electron microscope image data set along a predefined axial direction, and forming the obtained boundary points into neuron point clouds; and according to the predefined sampling quantity, carrying out average sampling on the neuron point cloud, and normalizing the average sampling result into a predefined unit cube to obtain a neuron point cloud sample.
Fig. 5 is a schematic diagram of neuronal fragment morphology data according to an embodiment of the invention.
For high-resolution electron microscope images, the segmentation result usually comprises thousands or even millions of voxels, and in order to analyze the morphology of the electron microscope images more efficiently, the surface point cloud of the electron microscope images is extracted first, so as to obtain a point cloud data set of electron microscope segmentation fragments (shown in fig. 5). The point cloud data used by the invention is acquired by collecting each segment segmented by FAFB-FFN 1. First, the boundary of the segmentation segment is extracted for each frame of picture along the z-axis direction, and then all boundary points are formed into a point cloud. However, the number of points corresponding to different segments varies greatly, so a fixed number of points need to be sampled from the dense surface point cloud on average and normalized to a unit square. This approach focuses the network on morphological and structural features, not the absolute size of the point cloud.
According to an embodiment of the present invention, the predefined point cloud-to-skeleton loss functions include a sampling loss function, a point-to-sphere loss function, a radius regular penalty loss function, a graph connectivity loss function, an offset regular loss function, and an offset direction loss function; the skeleton grid generation loss value is obtained by carrying out weighted calculation on a sampling loss value, a point-to-sphere loss value, a radius regular penalty loss value, a graph connectivity loss value, an offset regular loss value and an offset direction loss value.
According to an embodiment of the present invention, training the contrast learning module using the data-enhanced neuron point cloud sample, and calculating the contrast learning loss value in the training process of the contrast learning module using the predefined contrast learning loss function includes: carrying out data enhancement on the neuron point cloud samples by adopting different data enhancement methods to obtain a first neuron point cloud enhancement sample and a second neuron point cloud enhancement sample; inputting the first neuron point cloud enhancement sample into an online network of a contrast learning module for processing to obtain a first embedded feature; inputting the second neuron point cloud enhancement sample into a target network of a contrast learning module for processing to obtain a second embedded feature; and calculating a loss value in the embedded feature acquisition operation by utilizing a predefined contrast learning loss function to obtain a contrast learning loss value.
According to an embodiment of the present invention, the online network includes a morphological coder, a mapper, and a predictor; the method comprises the steps that a morphological characterization encoder of a target network and a morphological characterization encoder of a skeleton grid generation module share parameters and weights; wherein the target network comprises an encoder and a mapper; the encoder of the target network is obtained by carrying out the exponential moving average operation of parameters on the morphological characterization encoder of the online network; wherein, the mapper of the target network is obtained by carrying out the exponential sliding average operation of the parameters on the mapper of the online network.
The skeleton-aware morphology characterization network includes a Skeleton Mesh Generation Module (SMGM) that extracts a simplified skeleton mesh (skelteal mesh) of the input point cloud and a Contrast Learning Module (CLM).
In the skeleton grid generation module, the invention provides a learning mode based on sampling points and offset, which can generate a three-dimensional model simplification of a neuron segment maintaining a key geometric structure, improve the abstract robustness of the shape on the neuron segment, and support detailed observation of the shape of the neuron segment. The contrast learning module learns the more differentiated morphological characterization by using a comparison classical contrast learning framework BYOL. Through the combined training of the two modules, the form characterization of the neuron perceived by the skeleton can be obtained, and based on the form characterization, a neural network classification model can be connected to obtain the specific category of the neuron fragments so as to assist the subsequent neuron reconstruction and analysis work.
Through the method, the invention can efficiently learn the neuron morphological characterization perceived by the skeleton, is used for supporting efficient rendering and morphological analysis, and can simultaneously obtain the simplified shape of the neuron segment, thereby improving the three-dimensional rendering efficiency of the segmentation segment and reducing the requirement of the neuron segment on the data storage space. The invention is based on the adult drosophila FAFB-FFN1 and the H01 electron microscope image segmentation dataset of human cerebral cortex for experiments, and the method provided by the invention obtains the most advanced performance at present in the aspects of skeleton extraction of neuron fragments and morphology-based neuron classification.
The training method of the neuron skeleton perception morphology characterization learning method and the skeleton grid generating module and the comparison learning module based on the self-supervision learning provided by the invention is further described in detail below through a specific embodiment and with reference to fig. 6.
Fig. 6 is a schematic diagram of a architecture for learning and training of a neural skeleton-aware morphological feature based on self-supervised learning according to an embodiment of the present invention.
As shown in fig. 6, the object of the present invention perceives a morphological representation e from Xi Gujia in a 3D point cloud P sampled from the surface of a neuron fragment. The learned morphology vector e can be used to reconstruct a simplified three-dimensional morphology model and classify cell types. Specifically, the method provided by the invention comprises two modules, namely a skeleton grid generation module (SMGM) and a Comparison Learning Module (CLM), wherein the skeleton grid generation module extracts a simplified skeleton grid (skelteal mesh) of the input point cloud, and the comparison learning module learns more differentiated morphological characterization. The two modules share a morphological characterization encoder, which follows a multi-task learning scheme to learn a skeleton-aware morphological characterization.
For the skeleton mesh generation module, the goal is to predict a simplified surface model while preserving the dominant morphological features of the input shape. The present invention uses a skeletal mesh representation that can simultaneously present the skeletons of the tubular structure and the mesh structure. A skeletal mesh of a 3D shape is a discrete 2D non-manifold defined by a set of skeletal points, edges and faces, which may be regarded as an abstract representation of the shape. The skeleton grid generation method comprises two steps: skeleton point prediction and skeleton map connection prediction. However, since skeleton points are obtained by using linear combination of input point clouds, the method tends to generate some outlier skeleton points, so the invention provides a brand-new skeleton point generation scheme to generate more reasonable skeleton points.
For Skeleton Point prediction, the Point2Skeleton method is used for inputting Point clouds(where M < N) a set of skeleton spheres { s } is first predictedi =(ci ,ri ) I=1,..m }. Each skeleton sphere is represented by its position c= (x, y, z) and radius r. To extract the contextual features of the input shape, the encoder f uses PointNet++, resulting in +.>Context feature of (2) and sample point +.>In contrast to the Point2Skeleton method, which uses a convex combination of multi-layer perceptron (MLP) predictive input sample points to generate Skeleton points, the present invention uses M sample points as reference bones And (3) framing points, and further predicting the offset v of each reference point. Because the skeleton points can be considered as local centers of a set of surface points, the present invention predicts the offset of each reference point (from the encoded shape context feature Fi Moving the reference point to the local surface center), denoted vi =foffset (Fi ). The present invention also uses MLP to predict the radius of each skeleton sphere, i.e., ri =fradius (Fi )。
In the skeleton graph connection prediction process, based on skeleton balls, the invention then predicts connectivity between predicted skeleton points to generate a skeleton grid. The invention adopts the following three steps: the method comprises the steps of initializing a diagram, predicting diagram connection through GAE and generating skeleton grids, and finally generating a compact and structurally meaningful skeleton representation.
Regarding the loss function in the Skeleton grid generation module training process, the present invention uses four loss functions as set forth in Point2Skeleton to train Skeleton grid generation branches, including sampling loss Ls Point-to-sphere loss Lp Radius canonical penalty Lr Sum graph connectivity loss Llink Wherein L iss And Lp For measuring reconstruction errors, Lr To promote the generation of larger skeleton sphere radii, Llink And the reconstruction errors of the input skeleton diagram and the output skeleton diagram are measured. Compared with Point2Skeleton, the method of the invention not only designs a new method for generating the reference Skeleton Point through offset prediction, but also introduces new offset regular loss Loffset And offset direction loss Ldir So as to encourage the resulting skeleton points to be close to the shape surface while remaining inside the shape.
For the offset canonical loss function, the offset of the sampling points should not be too large to maintain geometric consistency between the sampling reference points and the final skeleton points. Thus, the present invention defines offset regularization penalty to encourage smaller offsets, as shown in equation (1):
for the offset directional loss function, to cause the generated skeleton point to lie inside the local shape, the present invention defines an offset directional loss to encourage the sample point to move in the opposite direction to the surface point normal, as shown in equation (2):
wherein n isi Is the surface normal vector of the i-th sample reference point estimate.
Thus, the total loss function for skeletal mesh generation is a weighted combination of the following losses, as shown in equation (3):
Lskel =Llink +Ls +λp Lp +λoffset LoffSet +λr Lr +λdir Ldir (3)。
for contrast learning modules, the present invention follows a classical contrast learning framework consisting of an online network and a target network, as shown in fig. 6. The online network consists of three parts: an encoder (encoder), a mapper (projector), and a predictor (predictor). The target network does not contain a predictor, but its encoder and projector have the same structure as the online network, but use different weights. The target network provides a regression target for online network training, the parameters of which are exponentially moving averages of online network parameters. During training, the contrast learning module applies two data enhancements to the same segment, using the in-line network and the target network to enhance the similarity between them. An example contrast loss function used in the present invention is BYOL [34 ] ]Introduced in (a)
Therefore, the joint loss function of the contrast learning module and the skeleton extraction module is as shown in formula (4):
to further illustrate the advantages of the above method provided by the present invention, the above method provided by the present invention is verified by a specific experimental stack.
The construction of the dataset is first performed, comprising the FAFB dataset and the H01 dataset.
For the FAFB dataset, we used FAFB-CellSeg, which included 591 cells, 464 nerve fibers, and initially 110 glial cells, in order to evaluate the quality of skeletal production and the effectiveness of subcellular classification. Compared with the cell bodies and the nerve fibers, the number of the glial cells is relatively small, the invention please mark the extra 221 glial cells by the expert, and finally 331 glial cells are obtained. This expanded dataset was named FAFB-CellSeg++. The complete morphological representation learning model of the present invention was trained using all 84,560 EM segmentation fragments from FAFB-FFN 1. For subcellular classification tasks, 50% of the fragments labeled in FAFB-CellSeg++ were used for training and the other half for testing.
H01 dataset, a publicly available dataset, contains 1.4PB of human cortex 1mm3 Electron microscope images of (a). The present invention collects 15,000 shapes from three cell categories and randomly samples 2048 points for each cell shape. The present invention uses 90% of the segments to train the morphology learning network of the present invention, while the other 10% are used for testing to perform cell classification tasks.
FIG. 5 shows a set of samples for each type of FAFB-CellSeg++ and H01 dataset.
Fig. 7 is a simplified resulting schematic of a three-dimensional structure of a neuronal segment according to an embodiment of the invention.
The present invention achieves optimal performance in the skeleton extraction task of the neuronal fragment as shown in tables 1 and 2, while achieving optimal performance in the subcellular classification task of the neuronal fragment as shown in tables 3 and 4. As shown in fig. 7, the method of the present invention performs better in preserving structural integrity, particularly in axon structure (solid black boxes), than Point2 Skeleton. The method of the invention can also better recover the shape details of the surface of the neuron cell segment (dashed black box), extract more accurate and simple skeleton grids, and reduce loops and spikes (dashed black box).
Table 1: neuron fragment skeleton extraction index effect on FAFBCellSeg++ data set
Table 2: h01 data set neuron fragment skeleton extraction index effect
Table 3: FAFB-CellSeg++ data set neuron fragment subcellular classification index effect
Table 4: h01 data set neuron fragment subcellular classification index effect
The invention discloses a neuron skeleton perception morphology characterization learning method based on self-supervision learning. In order to solve the problems of large size of the image data of the electron microscope and lack of manual annotation data, the invention provides a self-supervision method for learning morphological characterization from a super-large-scale 3D neuron separation segment. In order to realize high-efficiency morphological analysis on large-scale neuron segmentation fragments, the invention provides a skeleton-aware morphological characterization extraction network which can realize three-dimensional model simplification of neuron fragments with key geometric structures so as to support high-efficiency rendering and morphological analysis. Specifically, the invention introduces a self-supervision learning framework, connects the shape simplification of skeleton perception and the classification of neuron fragments based on morphology, and enhances the identifiability of the morphology characterization of the learned skeleton perception through multi-task combined training. According to the method, the neural network model can learn the morphological characterization of the neurons perceived by the skeleton under the condition of no artificial annotation data, and the three-dimensional model simplification of the neuron fragments which keep the key geometric structure is obtained.
Fig. 8 schematically illustrates a block diagram of an electronic device adapted to implement a method of learning a neural skeleton-aware morphological feature based on self-supervised learning and a training method of a skeleton mesh generation module and a contrast learning module according to an embodiment of the invention.
As shown in fig. 8, an electronic device 800 according to an embodiment of the present invention includes a processor 801 that can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) 802 or a program loaded from a storage section 808 into a Random Access Memory (RAM) 803. The processor 801 may include, for example, a general purpose microprocessor (e.g., a CPU), an instruction set processor and/or an associated chipset and/or a special purpose microprocessor (e.g., an Application Specific Integrated Circuit (ASIC)), or the like. The processor 801 may also include on-board memory for caching purposes. The processor 801 may comprise a single processing unit or multiple processing units for performing the different actions of the method flows according to embodiments of the invention.
In the RAM 803, various programs and data required for the operation of the electronic device 800 are stored. The processor 801, the ROM 802, and the RAM 803 are connected to each other by a bus 804. The processor 801 performs various operations of the method flow according to the embodiment of the present invention by executing programs in the ROM 802 and/or the RAM 803. Note that the program may be stored in one or more memories other than the ROM 802 and the RAM 803. The processor 801 may also perform various operations of the method flow according to embodiments of the present invention by executing programs stored in one or more memories.
According to an embodiment of the invention, the electronic device 800 may further comprise an input/output (I/O) interface 805, the input/output (I/O) interface 805 also being connected to the bus 804. The electronic device 800 may also include one or more of the following components connected to the I/O interface 805: an input portion 806 including a keyboard, mouse, etc.; an output portion 807 including a display such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and a speaker; a storage section 808 including a hard disk or the like; and a communication section 809 including a network interface card such as a LAN card, a modem, or the like. The communication section 909 performs communication processing via a network such as the internet. The drive 810 is also connected to the I/O interface 805 as needed. A removable medium 811 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 810 as needed so that a computer program read out therefrom is mounted into the storage section 808 as needed.
The present invention also provides a computer-readable storage medium that may be embodied in the apparatus/device/system described in the above embodiments; or may exist alone without being assembled into the apparatus/device/system. The computer-readable storage medium carries one or more programs which, when executed, implement methods in accordance with embodiments of the present invention.
According to embodiments of the present invention, the computer-readable storage medium may be a non-volatile computer-readable storage medium, which may include, for example, but is not limited to: a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. For example, according to embodiments of the invention, the computer-readable storage medium may include ROM 802 and/or RAM 803 and/or one or more memories other than ROM 802 and RAM 803 described above.
The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The foregoing embodiments have been provided for the purpose of illustrating the general principles of the present invention, and are not meant to limit the scope of the invention, but to limit the invention thereto.