CN120107254B

Movatterモバイル変換

Info

Publication number: CN120107254B
Application number: CN202510585504.2A
Authority: CN
Inventors: 徐志民; 张赢心; 戴桓琰; 魏欣; 李锡; 武志艳; 刘哲昊; 陈海鹏; 于征磊
Original assignee: Jilin University
Current assignee: Jilin University
Priority date: 2025-05-08
Filing date: 2025-05-08
Publication date: 2025-08-08
Anticipated expiration: 2045-05-08
Also published as: CN120107254A

Abstract

The application relates to the technical field of dental image analysis, and discloses a dental image intelligent analysis system and a dental image intelligent analysis method for oral surgery. Subsequently, spatial characteristics of the middle and deep feature maps are enhanced using a spatial adaptation module, while the shallow feature map is refined using edge detection branches to capture tooth edge details. And then fusing the feature images to generate a multi-scale obvious fused feature image, and processing the multi-scale obvious fused feature image by a classification layer to obtain a tooth voxel probability image. Finally, the discrete segmentation label graph is converted into a discrete segmentation label graph as a semantic segmentation result. The method aims at improving the accuracy and the robustness of the tooth image analysis by integrating the multi-scale characteristics and the spatial context information, and provides more reliable support for oral surgery.

Description

Dental image intelligent analysis system and method for oral surgery

Technical Field

The application relates to the technical field of dental image analysis, in particular to an intelligent dental image analysis system and method for oral surgery.

Background

In the field of oral surgery, accurate dental image analysis is critical to diagnosis and treatment planning. Traditional dental image analysis methods often rely on the experience and expertise of doctors, which are time-consuming and subject to subjective factors, resulting in limitations in consistency and accuracy of the diagnostic results. With the development of medical imaging technology, digital image processing is one of the important means for improving diagnosis efficiency and accuracy. However, existing digital image processing techniques still face challenges in the face of complex dental structures, blurred boundaries, and insufficient feature extraction.

First, the anatomy of the teeth and their surrounding tissue is very complex, with tremendous differences between individuals. The traditional image analysis method is difficult to comprehensively and accurately capture the detailed information, and especially the method is worry about detecting the fine lesions. Secondly, due to the influence of factors such as shooting angles, equipment resolution and the like, boundaries in the dental images are often not clear enough, and difficulties are brought to subsequent segmentation and recognition work. Furthermore, how to effectively extract representative feature information from the dental image is a key point for realizing automated analysis. However, most of the current methods focus on the extraction of shallow features, and the deep features are not focused enough, so that the positioning of the lesion part is not accurate enough.

Therefore, a dental image intelligent analysis scheme for oral surgery is desired.

Disclosure of Invention

In view of the problems that the prior art is sensitive to noise, is difficult to accurately divide complex structures, ignores spatial context information and the like when processing dental images, the invention provides an intelligent dental image analysis system and method for oral surgery.

According to one aspect of the application, an intelligent analysis method for dental images for oral surgery is provided, which comprises the steps of obtaining dental image data for oral surgery, inputting the dental image data into a backbone network to obtain a dental image shallow feature map, a dental image middle layer feature map and a dental image deep feature map, inputting the dental image middle layer feature map and the dental image deep feature map into a space self-adaption module to obtain a dental image middle layer space strengthening feature map and a dental image deep space strengthening feature map, inputting the dental image shallow feature map into an edge detection branch to obtain a dental edge information feature map, fusing the dental image middle layer space strengthening feature map, the dental image deep space strengthening feature map and the dental edge information feature map to obtain a dental image multi-scale significant fusion feature map, inputting the dental image multi-scale significant fusion feature map into a classification layer to obtain a dental voxel probability map, and converting the dental voxel probability map into a dental voxel discrete label map as a dental image semantic segmentation result.

In the intelligent analysis method of the dental image facing the oral surgery, the dental image shallow feature map is input into an edge detection branch to obtain a dental edge information feature map, and the intelligent analysis method comprises the steps of inputting the dental image shallow feature map into a convolution layer of the edge detection branch to obtain a dental image edge feature focusing feature map, inputting the dental image edge feature focusing feature map into an activation layer of the edge detection branch to obtain the dental edge feature mask feature map, using a Sigmoid activation function by the activation layer, and calculating the position-based point multiplication between the dental edge feature mask feature map and the dental image shallow feature map to obtain the dental edge information feature map.

In the dental image intelligent analysis method facing the oral surgery, the backbone network is a Swin transducer network.

The dental image intelligent analysis method facing the oral surgery comprises the steps of inputting the dental image middle layer space strengthening characteristic image, the dental image deep space strengthening characteristic image and the dental edge information characteristic image into a jump connection layer to obtain an initial dental image multi-scale salient fusion downsampling characteristic image, carrying out complementation fusion optimization of visual semantic characteristics on the initial dental image multi-scale salient fusion downsampling characteristic image to obtain a dental image multi-scale salient fusion downsampling characteristic image, and carrying out upsampling on the dental image multi-scale salient fusion downsampling characteristic image to obtain the dental image multi-scale salient fusion characteristic image.

The dental image intelligent analysis method for the oral surgery comprises the steps of carrying out complementation fusion optimization of visual semantic features on an initial dental image multi-scale salient fusion downsampling feature image to obtain a dental image multi-scale salient fusion downsampling feature image, carrying out cavity convolution operations with different cavity rates on a dental image middle layer space enhancement feature image, a dental image deep space enhancement feature image and a dental edge information feature image to obtain a dental image middle layer space enhancement cavity convolution feature image, a dental image deep space enhancement cavity convolution feature image and a dental edge information cavity convolution feature image, taking the dental image middle layer space enhancement cavity convolution feature image as a reference, calculating cross-layer geometrical feature response of the dental image deep space enhancement cavity convolution feature image and the dental edge information cavity convolution feature image to obtain a first dental image cross-layer geometrical feature response feature image and a second dental image cross-layer geometrical feature response feature image, carrying out fusion correction on the first dental image cross-layer geometrical feature image and the second dental image cross-layer geometrical feature response feature image based on a shift window mechanism and a relative position coding mechanism to obtain a multi-scale image multi-scale salient fusion feature image, and carrying out multi-scale sampling of the dental image sampled multi-scale salient fusion feature image.

In the dental image intelligent analysis method facing the oral surgery, the cavity rate of the cavity convolution operation is 3,5 and 1 respectively.

In the dental image intelligent analysis method facing the oral surgery, the classification layer comprises a point convolution layer and a Softmax classification unit.

According to another aspect of the application, an intelligent dental image analysis system for oral surgery is provided, which comprises a dental image data acquisition module, a dental image multi-scale feature extraction module, a dental image space enhancement module, a dental image segmentation and processing module and a dental image edge detection module, wherein the dental image data acquisition module is used for acquiring dental image data for oral surgery, the dental image multi-scale feature extraction module is used for inputting the dental image data into a backbone network to obtain a dental image shallow feature map, a dental image middle feature map and a dental image deep feature map, the dental image space enhancement module is used for inputting the dental image middle feature map and the dental image deep feature map into a space adaptation module to obtain a dental image middle-layer space enhancement feature map and a dental image deep space enhancement feature map, the dental image edge detection module is used for inputting the dental image shallow feature map into an edge detection branch to obtain a dental edge information feature map, the dental image multi-scale feature map is used for fusing the dental image deep space enhancement feature map, the dental image deep space enhancement feature map and the dental image edge information feature map to obtain a dental image multi-scale significant fusion feature map, the dental image classification processing module is used for inputting the dental image multi-scale significant fusion feature map into a classification layer, and the dental image segmentation voxel probability map is used for voxel label segmentation voxel label as a dental image result.

Compared with the prior art, the intelligent analysis system and the intelligent analysis method for the dental image facing the oral surgery, provided by the application, are characterized in that firstly, the dental image data are obtained, and a multi-scale characteristic diagram comprising a shallow layer, a middle layer and a deep layer is extracted through a backbone network. Subsequently, spatial characteristics of the middle and deep feature maps are enhanced using a spatial adaptation module, while the shallow feature map is refined using edge detection branches to capture tooth edge details. And then fusing the feature images to generate a multi-scale obvious fused feature image, and processing the multi-scale obvious fused feature image by a classification layer to obtain a tooth voxel probability image. Finally, the discrete segmentation label graph is converted into a discrete segmentation label graph as a semantic segmentation result. The method aims at improving the accuracy and the robustness of the tooth image analysis by integrating the multi-scale characteristics and the spatial context information, and provides more reliable support for oral surgery.

Drawings

The above and other objects, features and advantages of the present application will become more apparent by describing embodiments of the present application in more detail with reference to the attached drawings. The accompanying drawings are included to provide a further understanding of embodiments of the application and are incorporated in and constitute a part of this specification, illustrate the application and together with the embodiments of the application, and not constitute a limitation to the application. In the drawings, like reference numerals generally refer to like parts or steps.

Fig. 1 illustrates a schematic flow chart of a dental image intelligent analysis method for oral surgery according to an embodiment of the present application.

Fig. 2 illustrates a schematic flow chart of S3 in a dental image intelligent analysis method for oral surgery according to an embodiment of the present application.

Fig. 3 illustrates a schematic flowchart of S33 in a dental image intelligent analysis method for oral surgery according to an embodiment of the present application.

Fig. 4 illustrates a schematic flow chart of S4 in a dental image intelligent analysis method for oral surgery according to an embodiment of the present application.

Fig. 5 illustrates a schematic flow chart of S5 in a dental image intelligent analysis method for oral surgery according to an embodiment of the present application.

Fig. 6 illustrates a schematic block diagram of an intelligent dental image analysis system for oral surgery in accordance with an embodiment of the present application.

Detailed Description

Hereinafter, exemplary embodiments according to the present application will be described in detail with reference to the accompanying drawings. It should be apparent that the described embodiments are only some embodiments of the present application and not all embodiments of the present application, and it should be understood that the present application is not limited by the example embodiments described herein.

Fig. 1 illustrates a schematic flow chart of a dental image intelligent analysis method for oral surgery according to an embodiment of the present application. The application provides an intelligent analysis method of dental images for oral surgery, which is shown in fig. 1, and comprises the steps of S1, obtaining dental image data for oral surgery, S2, inputting the dental image data into a backbone network to obtain a dental image shallow layer feature map, a dental image middle layer feature map and a dental image deep layer feature map, S3, inputting the dental image middle layer feature map and the dental image deep layer feature map into a space self-adaption module to obtain a dental image middle layer space strengthening feature map and a dental image deep layer space strengthening feature map, S4, inputting the dental image shallow layer feature map into an edge detection branch to obtain a dental edge information feature map, S5, fusing the dental image middle layer space strengthening feature map, the dental image deep layer space strengthening feature map and the dental edge information feature map to obtain a dental image multi-scale significant fusion feature map, S6, inputting the dental image multi-scale significant fusion feature map into a classification layer to obtain a dental voxel probability map, and S7, and converting the dental voxel probability map into a dental voxel discrete semantic label map as a dental image segmentation result.

Specifically, in step S1, dental image data for oral surgery is acquired. It should be appreciated that CBCT is widely used in the field of oral surgery, given its advantages of being able to provide three-dimensional views, low radiation dose, and high image resolution. Thus, by scanning a patient using a CBCT apparatus, a three-dimensional image dataset containing detailed information of teeth and their surrounding structures can be acquired. The data not only comprises morphological characteristics of the teeth, but also comprises adjacent bone structures and soft tissue conditions, and provides a rich information basis for subsequent analysis.

In one particular embodiment, the doctor or technician will select appropriate scan parameters, such as field of view (FOV), scan time, and resolution, to ensure that a high quality image is obtained that both meets diagnostic requirements and minimizes the amount of radiation received by the patient, depending on clinical requirements. Once the setup is complete, the patient is guided to the correct position and remains stationary as indicated to complete the scanning procedure successfully. The raw image data generated is then transferred to a specialized image processing workstation where the technician pre-processes the data using specific software tools, including steps of noise reduction, correction of artifacts, etc., to improve image quality and prepare for further analysis.

Specifically, in step S2, the dental image data is input into a backbone network to obtain a dental image shallow feature map, a dental image middle layer feature map, and a dental image deep feature map. It should be appreciated that conventional convolutional neural networks, while excellent in many visual tasks, have limitations in handling long-range dependencies and global information. In contrast, backbone networks utilize self-attention mechanisms within local windows, in conjunction with moving window policies, to capture information traffic across windows, which can both maintain computational efficiency and effectively capture global dependencies. This is important for resolving fine structure differences in dental images. In addition, by hierarchically constructing the feature map, it is possible to ensure that the model can recognize not only an obvious anatomical structure but also deep semantic information hidden behind it. The method is beneficial to improving the working effects of subsequent space self-adaptive reinforcement treatment and edge detection branches, and lays a solid foundation for finally realizing accurate tooth segmentation and classification.

Specifically, in one embodiment, the backbone network is a Swin Transformer network. Swin transducer is used as an improved transducer architecture, and a hierarchical feature representation method is introduced, so that feature information under different scales can be effectively captured. In practical applications, the Swin transducer is used to extract a shallow dental image feature map, a middle dental image feature map, and a deep dental image feature map from the input dental image data step by step through a series of stages, each of which includes multiple levels of self-attention mechanisms.

In a specific embodiment, the Swin transducer performs a preliminary feature extraction on the input dental image data to generate a more basic dental image shallow feature map, which contains basic information such as edge, color contrast, etc. As the depth of the network increases, to an intermediate stage, the model begins to focus on more abstract structural features, such as the relative positional relationships between teeth or morphological characteristics of certain specific areas, to produce a layer profile in the dental image. In the last stage or stages, the model further extracts more complex and representative deep feature maps of the dental image, which often correspond to higher-level concepts such as three-dimensional shapes of the entire tooth or relationships with other tissues.

Specifically, in step S3, the middle-layer dental image feature map and the deep-layer dental image feature map are input into a spatial adaptive module to obtain a middle-layer dental image spatial enhancement feature map and a deep-layer dental image spatial enhancement feature map. It should be appreciated that conventional convolutional neural networks, while excellent in many visual tasks, have limitations in processing medical images with significant dimensional changes and non-uniform distribution characteristics. In particular in the field of oral surgery, the morphology of the teeth and their surrounding tissues is different and staggered with respect to each other, which puts higher demands on feature extraction. By introducing the space self-adaptive module, the sensitivity and the robustness of the model to detail characteristics can be greatly improved on the premise of not remarkably increasing the calculation cost. In addition, the method also allows the model to automatically adjust the internal parameters according to the characteristics of the input data, thereby improving the adaptability to various different types of image data.

Specifically, the spatial adaptation module consists of two major parts, namely a convolutional layer-based offset learning module and a convolutional layer-based modulation scalar learning module. The offset learning module is typically formed of a series of convolution layers and functions to learn an offset vector for each pixel location to indicate the direction of adjustment of the center position of the local area of interest at that location. The core of this step is that it allows the model to dynamically adjust the receptive field to better accommodate irregular shapes or boundaries in the image. For example, in processing dental images, this mechanism helps to more accurately locate the tooth-gum interface or other fine structure. At the same time, the modulation scalar learning module also employs a convolutional layer structure, but its goal is to generate a scalar value for each pixel location that is used to adjust the characteristic response intensity at the corresponding location. This means that in addition to changing the position of the receptive field, the information weights of the parts in the feature map can be dynamically adjusted according to the importance of the image content. For example, when some key anatomical landmarks are emphasized, the response intensity of the relevant region can be increased, while the response intensity is properly weakened in the background region, so that the effectiveness and pertinence of the overall feature expression are improved.

In a specific embodiment, for example, there is an input feature map F, size, for the convolutional layer-based offset learning module(Height, width, number of channels). In order to generate the layer feature offset map in the dental image, a network of several convolution layers is designed in the present application. First, a standard convolutional layer (containing a ReLU activation function) is used, whose output channel number is set to 2C (since two values are required for each position to represent the offsets in the x-axis and y-axis). Next, the number of channels is reduced to 2 by another 1x1 convolution layer, thus obtaining the final offset map O of the size. Each element here represents an offset that should be applied at the corresponding location.

For the modulation scalar learning module based on the convolution layer, taking the input feature map F as an example, in order to generate a layer feature modulation map in the dental image, a similar network structure is constructed. First, the input signature is processed with one convolutional layer (again possibly including ReLU activation), but this time the number of output channels is set to C. Then, through a 1x1 convolution layer of the Sigmoid activation function, the output value is ensured to be between 0 and 1, and a modulation scalar diagram M is obtained, wherein the size is. Each value in this scalar map M represents an enlargement or reduction of the feature map at the corresponding position.

Specifically, in step S4, the dental image shallow feature map is input into an edge detection branch to obtain a dental edge information feature map. The edge detection branch is a network for edge detection, and it should be understood that the edge detection branch can obviously improve the accuracy of edge detection without sacrificing details, and provide powerful support for subsequent multi-scale feature fusion and tooth voxel segmentation tasks.

In one embodiment, as shown in fig. 4, in step S4, the step of inputting the shallow dental image feature map into an edge detection branch to obtain a dental edge information feature map includes S41, inputting the shallow dental image feature map into a convolution layer of the edge detection branch to obtain a dental image edge feature focusing feature map, S42, inputting the dental image edge feature focusing feature map into an activation layer of the edge detection branch to obtain the dental edge feature mask feature map, the activation layer using a Sigmoid activation function, and S43, calculating a position-wise point multiplication between the dental edge feature mask feature map and the shallow dental image feature map to obtain the dental edge information feature map.

Specifically, in step S5, the middle layer spatial enhancement feature map, the deep layer spatial enhancement feature map, and the tooth edge information feature map are fused to obtain a multi-scale salient fusion feature map of the tooth image. Depending on a certain level of feature map alone, it is often difficult to comprehensively capture all important information in dental images. For example, shallow feature maps may contain rich detailed information but have inadequate understanding of the overall structure, whereas deep feature maps, while being adept at identifying a wide range of structural modes, may have drawbacks in terms of detailed description. Therefore, the problems can be effectively overcome by fusing the feature graphs of different layers and by means of jump connection, so that the generated multi-scale obvious fusion feature graph not only has enough detail definition, but also can accurately reflect the global structural features.

In one embodiment, as shown in fig. 5, in step S5, the tooth image middle layer spatial enhancement feature map, the tooth image deep layer spatial enhancement feature map and the tooth edge information feature map are fused to obtain a tooth image multi-scale salient fusion feature map, which comprises step S51, inputting the tooth image middle layer spatial enhancement feature map, the tooth image deep layer spatial enhancement feature map and the tooth edge information feature map into a jump connection layer to obtain an initial tooth image multi-scale salient fusion downsampling feature map, step S52, performing complementary fusion optimization of visual semantic features on the initial tooth image multi-scale salient fusion downsampling feature map to obtain a tooth image multi-scale salient fusion downsampling feature map, and step S53, upsampling the tooth image multi-scale salient fusion downsampling feature map to obtain the tooth image multi-scale salient fusion feature map.

In particular, feature maps from the encoder portion (corresponding to the middle and deep feature maps of the backbone network output) are passed directly to the decoder portion (i.e., feature fusion phase) via a jump connection. The connection mode is not only helpful for alleviating the problem of gradient disappearance, but also can ensure that important details of the original image are kept in the process of restoring the resolution of the image. Inside the jump connection layer, in order to fuse the feature graphs of different sources, an addition or splicing mode may be adopted. For example, the middle-layer spatial enhancement feature map and the deep-layer spatial enhancement feature map may be first spliced to form a new feature map, and then added to the tooth edge information feature map to obtain the initial tooth image multi-scale significantly fused downsampled feature map.

Preferably, in view of the fact that when the mid-level spatial enhancement feature map, the tooth image deep-level spatial enhancement feature map, and the tooth edge information feature map are fused, the mid-level spatial enhancement feature map and the tooth image deep-level spatial enhancement feature map are offset and scalar-modulated enhanced on the basis of the mid-level visual feature space representation and deep-level visual feature space representation of the tooth image data, respectively, and the tooth edge information feature map is edge feature activation enhanced on the basis of the shallow-level visual feature space representation of the tooth image data, it is desirable to enable complementary fusion of visual semantic features while maintaining spatial detail integrity at the time of fusion.

Based on the above, in the application, the complementary fusion optimization of visual semantic features is carried out on the initial tooth image multi-scale significant fusion downsampling feature map to obtain the tooth image multi-scale significant fusion downsampling feature map, which comprises the steps of firstly, strengthening the feature map aiming at the middle layer space of the tooth imageThe tooth image deep space strengthening characteristic diagramAnd the tooth edge information feature mapRespectively performing hole convolution operation with the hole rate of {3,5,1},AndTo obtain a middle layer space enhanced cavity convolution feature map and a deep layer space enhanced cavity convolution feature map of the tooth image and a tooth edge information cavity convolution feature map, which are expressed as follows:

;

Wherein, theA cavity convolution characteristic diagram representing tooth edge information,A middle layer space enhanced cavity convolution characteristic diagram of the tooth image is shown,And (5) representing a tooth image deep space reinforced cavity convolution characteristic diagram. Here, the multi-scale cross-layer structure decomposition association with spatial invariance is achieved through the above-described hole convolution operation.

Then, a cross-layer geometrical characteristic response is established by taking the middle layer characteristic as a reference, namely, a cross-layer geometrical characteristic response of the tooth image deep layer space enhanced cavity convolution characteristic diagram and the tooth edge information cavity convolution characteristic diagram is calculated by taking the middle layer space enhanced cavity convolution characteristic diagram of the tooth image as a reference to obtain a first tooth image cross-layer geometrical characteristic response characteristic diagram and a second tooth image cross-layer geometrical characteristic response characteristic diagram, wherein the first tooth image cross-layer geometrical characteristic response characteristic diagram and the second tooth image cross-layer geometrical characteristic response characteristic diagram are expressed as follows:

;

Wherein, theIndicating that the subtraction is performed by position,Representing the multiplication by the position point,Representing the reciprocal of each position feature value of the computed tooth edge information cavity convolution feature map,Representing the reciprocal of each position characteristic value of the reinforced cavity convolution characteristic diagram of the deep space of the tooth image,Representing a first dental image cross-layer geometric feature response profile,And representing a cross-layer geometrical characteristic response characteristic diagram of the second tooth image. That is, the compensation of the depth semantic confidence changes by the depth reference geometric response is equivalent to the depth confidence semantic space alignment for the mid-level features.

In this way, based on a shift window mechanism and a relative position coding mechanism of the Swin transform network, the first dental image cross-layer geometric feature response feature map and the second dental image cross-layer geometric feature response feature map can be subjected to fusion correction to obtain a dental image multi-scale correction feature map, which is expressed as:

;

Here the number of the elements is the number,Representing the characteristic diagramThe feature matrix is shifted in width and height directions,Representing the characteristic diagramThe feature matrix is shifted in width and height directionsAnd in particular, a cyclic shift, such as a first row up-shift by one position to fill the last row,Indicating that the addition is by location,A multi-scale correction feature map of the dental image is presented.

And finally, carrying out dot-multiplication on the tooth image multi-scale correction feature map and the initial tooth image multi-scale significant fusion downsampling feature map to carry out fusion optimization so as to obtain the tooth image multi-scale significant fusion downsampling feature map. In this way, complementary fusion is performed on visual feature space representations of different depth features through multi-scale decomposition response collaboration and depth semantic space alignment on the basis of more intrinsic cross-layer interactive representations of depth semantic confidence to feature set space.

Then, after fusion optimization is completed, up-sampling operation is performed on the tooth image multi-scale significant fusion down-sampling feature map so as to restore the spatial resolution of the original image. In the present application, this can be achieved by deconvolution or bilinear interpolation, etc. Deconvolution is a common up-sampling technique that increases the spatial size of the feature map by learning the inverse convolution kernel, while bilinear interpolation performs interpolation calculations based on neighboring pixel values, which, although simple, provides better smoothing. Which method is selected depends on the specific task requirements and the level of accuracy desired.

Specifically, in step S6, the multi-scale salient fusion feature map of the dental image is input into a classification layer to obtain a dental voxel probability map, where the classification layer includes a point convolution layer and a Softmax classification unit. It will be appreciated that by using a point convolution layer, fine anatomical features can be effectively captured while maintaining a high resolution, while by means of a Softmax classification unit, these features can be mapped onto specific class labels, providing an intuitive and easily interpretable result for the physician.

In a specific embodiment, a multi-scale saliency map of dental images is first input into a point convolution layer. The point convolution layer performs a convolution operation on a small region around each voxel by applying a set of learnable filters, thereby extracting a more abstract feature representation. Since these filters can reduce the number of parameters without losing spatial information, they help to increase computational efficiency and reduce the risk of overfitting. The feature map after the processing of the point convolution layer is then sent to the Softmax classification unit. The Softmax function can translate the raw output into a probability distribution such that each voxel has a probability value corresponding to a particular class. Specifically, it is determined by the Softmax function which voxels belong to the tooth structure and which do not.

Specifically, in step S7, the tooth voxel probability map is converted into a tooth voxel discrete division label map as a tooth image semantic division result. It will be appreciated that it is often difficult to directly obtain the ideal segmentation effect by means of the original tooth voxel probability map alone, since the probability values themselves do not always perfectly reflect the actual anatomical boundaries. Therefore, the tooth voxel probability map is further converted into a tooth voxel discrete segmentation label map to serve as a tooth image semantic segmentation result, so that the quality of the segmentation result can be remarkably improved, and the adaptability of the model to various complex conditions can be enhanced.

Specifically, after obtaining the tooth voxel probability map, first, a threshold is applied to each voxel to decide which class it belongs to. Specifically, a probability threshold (e.g., 0.5, which may be adjusted according to the actual situation, and not specifically defined in this embodiment) may be set, and a voxel may be labeled as a certain class if its probability value exceeds this threshold, or may be considered as part of a background or other class otherwise. This approach is straightforward, but may sometimes not be sufficient to cope with complex boundary conditions or unevenly distributed probability values. Thus, in practical applications, finer strategies may be necessary, such as employing dynamic threshold adjustment or making decisions in combination with multiple conditions, to improve segmentation accuracy. Further, in order to ensure consistency and accuracy of the segmentation result, a connected component analysis may also be performed. The connected component analysis may help identify and separate different objects or regions in the image, which is particularly important for distinguishing between different teeth or teeth in close proximity and surrounding tissue. Through the analysis of connected components, isolated small noise points can be effectively removed, voxels with similar attributes are classified into one type, and therefore a clearer and accurate segmentation result is generated. Finally, morphological operations may also be performed. Morphological operations include dilation (dilation), erosion (erosion), open (open) and close (closing), etc., which can help correct holes, breaks or other irregularities that may occur during segmentation. For example, segmentation gaps due to noise or low probability values may be filled in by appropriate dilation operations, while erosion may be used to remove unnecessary edge extensions, ensuring that the final segmentation contour is as close as possible to the real anatomy.

In summary, the dental image intelligent analysis method for oral surgery provided by the application firstly acquires dental image data, and extracts a multi-scale feature map comprising shallow layer, middle layer and deep layer feature maps through a backbone network. Subsequently, spatial characteristics of the middle and deep feature maps are enhanced using a spatial adaptation module, while the shallow feature map is refined using edge detection branches to capture tooth edge details. And then fusing the feature images to generate a multi-scale obvious fused feature image, and processing the multi-scale obvious fused feature image by a classification layer to obtain a tooth voxel probability image. Finally, the discrete segmentation label graph is converted into a discrete segmentation label graph as a semantic segmentation result. The method aims at improving the accuracy and the robustness of the tooth image analysis by integrating the multi-scale characteristics and the spatial context information, and provides more reliable support for oral surgery.

The application also provides a dental image intelligent analysis system for oral surgery, as shown in fig. 6, the dental image intelligent analysis system 100 for oral surgery comprises a dental image data acquisition module 11 for acquiring dental image data for oral surgery, a dental image multi-scale feature extraction module 12 for inputting the dental image data into a backbone network to obtain a dental image shallow feature map, a dental image middle feature map and a dental image deep feature map, a dental image space enhancement module 13 for inputting the dental image middle feature map and the dental image deep feature map into a space adaptation module to obtain a dental image middle feature map and a dental image deep space enhancement feature map, a dental image edge detection module 14 for inputting the dental image shallow feature map into an edge detection branch to obtain a dental edge information feature map, a dental image multi-scale feature fusion module 15 for fusing the dental image middle feature map, the dental image deep space enhancement feature map and the dental edge information feature map to obtain a dental image multi-scale fusion feature map, a dental image classification processing module 16 for converting the dental image classification probability map into a voxel map, and a voxel map is segmented by using the dental image classification probability classification and the dental image classification probability map as a voxel map.

The basic principles of the present application have been described above in connection with specific embodiments, but it should be noted that the advantages, benefits, effects, etc. mentioned in the present application are merely examples and not intended to be limiting, and these advantages, benefits, effects, etc. are not to be construed as necessarily possessed by the various embodiments of the application. Furthermore, the specific details disclosed herein are for purposes of illustration and understanding only, and are not intended to be limiting, as the application is not necessarily limited to practice with the above described specific details.

The block diagrams of the devices, apparatuses, devices, systems referred to in the present application are only illustrative examples and are not intended to require or imply that the connections, arrangements, configurations must be made in the manner shown in the block diagrams. As will be appreciated by one of skill in the art, the devices, apparatuses, devices, systems may be connected, arranged, configured in any manner. Words such as "including," "comprising," "having," and the like are words of openness and mean "including but not limited to," and are used interchangeably therewith. The terms "or" and "as used herein refer to and are used interchangeably with the term" and/or "unless the context clearly indicates otherwise. The term "such as" as used herein refers to, and is used interchangeably with, the phrase "such as, but not limited to.

It is also noted that in the apparatus, devices and methods of the present application, the components or steps may be disassembled and/or assembled. Such decomposition and/or recombination should be considered as equivalent aspects of the present application.

The previous description of the disclosed aspects is provided to enable any person skilled in the art to make or use the present application. Various modifications to these aspects will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other aspects without departing from the scope of the application. Thus, the present application is not intended to be limited to the aspects shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

The foregoing description has been presented for purposes of illustration and description. Furthermore, this description is not intended to limit embodiments of the application to the form disclosed herein. Although a number of example aspects and embodiments have been discussed above, a person of ordinary skill in the art will recognize certain variations, modifications, alterations, additions, and subcombinations thereof.

Claims

1. An intelligent analysis method for dental images facing to oral surgery is characterized by comprising the following steps:

Acquiring dental image data for oral surgery;

inputting the dental image data into a backbone network to obtain a dental image shallow feature map, a dental image middle layer feature map and a dental image deep feature map;

inputting the middle-layer dental image feature map and the deep dental image feature map into a space self-adaptive module to obtain a middle-layer dental image space enhancement feature map and a deep dental image space enhancement feature map;

inputting the tooth image shallow feature map into an edge detection branch to obtain a tooth edge information feature map;

Fusing the middle layer space strengthening feature map, the tooth image deep space strengthening feature map and the tooth edge information feature map to obtain a tooth image multi-scale obvious fusion feature map;

inputting the tooth image multi-scale significant fusion feature map into a classification layer to obtain a tooth voxel probability map;

Converting the tooth voxel probability map into a tooth voxel discrete segmentation label map as a tooth image semantic segmentation result;

Fusing the middle layer space enhancement feature map, the deep layer space enhancement feature map and the tooth edge information feature map to obtain a tooth image multi-scale significant fusion feature map, comprising:

Inputting the middle layer space strengthening characteristic diagram, the tooth image deep space strengthening characteristic diagram and the tooth edge information characteristic diagram into a jump connection layer to obtain an initial tooth image multi-scale obvious fusion downsampling characteristic diagram;

And upsampling the tooth image multi-scale significant fusion downsampling feature map to obtain the tooth image multi-scale significant fusion feature map.

2. The method of claim 1, wherein inputting the dental image shallow feature map into an edge detection branch to obtain a dental edge information feature map comprises:

inputting the tooth image shallow feature map into a convolution layer of the edge detection branch to obtain a tooth image edge feature focusing feature map;

Inputting the tooth image edge feature focusing feature map into an activation layer of the edge detection branch to obtain a tooth edge feature mask feature map, wherein the activation layer uses a Sigmoid activation function;

and calculating the point-by-point multiplication between the tooth edge feature mask feature map and the tooth image shallow feature map to obtain the tooth edge information feature map.

3. The dental image intelligent analysis method for oral surgery according to claim 1, wherein the backbone network is a Swin transducer network.

4. The method of claim 1, wherein inputting the dental image middle layer feature map and the dental image deep layer feature map into a spatial adaptation module to obtain a dental image middle layer spatial enhancement feature map and a dental image deep layer spatial enhancement feature map, comprises:

inputting the middle layer characteristic image of the tooth image into a shift learning module based on a convolution layer to obtain a middle layer characteristic shift image of the tooth image;

inputting the middle layer feature map of the tooth image into a modulation scalar learning module based on a convolution layer to obtain a middle layer feature modulation map of the tooth image;

And inputting the middle layer characteristic image of the tooth image into a space self-adaptive strengthening component based on a deformable convolution layer by combining the middle layer characteristic offset image of the tooth image and the middle layer characteristic modulation image of the tooth image so as to obtain a middle layer space strengthening characteristic image of the tooth image.

5. The method of claim 4, wherein inputting the dental image middle layer feature map into a deformable convolution layer based spatially adaptive enhancement component to obtain the dental image middle layer spatially enhanced feature map in combination with the dental image middle layer feature offset map and the dental image middle layer feature modulation map, comprises:

Performing bilinear interpolation sampling on the dental image middle layer feature map based on the dental image middle layer feature offset map to obtain a dental image middle layer sampling feature map;

and calculating the point-by-point multiplication between the tooth image middle layer sampling feature map and the tooth image middle layer feature modulation map to obtain the tooth image middle layer space strengthening feature map.

6. The intelligent analysis method for dental images for oral surgery according to claim 1, wherein the cavitation rate of the cavitation convolution operation is 3, 5, 1, respectively.

7. The dental image intelligent analysis method for oral surgery according to claim 1, wherein the classification layer comprises a point convolution layer and a Softmax classification unit.

8. An intelligent dental image analysis system for oral surgery, for performing the intelligent dental image analysis method for oral surgery according to any one of claims 1 to 7, comprising:

A dental image data acquisition module for acquiring dental image data for oral surgery;

The tooth image multi-scale feature extraction module is used for inputting the tooth image data into a backbone network to obtain a tooth image shallow feature map, a tooth image middle layer feature map and a tooth image deep feature map;

The tooth image space strengthening module is used for inputting the tooth image middle layer characteristic diagram and the tooth image deep layer characteristic diagram into the space self-adapting module to obtain a tooth image middle layer space strengthening characteristic diagram and a tooth image deep layer space strengthening characteristic diagram;

the tooth image edge detection module is used for inputting the tooth image shallow feature image into an edge detection branch to obtain a tooth edge information feature image;

The tooth image multi-scale feature fusion module is used for fusing the tooth image middle-layer space strengthening feature image, the tooth image deep-layer space strengthening feature image and the tooth edge information feature image to obtain a tooth image multi-scale obvious fusion feature image;

The tooth image classification processing module is used for inputting the tooth image multi-scale significant fusion feature image into a classification layer to obtain a tooth voxel probability image;

And the tooth image semantic segmentation module is used for converting the tooth voxel probability map into a tooth voxel discrete segmentation label map serving as a tooth image semantic segmentation result.