技术领域Technical Field
本申请涉及图像处理技术领域,尤其涉及一种基于三维表型建模的植物图像处理方法、装置及设备。The present application relates to the field of image processing technology, and in particular to a plant image processing method, device and equipment based on three-dimensional phenotypic modeling.
背景技术Background Art
对植物图像分析以进行三维建模的方法目前包括传统建模和基于深度学习建模两种方式。传统建模方法主要包括COLMAP和OpenMVS两种,这类方法在三维表型精细化建模的过程中存在时间长、重建效果不佳的问题。基于深度学习建模的方法主要有分为卷积神经网络模型(如DeepVO、BA-Net、CNN-SLAM)和基于神经辐射场的方法,这些技术对于形态简单的各类水果蔬菜重建效果较好,但对于小麦、玉米、烟草等形态较复杂的植物建模效果不佳。The methods for analyzing plant images for 3D modeling currently include traditional modeling and deep learning-based modeling. Traditional modeling methods mainly include COLMAP and OpenMVS. These methods have the problems of long time and poor reconstruction effect in the process of fine-grained modeling of 3D phenotypes. The methods based on deep learning modeling are mainly divided into convolutional neural network models (such as DeepVO, BA-Net, CNN-SLAM) and methods based on neural radiation fields. These technologies have good reconstruction effects on various types of fruits and vegetables with simple morphology, but poor modeling effects on plants with more complex morphology such as wheat, corn, and tobacco.
目前已有技术对于特定物体或形态简单植物进行分析效果较好,但是对于各类植物图像,由于大多植物图像的分析需要精准的尺度,所以现有技术的这些模型不能为植物图像进行精准的分析和处理,以获得精细化的三维表型建模。The existing technologies are effective in analyzing specific objects or plants with simple shapes. However, for various plant images, since the analysis of most plant images requires precise scales, these models in the existing technologies cannot accurately analyze and process plant images to obtain refined three-dimensional phenotypic modeling.
发明内容Summary of the invention
有鉴于此,本申请的目的在于提出一种基于三维表型建模的植物图像处理方法、装置及设备,用以解决目前的图像处理模型不能对植物图像进行精准的分析处理的问题。In view of this, the purpose of the present application is to propose a plant image processing method, device and equipment based on three-dimensional phenotypic modeling, so as to solve the problem that the current image processing model cannot accurately analyze and process plant images.
基于上述目的,本申请提供了一种基于三维表型建模的植物图像处理方法,包括:Based on the above objectives, the present application provides a plant image processing method based on three-dimensional phenotypic modeling, comprising:
对植物在多个视角下的初始图像序列进行特征提取处理,得到提取后的特征点;Perform feature extraction processing on the initial image sequence of the plant at multiple viewing angles to obtain extracted feature points;
确定所述特征点对应的相机位置,并利用三角测量法根据所述相机位置对相应的特征点进行处理,得到植物的稀疏点云数据;Determine the camera position corresponding to the feature point, and process the corresponding feature point according to the camera position using triangulation to obtain sparse point cloud data of the plant;
将多个视角下的初始图像序列输入至追踪网络中的SAM模型中,利用SAM模型对每个视角下的初始图像序列进行前景分割处理,得到初始分割植物前景数据;The initial image sequences under multiple viewing angles are input into the SAM model in the tracking network, and the SAM model is used to perform foreground segmentation processing on the initial image sequence under each viewing angle to obtain initial segmented plant foreground data;
对每个视角下的所述初始分割植物前景数据,利用对应的初始图像进行边缘细化处理,得到与每个视角对应的中间分割植物前景;For the initial segmented plant foreground data at each viewing angle, edge thinning processing is performed using the corresponding initial image to obtain an intermediate segmented plant foreground corresponding to each viewing angle;
将各个视角对应的中间分割植物前景进行空间对齐处理,得到对齐后的特征数据;Perform spatial alignment processing on the middle segmented plant foregrounds corresponding to each viewing angle to obtain aligned feature data;
将所述对齐后的特征数据输入至所述追踪网络中,利用所述追踪网络中的相似性算法进行分析去除不完整的特征,得到各个视角的分割植物前景结果;Inputting the aligned feature data into the tracking network, using the similarity algorithm in the tracking network to analyze and remove incomplete features, and obtaining the segmentation results of the plant foreground at each viewing angle;
根据所述稀疏点云数据确定初始高斯点,基于所述初始高斯点对各个视角的分割植物前景结果进行高斯重建和可视化渲染处理,得到各个视角的三维植物渲染结果;Determine initial Gaussian points according to the sparse point cloud data, and perform Gaussian reconstruction and visualization rendering processing on the segmented plant foreground results of each viewing angle based on the initial Gaussian points to obtain three-dimensional plant rendering results of each viewing angle;
对每个视角的三维植物渲染结果进行网格化三维面片处理,生成与每个视角对应的植物三维网格面片图像,以供基于所述植物三维网格面片图像进行植物形态分析。The three-dimensional plant rendering result of each viewing angle is subjected to gridding three-dimensional surface processing to generate a plant three-dimensional grid surface image corresponding to each viewing angle for plant morphological analysis based on the plant three-dimensional grid surface image.
一种基于三维表型建模的植物图像处理装置,其特征在于,包括:A plant image processing device based on three-dimensional phenotypic modeling, characterized by comprising:
特征提取模块,被配置为对植物在多个视角下的初始图像序列进行特征提取处理,得到提取后的特征点;A feature extraction module is configured to perform feature extraction processing on an initial image sequence of the plant at multiple viewing angles to obtain extracted feature points;
稀疏点云模块,被配置为确定所述特征点对应的相机位置,并利用三角测量法根据所述相机位置对相应的特征点进行处理,得到植物的稀疏点云数据;A sparse point cloud module is configured to determine the camera position corresponding to the feature point, and process the corresponding feature point according to the camera position using triangulation to obtain sparse point cloud data of the plant;
初始分割模块,被配置为将多个视角下的初始图像序列输入至追踪网络中的SAM模型中,利用SAM模型对每个视角下的初始图像序列进行前景分割处理,得到初始分割植物前景数据;An initial segmentation module is configured to input the initial image sequences under multiple viewing angles into the SAM model in the tracking network, and use the SAM model to perform foreground segmentation processing on the initial image sequence under each viewing angle to obtain initial segmented plant foreground data;
中间分割模块,被配置为对每个视角下的所述初始分割植物前景数据,利用对应的初始图像进行边缘细化处理,得到与每个视角对应的中间分割植物前景;The intermediate segmentation module is configured to perform edge thinning processing on the initial segmented plant foreground data at each viewing angle using the corresponding initial image to obtain an intermediate segmented plant foreground corresponding to each viewing angle;
对齐处理模块,被配置为将各个视角对应的中间分割植物前景进行空间对齐处理,得到对齐后的特征数据;An alignment processing module is configured to perform spatial alignment processing on the middle segmented plant foregrounds corresponding to each viewing angle to obtain aligned feature data;
全方位分割模块,被配置为将所述对齐后的特征数据输入至所述追踪网络中,利用所述追踪网络中的相似性算法进行分析去除不完整的特征,得到各个视角的分割植物前景结果;An omnidirectional segmentation module is configured to input the aligned feature data into the tracking network, use the similarity algorithm in the tracking network to analyze and remove incomplete features, and obtain segmented plant foreground results at various viewing angles;
高斯渲染模块,被配置为根据所述稀疏点云数据确定初始高斯点,基于所述初始高斯点对各个视角的分割植物前景结果进行高斯重建和可视化渲染处理,得到各个视角的三维植物渲染结果;A Gaussian rendering module is configured to determine initial Gaussian points according to the sparse point cloud data, and perform Gaussian reconstruction and visualization rendering processing on the segmented plant foreground results of each viewing angle based on the initial Gaussian points to obtain three-dimensional plant rendering results of each viewing angle;
面块处理模块,被配置为对每个视角的三维植物渲染结果进行网格化三维面片处理,生成与每个视角对应的植物三维网格面片图像;以供基于所述植物三维网格面片图像进行植物形态分析。The surface block processing module is configured to perform gridded three-dimensional surface processing on the three-dimensional plant rendering results of each viewing angle to generate a plant three-dimensional grid surface image corresponding to each viewing angle; so as to perform plant morphological analysis based on the plant three-dimensional grid surface image.
基于同一发明构思,本公开还提供了一种电子设备,包括存储器、处理器及存储在所述存储器上并可由所述处理器执行的计算机程序,所述处理器在执行所述计算机程序时实现如上所述的方法。Based on the same inventive concept, the present disclosure also provides an electronic device, including a memory, a processor, and a computer program stored in the memory and executable by the processor, wherein the processor implements the method as described above when executing the computer program.
从上面所述可以看出,本申请提供的基于三维表型建模的植物图像处理方法、装置及设备,一般相机采集的植物图像可能是多视角的,所以会将植物在多视角下的初始图像序列特征提取后,确定特征点的相机位置,进而便于根据相机位置利用三角测量法进行点云的确定,得到植物的稀疏点云数据;本方案引入了结合SAM模型进行改进的追踪网络,这样利用该追踪网络能够对各个视角下的初始图像序列进行前景分割,进而得到只有植物部分的初始分割植物前景数据,避免背景内容的干扰,为了提高初始分割植物前景数据的精准度会对其进行边缘细化,以及将各个视角进行空间对齐,将对齐后的特征数据再利用追踪网络去除不完整的特征,得到精准的各个视角的分割植物前景结果;对各个视角的分割植物前景结果通过高斯重建和可视化渲染处理,得到三维植物渲染结果,但是该三维植物渲染结果可能不能直接用于植物形态分析,需要将其进行网格化三维面片处理,得到每个视角下的植物三维网格面片图像,这样才能更好的基于所述植物三维网格面片图像进行植物形态分析。From the above, it can be seen that the plant image processing method, device and equipment based on three-dimensional phenotypic modeling provided by the present application may be multi-perspective plant images captured by a general camera, so the initial image sequence features of the plant under multiple perspectives will be extracted, and the camera position of the feature points will be determined, so that the point cloud can be determined by triangulation according to the camera position to obtain sparse point cloud data of the plant; this scheme introduces an improved tracking network combined with the SAM model, so that the initial image sequence under each perspective can be segmented by using the tracking network, and then the initial segmented plant foreground data of only the plant part can be obtained, avoiding interference from background content, In order to improve the accuracy of the initial segmented plant foreground data, its edges will be refined, and each perspective will be spatially aligned. The aligned feature data will be used again using a tracking network to remove incomplete features to obtain accurate segmented plant foreground results for each perspective. The segmented plant foreground results for each perspective are processed by Gaussian reconstruction and visualization rendering to obtain a three-dimensional plant rendering result. However, the three-dimensional plant rendering result may not be directly used for plant morphological analysis, and it is necessary to perform gridded three-dimensional surface processing on it to obtain a three-dimensional grid surface image of the plant at each perspective, so that better plant morphological analysis can be performed based on the three-dimensional grid surface image of the plant.
附图说明BRIEF DESCRIPTION OF THE DRAWINGS
为了更清楚地说明本申请或相关技术中的技术方案,下面将对实施例或相关技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the technical solutions in the present application or related technologies, the drawings required for use in the embodiments or related technical descriptions are briefly introduced below. Obviously, the drawings described below are merely embodiments of the present application. For ordinary technicians in this field, other drawings can be obtained based on these drawings without paying any creative work.
图1为本申请实施例的基于三维表型建模的植物图像处理方法的流程图;FIG1 is a flow chart of a plant image processing method based on three-dimensional phenotypic modeling according to an embodiment of the present application;
图2为本申请实施例的植物三维数字可视化的流程示意图;FIG2 is a schematic diagram of a process of three-dimensional digital visualization of plants according to an embodiment of the present application;
图3为本申请实施例的植物图像采集的示意图;FIG3 is a schematic diagram of plant image acquisition according to an embodiment of the present application;
图4为本申请实施例的前景分割的流程示意图;FIG4 is a schematic diagram of a process of foreground segmentation according to an embodiment of the present application;
图5为本申请实施例的各个视角进行前景分割的流程示意图;FIG5 is a schematic diagram of a process of performing foreground segmentation at various viewing angles in an embodiment of the present application;
图6为本申请实施例的高斯渲染以及面片提取的流程示意图;FIG6 is a schematic diagram of the process of Gaussian rendering and face extraction according to an embodiment of the present application;
图7为本申请实施例的植物前景分割结果示意图;FIG7 is a schematic diagram of a plant foreground segmentation result according to an embodiment of the present application;
图8为本申请实施例的单视角和多视角的前景分割示意图;FIG8 is a schematic diagram of foreground segmentation in a single view and multiple views according to an embodiment of the present application;
图9为本申请实施例的高斯渲染结果的示意图;FIG9 is a schematic diagram of a Gaussian rendering result according to an embodiment of the present application;
图10为本申请实施例的基于三维表型建模的植物图像处理装置的结构框图;FIG10 is a structural block diagram of a plant image processing device based on three-dimensional phenotypic modeling according to an embodiment of the present application;
图11为本申请实施例的电子设备的结构示意图。FIG. 11 is a schematic diagram of the structure of an electronic device according to an embodiment of the present application.
具体实施方式DETAILED DESCRIPTION
为使本申请的目的、技术方案和优点更加清楚明白,以下结合具体实施例,并参照附图,对本申请进一步详细说明。In order to make the objectives, technical solutions and advantages of the present application more clearly understood, the present application is further described in detail below in combination with specific embodiments and with reference to the accompanying drawings.
需要说明的是,除非另外定义,本申请实施例使用的技术术语或者科学术语应当为本申请所属领域内具有一般技能的人士所理解的通常意义。本申请实施例中使用的“第一”、“第二”以及类似的词语并不表示任何顺序、数量或者重要性,而只是用来区分不同的组成部分。“包括”或者“包含”等类似的词语意指出现该词前面的元件或者物件涵盖出现在该词后面列举的元件或者物件及其等同,而不排除其他元件或者物件。“连接”或者“相连”等类似的词语并非限定于物理的或者机械的连接,而是可以包括电性的连接,不管是直接的还是间接的。“上”、“下”、“左”、“右”等仅用于表示相对位置关系,当被描述对象的绝对位置改变后,则该相对位置关系也可能相应地改变。It should be noted that, unless otherwise defined, the technical terms or scientific terms used in the embodiments of the present application should be the usual meanings understood by people with ordinary skills in the field to which the present application belongs. The "first", "second" and similar words used in the embodiments of the present application do not represent any order, quantity or importance, but are only used to distinguish different components. "Including" or "comprising" and similar words mean that the elements or objects appearing in front of the word cover the elements or objects listed after the word and their equivalents, without excluding other elements or objects. "Connect" or "connected" and similar words are not limited to physical or mechanical connections, but can include electrical connections, whether direct or indirect. "Up", "down", "left", "right" and the like are only used to indicate relative positional relationships. When the absolute position of the described object changes, the relative positional relationship may also change accordingly.
名词解释:Glossary:
PnP:Perspective-n-Point,PnP算法。PnP: Perspective-n-Point, PnP algorithm.
SAM:segment anything model,图像分割大模型。SAM: segment anything model, large image segmentation model.
SFM:switch fabric module,交叉矩阵结构,由多条的channel水平和垂直交错而成。SFM: switch fabric module, a cross-matrix structure consisting of multiple channels interlaced horizontally and vertically.
TAM:track anything model,追踪模型。TAM: track anything model, tracking model.
XMem:基于Atkinson-Shiffrin记忆模型的长期视频对象分割架构。XMem: An Atkinson-Shiffrin memory model-based architecture for long-term video object segmentation.
ICP:Iterative Closest Point,匹配算法是点云配准(Point CloudRegistration)的一种方法。ICP: Iterative Closest Point, matching algorithm is a method of point cloud registration (Point Cloud Registration).
MLP:Multilayer Perceptron,多层感知机。MLP: Multilayer Perceptron, multilayer perceptron.
本申请致力于开发一种基于三维表型建模的植物图像处理方法,通过首次应用3DGaussian Splatting(高斯泼溅)技术,实现了对植物全方位精细化三维可视化建模的突破。面对传统高斯重建技术在处理复杂背景时的限制,引入了SAM(Segment AnythingModel)+长视频分割XMem算法,从而实现了从图像序列中精准分割目标植物轮廓的目标。此外,针对高斯渲染结果的不可直接测量性问题,采用了一种新型的面片化方式,将单个植株的图像处理结果转化为可直接测量的面片化数据,为植物表型的精确三维重建提供了新的技术途径。This application is dedicated to developing a plant image processing method based on three-dimensional phenotypic modeling. By applying 3D Gaussian Splatting technology for the first time, a breakthrough in the comprehensive and refined three-dimensional visualization modeling of plants has been achieved. Faced with the limitations of traditional Gaussian reconstruction technology in dealing with complex backgrounds, the SAM (Segment Anything Model) + long video segmentation XMem algorithm was introduced to achieve the goal of accurately segmenting the target plant contour from the image sequence. In addition, in order to address the problem of the non-direct measurability of Gaussian rendering results, a new patching method is used to convert the image processing results of a single plant into directly measurable patching data, providing a new technical approach for the accurate three-dimensional reconstruction of plant phenotypes.
在开展本申请的方案说明之前,具体说明本申请的方案思路:Before starting to describe the solution of this application, the solution ideas of this application are described in detail:
对任何时间、任何地点的植物三维数字可视化的探索主要分为四个步骤进行:植物图像采集和背景分割,相机位姿估计和稀疏点云重建,数字植物可视化渲染,以及面片结果提取,如图2所示。为了保证实验可以在任何时间地点对任何植物进行,基于改进的trackanything model(TAM)的方式(SAM+XMem),引入了多尺度特征融合方法,将不同视角下的植物图像流分别输入模型作为不同的prompt(特征),以精细化分割全方位植物前景。对分割之前的结果,基于SFM的方法进行稀疏点云重建和相机位姿估计,以保证初始点云和相机位姿的精确性。接着,将稀疏点云、相机位姿、植物前景图像输入到高斯模型中进行高斯泼溅,经过tiled-based rasterizer得到极接近真实的三维植物渲染结果。最后,为了方便传统的植物三维点云可测量,对三维植物渲染结果进行二次重训练和面片分配,得到植物可测量的植物三维网格面片图像。The exploration of 3D digital visualization of plants at any time and any place is mainly divided into four steps: plant image acquisition and background segmentation, camera pose estimation and sparse point cloud reconstruction, digital plant visualization rendering, and facet result extraction, as shown in Figure 2. In order to ensure that the experiment can be carried out on any plant at any time and place, a multi-scale feature fusion method is introduced based on the improved trackanything model (TAM) method (SAM+XMem), and the plant image streams under different perspectives are respectively input into the model as different prompts (features) to finely segment the omnidirectional plant foreground. For the results before segmentation, sparse point cloud reconstruction and camera pose estimation are performed based on the SFM method to ensure the accuracy of the initial point cloud and camera pose. Then, the sparse point cloud, camera pose, and plant foreground image are input into the Gaussian model for Gaussian splashing, and a 3D plant rendering result that is very close to the real one is obtained through a tiled-based rasterizer. Finally, in order to facilitate the measurability of the traditional plant 3D point cloud, the 3D plant rendering result is retrained and facets are allocated to obtain a 3D mesh facet image of the plant that can be measured.
另外本申请的方案,在Ubuntu 20.4操作系统下进行,处理器型号为Intel(R)Core (TM) i5-13500HX CPU @ 2.50 GHz,显卡为Nvidia GeForce RTX 3090。深度学习框架采用PyTorch 2.0,编程平台为PyCharm,编程语言为Python 3.10,所有对比算法均在相同环境下运行。数据采集和图像处理编程语言为C++,采用QT 6.5.1设计界面,采用OpenCV4.7.0处理图像,程序调试平台为Visual Studio 2022。In addition, the scheme of this application is carried out under the Ubuntu 20.4 operating system, the processor model is Intel(R)Core (TM) i5-13500HX CPU @ 2.50 GHz, and the graphics card is Nvidia GeForce RTX 3090. The deep learning framework uses PyTorch 2.0, the programming platform is PyCharm, the programming language is Python 3.10, and all comparison algorithms are run in the same environment. The programming language for data acquisition and image processing is C++, the interface is designed using QT 6.5.1, the image is processed using OpenCV4.7.0, and the program debugging platform is Visual Studio 2022.
在本实施例的方案进行植物图像的处理之前,对于植物图像的采集具体包括:Before the plant image is processed in the solution of this embodiment, the collection of the plant image specifically includes:
对室内盆栽、室外盆栽和室外大田三种环境,早晨、正午、夜晚三个时间段,对多种植物进行图像数据采集,如图3所示。采集过程包括两个阶段:初步的植物表面粗采集,和针对植物不同部位(如冠层、侧面和底部)的细致形态采集。采用不同品牌手机进行图像采集,确保植物图像的多样性。Image data of various plants were collected in three environments: indoor potted plants, outdoor potted plants, and outdoor fields, and in three time periods: morning, noon, and night, as shown in Figure 3. The collection process includes two stages: preliminary rough collection of plant surfaces, and detailed morphological collection of different parts of plants (such as canopy, side, and bottom). Different brands of mobile phones were used for image collection to ensure the diversity of plant images.
所采集的植物包括:不同生长期下的小麦,不同生长期下的烟草,玉米,以及各类室内盆栽。其中,烟草图像于2023年6月到7月期间,选取三个生长期下的3个植物,共计9组,在10点到15点时间段内进行采集,采集地点为陕西杨凌,实验场景为室外大田;玉米图像于2024年1月到2月期间采集,共计N组,采集地点为海南三亚,实验场景包括室外和室内;各类室内盆栽图像于2023年8月到2024年5月期间采集,共计N组,采集地点包括海南三亚、江苏泰州和陕西杨凌,实验场景为室内;小麦为重点研究对象,图像包括三页期、抽穗期、开花期和成熟期,共计N组,采集地点为陕西杨凌,实验场景包含室内盆栽、室外盆栽和室外田间。The collected plants include: wheat in different growth stages, tobacco in different growth stages, corn, and various indoor potted plants. Among them, tobacco images were collected from June to July 2023, and 3 plants in three growth stages were selected, totaling 9 groups, from 10:00 to 15:00, and the collection location was Yangling, Shaanxi, and the experimental scene was an outdoor field; corn images were collected from January to February 2024, totaling N groups, and the collection location was Sanya, Hainan, and the experimental scenes included outdoor and indoor; various indoor potted plant images were collected from August 2023 to May 2024, totaling N groups, and the collection locations included Sanya, Hainan, Taizhou, Jiangsu, and Yangling, Shaanxi, and the experimental scene was indoor; wheat is the key research object, and the images include the three-leaf period, heading period, flowering period, and maturity period, totaling N groups, and the collection location was Yangling, Shaanxi, and the experimental scenes included indoor potted plants, outdoor potted plants, and outdoor fields.
为保证实验普适性,采集设备为手机、相机、平板或者其他带有摄像头的电子设备。视频分辨率为2k以下任意分辨率,帧数为30或60Hz。采集植物视频后,对视频进行预处理,包括视频拼接、图像抽帧,并删除模糊图像。To ensure the universality of the experiment, the acquisition device is a mobile phone, camera, tablet or other electronic device with a camera. The video resolution is any resolution below 2k, and the frame rate is 30 or 60Hz. After collecting the plant video, the video is preprocessed, including video stitching, image frame extraction, and deleting blurred images.
以下结合附图来详细说明本申请的实施例。The embodiments of the present application are described in detail below with reference to the accompanying drawings.
本申请实施例的基于三维表型建模的植物图像处理方法,如图1所示,包括:The plant image processing method based on three-dimensional phenotypic modeling in the embodiment of the present application, as shown in FIG1 , includes:
步骤101,对植物在多个视角下的初始图像序列进行特征提取处理,得到提取后的特征点。Step 101 , performing feature extraction processing on an initial image sequence of a plant at multiple viewing angles to obtain extracted feature points.
步骤102,确定所述特征点对应的相机位置,并利用三角测量法根据所述相机位置对相应的特征点进行处理,得到植物的稀疏点云数据。Step 102, determining the camera position corresponding to the feature point, and processing the corresponding feature point according to the camera position using a triangulation method to obtain sparse point cloud data of the plant.
步骤103,将多个视角下的初始图像序列输入至追踪网络中的SAM模型中,利用SAM模型对每个视角下的初始图像序列进行前景分割处理,得到初始分割植物前景数据。Step 103, inputting the initial image sequences under multiple viewing angles into the SAM model in the tracking network, and performing foreground segmentation processing on the initial image sequence under each viewing angle using the SAM model to obtain initial segmented plant foreground data.
步骤104,对每个视角下的所述初始分割植物前景数据,利用对应的初始图像进行边缘细化处理,得到与每个视角对应的中间分割植物前景。Step 104 , performing edge thinning processing on the initial segmented plant foreground data at each viewing angle using the corresponding initial image to obtain an intermediate segmented plant foreground corresponding to each viewing angle.
步骤105,将各个视角对应的中间分割植物前景进行空间对齐处理,得到对齐后的特征数据。Step 105 , spatially aligning the middle segmented plant foregrounds corresponding to each viewing angle to obtain aligned feature data.
步骤106,将所述对齐后的特征数据输入至所述追踪网络中,利用所述追踪网络中的相似性算法进行分析去除不完整的特征,得到各个视角的分割植物前景结果。Step 106, inputting the aligned feature data into the tracking network, using the similarity algorithm in the tracking network to analyze and remove incomplete features, and obtaining the results of segmenting the plant foreground at each viewing angle.
步骤107,根据所述稀疏点云数据确定初始高斯点,基于所述初始高斯点对各个视角的分割植物前景结果进行高斯重建和可视化渲染处理,得到各个视角的三维植物渲染结果。Step 107, determining initial Gaussian points according to the sparse point cloud data, and performing Gaussian reconstruction and visualization rendering processing on the segmented plant foreground results of each viewing angle based on the initial Gaussian points to obtain three-dimensional plant rendering results of each viewing angle.
步骤108,对每个视角的三维植物渲染结果进行网格化三维面片处理,生成与每个视角对应的植物三维网格面片图像,以供基于所述植物三维网格面片图像进行植物形态分析。Step 108, gridding the 3D plant rendering results of each viewing angle to generate a plant 3D grid surface image corresponding to each viewing angle, so as to perform plant morphological analysis based on the plant 3D grid surface image.
通过上述方案,一般相机采集的植物图像可能是多视角的,所以会将植物在多视角下的初始图像序列特征提取后,确定特征点的相机位置,进而便于根据相机位置利用三角测量法进行点云的确定,进而得到植物的稀疏点云数据;本方案引入了结合SAM模型进行改进的追踪网络,这样利用该追踪网络能够对各个视角下的初始图像序列进行前景分割,进而得到只有植物部分的初始分割植物前景数据,避免背景内容的干扰,为了提高初始分割植物前景数据的精准度会对其进行边缘细化,以及将各个视角进行空间对齐,将对齐后的特征数据再利用追踪网络去除不完整的特征,得到精准的各个视角的分割植物前景结果;对各个视角的分割植物前景结果通过高斯重建和可视化渲染处理,得到三维植物渲染结果,但是该三维植物渲染结果可能不能直接用于植物形态分析,需要将其进行网格化三维面片处理,得到每个视角下的植物三维网格面片图像,这样才能更好的基于所述植物三维网格面片图像进行植物形态分析。According to the above scheme, the plant images collected by a general camera may be multi-perspective, so the features of the initial image sequence of the plant under multiple perspectives will be extracted, and the camera position of the feature point will be determined, so as to facilitate the determination of the point cloud by triangulation according to the camera position, and then obtain the sparse point cloud data of the plant; this scheme introduces a tracking network improved in combination with the SAM model, so that the initial image sequence under each perspective can be segmented by using the tracking network, and then the initial segmented plant foreground data of only the plant part is obtained to avoid the interference of the background content. In order to improve the accuracy of the initial segmented plant foreground data, the edge will be refined, and the various perspectives will be spatially aligned. The aligned feature data will be used to remove incomplete features by the tracking network to obtain accurate segmented plant foreground results of each perspective; the segmented plant foreground results of each perspective are processed by Gaussian reconstruction and visual rendering to obtain a three-dimensional plant rendering result, but the three-dimensional plant rendering result may not be directly used for plant morphological analysis, and it is necessary to perform gridded three-dimensional surface processing on it to obtain a three-dimensional grid surface image of the plant under each perspective, so that the plant morphological analysis can be better performed based on the three-dimensional grid surface image of the plant.
在一些实施例中,步骤101包括:In some embodiments, step 101 includes:
步骤1011,确定各个视角对应的初始图像序列的灰度值。Step 1011, determining the grayscale value of the initial image sequence corresponding to each viewing angle.
具体实施时,首选是采集植物在特定场景下的各个视角的初始图像序列。其中特定场景可以是由不同地点和/或不同时间和/或不同植物形成的场景。In specific implementation, the first choice is to collect the initial image sequence of plants in various viewing angles in a specific scene, wherein the specific scene may be a scene formed by different locations and/or different times and/or different plants.
步骤1012,利用各个视角的尺度不变特征变换方式,使用高斯函数分别对相应视角的初始图像序列的灰度值进行特征提取,得到提取后的特征点。Step 1012, using the scale-invariant feature transformation method of each viewing angle, using a Gaussian function to extract features from the grayscale values of the initial image sequence of the corresponding viewing angle, to obtain extracted feature points.
采用全局的尺度不变特征变换(SIFT)对所有初始图像序列进行特征提取。SIFT算法能够在不同的尺度空间中寻找关键点,并计算出其方向和描述符,为后续的特征点匹配提供了可靠的基础。SIFT的特征提取表示如公式1所示。The global scale-invariant feature transform (SIFT) is used to extract features from all initial image sequences. The SIFT algorithm can find key points in different scale spaces and calculate their directions and descriptors, providing a reliable basis for subsequent feature point matching. The feature extraction representation of SIFT is shown in Formula 1.
,公式1。 , Formula 1.
其中,是在尺度下初始图像序列中像素点位置的灰度值,是高斯函数,用于模拟图像在不同尺度下的平滑效果,表示尺度空间中的特征点强度即提取后的特征点。in, is in scale The pixel points in the initial image sequence The gray value of the position, is a Gaussian function, which is used to simulate the smoothing effect of images at different scales. It represents the intensity of feature points in the scale space, i.e. the extracted feature points.
在一些实施例中,步骤102包括:In some embodiments, step 102 includes:
步骤1021,依据快速近邻搜索库对提取到的所述特征点,利用最小化欧氏距离的方式进行特征点匹配,得到特征点匹配结果。Step 1021 , matching the extracted feature points according to the fast nearest neighbor search library by minimizing the Euclidean distance to obtain a feature point matching result.
具体实施时,基于FLANN (Fast Library for Approximate Nearest Neighbors)快速近邻搜索库的方法对提取到的特征点进行匹配,以估计相机位姿。特别地,将三组初始图像序列中的每一组图像进行两两之间的特征点匹配,但不跨组匹配,以保证匹配的准确性和效率。特征点匹配可以通过最小化特征点之间的欧式距离来实现,如公式2所示。In the specific implementation, the extracted feature points are matched based on the FLANN (Fast Library for Approximate Nearest Neighbors) fast neighbor search library method to estimate the camera pose. In particular, the feature points of each group of images in the three initial image sequences are matched between each other, but not across groups, to ensure the accuracy and efficiency of the matching. Feature point matching can be achieved by minimizing the Euclidean distance between feature points, as shown in Formula 2.
,公式2。 , Formula 2.
其中,分别表示两个特征点在各自初始图像序列中的位置,表示这两个点之间的距离。通过对所有特征点对计算距离并选择最小值,即可确定最佳的匹配对,得到特征点匹配结果。in, Respectively represent the positions of two feature points in their respective initial image sequences, Represents the distance between the two points. By calculating the distances of all feature point pairs and selecting the minimum value, the best matching pair can be determined and the feature point matching result can be obtained.
步骤1022,基于所述特征点匹配结果利用PnP算法进行位姿匹配,确定所述特征点的相机位置。Step 1022: Perform posture matching using the PnP algorithm based on the feature point matching result to determine the camera position of the feature point.
具体实施时,通过特征点匹配结果,利用PnP (Perspective-n-Point)算法计算相机位置,如公式3所示。在已知一定世界坐标系中匹配点的位置下,可以较准确的计算出每个特征点对应的相机位置。In specific implementation, the camera position is calculated using the PnP (Perspective-n-Point) algorithm based on the feature point matching results, as shown in Formula 3. When the positions of the matching points in a certain world coordinate system are known, the camera position corresponding to each feature point can be calculated more accurately.
,公式3。 , Formula 3.
其中,(u,v)是图像坐标系中的点,K是相机内参矩阵,R和t分别表示相机相对于世界坐标系的旋转和平移。(X,Y,Z)是世界坐标系中的点,s是缩放因子。Where (u, v) is a point in the image coordinate system, K is the camera intrinsic matrix, R and t represent the rotation and translation of the camera relative to the world coordinate system, respectively. (X, Y, Z) is a point in the world coordinate system, and s is the scaling factor.
步骤1023,根据所述相机位置和所述特征点的位置,利用三角测量法计算所述特征点在三维空间中的坐标,根据坐标确定植物的稀疏点云数据。Step 1023, according to the camera position and the position of the feature point, the coordinates of the feature point in the three-dimensional space are calculated by using triangulation method, and the sparse point cloud data of the plant is determined according to the coordinates.
具体实施时,根据已知的相机位置和初始图像序列中的特征点的位置,通过三角测量法可以计算出特征点在三维空间中的坐标,从而获得稀疏点云数据(点云就是一系列空间三维点的集合)。In specific implementation, based on the known camera position and the position of the feature points in the initial image sequence, the coordinates of the feature points in three-dimensional space can be calculated through triangulation, thereby obtaining sparse point cloud data (a point cloud is a collection of a series of three-dimensional points in space).
假设在两张初始图像序列中的图像中,有特征点和,以及相应的相机位姿和,通过求解以下方程组可以得到三维空间中的坐标:Assume that in the images in the two initial image sequences, there are feature points and , and the corresponding camera pose and , the coordinates in three-dimensional space can be obtained by solving the following set of equations :
,其中,和为缩放因子。 ,in, and is the scaling factor.
在一些实施例中,步骤103包括:In some embodiments, step 103 includes:
如图4所示的前景分割的流程示意图,根据输入视频确定初始图像序列,针对多个视角中的每个视角:As shown in FIG4 , the process flow diagram of foreground segmentation determines an initial image sequence according to an input video, for each of the multiple perspectives:
步骤1031,确定用户针对该视角下的初始图像序列的选定目标点,将所述选定目标点和该视角下的初始图像序列输入至追踪网络中的SAM模型中,进行图像标准化处理,得到标准化处理后的数据。Step 1031, determine the target point selected by the user for the initial image sequence under the viewing angle, input the selected target point and the initial image sequence under the viewing angle into the SAM model in the tracking network, perform image standardization processing, and obtain standardized data.
具体实施时,追踪网络(Track Anything Model ,TAM)是一种结合了最新最有效的图像分割大模型segment anything model(SAM)的视频分割模型。In specific implementation, the Track Anything Model (TAM) is a video segmentation model that combines the latest and most effective image segmentation model segment anything model (SAM).
将从该视角的初始图像序列抽帧得到的植物图像序列输入追踪网络。用户从植物图像序列选择一张特定视角下的植物图像,然后在植物图像的前景区域进行人工点选,选定目标点。这些选定的目标点和抽帧得到的植物图像序列一同输入到SAM模型中进行处理。The plant image sequence obtained by extracting frames from the initial image sequence of the viewpoint is input into the tracking network. The user selects a plant image under a specific viewpoint from the plant image sequence, and then manually selects the target point in the foreground area of the plant image. These selected target points and the plant image sequence obtained by extracting frames are input into the SAM model for processing.
在Vision Transformer(视觉转换器)处理阶段,对输入的植物图像序列进行标准化处理。In the Vision Transformer processing stage, the input plant image sequence is normalized.
步骤1032,利用SAM模型中的多头注意力机制对所述标准化处理后的数据进行残差连接和第二次标准化处理,得到二次标准化处理后的数据。Step 1032, using the multi-head attention mechanism in the SAM model to perform residual connection and a second normalization on the normalized data to obtain the second normalized data.
具体实施时,多头注意力机制使得SAM模型可以在不同的表示空间中并行处理植物图像,如公式4所示。In specific implementation, the multi-head attention mechanism enables the SAM model to process plant images in parallel in different representation spaces, as shown in Formula 4.
,公式4。 , Formula 4.
其中,Q,K,V分别为查询、键和值矩阵,是键向量的维度。Among them, Q, K, V are query, key and value matrices respectively. is the dimension of the key vector.
添加多头注意力机制后,进行残差连接和二次标准化处理,得到二次标准化处理后的数据。After adding the multi-head attention mechanism, residual connection and secondary normalization are performed to obtain the data after secondary normalization.
步骤1033,利用SAM模型中的多层感知机MLP对所述二次标准化处理后的数据进行处理,将多层感知机MLP的输出结果与所述二次标准化处理后的数据相加形成残差连接,得到图像的嵌入结果。Step 1033, using the multi-layer perceptron MLP in the SAM model to process the data after the secondary normalization processing, adding the output result of the multi-layer perceptron MLP to the data after the secondary normalization processing to form a residual connection, and obtaining the embedding result of the image.
具体实施时,SAM模型中的多头注意力机制连接多层感知机MLP,MLP由两个全连接层和一个ReLU激活函数组成,具体如公式5:In specific implementation, the multi-head attention mechanism in the SAM model is connected to the multi-layer perceptron MLP, which consists of two fully connected layers and a ReLU activation function, as shown in Formula 5:
,公式5。 , Formula 5.
其中x为输入二次标准化处理后的数据的向量,W1为第一层权重矩阵,b1为第一层偏置向量;W2为第二层权重矩阵,b2为第二层偏置向量。Where x is the vector of input data after secondary normalization,W1 is the weight matrix of the first layer,b1 is the bias vector of the first layer;W2 is the weight matrix of the second layer, andb2 is the bias vector of the second layer.
整个MLP的过程由两层全连接层和一个ReLu激活函数组成,全连接层包括权重矩阵和偏置向量。The entire MLP process consists of two fully connected layers and a ReLu activation function. The fully connected layer includes a weight matrix and a bias vector.
第一层中,输入x通过权重矩阵W1进行加权求和,并加上偏置b1;计算结果通过ReLU激活函数进行非线性变换。In the first layer, the input x is weighted and summed by the weight matrix W1 and the bias b1 is added; the calculation result is nonlinearly transformed by the ReLU activation function.
接着进入第二层,ReLU的输出结果通过第二层权重矩阵W2进行加权求和,并加上偏置b2。Then enter the second layer, the output of ReLU is weighted summed by the second layer weight matrix W2 , and the bias b2 is added.
MLP的输出会于与输入到MLP之前的结果相加,形成残差连接,以增强模型对植物不同尺度下的学习能力。The output of the MLP is added to the result before being input to the MLP to form a residual connection to enhance the model's ability to learn plants at different scales.
经过vision transformer阶段的深度特征提取后,生成图像的嵌入表示(imageembeddings)。After deep feature extraction in the vision transformer stage, image embeddings are generated.
步骤1034,将所述选定目标点进行位置编码处理,得到位置编码结果。Step 1034, performing position coding processing on the selected target point to obtain a position coding result.
将用户选择的选定目标点送入位置编码部分,帮助SAM模型理解前景部分特征的位置。位置编码采用正弦函数和余弦函数的变化,如公式6和公式7所示。The selected target point selected by the user is sent to the position encoding part to help the SAM model understand the location of the foreground features. Position encoding uses changes in sine and cosine functions, as shown in Formula 6 and Formula 7.
,公式6。 , Formula 6.
,公式7。 , Formula 7.
其中,pos为位置索引,i是维度索引,dmodel是编码的维度。Among them, pos is the position index, i is the dimension index, and dmodel is the encoded dimension.
步骤1035,使用SAM模型中的自注意力机制根据所述图像的嵌入结果和所述位置编码结果,分析图像的各部分的关系,以及使用SAM模型中的交叉注意力机制根据所述图像的嵌入结果和所述位置编码结果,加强图像的特定部分与选定目标点之间的相关性。Step 1035, using the self-attention mechanism in the SAM model to analyze the relationship between the various parts of the image based on the embedding result of the image and the position encoding result, and using the cross-attention mechanism in the SAM model to strengthen the correlation between the specific part of the image and the selected target point based on the embedding result of the image and the position encoding result.
在mask decoder(掩码解码)阶段,使用自注意力机制和交叉注意力机制来处理图像的嵌入结果和位置编码结果。自注意力机制帮助SAM模型更好地理解图像各部分之间的关系,而交叉注意力机制则用于加强图像特定部分与输入点之间的相关性,从而提高植物前景分割的准确性。In the mask decoder stage, the self-attention mechanism and the cross-attention mechanism are used to process the image embedding results and position encoding results. The self-attention mechanism helps the SAM model better understand the relationship between the parts of the image, while the cross-attention mechanism is used to strengthen the correlation between the specific parts of the image and the input points, thereby improving the accuracy of plant foreground segmentation.
步骤1036,基于所述图像的各部分的关系和图像的特定部分与选定目标点之间的相关性,进行前景分割处理,得到初始分割植物前景数据。Step 1036, based on the relationship between the various parts of the image and the correlation between the specific part of the image and the selected target point, foreground segmentation processing is performed to obtain initial segmented plant foreground data.
基于改进的追踪网络,结合预训练的Segment Anything Model (SAM)和XMem视频追踪和分割算法,能够在任何场景下的连续的初始图像序列中准确地分割出植物的前景,显著提高前景分割的精度和效率。Based on the improved tracking network, combined with the pre-trained Segment Anything Model (SAM) and XMem video tracking and segmentation algorithm, it is possible to accurately segment the foreground of plants in a continuous initial image sequence in any scene, significantly improving the accuracy and efficiency of foreground segmentation.
如图5所示,为各个视角进行前景分割的流程示意图。As shown in FIG5 , it is a schematic diagram of the process of performing foreground segmentation for each viewing angle.
在一些实施例中,所述步骤104包括:In some embodiments, step 104 includes:
针对多个视角中的每个视角:For each of the multiple perspectives:
步骤1041,将该视角下的初始图像序列和初始分割植物前景数据输入至追踪网络中。Step 1041 , inputting the initial image sequence and initial segmented plant foreground data under the viewing angle into the tracking network.
步骤1042,利用所述追踪网络中的XMem模型,根据初始分割植物前景数据进行半监督(VOS)处理,得到优化后的初始分割植物前景数据。Step 1042 , using the XMem model in the tracking network, semi-supervised (VOS) processing is performed based on the initial segmented plant foreground data to obtain optimized initial segmented plant foreground data.
步骤1043,利用所述追踪网络中SAM模型,基于该视角下的初始图像序列对优化后的初始分割植物前景数据进行边缘细化处理,得到该视角对应的中间分割植物前景。Step 1043, using the SAM model in the tracking network, based on the initial image sequence at the viewing angle, edge refinement processing is performed on the optimized initial segmented plant foreground data to obtain the intermediate segmented plant foreground corresponding to the viewing angle.
通过上述方案,能够完成对图像中植物的目标追踪的目的,进一步提高植物前景的分割效果。Through the above scheme, the purpose of tracking the target plants in the image can be achieved, and the segmentation effect of the plant foreground can be further improved.
在一些实施例中,步骤105包括:In some embodiments, step 105 includes:
步骤1051,针对各个视角对应的中间分割植物前景提取多尺度特征。Step 1051 , extracting multi-scale features from the middle segmented plant foreground corresponding to each viewing angle.
步骤1052,利用ICP配准算法依据所述多尺度特征,对各个视角的中间分割植物前景进行空间对齐,并将空间对齐后的中间分割植物前景进行特征加权平均处理,得到对齐后的特征数据。Step 1052, using the ICP registration algorithm to spatially align the intermediate segmented plant foregrounds of each perspective according to the multi-scale features, and performing feature weighted averaging processing on the spatially aligned intermediate segmented plant foregrounds to obtain aligned feature data.
其中,ICP匹配算法是点云配准(Point Cloud Registration)的一种方法,输入两幅点云,然后获得一个R&T矩阵,能使得一幅点云经过R&T变化后,能和另一幅点云重合度尽可能高。Among them, the ICP matching algorithm is a method of point cloud registration. It inputs two point clouds and then obtains an R&T matrix, which can make one point cloud overlap with the other point cloud as much as possible after R&T changes.
然后执行步骤106中的“将所述对齐后的特征数据输入至所述追踪网络中,利用所述追踪网络中的相似性算法进行分析去除不完整的特征,得到各个视角的分割植物前景结果”完成前景分割的过程。Then, execute step 106 of "inputting the aligned feature data into the tracking network, using the similarity algorithm in the tracking network to analyze and remove incomplete features, and obtaining the segmented plant foreground results of each viewing angle" to complete the foreground segmentation process.
图6示出了进行高斯渲染以及面片提取的流程示意图。FIG. 6 shows a schematic diagram of the process of performing Gaussian rendering and face extraction.
在一些实施例中,步骤107包括:In some embodiments, step 107 includes:
步骤1071,根据所述稀疏点云数据确定初始高斯点,基于K-Means聚类算法对所述初始高斯点进行聚类均值处理,得到高斯点的均值结果。Step 1071, determine initial Gaussian points according to the sparse point cloud data, perform clustering mean processing on the initial Gaussian points based on the K-Means clustering algorithm, and obtain the mean result of the Gaussian points.
具体实施时,从稀疏点云中即可获得初始高斯点。初始高斯点中每个高斯点的数学表示如公式8所示。In specific implementation, the initial Gaussian points can be obtained from the sparse point cloud. The mathematical representation of each Gaussian point in the initial Gaussian points is shown in Formula 8.
,公式8。 , formula 8.
其中,x为稀疏点云数据的空间中的任意点,μ高斯分布的均值,即初始高斯点中每个高斯点的中心位置,为三维协方差矩阵,控制高斯分布的形状和方向。Among them, x is any point in the space of sparse point cloud data, μ is the mean of Gaussian distribution, that is, the center position of each Gaussian point in the initial Gaussian point, is the three-dimensional covariance matrix that controls the shape and direction of the Gaussian distribution.
针对输入的稀疏点云数据中包含背景和植物前景,为了提高后续植物前景渲染的效率,基于K-Means聚类初始化高斯点的均值μ假设输入的稀疏点云数据为,Pi为点云中的点,聚类中心为μj,则更新每个聚类中心如公式9所示。The input sparse point cloud data contains background and plant foreground. In order to improve the efficiency of subsequent plant foreground rendering, the mean μ of Gaussian points is initialized based on K-Means clustering. Assume that the input sparse point cloud data is ,Pi is a point in the point cloud, and the cluster center is μj , then each cluster center is updated as shown in Formula 9.
,公式9。 , Formula 9.
其中,Cj是第j个聚类中的点集,是Cj中点的数量,为Cj中所有点的位置向量之和。Among them,Cj is the point set in the jth cluster, is the number of points in Cj , is the sum of the position vectors of all points inCj .
步骤1072,利用放缩变换和旋转变换矩阵确定初始化的三维协方差矩阵,并根据三维协方差矩阵进行二维投影计算得到二维协方差矩阵。Step 1072: Determine an initialized three-dimensional covariance matrix using the scaling transformation and the rotation transformation matrix, and perform a two-dimensional projection calculation based on the three-dimensional covariance matrix to obtain a two-dimensional covariance matrix.
定义放缩变换和旋转变换矩阵来初始化参数,如公式10所示。Define scaling and rotation transformation matrices to initialize parameters , as shown in Formula 10.
,公式10。 , formula 10.
其中,R(q)为由四元数q表达的旋转变换,S(s)为放缩变换,由一个3D向量s表示。Among them, R(q) is the rotation transformation expressed by the quaternion q, and S(s) is the scaling transformation, represented by a 3D vector s.
从3D高斯渲染(Gaussian Splatting)到平面上,然后经过Tiled-basedRasterizer后得到三维数字植物渲染结果。From 3D Gaussian Splatting to a plane, and then through Tiled-basedRasterizer, we get the 3D digital plant rendering result.
所谓的三维渲染,实际上就是,让用户在拖动的过程中,实时地给用户展现各个角度下的植物图片。所以,要做到三维渲染,需要计算3D点在每个视角下的二维投影结果。在上述实施例中对3D高斯点进行了初始化和优化,在该步骤,就需要把3D高斯点投影到二维平面上进行渲染。The so-called three-dimensional rendering is actually to show users plant pictures from various angles in real time during the dragging process. Therefore, to achieve three-dimensional rendering, it is necessary to calculate the two-dimensional projection results of the 3D points at each viewing angle. In the above embodiment, the 3D Gaussian points are initialized and optimized. In this step, the 3D Gaussian points need to be projected onto a two-dimensional plane for rendering.
计算投影的二维协方差矩阵,目的就是为了将三维空间中的高斯点投影到二维图像平面上,并在图像平面上进行渲染。通过计算投影后的二维协方差矩阵,可以准确地表示每个高斯点在图像平面上的形状和分布,从而进行高质量的渲染。The purpose of calculating the projected two-dimensional covariance matrix is to project the Gaussian points in the three-dimensional space onto the two-dimensional image plane and render them on the image plane. By calculating the projected two-dimensional covariance matrix, the shape and distribution of each Gaussian point on the image plane can be accurately represented, thereby achieving high-quality rendering.
采用gaussian splatting中的方法进行渲染(splatting)。按照公式11计算投影的二维协方差矩阵。Use the Gaussian splatting method for rendering (splatting). Calculate the two-dimensional covariance matrix of the projection according to formula 11.
,公式11。 , Formula 11.
其中,J表示雅可比矩阵(Jacobian Matrix),描述投影变换中每个变量对于不同坐标的偏导数,W表示旋转和平移矩阵(对应的前面的相机位姿)。Among them, J represents the Jacobian Matrix, which describes the partial derivatives of each variable in the projection transformation for different coordinates, and W represents the rotation and translation matrix (corresponding to the previous camera pose ).
步骤1073,将所述高斯点的均值结果乘以所述二维协方差矩阵,进行高斯处理得到的图像强度。Step 1073: Multiply the mean value of the Gaussian points by the two-dimensional covariance matrix and perform Gaussian processing to obtain the image intensity.
步骤1074,确定每个高斯点的不透明度,根据所述不透明度确定图像颜色。Step 1074, determining the opacity of each Gaussian point, and determining the image color according to the opacity.
基于Tiled-based Rasterizer的方法进行渲染。经过投影变换(这里的投影变换就是指的是将3D高斯点乘以上述的协方差矩阵)后,采用公式12计算每个高斯点的不透明度,最终合成图像。Rendering is performed based on the Tiled-based Rasterizer method. After projection transformation (the projection transformation here refers to multiplying the 3D Gaussian point by the above covariance matrix), the opacity of each Gaussian point is calculated using formula 12 to finally synthesize the image.
,公式12。 , Formula 12.
其中,α为三维高斯点的不透明度,α’为投影的二维高斯点的不透明度。Among them,α is the opacity of the three-dimensional Gaussian point, and α' is the opacity of the projected two-dimensional Gaussian point.
C为图像颜色,通过混合alpha合成的方法来计算整体图像最终颜色,如公式所示:C is the image color, and the final color of the overall image is calculated by mixing alpha synthesis, as shown in the formula:
其中,为模型学习到的颜色,而为计算的不透明度。in, is the color learned by the model, and The calculated opacity.
步骤1075,基于所述高斯处理得到的图像强度和所述图像颜色,利用高斯渲染模型进行处理,并确定高斯渲染模型的损失函数,并对损失函数优化参数后对高斯渲染模型进行反向传播处理,使得利用高斯渲染模型能够基于高斯处理得到的图像强度和图像颜色对分割植物前景结果进行颜色渲染,得到各个视角的三维植物渲染结果。Step 1075, based on the image intensity and the image color obtained by the Gaussian processing, a Gaussian rendering model is used for processing, and a loss function of the Gaussian rendering model is determined. After optimizing the parameters of the loss function, the Gaussian rendering model is back-propagated, so that the Gaussian rendering model can be used to perform color rendering on the segmented plant foreground result based on the image intensity and the image color obtained by the Gaussian processing, and obtain three-dimensional plant rendering results from various perspectives.
具体实施时,投影处理之后,定义高斯渲染模型的损失函数更新并优化参数并反向传播。由于可视化渲染过程中只关注植物前景,因此为植物前景像素分配更高的权重,损失函数如公式13所示。In the specific implementation, after the projection processing, the loss function of the Gaussian rendering model is defined to update and optimize the parameters and back propagate. Since only the plant foreground is concerned during the visualization rendering process, a higher weight is assigned to the plant foreground pixels, and the loss function is shown in Formula 13.
,公式13。 , Formula 13.
其中,是针对每个像素点的权重,N为目标数据中的像素点数,为由当前高斯参数渲染得到的图像强度,为实际图像在处的强度。in, is the weight for each pixel, N is the number of pixels in the target data, is the image intensity rendered by the current Gaussian parameters, For the actual image The strength of the place.
通过梯度下降的方法更新参数,以减小损失函数值,如公式14所示。The parameters are updated by the gradient descent method to reduce the loss function value, as shown in Formula 14.
,公式14。 , Formula 14.
其中,β为学习率,控制参数更新的步长。Among them, β is the learning rate, which controls the step size of parameter update.
根据计算获得的颜色和不透明度,即可将各个高斯点渲染到各个角度的图像平面上,从而合成获得图像。在用户任意观察视角(相机视角)下,都可以计算出对应的投影图像,即可获得全方位的各个视角的三维植物渲染结果。According to the calculated color and opacity, each Gaussian point can be rendered on the image plane at each angle to synthesize the image. Under any user's observation angle (camera angle), the corresponding projection image can be calculated to obtain a full range of 3D plant rendering results from various angles.
在一些实施例中,步骤108包括:In some embodiments, step 108 includes:
尽管三维(3D)高斯渲染的表示方法是一种新型的三维表示方法,但传统三维测量方法仍然主要依赖网格化的三维面片进行计算。Although the three-dimensional (3D) Gaussian rendering representation method is a new 3D representation method, traditional 3D measurement methods still mainly rely on meshed 3D surfaces for calculation.
针对多个视角中的每个视角:For each of the multiple perspectives:
步骤1081,利用正则项的算法优化高斯渲染模型的损失函数,利用该视角的三维植物渲染结果对高斯模型进行二次训练处理,得到与植物表面贴合的高斯分布图像。Step 1081, using the regularization algorithm to optimize the loss function of the Gaussian rendering model, and using the three-dimensional plant rendering result of the viewing angle to perform secondary training processing on the Gaussian model to obtain a Gaussian distribution image that fits the plant surface.
具体实施时,通过引入正则项的方法优化损失函数,基于高斯渲染结果对高斯模型进行二次训练,得到与植物表面紧密贴合的高斯分布图像(3D高斯分布)。In the specific implementation, the loss function is optimized by introducing the regularization term, and the Gaussian model is trained twice based on the Gaussian rendering results to obtain a Gaussian distribution image (3D Gaussian distribution) that closely fits the plant surface.
主要由三个步骤组成,即(1)计算SDF值使得高斯更接近植物表面;(2)结合深度图映射的方法优化密度函数以增加3D高斯之间的重叠度;(3)减小3D高斯的缩放因子使其更平滑以贴合植物表面。The method mainly consists of three steps, namely (1) calculating the SDF value to make the Gaussian closer to the plant surface; (2) optimizing the density function in combination with the depth map mapping method to increase the overlap between 3D Gaussians; and (3) reducing the scaling factor of the 3D Gaussian to make it smoother and fit the plant surface.
经过新的正则项优化的方法,可以训练获得大部分重叠且完全贴合植物表面的3D高斯分布。Through the new regularization term optimization method, it is possible to train a 3D Gaussian distribution that mostly overlaps and completely fits the plant surface.
步骤1082,基于所述与植物表面贴合的高斯分布图像的密度进行上采样处理,得到稠密植物特征点。Step 1082 , performing upsampling processing based on the density of the Gaussian distribution image that fits the plant surface to obtain dense plant feature points.
步骤1083,利用泊松重建的方式确定3D网络模型,在所述3D网络模型上绑定所述稠密植物特征点,得到该视角对应的植物三维网格面片图像。Step 1083, determine the 3D network model by using Poisson reconstruction, bind the dense plant feature points to the 3D network model, and obtain the plant three-dimensional mesh patch image corresponding to the viewing angle.
具体实施时,在重建网格的过程中进一步结合SDF的方法,优化网格质量。最后,重新在网格表面绑定3D高斯,获得更精细化的植物三维网格面片图像。In the specific implementation, the SDF method is further combined in the process of mesh reconstruction to optimize the mesh quality. Finally, the 3D Gaussian is re-bound on the mesh surface to obtain a more refined plant 3D mesh patch image.
下面具体分析本申请方案的处理结果。The processing results of this application scheme are analyzed in detail below.
如图7所示,为植物前景分割结果,可以准确的分割出植物前景。As shown in FIG. 7 , it is the plant foreground segmentation result, which can accurately segment the plant foreground.
如图8所示,为单视角和多视角的前景分割示意图,根据图8可以准确获知多视角的前景分割结果。As shown in FIG8 , it is a schematic diagram of foreground segmentation of a single view and multiple views. According to FIG8 , the foreground segmentation result of multiple views can be accurately obtained.
如图9所示,为三维高斯可视化渲染结果,可以得到准确的可视化渲染结果。As shown in FIG9 , it is a three-dimensional Gaussian visualization rendering result, and an accurate visualization rendering result can be obtained.
单纯的三维渲染在实际农业生产中难以发挥更多的可能性。因此,将植物三维可视化结果转化为真实可测量的3D点的方法。经过三维高斯溅射优化后,高斯分布并不是一个有序的结构、与场景的实际表面并不能很好地对应,因此,探索出了一种从3D高斯结果中提取精准面片的方法。基于本申请的方法,成功的提取出了精准的植物面片,可以直接对其测量高度、体积、叶面积、茎叶夹角等性状、或者搭建新的模型进行叶片分割、植物特性分析等细致的点云处理。Pure three-dimensional rendering is difficult to bring out more possibilities in actual agricultural production. Therefore, a method of converting the three-dimensional visualization results of plants into real measurable 3D points. After three-dimensional Gaussian sputtering optimization, the Gaussian distribution is not an ordered structure and does not correspond well to the actual surface of the scene. Therefore, a method of extracting accurate faces from 3D Gaussian results is explored. Based on the method of this application, accurate plant faces are successfully extracted, and their height, volume, leaf area, stem-leaf angle and other properties can be directly measured, or new models can be built for detailed point cloud processing such as leaf segmentation and plant characteristic analysis.
值得一提的是,尽管对植物渲染结果做了可测量面片的提取,以便于后续通过传统点云处理的方式进行计算,3D Gaussian的表示方式,作为一种高效的新型场景表示和传统渲染方式相结合的手段,更重要的是它本身的显式函数表示(区别于Nerf的隐式神经网络),足以引起一场三维表型测量领域的变革。本申请通过相应插件即可方便的将3Dgaussian表示导入到Unity等常用软件中进行二次编辑。It is worth mentioning that although the measurable patches are extracted from the plant rendering results to facilitate subsequent calculations through traditional point cloud processing, the 3D Gaussian representation, as an efficient means of combining new scene representation with traditional rendering methods, is more importantly, its own explicit function representation (different from Nerf's implicit neural network), which is enough to cause a revolution in the field of three-dimensional phenotypic measurement. This application can easily import the 3D Gaussian representation into common software such as Unity for secondary editing through the corresponding plug-in.
需要说明的是,本申请实施例的方法可以由单个设备执行,例如一台计算机或服务器等。本实施例的方法也可以应用于分布式场景下,由多台设备相互配合来完成。在这种分布式场景的情况下,这多台设备中的一台设备可以只执行本申请实施例的方法中的某一个或多个步骤,这多台设备相互之间会进行交互以完成所述的方法。It should be noted that the method of the embodiment of the present application can be performed by a single device, such as a computer or server. The method of this embodiment can also be applied to a distributed scenario and completed by multiple devices cooperating with each other. In the case of such a distributed scenario, one of the multiple devices can only perform one or more steps in the method of the embodiment of the present application, and the multiple devices will interact with each other to complete the described method.
需要说明的是,上述对本申请的一些实施例进行了描述。其它实施例在所附权利要求书的范围内。在一些情况下,在权利要求书中记载的动作或步骤可以按照不同于上述实施例中的顺序来执行并且仍然可以实现期望的结果。另外,在附图中描绘的过程不一定要求示出的特定顺序或者连续顺序才能实现期望的结果。在某些实施方式中,多任务处理和并行处理也是可以的或者可能是有利的。It should be noted that the above describes some embodiments of the present application. Other embodiments are within the scope of the appended claims. In some cases, the actions or steps recorded in the claims can be performed in an order different from that in the above embodiments and still achieve the desired results. In addition, the processes depicted in the accompanying drawings do not necessarily require the specific order or continuous order shown to achieve the desired results. In some embodiments, multitasking and parallel processing are also possible or may be advantageous.
基于同一发明构思,与上述任意实施例方法相对应的,本申请还提供了一种基于三维表型建模的植物图像处理装置。Based on the same inventive concept, corresponding to any of the above-mentioned embodiment methods, the present application also provides a plant image processing device based on three-dimensional phenotypic modeling.
参考图10,基于三维表型建模的植物图像处理装置,包括:Referring to FIG10 , a plant image processing device based on three-dimensional phenotypic modeling includes:
特征提取模块201,被配置为对植物在多个视角下的初始图像序列进行特征提取处理,得到提取后的特征点;The feature extraction module 201 is configured to perform feature extraction processing on the initial image sequence of the plant at multiple viewing angles to obtain extracted feature points;
稀疏点云模块202,被配置为确定所述特征点对应的相机位置,并利用三角测量法根据所述相机位置对相应的特征点进行处理,得到植物的稀疏点云数据;The sparse point cloud module 202 is configured to determine the camera position corresponding to the feature point, and process the corresponding feature point according to the camera position using triangulation to obtain sparse point cloud data of the plant;
初始分割模块203,被配置为将多个视角下的初始图像序列输入至追踪网络中的SAM模型中,利用SAM模型对每个视角下的初始图像序列进行前景分割处理,得到初始分割植物前景数据;The initial segmentation module 203 is configured to input the initial image sequences under multiple viewing angles into the SAM model in the tracking network, and perform foreground segmentation processing on the initial image sequence under each viewing angle using the SAM model to obtain initial segmented plant foreground data;
中间分割模块204,被配置为对每个视角下的所述初始分割植物前景数据,利用对应的初始图像进行边缘细化处理,得到与每个视角对应的中间分割植物前景;The intermediate segmentation module 204 is configured to perform edge thinning processing on the initial segmented plant foreground data at each viewing angle using the corresponding initial image to obtain the intermediate segmented plant foreground corresponding to each viewing angle;
对齐处理模块205,被配置为将各个视角对应的中间分割植物前景进行空间对齐处理,得到对齐后的特征数据;The alignment processing module 205 is configured to perform spatial alignment processing on the middle segmented plant foregrounds corresponding to each viewing angle to obtain aligned feature data;
全方位分割模块206,被配置为将所述对齐后的特征数据输入至所述追踪网络中,利用所述追踪网络中的相似性算法进行分析去除不完整的特征,得到各个视角的分割植物前景结果;The omnidirectional segmentation module 206 is configured to input the aligned feature data into the tracking network, and use the similarity algorithm in the tracking network to analyze and remove incomplete features, so as to obtain the segmentation results of the plant foreground at each viewing angle;
高斯渲染模块207,被配置为根据所述稀疏点云数据确定初始高斯点,基于所述初始高斯点对各个视角的分割植物前景结果进行高斯重建和可视化渲染处理,得到各个视角的三维植物渲染结果;The Gaussian rendering module 207 is configured to determine initial Gaussian points according to the sparse point cloud data, and perform Gaussian reconstruction and visualization rendering processing on the segmented plant foreground results of each viewing angle based on the initial Gaussian points to obtain three-dimensional plant rendering results of each viewing angle;
面块处理模块208,被配置为对每个视角的三维植物渲染结果进行网格化三维面片处理,生成与每个视角对应的植物三维网格面片图像;以供基于所述植物三维网格面片图像进行植物形态分析。The surface block processing module 208 is configured to perform gridded 3D surface processing on the 3D plant rendering results of each viewing angle to generate a plant 3D grid surface image corresponding to each viewing angle; so as to perform plant morphological analysis based on the plant 3D grid surface image.
在一些实施例中,特征提取模块201,具体被配置为:In some embodiments, the feature extraction module 201 is specifically configured to:
确定各个视角对应的初始图像序列的灰度值;Determine the grayscale value of the initial image sequence corresponding to each viewing angle;
利用各个视角的尺度不变特征变换方式,使用高斯函数分别对相应视角的初始图像序列的灰度值进行特征提取,得到提取后的特征点。By utilizing the scale-invariant feature transformation method of each viewing angle, a Gaussian function is used to extract features from the grayscale values of the initial image sequence of the corresponding viewing angle to obtain the extracted feature points.
在一些实施例中,稀疏点云模块202,具体被配置为:In some embodiments, the sparse point cloud module 202 is specifically configured to:
依据快速近邻搜索库对提取到的所述特征点,利用最小化欧氏距离的方式进行特征点匹配,得到特征点匹配结果;According to the fast nearest neighbor search library, the feature points extracted are matched by minimizing the Euclidean distance to obtain a feature point matching result;
基于所述特征点匹配结果利用PnP算法进行位姿匹配,确定所述特征点的相机位置;Based on the feature point matching result, a PnP algorithm is used to perform posture matching to determine the camera position of the feature point;
根据所述相机位置和所述特征点的位置,利用三角测量法计算所述特征点在三维空间中的坐标,根据坐标确定植物的稀疏点云数据。According to the camera position and the position of the feature point, the coordinates of the feature point in the three-dimensional space are calculated using triangulation, and the sparse point cloud data of the plant is determined according to the coordinates.
在一些实施例中,初始分割模块203,具体被配置为:In some embodiments, the initial segmentation module 203 is specifically configured to:
针对多个视角中的每个视角:For each of the multiple perspectives:
确定用户针对该视角下的初始图像序列的选定目标点,将所述选定目标点和该视角下的初始图像序列输入至追踪网络中的SAM模型中,进行图像标准化处理,得到标准化处理后的数据;Determine the target point selected by the user for the initial image sequence under the viewing angle, input the selected target point and the initial image sequence under the viewing angle into the SAM model in the tracking network, perform image standardization processing, and obtain standardized data;
利用SAM模型中的多头注意力机制对所述标准化处理后的数据进行残差连接和第二次标准化处理,得到二次标准化处理后的数据;Performing residual connection and second normalization on the normalized data using the multi-head attention mechanism in the SAM model to obtain second normalized data;
利用SAM模型中的多层感知机MLP对所述二次标准化处理后的数据进行处理,将多层感知机MLP的输出结果与所述二次标准化处理后的数据相加形成残差连接,得到图像的嵌入结果;The data after the secondary normalization processing is processed by using the multi-layer perceptron MLP in the SAM model, and the output result of the multi-layer perceptron MLP is added to the data after the secondary normalization processing to form a residual connection, so as to obtain the embedding result of the image;
将所述选定目标点进行位置编码处理,得到位置编码结果;Performing position coding processing on the selected target point to obtain a position coding result;
使用SAM模型中的自注意力机制根据所述图像的嵌入结果和所述位置编码结果,分析图像的各部分的关系,以及使用SAM模型中的交叉注意力机制根据所述图像的嵌入结果和所述位置编码结果,加强图像的特定部分与选定目标点之间的相关性;Using the self-attention mechanism in the SAM model to analyze the relationship between the parts of the image according to the embedding result of the image and the position encoding result, and using the cross-attention mechanism in the SAM model to strengthen the correlation between the specific part of the image and the selected target point according to the embedding result of the image and the position encoding result;
基于所述图像的各部分的关系和图像的特定部分与选定目标点之间的相关性,进行前景分割处理,得到初始分割植物前景数据。Based on the relationship between the various parts of the image and the correlation between the specific part of the image and the selected target point, foreground segmentation processing is performed to obtain initial segmented plant foreground data.
在一些实施例中,中间分割模块204,具体被配置为:In some embodiments, the middle segmentation module 204 is specifically configured to:
针对多个视角中的每个视角:For each of the multiple perspectives:
将该视角下的初始图像序列和初始分割植物前景数据输入至追踪网络中;Input the initial image sequence and initial segmented plant foreground data under the viewing angle into the tracking network;
利用所述追踪网络中的XMem模型,根据初始分割植物前景数据进行半监督处理,得到优化后的初始分割植物前景数据;Using the XMem model in the tracking network, semi-supervised processing is performed based on the initial segmented plant foreground data to obtain optimized initial segmented plant foreground data;
利用所述追踪网络中SAM模型,基于该视角下的初始图像序列对优化后的初始分割植物前景数据进行边缘细化处理,得到该视角对应的中间分割植物前景。The SAM model in the tracking network is used to perform edge refinement processing on the optimized initial segmented plant foreground data based on the initial image sequence at the viewing angle to obtain the intermediate segmented plant foreground corresponding to the viewing angle.
在一些实施例中,对齐处理模块205,具体被配置为:In some embodiments, the alignment processing module 205 is specifically configured to:
针对各个视角对应的中间分割植物前景提取多尺度特征;Extract multi-scale features for the intermediate segmented plant foreground corresponding to each perspective;
利用ICP配准算法依据所述多尺度特征,对各个视角的中间分割植物前景进行空间对齐,并将空间对齐后的中间分割植物前景进行特征加权平均处理,得到对齐后的特征数据。The ICP registration algorithm is used to spatially align the intermediate segmented plant foregrounds of each viewing angle according to the multi-scale features, and feature weighted averaging is performed on the intermediate segmented plant foregrounds after spatial alignment to obtain aligned feature data.
在一些实施例中,高斯渲染模块207,具体被配置为:In some embodiments, the Gaussian rendering module 207 is specifically configured to:
根据所述稀疏点云数据确定初始高斯点,基于K-Means聚类算法对所述初始高斯点进行聚类均值处理,得到高斯点的均值结果;Determine initial Gaussian points according to the sparse point cloud data, perform clustering mean processing on the initial Gaussian points based on the K-Means clustering algorithm, and obtain a mean result of the Gaussian points;
利用放缩变换和旋转变换矩阵确定初始化的三维协方差矩阵,并根据三维协方差矩阵进行二维投影计算得到二维协方差矩阵;The initialization of the three-dimensional covariance matrix is determined by using the scaling transformation and the rotation transformation matrix, and the two-dimensional covariance matrix is obtained by performing a two-dimensional projection calculation based on the three-dimensional covariance matrix;
将所述高斯点的均值结果乘以所述二维协方差矩阵,进行高斯处理得到的图像强度;Multiplying the mean result of the Gaussian points by the two-dimensional covariance matrix to perform Gaussian processing to obtain image intensity;
确定每个高斯点的不透明度,根据所述不透明度确定图像颜色;Determine the opacity of each Gaussian point, and determine the image color according to the opacity;
基于所述高斯处理得到的图像强度和所述图像颜色,利用高斯渲染模型进行处理,并确定高斯渲染模型的损失函数,并对损失函数优化参数后对高斯渲染模型进行反向传播处理,使得利用高斯渲染模型能够基于高斯处理得到的图像强度和图像颜色对分割植物前景结果进行颜色渲染,得到各个视角的三维植物渲染结果。Based on the image intensity and the image color obtained by the Gaussian processing, a Gaussian rendering model is used for processing, and a loss function of the Gaussian rendering model is determined. After optimizing the parameters of the loss function, the Gaussian rendering model is back-propagated, so that the Gaussian rendering model can be used to perform color rendering on the segmented plant foreground result based on the image intensity and the image color obtained by the Gaussian processing, and obtain three-dimensional plant rendering results from various perspectives.
在一些实施例中,面块处理模块208,具体被配置为:In some embodiments, the face block processing module 208 is specifically configured to:
针对多个视角中的每个视角:For each of the multiple perspectives:
利用正则项的算法优化高斯渲染模型的损失函数,利用该视角的三维植物渲染结果对高斯模型进行二次训练处理,得到与植物表面贴合的高斯分布图像;The loss function of the Gaussian rendering model is optimized using the regularization algorithm, and the Gaussian model is trained again using the three-dimensional plant rendering results of this perspective to obtain a Gaussian distribution image that fits the plant surface.
基于所述与植物表面贴合的高斯分布图像的密度进行上采样处理,得到稠密植物特征点;Perform upsampling processing based on the density of the Gaussian distribution image that fits the plant surface to obtain dense plant feature points;
利用泊松重建的方式确定3D网络模型,在所述3D网络模型上绑定所述稠密植物特征点,得到该视角对应的植物三维网格面片图像。The 3D network model is determined by using a Poisson reconstruction method, and the dense plant feature points are bound to the 3D network model to obtain a three-dimensional mesh patch image of the plant corresponding to the viewing angle.
为了描述的方便,描述以上装置时以功能分为各种模块分别描述。当然,在实施本申请时可以把各模块的功能在同一个或多个软件和/或硬件中实现。For the convenience of description, the above device is described in terms of functions divided into various modules. Of course, when implementing the present application, the functions of each module can be implemented in the same or multiple software and/or hardware.
上述实施例的装置用于实现前述任一实施例中相应的方法,并且具有相应的方法实施例的有益效果,在此不再赘述。The device of the above embodiment is used to implement the corresponding method in any of the above embodiments, and has the beneficial effects of the corresponding method embodiment, which will not be described in detail here.
基于同一发明构思,与上述任意实施例方法相对应的,本申请还提供了一种电子设备,包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,所述处理器执行所述程序时实现上任意一实施例所述的方法。Based on the same inventive concept, corresponding to any of the above-mentioned embodiments and methods, the present application also provides an electronic device, including a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor implements the method described in any of the above embodiments when executing the program.
图11示出了本实施例所提供的一种更为具体的电子设备硬件结构示意图, 该设备可以包括:处理器1010、存储器1020、输入/输出接口1030、通信接口1040和总线 1050。其中处理器1010、存储器1020、输入/输出接口1030和通信接口1040通过总线1050实现彼此之间在设备内部的通信连接。FIG11 shows a more specific schematic diagram of the hardware structure of an electronic device provided in this embodiment, and the device may include: a processor 1010, a memory 1020, an input/output interface 1030, a communication interface 1040, and a bus 1050. The processor 1010, the memory 1020, the input/output interface 1030, and the communication interface 1040 are connected to each other in communication within the device through the bus 1050.
处理器1010可以采用通用的CPU(Central Processing Unit,中央处理器)、微处理器、应用专用集成电路(Application Specific Integrated Circuit,ASIC)、或者一个或多个集成电路等方式实现,用于执行相关程序,以实现本说明书实施例所提供的技术方案。The processor 1010 can be implemented by a general-purpose CPU (Central Processing Unit), a microprocessor, an application-specific integrated circuit (ASIC), or one or more integrated circuits, and is used to execute relevant programs to implement the technical solutions provided in the embodiments of this specification.
存储器1020可以采用ROM(Read Only Memory,只读存储器)、RAM(Random AccessMemory,随机存取存储器)、静态存储设备,动态存储设备等形式实现。存储器1020可以存储操作系统和其他应用程序,在通过软件或者固件来实现本说明书实施例所提供的技术方案时,相关的程序代码保存在存储器1020中,并由处理器1010来调用执行。The memory 1020 may be implemented in the form of ROM (Read Only Memory), RAM (Random Access Memory), static storage device, dynamic storage device, etc. The memory 1020 may store an operating system and other application programs. When the technical solutions provided in the embodiments of this specification are implemented by software or firmware, the relevant program codes are stored in the memory 1020 and are called and executed by the processor 1010.
输入/输出接口1030用于连接输入/输出模块,以实现信息输入及输出。输入输出/模块可以作为组件配置在设备中(图中未示出),也可以外接于设备以提供相应功能。其中输入设备可以包括键盘、鼠标、触摸屏、麦克风、各类传感器等,输出设备可以包括显示器、扬声器、振动器、指示灯等。The input/output interface 1030 is used to connect the input/output module to realize information input and output. The input/output module can be configured in the device as a component (not shown in the figure), or it can be externally connected to the device to provide corresponding functions. The input device may include a keyboard, a mouse, a touch screen, a microphone, various sensors, etc., and the output device may include a display, a speaker, a vibrator, an indicator light, etc.
通信接口1040用于连接通信模块(图中未示出),以实现本设备与其他设备的通信交互。其中通信模块可以通过有线方式(例如USB、网线等)实现通信,也可以通过无线方式(例如移动网络、WIFI、蓝牙等)实现通信。The communication interface 1040 is used to connect a communication module (not shown) to realize communication interaction between the device and other devices. The communication module can realize communication through wired means (such as USB, network cable, etc.) or wireless means (such as mobile network, WIFI, Bluetooth, etc.).
总线1050包括一通路,在设备的各个组件(例如处理器1010、存储器1020、输入/输出接口1030和通信接口1040)之间传输信息。The bus 1050 includes a path that transmits information between the various components of the device (eg, the processor 1010 , the memory 1020 , the input/output interface 1030 , and the communication interface 1040 ).
需要说明的是,尽管上述设备仅示出了处理器1010、存储器1020、输入/输出接口1030、通信接口1040以及总线1050,但是在具体实施过程中,该设备还可以包括实现正常运行所必需的其他组件。此外,本领域的技术人员可以理解的是,上述设备中也可以仅包含实现本说明书实施例方案所必需的组件,而不必包含图中所示的全部组件。It should be noted that, although the above device only shows the processor 1010, the memory 1020, the input/output interface 1030, the communication interface 1040 and the bus 1050, in the specific implementation process, the device may also include other components necessary for normal operation. In addition, it can be understood by those skilled in the art that the above device may also only include the components necessary for implementing the embodiments of the present specification, and does not necessarily include all the components shown in the figure.
上述实施例的电子设备用于实现前述任一实施例中相应的方法,并且具有相应的方法实施例的有益效果,在此不再赘述。The electronic device of the above embodiment is used to implement the corresponding method in any of the above embodiments, and has the beneficial effects of the corresponding method embodiment, which will not be described in detail here.
基于同一发明构思,与上述任意实施例方法相对应的,本申请还提供了一种非暂态计算机可读存储介质,所述非暂态计算机可读存储介质存储计算机指令,所述计算机指令用于使所述计算机执行如上任一实施例所述的方法。Based on the same inventive concept, corresponding to any of the above-mentioned embodiments, the present application also provides a non-transitory computer-readable storage medium, wherein the non-transitory computer-readable storage medium stores computer instructions, and the computer instructions are used to enable the computer to execute the method described in any of the above embodiments.
本实施例的计算机可读介质包括永久性和非永久性、可移动和非可移动媒体可以由任何方法或技术来实现信息存储。信息可以是计算机可读指令、数据结构、程序的模块或其他数据。计算机的存储介质的例子包括,但不限于相变内存(PRAM)、静态随机存取存储器(SRAM)、动态随机存取存储器(DRAM)、其他类型的随机存取存储器(RAM)、只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、快闪记忆体或其他内存技术、只读光盘只读存储器(CD-ROM)、数字多功能光盘(DVD)或其他光学存储、磁盒式磁带,磁带磁盘存储或其他磁性存储设备或任何其他非传输介质,可用于存储可以被计算设备访问的信息。The computer-readable medium of this embodiment includes permanent and non-permanent, removable and non-removable media, and information storage can be implemented by any method or technology. Information can be computer-readable instructions, data structures, modules of programs, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), flash memory or other memory technology, read-only compact disk read-only memory (CD-ROM), digital versatile disk (DVD) or other optical storage, magnetic cassettes, tape disk storage or other magnetic storage devices or any other non-transmission media that can be used to store information that can be accessed by a computing device.
上述实施例的存储介质存储的计算机指令用于使所述计算机执行如上任一实施例所述的方法,并且具有相应的方法实施例的有益效果,在此不再赘述。The computer instructions stored in the storage medium of the above embodiments are used to enable the computer to execute the method described in any of the above embodiments, and have the beneficial effects of the corresponding method embodiments, which will not be repeated here.
基于同一构思,与上述任意实施例方法相对应的,本申请还提供了一种计算机程序产品,包括计算机程序指令,当所述计算机程序指令在计算机上运行时,使得所述计算机执行如上任一实施例所述的方法,具有相应的方法实施例的有益效果,在此不再赘述。Based on the same concept, corresponding to any of the above-mentioned embodiments, the present application also provides a computer program product, including computer program instructions. When the computer program instructions are run on a computer, the computer executes the method described in any of the above embodiments, which has the beneficial effects of the corresponding method embodiments and will not be repeated here.
可以理解的是,在使用本公开中各个实施例的技术方案之前,均会通过恰当的方式对所涉及的个人信息的类型、使用范围、使用场景等告知用户,并获得用户的授权。It is understandable that before using the technical solutions of each embodiment of the present disclosure, the type, scope of use, usage scenarios, etc. of the personal information involved will be informed to the user in an appropriate manner, and the user's authorization will be obtained.
例如,在响应于接收到用户的主动请求时,向用户发送提示信息,以明确的提示用户,其请求执行的操作将需要获取和使用到用户的个人信息。从而,使得用户可以根据提示信息来自主的选择是否向执行本公开技术方案的操作的电子设备、应用程序、服务器或存储介质等软件或硬件提供个人信息。For example, in response to receiving an active request from a user, a prompt message is sent to the user to clearly remind the user that the operation requested to be performed will require obtaining and using the user's personal information. Thus, the user can independently choose whether to provide personal information to software or hardware such as an electronic device, application, server, or storage medium that performs the operation of the technical solution of the present disclosure according to the prompt message.
作为一种可选的但非限定的实现方式,响应于接受到用户的主动请求,向用户发送提示信息的方式例如可以是弹窗的方式,弹窗中可以以文字的方式呈现提示信息。此外,弹窗中还可以承载供用户选择“同意”或者“不同意”向电子设备提供个人信息的选择控件。As an optional but non-limiting implementation, in response to receiving the user's active request, the prompt information may be sent to the user in the form of a pop-up window, in which the prompt information may be presented in text form. In addition, the pop-up window may also carry a selection control for the user to choose "agree" or "disagree" to provide personal information to the electronic device.
可以理解的是,上述通知和获取用户授权过程仅是示意性的,不对本公开的实现方式构成限定,其他满足相关法律法规的方式也可应用于本公开的实现方式中。It is understandable that the above notification and the process of obtaining user authorization are merely illustrative and do not constitute a limitation on the implementation of the present disclosure. Other methods that meet relevant laws and regulations may also be applied to the implementation of the present disclosure.
所属领域的普通技术人员应当理解:以上任何实施例的讨论仅为示例性的,并非旨在暗示本申请的范围(包括权利要求)被限于这些例子;在本申请的思路下,以上实施例或者不同实施例中的技术特征之间也可以进行组合,步骤可以以任意顺序实现,并存在如上所述的本申请实施例的不同方面的许多其它变化,为了简明它们没有在细节中提供。A person skilled in the art should understand that the discussion of any of the above embodiments is merely illustrative and is not intended to imply that the scope of the present application (including the claims) is limited to these examples. In line with the concept of the present application, the technical features in the above embodiments or different embodiments may be combined, the steps may be implemented in any order, and there are many other variations of the different aspects of the embodiments of the present application as described above, which are not provided in detail for the sake of simplicity.
另外,为简化说明和讨论,并且为了不会使本申请实施例难以理解,在所提供的附图中可以示出或可以不示出与集成电路(IC)芯片和其它部件的公知的电源/接地连接。此外,可以以框图的形式示出装置,以便避免使本申请实施例难以理解,并且这也考虑了以下事实,即关于这些框图装置的实施方式的细节是高度取决于将要实施本申请实施例的平台的(即,这些细节应当完全处于本领域技术人员的理解范围内)。在阐述了具体细节(例如,电路)以描述本申请的示例性实施例的情况下,对本领域技术人员来说显而易见的是,可以在没有这些具体细节的情况下或者这些具体细节有变化的情况下实施本申请实施例。因此,这些描述应被认为是说明性的而不是限制性的。In addition, to simplify the description and discussion, and in order not to make the embodiments of the present application difficult to understand, the well-known power/ground connections to the integrated circuit (IC) chip and other components may or may not be shown in the provided drawings. In addition, the device may be shown in the form of a block diagram to avoid making the embodiments of the present application difficult to understand, and this also takes into account the fact that the details of the implementation of these block diagram devices are highly dependent on the platform on which the embodiments of the present application are to be implemented (that is, these details should be fully within the scope of understanding of those skilled in the art). Where specific details (e.g., circuits) are set forth to describe exemplary embodiments of the present application, it is obvious to those skilled in the art that the embodiments of the present application can be implemented without these specific details or with changes in these specific details. Therefore, these descriptions should be considered illustrative rather than restrictive.
尽管已经结合了本申请的具体实施例对本申请进行了描述,但是根据前面的描述,这些实施例的很多替换、修改和变型对本领域普通技术人员来说将是显而易见的。例如,其它存储器架构(例如,动态RAM(DRAM))可以使用所讨论的实施例。Although the present application has been described in conjunction with specific embodiments of the present application, many alternatives, modifications and variations of these embodiments will be apparent to those skilled in the art from the foregoing description. For example, other memory architectures (e.g., dynamic RAM (DRAM)) may use the discussed embodiments.
本申请实施例旨在涵盖落入所附权利要求的宽泛范围之内的所有这样的替换、修改和变型。因此,凡在本申请实施例的精神和原则之内,所做的任何省略、修改、等同替换、改进等,均应包含在本申请的保护范围之内。The embodiments of the present application are intended to cover all such substitutions, modifications and variations that fall within the broad scope of the appended claims. Therefore, any omissions, modifications, equivalent substitutions, improvements, etc. made within the spirit and principles of the embodiments of the present application should be included in the scope of protection of the present application.
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202410906783.3ACN118447047B (en) | 2024-07-08 | 2024-07-08 | Plant image processing method, device and equipment based on three-dimensional phenotypic modeling |
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202410906783.3ACN118447047B (en) | 2024-07-08 | 2024-07-08 | Plant image processing method, device and equipment based on three-dimensional phenotypic modeling |
| Publication Number | Publication Date |
|---|---|
| CN118447047A CN118447047A (en) | 2024-08-06 |
| CN118447047Btrue CN118447047B (en) | 2024-10-11 |
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202410906783.3AActiveCN118447047B (en) | 2024-07-08 | 2024-07-08 | Plant image processing method, device and equipment based on three-dimensional phenotypic modeling |
| Country | Link |
|---|---|
| CN (1) | CN118447047B (en) |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN119295651B (en)* | 2024-09-14 | 2025-05-16 | 清华大学 | Reconstruction method, device and storage medium of three-dimensional dynamic scene |
| CN120279196A (en)* | 2025-06-10 | 2025-07-08 | 北京理工大学 | Static scene three-dimensional reconstruction method and system based on self-adaptive dynamic object elimination |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN118196306A (en)* | 2024-05-15 | 2024-06-14 | 广东工业大学 | 3D modeling reconstruction system, method and device based on point cloud information and Gaussian cloud cluster |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN110517348B (en)* | 2019-07-23 | 2023-01-06 | 西安电子科技大学 | Target object three-dimensional point cloud reconstruction method based on image foreground segmentation |
| CN116091588A (en)* | 2021-11-02 | 2023-05-09 | 中兴通讯股份有限公司 | Three-dimensional object detection method, apparatus, and computer-readable storage medium |
| CN117197405A (en)* | 2023-09-27 | 2023-12-08 | 支付宝(杭州)信息技术有限公司 | Augmented reality method, system and storage medium for three-dimensional object |
| CN117671138A (en)* | 2023-11-28 | 2024-03-08 | 山东大学 | Digital twin modeling method and system based on SAM large model and NeRF |
| CN118072007A (en)* | 2023-12-27 | 2024-05-24 | 之江实验室 | Obstacle segmentation method and device based on SAM point cloud and image fusion |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN118196306A (en)* | 2024-05-15 | 2024-06-14 | 广东工业大学 | 3D modeling reconstruction system, method and device based on point cloud information and Gaussian cloud cluster |
| Title |
|---|
| 割一切模型SAM 的潜力与展望:综述;王淼 等;中国图象图形学报;20240616;第29卷(第6期);正文第3.2节* |
| Publication number | Publication date |
|---|---|
| CN118447047A (en) | 2024-08-06 |
| Publication | Publication Date | Title |
|---|---|---|
| Liu et al. | Semantic-aware implicit neural audio-driven video portrait generation | |
| CN111598998B (en) | Three-dimensional virtual model reconstruction method, three-dimensional virtual model reconstruction device, computer equipment and storage medium | |
| CN109636831B (en) | A Method for Estimating 3D Human Pose and Hand Information | |
| CN110458939B (en) | Indoor scene modeling method based on visual angle generation | |
| CN110675487B (en) | Three-dimensional face modeling and recognition method and device based on multi-angle two-dimensional face | |
| CN118447047B (en) | Plant image processing method, device and equipment based on three-dimensional phenotypic modeling | |
| Rematas et al. | Novel views of objects from a single image | |
| Lyu et al. | Neural radiance transfer fields for relightable novel-view synthesis with global illumination | |
| CN113822977A (en) | Image rendering method, device, equipment and storage medium | |
| Kao et al. | Toward 3d face reconstruction in perspective projection: Estimating 6dof face pose from monocular image | |
| CN117557714A (en) | Three-dimensional reconstruction method, electronic device and readable storage medium | |
| Wu et al. | 3D interpreter networks for viewer-centered wireframe modeling | |
| Tao et al. | Indoor 3D semantic robot VSLAM based on mask regional convolutional neural network | |
| CN113220251A (en) | Object display method, device, electronic equipment and storage medium | |
| Kang et al. | Competitive learning of facial fitting and synthesis using uv energy | |
| CN116134491A (en) | Multi-view Neural Human Prediction Using an Implicit Differentiable Renderer for Facial Expression, Body Pose Morphology, and Clothing Performance Capture | |
| Yao et al. | Neural radiance field-based visual rendering: A comprehensive review | |
| CN117218246A (en) | Training method and device for image generation model, electronic equipment and storage medium | |
| CN119152134A (en) | Three-dimensional switch visualization method, device, equipment and medium | |
| Zhao et al. | Generalizable 3D Gaussian Splatting for novel view synthesis | |
| CN117834839A (en) | Multi-view 3D intelligent imaging measurement system based on mobile terminal | |
| CN117455972A (en) | UAV ground target positioning method based on monocular depth estimation | |
| CN116452715A (en) | Dynamic hand rendering method, device and storage medium | |
| Yu et al. | NToP: NeRF-Powered Large-Scale Dataset Generation for 2D and 3D Human Pose Estimation in Top-View Fisheye Images | |
| Jiao et al. | NEHand: Enhancing Hand Pose Estimation in the Wild through Synthetic and Motion Capture Datasets |
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| GR01 | Patent grant | ||
| GR01 | Patent grant |