CN114782996A

Movatterモバイル変換

Info

Publication number: CN114782996A
Application number: CN202210506046.5A
Authority: CN
Inventors: 贺俊东; 赵婷婷; 王彬; 周材权
Original assignee: China West Normal University
Current assignee: China West Normal University
Priority date: 2022-05-10
Filing date: 2022-05-10
Publication date: 2022-07-22

Abstract

The embodiment of the invention discloses an image recognition processing method and device, electronic equipment and a storage medium. The method comprises the following steps: acquiring an image to be identified and preprocessing the image to be identified; performing unsupervised pre-training on the image to be recognized to determine the feature representation of the image to be recognized, and classifying the image to be recognized; and carrying out supervision training on the image to be recognized according to the feature representation of the image to be recognized, and recognizing and determining a target image. By adopting the technical scheme of the embodiment of the invention, an unsupervised pre-training process is introduced, the dependence on the label is reduced by using a contrast learning technology, and the representation capability of the sample is enhanced, so that the image can be effectively identified on the premise of less samples.

Description

Translated fromChinese

图像识别处理方法、装置、电子设备及存储介质Image recognition processing method, device, electronic device and storage medium

技术领域technical field

本发明实施例涉及图像识别技术领域，尤其涉及一种图像识别处理方法、装置、电子设备及存储介质。Embodiments of the present invention relate to the technical field of image recognition, and in particular, to an image recognition processing method, apparatus, electronic device, and storage medium.

背景技术Background technique

在对动物的保护工作中，将相机部署到保护动物容易出没的地方，如果相机的画面出现变动，则会记录相关的影像数据；在相机捕捉到的动物图像中，目标动物图像的比例可能并不高，这导致标记100个目标动物图像，可能需要浏览上万张图像，才能使其达到比较好的效果。In the protection of animals, the cameras are deployed in places where animals are prone to appear. If the picture of the camera changes, the relevant image data will be recorded; in the animal images captured by the camera, the proportion of the target animal image may not be the same. Not very high, which leads to labeling 100 target animal images, which may require browsing tens of thousands of images to achieve good results.

因此，如何减少图像标记并准确、自动以及高校的识别目标动物图像是本领域技术人员亟待解决的技术问题。Therefore, how to reduce image tags and identify target animal images accurately, automatically and in colleges and universities is a technical problem to be solved urgently by those skilled in the art.

发明内容SUMMARY OF THE INVENTION

本发明实施例提供一种图像识别处理方法、装置、电子设备及存储介质，减少识别系统对训练样本标注的依赖，提高图像识别效率。Embodiments of the present invention provide an image recognition processing method, device, electronic device, and storage medium, which reduce the recognition system's dependence on the labeling of training samples and improve image recognition efficiency.

第一方面，本发明实施例提供了一种图像识别处理方法，包括：In a first aspect, an embodiment of the present invention provides an image recognition processing method, including:

获取待识别图像并将所述待识别图像进行预处理；acquiring an image to be recognized and preprocessing the image to be recognized;

将所述待识别图像进行无监督预训练确定所述待识别图像的特征表示，并将所述待识别图像进行分类；Perform unsupervised pre-training on the to-be-recognized image to determine the feature representation of the to-be-recognized image, and classify the to-be-recognized image;

依据所述待识别图像的特征表示对所述待识别图像进行监督训练，识别确定目标图像。The to-be-recognized image is supervised and trained according to the feature representation of the to-be-recognized image, and the target image is identified and determined.

第二方面，本发明实施例还提供了一种图像识别处理装置，包括：In a second aspect, an embodiment of the present invention further provides an image recognition processing device, including:

图像获取模块，用于获取待识别图像并将所述待识别图像进行预处理；an image acquisition module, configured to acquire an image to be recognized and preprocess the image to be recognized;

图像分类模块，用于将所述待识别图像进行无监督预训练确定所述待识别图像的特征表示，并将所述待识别图像进行分类；an image classification module, configured to perform unsupervised pre-training on the to-be-recognized image to determine the feature representation of the to-be-recognized image, and to classify the to-be-recognized image;

图像识别模块，用于依据所述待识别图像的特征表示对所述待识别图像进行监督训练，识别确定目标图像。The image recognition module is used to supervise and train the to-be-recognized image according to the feature representation of the to-be-recognized image, and identify and determine the target image.

第三方面，本发明实施例还提供了一种电子设备，该电子设备包括：In a third aspect, an embodiment of the present invention further provides an electronic device, the electronic device comprising:

一个或多个处理器；one or more processors;

存储装置，用于存储一个或多个程序；a storage device for storing one or more programs;

当所述一个或多个程序被所述一个或多个处理器执行，使得所述一个或多个处理器实现本发明任意实施例所述的图像识别处理方法。When the one or more programs are executed by the one or more processors, the one or more processors implement the image recognition processing method described in any embodiment of the present invention.

第四方面，本发明实施例还提供了一种计算机可读存储介质，其上存储有计算机程序，该程序被处理器执行时实现本发明任意实施例所述的图像识别处理方法。In a fourth aspect, an embodiment of the present invention further provides a computer-readable storage medium on which a computer program is stored, and when the program is executed by a processor, implements the image recognition processing method described in any embodiment of the present invention.

本发明实施例提供了一种图像识别处理方法、装置、电子设备和存储介质，通过获取待识别图像并将所述待识别图像进行预处理；将所述待识别图像进行无监督预训练确定所述待识别图像的特征表示，并将所述待识别图像进行分类；依据所述待识别图像的特征表示对所述待识别图像进行监督训练，识别确定目标图像。采用本发明实施例的技术方案，引入无监督预训练流程，使用对比学习的技术减少对标注的依赖，增强对样本的表示能力，使其在更少样本的前提下可以对图像进行有效识别。The embodiments of the present invention provide an image recognition processing method, device, electronic device and storage medium, by acquiring an image to be recognized and preprocessing the image to be recognized; performing unsupervised pre-training on the image to be recognized to determine the The feature representation of the to-be-recognized image is classified, and the to-be-recognized image is classified; the to-be-recognized image is supervised and trained according to the feature representation of the to-be-recognized image, and the target image is identified and determined. By adopting the technical solutions of the embodiments of the present invention, an unsupervised pre-training process is introduced, and a comparative learning technique is used to reduce the dependence on annotations, and enhance the representation ability of samples, so that images can be effectively recognized under the premise of fewer samples.

附图说明Description of drawings

通过阅读参照以下附图所作的对非限制性实施例所作的详细描述，本发明的其它特征、目的和优点将会变得更明显。附图仅用于示出优选实施方式的目的，而并不认为是对本发明的限制。而且在整个附图中，用相同的参考符号表示相同的部件。在附图中：Other features, objects and advantages of the present invention will become more apparent upon reading the detailed description of non-limiting embodiments taken with reference to the following drawings. The drawings are for the purpose of illustrating preferred embodiments only and are not to be considered limiting of the invention. Also, the same components are denoted by the same reference numerals throughout the drawings. In the attached image:

图1A是本发明实施例一提供的一种图像识别处理方法的流程图；1A is a flowchart of an image recognition processing method provided inEmbodiment 1 of the present invention;

图1B是本发明实施例提供的一种无监督预训练图像分类示意图；1B is a schematic diagram of an unsupervised pre-training image classification provided by an embodiment of the present invention;

图2A为本发明实施例二提供的一种图像识别处理方法的流程图；2A is a flowchart of an image recognition processing method according to Embodiment 2 of the present invention;

图2B是本发明实施例提供的一种无监督预训练去噪过程抽象示意图；2B is an abstract schematic diagram of an unsupervised pre-training denoising process provided by an embodiment of the present invention;

图3是本发明实施例三提供的一种图像识别处理装置的结构示意图；3 is a schematic structural diagram of an image recognition processing apparatus provided in Embodiment 3 of the present invention;

图4是本申请实施例四提供的一种电子设备的结构示意图。FIG. 4 is a schematic structural diagram of an electronic device provided in Embodiment 4 of the present application.

具体实施方式Detailed ways

下面结合附图和实施例对本发明作进一步的详细说明。可以理解的是，此处所描述的具体实施例仅仅用于解释本发明，而非对本发明的限定。另外还需要说明的是，为了便于描述，附图中仅示出了与本发明相关的部分而非全部结构。The present invention will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are only used to explain the present invention, but not to limit the present invention. In addition, it should be noted that, for the convenience of description, the drawings only show some but not all structures related to the present invention.

在更加详细地讨论示例性实施例之前，应当提到的是，一些示例性实施例被描述成作为流程图描绘的处理或方法。虽然流程图将各项操作(或步骤)描述成顺序的处理，但是其中的许多操作(或步骤)可以被并行地、并发地或者同时实施。此外，各项操作的顺序可以被重新安排。当其操作完成时所述处理可以被终止，但是还可以具有未包括在附图中的附加步骤。所述处理可以对应于方法、函数、规程、子例程、子程序等等。Before discussing the exemplary embodiments in greater detail, it should be mentioned that some of the exemplary embodiments are described as processes or methods depicted as flowcharts. Although a flowchart depicts various operations (or steps) as a sequential process, many of the operations (or steps) may be performed in parallel, concurrently, or concurrently. Additionally, the order of operations can be rearranged. The process may be terminated when its operation is complete, but may also have additional steps not included in the figures. The processes may correspond to methods, functions, procedures, subroutines, subroutines, and the like.

川金丝猴是一种珍稀濒危且中国特有的灵长类物种，生存在陕西秦岭、四川西部、甘肃南部和湖北神农架等山系之中。目前，它被列为国家一级保护动物，川金丝猴和大熊猫并称“双宝”是四川野生动植物保护的代表性物种和旗舰种、伞护种，开展川金丝猴保护文化研究工作意义重大。在川金丝猴的保护工作中，川金丝猴组群的密度分析以及生活习性等都提供了重要的基础信息。如何自动化的通过业务摄像机等设备准确、自动以及高校的识别川金丝猴成为了关键问题。The golden snub-nosed monkey is a rare and endangered primate species unique to China. It lives in the Qinling Mountains in Shaanxi, western Sichuan, southern Gansu and Shennongjia in Hubei. At present, it is listed as a national first-class protected animal. The Sichuan golden monkey and the giant panda are also called "Shuangbao". They are the representative species, flagship species and umbrella protection species of Sichuan's wildlife protection. It is of great significance to carry out research on Sichuan golden monkey protection culture. . In the protection work of golden snub-nosed monkeys, the density analysis and living habits of golden snub-nosed monkeys have provided important basic information. How to automate the identification of Sichuan golden monkeys through business cameras and other equipment accurately and automatically and in colleges and universities has become a key issue.

现有技术中通过监督学习的方式，通过计算机视觉的技术，通过人为标注的方式，从海量的相机样本中标注川金丝猴的样本。比较经典的算法，例如ResNet以及DenseNet，他们都需要大量的数据才能使得算法达到比较好的效果。一般过程为：将相机部署到川金丝猴容易出没的地方，如果相机的画面出现变动，则会记录相关的影像数据；通过采集到的人为数据，使用人工标注的方式，标注其中川金丝猴出现的图片；使用标记好的图像数据训练一个ResNet(或者同类型的网络)，使用该网络对川金丝猴进行自动化的识别。通过监督学习的方式进行标注，人工标记成本非常高，主要原因为：相机捕捉到的动物中，川金丝猴的比例可能并不高，这导致标记100个川金丝猴的影像，可以需要浏览上万张图片；监督学习的机制决定了，必须提供海量的样本，才能使其达到比较好的效果。通过监督学习的方式对图像进行识别，使用的图像较多且需要人工标注大量的图像。因此本发明实施例提供了一种图像识别处理方法。In the prior art, the samples of the golden snub-nosed monkey are labeled from a large number of camera samples by means of supervised learning, computer vision technology, and manual labeling. The more classic algorithms, such as ResNet and DenseNet, all require a large amount of data to make the algorithm achieve better results. The general process is: deploy the camera to the place where the golden snub-nosed monkey is prone to appear. If the picture of the camera changes, the relevant image data will be recorded; through the collected artificial data, use manual labeling to mark the pictures of the golden snub-nosed monkey. ; Use the labeled image data to train a ResNet (or a network of the same type), and use the network to automatically identify the golden snub-nosed monkey. The cost of manual labeling is very high for labeling through supervised learning. The main reason is that among the animals captured by the camera, the proportion of golden snub-nosed monkeys may not be high, which leads to the need to browse tens of thousands of images of 100 golden snub-nosed monkeys. Picture; The mechanism of supervised learning determines that a large number of samples must be provided in order to achieve better results. To identify images by means of supervised learning, many images are used and a large number of images need to be manually annotated. Therefore, the embodiment of the present invention provides an image recognition processing method.

实施例一Example 1

图1A是本发明实施例一提供的一种图像识别处理方法的流程图，本实施例可适用于对图像进行识别处理的情况，本实施例的方法可以由图像识别处理装置来执行，该装置可以采用硬件和/或软件的方式来实现。该装置可以配置于图像识别处理的服务器中。该方法具体包括如下步骤：FIG. 1A is a flowchart of an image recognition processing method provided inEmbodiment 1 of the present invention. This embodiment is applicable to the case of image recognition processing. The method of this embodiment can be executed by an image recognition processing apparatus. It can be implemented in hardware and/or software. The apparatus may be configured in a server for image recognition processing. The method specifically includes the following steps:

S110、获取待识别图像并将所述待识别图像进行预处理。S110: Acquire an image to be recognized and preprocess the image to be recognized.

其中，所述待识别图像可以是指由终端设备实时拍摄或者从终端设备的相册中选取的待识别图像；所述待识别图像包括但不限于目标图像，例如在本发明实施例的一种可选方案中，所述待识别图像包括但不限于川金丝猴图像以及其他动物图像。The to-be-recognized image may refer to an image to be recognized that is shot in real time by a terminal device or selected from an album of the terminal device; the to-be-recognized image includes but is not limited to a target image, for example, in an embodiment of the present invention, an image to be recognized In an alternative solution, the to-be-recognized images include but are not limited to images of golden snub-nosed monkeys and other animal images.

可选的，所述获取待识别图像并将所述待识别图像进行预处理，包括：Optionally, the acquiring the to-be-recognized image and preprocessing the to-be-recognized image includes:

获取通过终端设备实时拍摄或者从终端设备的相册中选取的待识别图像；Obtain the to-be-recognized image captured by the terminal device in real time or selected from the album of the terminal device;

对所述待识别图像进行预处理；所述预处理包括图像尺寸与格式调整、去噪、消光以及消除背景影响。Preprocessing is performed on the image to be recognized; the preprocessing includes image size and format adjustment, denoising, matting, and removing background effects.

可选的，通过多种渠道获取待识别图像，包括但不限于使用手机、平板电脑或相机等移动智能终端设备实时拍摄、网络下载以及公开征集等，保证了图片来源的丰富性。Optionally, the images to be recognized are obtained through various channels, including but not limited to real-time shooting with mobile smart terminal devices such as mobile phones, tablet computers or cameras, network downloads, and public collection, etc., to ensure the richness of image sources.

可选的，将所述待识别图像进行预处理，所述预处理包括大不限于图像尺寸与格式调整、去噪、消光以及消除背景影响。将所述待识别图像进行预处理，将所述待识别图像的尺寸与格式进行调整为符合进行无监督预训练的训练图像；经过去噪处理，去除图像噪声使图像所包含的特征信息更加准确。Optionally, the to-be-recognized image is preprocessed, and the preprocessing includes, but is not limited to, image size and format adjustment, denoising, matting, and background influence removal. The image to be recognized is preprocessed, and the size and format of the image to be recognized are adjusted to conform to the training image for unsupervised pre-training; after denoising, the image noise is removed to make the feature information contained in the image more accurate .

S120、将所述待识别图像进行无监督预训练确定所述待识别图像的特征表示，并将所述待识别图像进行分类。S120. Perform unsupervised pre-training on the to-be-recognized image to determine the feature representation of the to-be-recognized image, and classify the to-be-recognized image.

其中，所述无监督预训练可以是指通过图像增强的形式，自动创建了一个分类任务，该分类任务的目标是使得图像与其副本分为一类，与其他图像分为另一类。图1B是本发明实施例提供的一种无监督预训练图像分类示意图，参见图1B，左侧的图片代表原图的副本，上方代表图像的原图。1、0分别代表训练目标，1为预测为同类，0为非同类。例如，图1B中上方图像的“熊猫”与其“熊猫”副本为同类，则预测值为1；图1B中上方图像的“熊猫”与“犀牛”副本为非同类，则预测值为0。其中，所述图像副本可以是指与待识别图像像素值不同但与待识别图像语义相同的待识别图像副本；所述语义相同是指与所述待识别图像除像素值以外的其他图像内容完全相同。The unsupervised pre-training may refer to the automatic creation of a classification task in the form of image enhancement, and the objective of the classification task is to classify images and their copies into one class, and other images into another class. FIG. 1B is a schematic diagram of an unsupervised pre-training image classification provided by an embodiment of the present invention. Referring to FIG. 1B , the picture on the left represents a copy of the original image, and the upper part represents the original image of the image. 1 and 0 respectively represent the training target, 1 is predicted to be the same class, and 0 is not the same class. For example, if "Panda" in the upper image in Figure 1B and its counterpart "Panda" are of the same kind, the predicted value is 1; if the "Panda" and "Rhino" copies in the upper image in Figure 1B are non-identical, the predicted value is 0. Wherein, the image copy may refer to a copy of the image to be recognized that is different from the pixel value of the image to be recognized but has the same semantics as the image to be recognized. same.

特征表示可以是指待识别图像的特征，例如待识别图像的像素点、图像内容以及图像类别。将所述待识别图像进行无监督预训练确定所述待识别图像的特征表示，并将所述待识别图像进行分类。The feature representation may refer to the feature of the image to be recognized, such as the pixel points of the image to be recognized, the image content, and the image category. Perform unsupervised pre-training on the to-be-recognized image to determine the feature representation of the to-be-recognized image, and classify the to-be-recognized image.

S130、依据所述待识别图像的特征表示对所述待识别图像进行监督训练，识别确定目标图像。S130. Perform supervised training on the to-be-recognized image according to the feature representation of the to-be-recognized image, and identify and determine a target image.

其中，所述监督训练可以是指从给定的训练数据集中学习出一个函数(例如，模型参数)，当新的数据到来时，可以根据这个函数预测结果。监督学习的训练集要求包括输入输出，也可以说是特征和目标。训练集中的目标是由人标注的。监督学习就是最常见的分类问题，通过已有的训练样本(即已知数据及其对应的输出)去训练得到一个最优模型，再利用这个模型将所有的输入映射为相应的输出，对输出进行简单的判断从而实现分类的目的。例如，通过已经过无监督预训练获取的待识别图像的特征表示对所述待识别图像进行监督训练得到监督训练模型，利用此监督训练模型将输入的待识别图像进行识别处理，将所述待识别图像进行分类识别，确定目标图像。The supervised training may refer to learning a function (eg, model parameters) from a given training data set, and when new data arrives, the result may be predicted according to this function. The training set requirements for supervised learning include input and output, which can also be said to be features and targets. Objects in the training set are annotated by humans. Supervised learning is the most common classification problem. An optimal model is obtained by training the existing training samples (that is, known data and its corresponding output), and then the model is used to map all inputs to corresponding outputs, and the output Make simple judgments to achieve the purpose of classification. For example, a supervised training model is obtained by performing supervised training on the to-be-recognized image through the feature representation of the to-be-recognized image obtained by unsupervised pre-training, and the input to-be-recognized image is recognized by using this supervised training model, and the Identify the image for classification and identification, and determine the target image.

本发明实施例提供了一种图像识别处理方法，通过获取待识别图像并将所述待识别图像进行预处理；将所述待识别图像进行无监督预训练确定所述待识别图像的特征表示，并将所述待识别图像进行分类；依据所述待识别图像的特征表示对所述待识别图像进行监督训练，识别确定目标图像。采用本发明实施例的技术方案，引入无监督预训练流程，使用对比学习的技术减少对标注的依赖，增强对样本的表示能力；采用监督训练将所述待识别图像进行识别处理，使其在更少的标注样本的前提下可以对图像进行有效识别。The embodiment of the present invention provides an image recognition processing method, by acquiring the to-be-recognized image and preprocessing the to-be-recognized image; performing unsupervised pre-training on the to-be-recognized image to determine the feature representation of the to-be-recognized image, The to-be-recognized image is classified; the to-be-recognized image is supervised and trained according to the feature representation of the to-be-recognized image, and the target image is identified and determined. The technical solution of the embodiment of the present invention is adopted, an unsupervised pre-training process is introduced, and the technology of comparative learning is used to reduce the dependence on the labeling and enhance the representation ability of the sample; the supervised training is used to identify and process the to-be-recognized image, so that it is in the The image can be effectively recognized under the premise of fewer labeled samples.

实施例二Embodiment 2

图2A为本发明实施例二提供的一种图像识别处理方法的流程图。本发明实施例在上述实施例的基础上对前述实施例进行进一步优化，本发明实施例可以与上述一个或者多个实施例中各个可选方案结合。如图2A所示，本发明实施例中提供的图像识别处理方法，可包括以下步骤：FIG. 2A is a flowchart of an image recognition processing method according to Embodiment 2 of the present invention. The embodiments of the present invention further optimize the foregoing embodiments on the basis of the foregoing embodiments, and the embodiments of the present invention may be combined with each optional solution in one or more of the foregoing embodiments. As shown in FIG. 2A , the image recognition processing method provided in the embodiment of the present invention may include the following steps:

S210、获取待识别图像并将所述待识别图像进行预处理。S210: Acquire an image to be recognized and preprocess the image to be recognized.

S220、将所述待识别图像进行无监督预训练确定所述待识别图像的特征表示，并将所述待识别图像进行分类。S220. Perform unsupervised pre-training on the to-be-recognized image to determine the feature representation of the to-be-recognized image, and classify the to-be-recognized image.

可选的，所述将所述待识别图像进行无监督预训练确定所述待识别图像的特征表示，并将所述待识别图像进行分类，包括：Optionally, performing unsupervised pre-training on the to-be-recognized image to determine the feature representation of the to-be-recognized image, and classifying the to-be-recognized image includes:

将所述待识别图像进行无监督预训练，获取与第一待识别图像像素不同但图像内容相同的第一待识别图像副本；Perform unsupervised pre-training on the image to be recognized, and obtain a copy of the first image to be recognized that is different from the first image to be recognized but has the same image content;

获取所述第一待识别图像、第一待识别图像副本以及第二待识别图像的特征表示；其中，所述待识别图像包括第一待识别图像、第一待识别图像副本以及第二待识别图像；Obtain the feature representation of the first image to be recognized, the copy of the first image to be recognized, and the second image to be recognized; wherein the image to be recognized includes the first image to be recognized, the copy of the first image to be recognized, and the second image to be recognized image;

依据待识别图像特征表示将所述第一待识别图像、第一待识别图像副本以及第二待识别图像进行分类。The first to-be-recognized image, the first to-be-recognized image copy, and the second to-be-recognized image are classified according to the characteristic representation of the to-be-recognized image.

其中，在本发明实施例的一种可选方案中，存在一个终端设备获取的数据集X＝{x₁,x₂,x₃…,x_n}，其中x代表单独的一张图片，X代表整个数据集，并且已知该终端设备获取的数据中存在若干目标图像。无监督预训练训练过程包括：Wherein, in an optional solution of the embodiment of the present invention, there is a data set X={x₁ , x₂ , x₃ . . . , x_n } obtained by a terminal device, where x represents a single picture, X Represents the entire dataset, and it is known that there are several target images in the data acquired by the end device. The unsupervised pretraining training process includes:

z_j＝Resnet(Augment(x))z_j = Resnet(Augment(x))

z_i＝Resnet(x)_zi = Resnet(x)

其中，Augment代表图像增强方法，将所述待识别图像通过包括但不限于裁剪以及翻转等方式产生了一个像素不同但是语义相同的待识别图像副本。z_i和z_j分别代表第一待识别图像的特征表示与第一待识别图像副本的特征表示。Among them, Augment represents an image enhancement method, and the to-be-recognized image is generated by means including but not limited to cropping and flipping to generate a to-be-recognized image copy with different pixels but the same semantics. z_i and z_j respectively represent the feature representation of the first image to be recognized and the feature representation of the copy of the first image to be recognized.

可选的，所述将所述待识别图像进行无监督预训练确定所述待识别图像的特征表示，并将所述待识别图像进行分类，还包括：Optionally, performing unsupervised pre-training on the to-be-recognized image to determine the feature representation of the to-be-recognized image, and classifying the to-be-recognized image, further includes:

依据损失函数确定所述无监督预训练模型的鲁棒性；其中，损失函数如下：Determine the robustness of the unsupervised pre-training model according to the loss function; wherein, the loss function is as follows:

其中，z_i和z_j分别代表第一待识别图像的特征表示与第一待识别图像副本的特征表示；sim代表所述第一待识别图像与所述第一待识别图像副本的特征表示的相似度；z_k表征第二待识别图像的特征表示。Wherein, z_i and z_j represent the feature representation of the first image to be recognized and the feature representation of the first copy of the image to be recognized, respectively; sim represents the feature representation of the first image to be recognized and the feature representation of the copy of the first image to be recognized Similarity; z_k represents the feature representation of the second to-be-recognized image.

参见图1B，以图1B表示相关的优化目标，待识别图像之间两两的相似度要满足图1B中对角线为1的矩阵。这种无监督预训练方式可以理解为通过图像增强的形式，自动创建了一个分类任务，该分类任务的目标是使得图像与其副本分为一类，与其他图像分为另一类。参见图1B，左侧的图片代表原图的副本，上方代表图像的原图。1、0分别代表他们的训练目标，1为预测为同类，0为非同类。Referring to FIG. 1B , a related optimization objective is shown in FIG. 1B , and the similarity between the images to be recognized should satisfy the matrix whose diagonal line is 1 in FIG. 1B . This unsupervised pre-training approach can be understood as automatically creating a classification task in the form of image augmentation, the goal of which is to classify images into one class with their copies and into another class with other images. Referring to Figure IB, the picture on the left represents a copy of the original image, and the upper image represents the original image. 1 and 0 represent their training targets respectively, 1 is predicted to be homogeneous, and 0 is non-homogeneous.

在本发明实施例的一种可选方案中，可选的，所述将所述待识别图像进行无监督预训练确定所述待识别图像的特征表示，并将所述待识别图像进行分类，还包括：In an optional solution of the embodiment of the present invention, optionally, performing unsupervised pre-training on the to-be-recognized image to determine the feature representation of the to-be-recognized image, and classifying the to-be-recognized image, Also includes:

将分类后的待识别图像进行无监督预训练，并重新确定所述待识别图像的特征表示；其中，所述分类后的待识别图像包括以第一待识别图像与第一待识别图像副本为一类的待识别图像以及以第二待识别图像为另一类的待识别图像；Perform unsupervised pre-training on the classified images to be recognized, and re-determine the feature representation of the images to be recognized; wherein, the classified images to be recognized include a first image to be recognized and a copy of the first image to be recognized as One class of images to be recognized and the second image to be recognized as another class of images to be recognized;

通过聚类算法以及所述待识别图像的特征表示重新确定所述待识别图像的类别。The category of the to-be-recognized image is re-determined through a clustering algorithm and the feature representation of the to-be-recognized image.

其中，无监督预训练通常需要面临比较强的噪声，因为本质上该算法将图像与其对应的增强副本算为一类，将任意其他的图像视为非同类，显然这种方法并不总是正确的(例如，同一个训练批次(batch)中出现两个同类的图像)。因此，本发明实施例提出了一种去噪算法，挖掘出样本之间的关系，从而增强训练效果。Among them, unsupervised pre-training usually needs to face relatively strong noise, because in essence, the algorithm considers the image and its corresponding enhanced copy as one class, and treats any other image as non-class, obviously this method is not always correct (e.g., two images of the same type appearing in the same training batch). Therefore, the embodiment of the present invention proposes a denoising algorithm to mine the relationship between samples, thereby enhancing the training effect.

该去噪算法的核心点是，在每个训练批次开始之前，计算训练集中每个样本的特征表示，并通过一个离线的聚类算法，将训练数据集划分为若干的部分，在无监督预训练的过程中，相同类别的两个样本被认为是相同的类别。The core point of the denoising algorithm is to calculate the feature representation of each sample in the training set before starting each training batch, and divide the training data set into several parts through an offline clustering algorithm. During pre-training, two samples of the same class are considered to be the same class.

图2B是本发明实施例提供的一种无监督预训练去噪过程抽象示意图，参见图2B，将无监督预训练去噪过程可以被抽象为图2B，相同圈中的样本被划分为同类。参见图2B将待识别图像的特征表示抽象为二维。从图中可以看出待识别图像的特征表示比较进的一些待识别图像会被划分为相同的表示。结合该过程，无监督训练的伪代码包括A1-A3：FIG. 2B is an abstract schematic diagram of an unsupervised pre-training denoising process provided by an embodiment of the present invention. Referring to FIG. 2B , the unsupervised pre-training denoising process can be abstracted as FIG. 2B , and samples in the same circle are divided into the same class. Referring to FIG. 2B , the feature representation of the image to be recognized is abstracted into two dimensions. It can be seen from the figure that some of the images to be identified that are more advanced in the feature representation of the image to be identified will be divided into the same representation. Combining this process, the pseudocode for unsupervised training includes A1-A3:

A1：通过已经划分好类别的数据集，训练无监督模型。其中，所述已经划分好类别的数据集包括第一待识别图像、第一待识别图像副本以及第二待识别图像；其中，第一待识别图像与第一待识别图像副本为一类，第二待识别图像为另一类；A1: Train an unsupervised model with a data set that has been divided into categories. The data set that has been classified into categories includes a first image to be recognized, a copy of the first image to be recognized, and a second image to be recognized; wherein, the first image to be recognized and the copy of the first image to be recognized are of one type, and the first image to be recognized 2. The image to be recognized is of another type;

A2：当该epoch结束之后，重新计算训练中每个待识别图像的特征表示。其中，一个epoch就是使用训练集中的全部样本训练一次。Epoch的值就是整个训练数据集被反复使用几次；A2: After the epoch ends, recalculate the feature representation of each image to be recognized in training. Among them, an epoch is to use all the samples in the training set to train once. The value of Epoch is the number of times the entire training data set is used repeatedly;

A3：通过Kmeans算法和待识别图像的特征表示，得到每个待识别图像的类别。A3: Through the Kmeans algorithm and the feature representation of the image to be recognized, the category of each image to be recognized is obtained.

同时，将对比学习的损失修改为：At the same time, the loss of contrastive learning is modified as:

其中，M代表在当前训练批次中与z_i有相同数量的样本。采用本发明实施例提供的无监督预训练去噪过程扩展了一个图像对应的正例(例如，相同类别的样本)。where M represents the same number of samples as_zi in the current training batch. The positive examples corresponding to an image (for example, samples of the same category) are expanded by using the unsupervised pre-training denoising process provided by the embodiments of the present invention.

S230、依据所述待识别图像的特征表示对所述待识别图像进行监督训练，识别确定目标图像。S230. Perform supervised training on the to-be-recognized image according to the feature representation of the to-be-recognized image, and identify and determine a target image.

可选的，所述依据所述待识别图像的特征表示对所述待识别图像进行监督训练，识别确定目标图像，包括：Optionally, performing supervised training on the to-be-recognized image according to the feature representation of the to-be-recognized image to identify and determine the target image, including:

依据所述待识别图像的特征表示对所述待识别图像进行监督训练，确定所述待识别图像的类别得分；Perform supervised training on the to-be-recognized image according to the feature representation of the to-be-recognized image, and determine the category score of the to-be-recognized image;

依据所述待识别图像的类别得分以及所述待识别图像的标注值对所述待识别图像进行识别，确定目标图像；其中，所述待识别图像包括目标图像。The to-be-recognized image is recognized according to the category score of the to-be-recognized image and the label value of the to-be-recognized image, and a target image is determined; wherein the to-be-recognized image includes a target image.

其中，通过上述无监督预训练任务，得到了一个深度残差网络(Deep residualnetwork，ResNet)。这个网络通过无监督预训练的方式已经学习了比较多的语义信息，因此只需要标注少数的样本即可以获得更好的效果。Among them, through the above unsupervised pre-training task, a deep residual network (ResNet) is obtained. This network has learned a lot of semantic information through unsupervised pre-training, so it only needs to label a few samples to get better results.

监督训练的过程可以表示为：The process of supervised training can be expressed as:

z_i＝Resnet(x)_zi = Resnet(x)

logits_i＝MLP(z_i)logits_i =MLP(z_i )

Loss_sup＝CE(logits_i,y_i)Loss_sup =CE(logits_i ,y_i )

其中MLP是用于分类的全链接层，建立了Resnet输出的特征表示和最终类别的链接；CE代表交叉熵，是最后的训练Loss；y_i是样本x的标注值(其中，0代表不是目标图像、1代表是目标图像)；logits_i代表类别得分。其中，MLP与resnet在训练流程中都会被更新；Resnet输出的是待识别图像的特征表示，需要一个全链接层将其映射到具体的类别。在训练的时候接收到Resnet之后，直接输出图像类别的概率，从而建立了Resnet输出的特征表示和最终类别的链接。Among them, MLP is the full link layer for classification, which establishes the link between the feature representation output by Resnet and the final category; CE stands for cross entropy, which is the final training Loss; y_i is the label value of the sample x (where 0 means not the target image, 1 represents the target image); logits_i represents the category score. Among them, MLP and resnet will be updated in the training process; Resnet outputs the feature representation of the image to be recognized, and a full link layer is required to map it to a specific category. After receiving the Resnet during training, the probability of the image category is directly output, thereby establishing the link between the feature representation output by the Resnet and the final category.

本发明实施例提供了一种图像识别处理方法，通过获取待识别图像并将所述待识别图像进行预处理；获取待识别图像并将所述待识别图像进行预处理；依据所述待识别图像的特征表示对所述待识别图像进行监督训练，识别确定目标图像。采用本发明实施例的技术方案，通过无监督预训练去噪过程扩展了图像对应的正例，通过无监督预训练的方式学习了较多的语义信息，只需要标注少数的样本即可以获得更好的效果；采用监督训练将所述待识别图像进行识别处理，确定目标图像。引入无监督预训练流程，使用对比学习的技术减少对标注的依赖，增强对样本的表示能力，使其在更少样本的前提下可以对图像进行有效识别，提高了图像识别效率。An embodiment of the present invention provides an image recognition processing method, by acquiring an image to be recognized and preprocessing the image to be recognized; acquiring an image to be recognized and preprocessing the image to be recognized; The feature representation of the supervised training is performed on the to-be-recognized image, and the target image is identified and determined. Using the technical solution of the embodiment of the present invention, the positive examples corresponding to the image are expanded through the unsupervised pre-training denoising process, and more semantic information is learned through the unsupervised pre-training method, and more semantic information can be obtained only by labeling a few samples. Good effect; using supervised training to identify the to-be-recognized image to determine the target image. The unsupervised pre-training process is introduced, and the technology of contrastive learning is used to reduce the dependence on annotations and enhance the representation ability of samples, so that it can effectively recognize images under the premise of fewer samples, and improve the efficiency of image recognition.

实施例三Embodiment 3

图3是本发明实施例三提供的一种图像识别处理装置的结构示意图，该装置包括：图像获取模块310、图像分类模块320和图像识别模块330。其中：FIG. 3 is a schematic structural diagram of an image recognition processing apparatus according to Embodiment 3 of the present invention. The apparatus includes animage acquisition module 310 , animage classification module 320 and animage recognition module 330 . in:

图像获取模块310，用于获取待识别图像并将所述待识别图像进行预处理；animage acquisition module 310, configured to acquire an image to be recognized and preprocess the image to be recognized;

图像分类模块320，用于将所述待识别图像进行无监督预训练确定所述待识别图像的特征表示，并将所述待识别图像进行分类；Animage classification module 320, configured to perform unsupervised pre-training on the to-be-recognized image to determine the feature representation of the to-be-recognized image, and to classify the to-be-recognized image;

图像识别模块330，用于依据所述待识别图像的特征表示对所述待识别图像进行监督训练，识别确定目标图像。Theimage recognition module 330 is configured to perform supervised training on the to-be-recognized image according to the feature representation of the to-be-recognized image, and identify and determine the target image.

在上述实施例的基础上，可选的，所述图像获取模块，包括：On the basis of the foregoing embodiment, optionally, the image acquisition module includes:

在上述实施例的基础上，可选的，所述图像分类模块，包括：On the basis of the above embodiment, optionally, the image classification module includes:

在上述实施例的基础上，可选的，所述图像分类模块，还包括：On the basis of the foregoing embodiment, optionally, the image classification module further includes:

其中，Z_i和Z_j分别代表第一待识别图像的特征表示与第一待识别图像副本的特征表示；sim代表所述第一待识别图像与所述第一待识别图像副本的特征表示的相似度；Z_k表征第二待识别图像的特征表示。Wherein, Z_i and Z_j respectively represent the feature representation of the first image to be recognized and the feature representation of the first copy of the image to be recognized; sim represents the feature representation of the first image to be recognized and the copy of the first image to be recognized Similarity; Z_k represents the feature representation of the second to-be-recognized image.

在上述实施例的基础上，可选的，所述图像识别模块，包括：On the basis of the above embodiment, optionally, the image recognition module includes:

上述装置可执行本发明任意实施例所提供的图像识别处理方法，具备执行该图像识别处理方法相应的功能模块和有益效果。The above apparatus can execute the image recognition processing method provided by any embodiment of the present invention, and has corresponding functional modules and beneficial effects for executing the image recognition processing method.

实施例四Embodiment 4

图4是本申请实施例四提供的一种电子设备的结构示意图。本申请实施例提供了一种电子设备，该电子设备中可集成本申请实施例提供的图像识别处理的互动装置。如图4所示，本实施例提供了一种电子设备400，其包括：一个或多个处理器420；存储装置410，用于存储一个或多个程序，当所述一个或多个程序被所述一个或多个处理器420执行，使得所述一个或多个处理器420实现本申请实施例所提供的图像识别处理方法，该方法包括：FIG. 4 is a schematic structural diagram of an electronic device provided in Embodiment 4 of the present application. The embodiment of the present application provides an electronic device, in which the interactive device for image recognition processing provided by the embodiment of the present application can be integrated. As shown in FIG. 4 , this embodiment provides anelectronic device 400, which includes: one ormore processors 420; and astorage device 410 for storing one or more programs, when the one or more programs are The one ormore processors 420 execute, so that the one ormore processors 420 implement the image recognition processing method provided by the embodiment of the present application, and the method includes:

当然，本领域技术人员可以理解，处理器420还实现本申请任意实施例所提供的图像识别处理方法的技术方案。Of course, those skilled in the art can understand that theprocessor 420 also implements the technical solution of the image recognition processing method provided by any embodiment of the present application.

图4显示的电子设备400仅仅是一个示例，不应对本申请实施例的功能和使用范围带来任何限制。Theelectronic device 400 shown in FIG. 4 is only an example, and should not impose any limitations on the functions and scope of use of the embodiments of the present application.

如图4所示，该电子设备400包括处理器420、存储装置410、输入装置430和输出装置440；电子设备中处理器420的数量可以是一个或多个，图4中以一个处理器420为例；电子设备中的处理器420、存储装置410、输入装置430和输出装置440可以通过总线或其他方式连接，图4中以通过总线450连接为例。As shown in FIG. 4 , theelectronic device 400 includes aprocessor 420 , astorage device 410 , aninput device 430 and anoutput device 440 ; the number ofprocessors 420 in the electronic device may be one or more, and oneprocessor 420 is used in FIG. 4 . For example, theprocessor 420 , thestorage device 410 , theinput device 430 and theoutput device 440 in the electronic device may be connected by a bus or in other ways, and the connection by thebus 450 is taken as an example in FIG. 4 .

存储装置410作为一种计算机可读存储介质，可用于存储软件程序、计算机可执行程序以及模块单元，如本申请实施例中的图像识别处理方法对应的程序指令。As a computer-readable storage medium, thestorage device 410 may be used to store software programs, computer-executable programs, and module units, such as program instructions corresponding to the image recognition processing method in the embodiments of the present application.

存储装置410可主要包括存储程序区和存储数据区，其中，存储程序区可存储操作系统、至少一个功能所需的应用程序；存储数据区可存储根据终端的使用所创建的数据等。此外，存储装置410可以包括高速随机存取存储器，还可以包括非易失性存储器，例如至少一个磁盘存储器件、闪存器件、或其他非易失性固态存储器件。在一些实例中，存储装置410可进一步包括相对于处理器420远程设置的存储器，这些远程存储器可以通过网络连接。上述网络的实例包括但不限于互联网、企业内部网、局域网、移动通信网及其组合。Thestorage device 410 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created according to the use of the terminal, and the like. Additionally,storage device 410 may include high-speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid-state storage device. In some examples,storage device 410 may further include memory located remotely fromprocessor 420, the remote memory may be connected through a network. Examples of such networks include, but are not limited to, the Internet, an intranet, a local area network, a mobile communication network, and combinations thereof.

输入装置430可用于接收输入的数字、字符信息或语音信息，以及产生与电子设备的用户设置以及功能控制有关的键信号输入。输出装置440可包括显示屏、扬声器等电子设备。Theinput device 430 may be used to receive input numbers, character information or voice information, and generate key signal input related to user settings and function control of the electronic device. Theoutput device 440 may include electronic devices such as a display screen, a speaker, and the like.

本申请实施例提供的电子设备，可以达到有效解决图像识别处理难题，减少识别系统对训练样本标注的依赖，提高图像识别效率的技术效果。The electronic device provided by the embodiment of the present application can achieve the technical effect of effectively solving the problem of image recognition processing, reducing the recognition system's dependence on the labeling of training samples, and improving the efficiency of image recognition.

实施例五Embodiment 5

本发明实施例五还提供一种包含计算机可执行指令的存储介质，所述计算机可执行指令在由计算机处理器执行时用于执行一种图像识别处理方法，该方法包括：Embodiment 5 of the present invention further provides a storage medium containing computer-executable instructions, where the computer-executable instructions are used to execute an image recognition processing method when executed by a computer processor, and the method includes:

本发明实施例的计算机存储介质，可以采用一个或多个计算机可读的介质的任意组合。计算机可读介质可以是计算机可读信号介质或者计算机可读存储介质。计算机可读存储介质例如可以是——但不限于——电、磁、光、电磁、红外线、或半导体的系统、装置或器件，或者任意以上的组合。计算机可读存储介质的更具体的例子(非穷举的列表)包括：具有一个或多个导线的电连接、便携式计算机磁盘、硬盘、随机存取存储器(Random AccessMemory，RAM)、只读存储器(Read Only Memory，ROM)、可擦式可编程只读存储器(ErasableProgrammable Read Only Memory，EPROM)、闪存、光纤、便携式CD-ROM、光存储器件、磁存储器件、或者上述的任意合适的组合。计算机可读存储介质可以是任何包含或存储程序的有形介质，该程序可以被指令执行系统、装置或者器件使用或者与其结合使用。The computer storage medium in the embodiments of the present invention may adopt any combination of one or more computer-readable mediums. The computer-readable medium may be a computer-readable signal medium or a computer-readable storage medium. The computer-readable storage medium can be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus or device, or a combination of any of the above. More specific examples (a non-exhaustive list) of computer-readable storage media include: electrical connections having one or more wires, portable computer disks, hard disks, random access memory (RAM), read only memory ( Read Only Memory, ROM), erasable programmable read only memory (Erasable Programmable Read Only Memory, EPROM), flash memory, optical fiber, portable CD-ROM, optical storage device, magnetic storage device, or any suitable combination of the above. A computer-readable storage medium can be any tangible medium that contains or stores a program that can be used by or in connection with an instruction execution system, apparatus, or device.

计算机可读的信号介质可以包括在基带中或者作为载波一部分传播的数据信号，其中承载了计算机可读的程序代码。这种传播的数据信号可以采用多种形式，包括但不限于：电磁信号、光信号或上述的任意合适的组合。计算机可读的信号介质还可以是计算机可读存储介质以外的任何计算机可读介质，该计算机可读介质可以发送、传播或者传输用于由指令执行系统、装置或者器件使用或者与其结合使用的程序。A computer-readable signal medium may include a propagated data signal in baseband or as part of a carrier wave, with computer-readable program code embodied thereon. Such propagated data signals may take a variety of forms including, but not limited to, electromagnetic signals, optical signals, or any suitable combination of the foregoing. A computer-readable signal medium can also be any computer-readable medium other than a computer-readable storage medium that can transmit, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device .

计算机可读介质上包含的程序代码可以用任何适当的介质传输，包括但不限于：无线、电线、光缆、无线电频率(RadioFrequency，RF)等等，或者上述的任意合适的组合。Program code embodied on a computer-readable medium may be transmitted using any suitable medium, including but not limited to: wireless, wire, optical fiber cable, radio frequency (RF), etc., or any suitable combination of the foregoing.

可以以一种或多种程序设计语言或其组合来编写用于执行本发明操作的计算机程序代码，所述程序设计语言包括面向对象的程序设计语言—诸如Java、Smalltalk、C++，还包括常规的过程式程序设计语言—诸如“C”语言或类似的程序设计语言。程序代码可以完全地在用户计算机上执行、部分地在用户计算机上执行、作为一个独立的软件包执行、部分在用户计算机上部分在远程计算机上执行、或者完全在远程计算机或服务器上执行。在涉及远程计算机的情形中，远程计算机可以通过任意种类的网络——包括局域网(LAN)或广域网(WAN)——连接到用户计算机，或者，可以连接到外部计算机(例如利用因特网服务提供商来通过因特网连接)。Computer program code for carrying out operations of the present invention may be written in one or more programming languages, including object-oriented programming languages—such as Java, Smalltalk, C++, but also conventional Procedural programming language - such as the "C" language or similar programming language. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer, or entirely on the remote computer or server. Where a remote computer is involved, the remote computer may be connected to the user's computer through any kind of network, including a local area network (LAN) or a wide area network (WAN), or may be connected to an external computer (eg, using an Internet service provider to via Internet connection).

在本说明书的描述中，参考术语“一个实施例”、“一些实施例”、“示例”、“具体示例”、或“一些示例”等的描述意指结合该实施例或示例描述的具体特征、结构、材料或者特点包含于本发明的至少一个实施例或示例中。在本说明书中，对上述术语的示意性表述不一定指的是相同的实施例或示例。而且，描述的具体特征、结构、材料或者特点可以在任何的一个或多个实施例或示例中以合适的方式结合。In the description of this specification, description with reference to the terms "one embodiment," "some embodiments," "example," "specific example," or "some examples", etc., mean specific features described in connection with the embodiment or example , structure, material or feature is included in at least one embodiment or example of the present invention. In this specification, schematic representations of the above terms do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.

注意，上述仅为本发明的较佳实施例及所运用技术原理。本领域技术人员会理解，本发明不限于这里所述的特定实施例，对本领域技术人员来说能够进行各种明显的变化、重新调整和替代而不会脱离本发明的保护范围。因此，虽然通过以上实施例对本发明进行了较为详细的说明，但是本发明不仅仅限于以上实施例，在不脱离本发明构思的情况下，还可以包括更多其他等效实施例，而本发明的范围由所附的权利要求范围决定。Note that the above are only preferred embodiments of the present invention and applied technical principles. Those skilled in the art will understand that the present invention is not limited to the specific embodiments described herein, and various obvious changes, readjustments and substitutions can be made by those skilled in the art without departing from the protection scope of the present invention. Therefore, although the present invention has been described in detail through the above embodiments, the present invention is not limited to the above embodiments, and can also include more other equivalent embodiments without departing from the concept of the present invention. The scope is determined by the scope of the appended claims.

Claims

Translated fromChinese

1.一种图像识别处理方法，其特征在于，所述方法包括：1. an image recognition processing method, is characterized in that, described method comprises:

2.根据权利要求1所述的方法，其特征在于，所述获取待识别图像并将所述待识别图像进行预处理，包括：2. The method according to claim 1, wherein the acquiring and preprocessing the to-be-recognized image comprises:

3.根据权利要求1所述的方法，其特征在于，所述将所述待识别图像进行无监督预训练确定所述待识别图像的特征表示，并将所述待识别图像进行分类，包括：3. The method according to claim 1, wherein the unsupervised pre-training of the to-be-recognized image to determine the feature representation of the to-be-recognized image, and the classification of the to-be-recognized image, comprises:

4.根据权利要求1所述的方法，其特征在于，所述将所述待识别图像进行无监督预训练确定所述待识别图像的特征表示，并将所述待识别图像进行分类，还包括：4 . The method according to claim 1 , wherein, performing unsupervised pre-training on the image to be recognized to determine the feature representation of the image to be recognized, and classifying the image to be recognized, further comprising: 5 . :

5.根据权利要求1所述的方法，其特征在于，所述将所述待识别图像进行无监督预训练确定所述待识别图像的特征表示，并将所述待识别图像进行分类，还包括：5 . The method according to claim 1 , wherein, performing unsupervised pre-training on the image to be recognized to determine the feature representation of the image to be recognized, and classifying the image to be recognized, further comprising: 6 . :

6.根据权利要求1所述的方法，其特征在于，所述依据所述待识别图像的特征表示对所述待识别图像进行监督训练，识别确定目标图像，包括：6. The method according to claim 1, characterized in that, performing supervised training on the to-be-recognized image according to the feature representation of the to-be-recognized image, and identifying and determining the target image, comprising:

7.一种图像识别处理装置，其特征在于，所述装置包括：7. An image recognition processing device, wherein the device comprises:

8.根据权利要求7所述的装置，其特征在于，所述图像获取模块，包括：8. The device according to claim 7, wherein the image acquisition module comprises:

对所述待识别图像进行预处理；所述预处理包括图像尺寸与格式调整、去噪、消光以及消除背景影响。Preprocessing the to-be-identified image; the preprocessing includes image size and format adjustment, denoising, matting, and removing background effects.

9.一种电子设备，其特征在于，包括：9. An electronic device, characterized in that, comprising:

一个或多个处理器；one or more processors;

当所述一个或多个程序被所述一个或多个处理器执行，使得所述一个或多个处理器实现权利要求1-6中任一所述的图像识别处理方法。When the one or more programs are executed by the one or more processors, the one or more processors implement the image recognition processing method according to any one of claims 1-6.

10.一种包含计算机可执行指令的存储介质，其特征在于，所述计算机可执行指令在由计算机处理器执行时用于执行如权利要求1-6中任一所述的图像识别处理方法。10. A storage medium containing computer-executable instructions, wherein the computer-executable instructions are used to execute the image recognition processing method according to any one of claims 1-6 when executed by a computer processor.