CN113486204A

Movatterモバイル変換

Info

Publication number: CN113486204A
Application number: CN202110715697.0A
Authority: CN
Inventors: 雷晨雨; 刘胜坤; 韩茂琨; 刘玉宇
Original assignee: Ping An Technology Shenzhen Co Ltd
Current assignee: Ping An Technology Shenzhen Co Ltd
Priority date: 2021-06-25
Filing date: 2021-06-25
Publication date: 2021-10-08

Abstract

Translated fromChinese

本发明涉及人工智能的图像检测领域，提出一种图片标注方法、装置、介质及设备，该方法包括：获取与标注任务相对应的待标注图片；获取标注任务的描述信息；将所述标注任务的描述信息与预设标注模型的描述信息进行比较，得到所述标注任务的描述信息与所述预设标注模型的描述信息之间的相似度；根据所述标注任务的描述信息与所述预设标注模型的描述信息之间的相似度从存储有多个预设标注模型的数据库中确定出所述标注任务对应的标注模型；利用所述标注任务对应的标注模型对所述待标注图片进行标注，得到标注结果。本发明通过标注模型实现对图片的自动标注，有效提高了图片标注的效率。

The invention relates to the field of image detection of artificial intelligence, and provides a picture labeling method, device, medium and equipment. The method includes: obtaining a picture to be labelled corresponding to a labeling task; obtaining description information of the labeling task; Compare the description information of the labeling task with the description information of the preset labeling model, and obtain the similarity between the description information of the labeling task and the description information of the preset labeling model; Suppose the similarity between the description information of the labeling model is determined from the database storing a plurality of preset labeling models to determine the labeling model corresponding to the labeling task; the labeling model corresponding to the labeling task is used to perform the labeling process on the picture to be labelled. label to get the labeling result. The invention realizes the automatic labeling of pictures through the labeling model, and effectively improves the efficiency of picture labeling.

Description

Picture marking method, device, medium and equipment

Technical Field

The invention relates to the field of artificial intelligence image detection, in particular to a picture marking method, a picture marking device, a picture marking medium and picture marking equipment.

Background

In the technical field of computer vision based on deep learning, in order to serve a deep learning algorithm, a large number of labeled pictures are needed to train a training model. Objectively, many pictures are needed for training the model. The picture marking is a technology for adding information in a picture, and information such as characters, an indication arrow, a size and the like can be added to the picture. In order to label a large amount of data, a large amount of manpower and material resources need to be invested, so that the workload of data labeling is large, the efficiency of data labeling is low, and the labeling cost is high.

Disclosure of Invention

The invention provides a picture marking method, a picture marking device, a picture marking medium and picture marking equipment, and mainly aims to realize automatic marking of pictures through a marking model and effectively improve the picture marking efficiency.

In order to achieve the above object, the present invention provides a method for labeling a picture, the method comprising:

acquiring a picture to be labeled corresponding to the labeling task;

acquiring description information of the labeling task;

comparing the description information of the labeling task with the description information of a plurality of preset labeling models stored in a database to obtain the similarity between the description information of the labeling task and the description information of the preset labeling models;

determining a labeling model corresponding to the labeling task from a database in which a plurality of preset labeling models are stored according to the similarity between the description information of the labeling task and the description information of the preset labeling model;

and marking the picture to be marked by using the marking model corresponding to the marking task to obtain a marking result.

Optionally, the method further comprises: and storing the labeling result.

Optionally, the storing the annotation result includes: and storing the labeling result by utilizing a label group associated with the labeling task, wherein the label group comprises at least one type of tree label structure, and the tree label structure is used for storing the labeling result.

Optionally, in the process of labeling the picture, recording the labeling time and the time point of opening and closing the picture.

Optionally, the annotation model comprises: a yolov 5-based target detection model, a resnet-based classification model, a Mask R-CNN-based instance segmentation model and a Densenet-based model.

Optionally, the method for labeling the picture to be labeled by using the yolov 5-based target detection model includes:

acquiring a picture to be marked;

detecting the picture to be marked by utilizing a yolov5 algorithm to obtain a target to be marked;

labeling the target to be labeled to obtain a labeling result;

and storing the labeling result into a label group.

Optionally, the method further comprises: and rechecking the labeling result of the labeled picture to be labeled.

In addition, to achieve the above object, the present invention further provides a picture labeling apparatus, including:

the image acquisition module is used for acquiring the image to be labeled corresponding to the labeling task;

the comparison module is used for acquiring the description information of the labeling task and comparing the description information of the labeling task with the description information of a plurality of preset labeling models stored in a database to obtain the similarity between the description information of the labeling task and the description information of the preset labeling models;

the annotation model determining module is used for determining an annotation model corresponding to the annotation task from a database in which a plurality of preset annotation models are stored according to the similarity between the description information of the annotation task and the description information of the preset annotation model;

and the marking module is used for marking the picture to be marked by using the marking model corresponding to the marking task to obtain a marking result.

In addition, to achieve the above object, the present invention also provides an article vending apparatus, comprising a processor coupled to a memory, the memory storing program instructions, the program instructions stored in the memory implementing the method when executed by the processor.

In addition, to achieve the above object, the present invention also provides a computer-readable storage medium including a program, which, when run on a computer, causes the computer to execute the picture labeling method.

As described above, the method, the apparatus, the medium, and the device for labeling a picture provided by the present invention have the following advantages: the method comprises the steps of obtaining a picture to be marked corresponding to a marking task; acquiring description information of the labeling task; comparing the description information of the labeling task with the description information of a preset labeling model to obtain the similarity between the description information of the labeling task and the description information of the preset labeling model; determining a labeling model corresponding to the labeling task from a database in which a plurality of preset labeling models are stored according to the similarity between the description information of the labeling task and the description information of the preset labeling model; and marking the picture to be marked by using the marking model corresponding to the marking task to obtain a marking result. According to the invention, the automatic labeling of the picture is realized through the labeling model, and the efficiency of picture labeling is effectively improved.

Drawings

FIG. 1 is a flowchart illustrating a method for tagging a picture according to an embodiment of the present invention;

FIG. 2 is a flowchart illustrating automatic labeling of image sheets according to an embodiment of the present invention

FIG. 3 is a diagram illustrating a tree tag structure according to an embodiment of the present invention;

fig. 4 is a schematic structural diagram of a picture labeling apparatus according to an embodiment of the present invention.

The implementation, functional features and advantages of the objects of the present invention will be further explained with reference to the accompanying drawings.

Detailed Description

The embodiments of the present invention are described below with reference to specific embodiments, and other advantages and effects of the present invention will be easily understood by those skilled in the art from the disclosure of the present specification. The invention is capable of other and different embodiments and of being practiced or of being carried out in various ways, and its several details are capable of modification in various respects, all without departing from the spirit and scope of the present invention. It is to be noted that the features in the following embodiments and examples may be combined with each other without conflict.

It should be noted that the drawings provided in the following embodiments are only for illustrating the basic idea of the present invention, and the components related to the present invention are only shown in the drawings rather than drawn according to the number, shape and size of the components in actual implementation, and the type, quantity and proportion of the components in actual implementation may be changed freely, and the layout of the components may be more complicated.

As shown in fig. 1, an embodiment of the present application provides a method for annotating a picture, where the method can be performed by an annotation device, and the method includes:

s11, acquiring a picture to be annotated corresponding to the annotation task;

the picture marking means to label a picture, and the label printed on a picture is usually the characteristics or attributes of the picture, which can be used to know the picture. For example, a target label may be labeled for a target in a picture, for example, the target in the picture may be a person, a building, a tree, etc., and then the label of the labeled person, the label of the building, and the label of the tree. For example, a picture of a human face may be labeled with a gender label, an age label, and a skin color label, and these labeled pictures may be used to train a machine learning model of a corresponding type as a labeling model. For example, a picture labeled with a face label can be used to train a machine learning model for face detection, and a picture labeled with an age label can be used to train a machine learning model for identifying age from a face.

The pictures to be labeled can be pictures collected by a camera, pictures sent by other terminals can be received, and pictures downloaded through a network can be also received. In addition, the obtaining of the picture to be annotated may also be receiving a selection instruction of the user, and taking the picture selected by the user as the picture to be annotated. In an embodiment, the obtaining the picture to be annotated includes: and receiving a picture selection instruction input by a user, and determining the picture selected by the user as a picture to be annotated. That is to say, the user can select any picture as the picture to be labeled, so as to label the picture to be labeled, and finally obtain the labeling result of the picture to be labeled.

The annotation task can be created by a terminal or a server or other equipment except the annotation device, and then sent to the annotation device, and the annotation device executes the corresponding annotation method. Of course, the annotation task can also be created by the annotation device itself. The creation of the annotation task can be implemented by means of a flowchart. For example, a labeling task editing page is opened in a page editing area of the terminal, and after a flowchart representing a labeling task is created in the task editing page in a flowchart manner, the flowchart is sent to the labeling device through a task sending button. The labeling task may include labeling objects in the picture, labeling attributes (for example, labeling a face label, labeling a gender label, or labeling an age label), labeling modes (labeling a box, or labeling a round box), and the like.

And the marking device receives the marking task uploaded by the terminal and then acquires the corresponding picture to be marked from the picture storage library according to the marking task. Wherein, the picture storage library can be located on a terminal or a server. The image storage mode and the index table storage mode can achieve high-efficiency storage and query capabilities.

And after the labeling personnel receive the corresponding labeling tasks, calling the corresponding pictures to be labeled from the picture storage library to finish the labeling of the pictures. The pictures to be labeled can be imported into the terminal or the server in advance, and before the pictures are imported into the terminal or the server, the pictures can be correspondingly preprocessed according to the labeling task.

For example, if the labeling task includes labeling the skin color of the face in the picture, the picture needs to have higher definition; and if the resolution of the picture to be marked acquired from the picture storage library does not meet the requirement, improving the resolution of the picture to be marked so as to meet the subsequent marking requirement. If the labeling task comprises labeling a certain target in the picture, a too clear picture is not needed; if the resolution of the picture to be annotated acquired from the picture repository is too high, the picture to be annotated needs to be compressed, and the resolution of the picture is reduced, so that the transmission speed of the picture can be reduced, and the annotation efficiency is improved.

S12, the description information of the labeling task is obtained, and the description information of the labeling task is compared with the description information of a plurality of preset labeling models stored in a database, so that the similarity between the description information of the labeling task and the description information of the preset labeling models is obtained.

The description information may be a gender of a person in the picture.

It can be understood that, a plurality of annotation models are pre-stored in the database, and each annotation model has corresponding description information, for example, the description information of some annotation models is: marking the gender of the figure in the picture; some description information of the labeling model is as follows: and labeling the face in the picture, and the like. In the process of determining the annotation model corresponding to the annotation task, the description information of the annotation task may be sequentially compared with the description information of a plurality of preset annotation models stored in the database, and the similarity between the task description information of the annotation task and the description information of each annotation model may be calculated.

The plurality of preset annotation models may include: a yolov 5-based target detection model, a resnet-based classification model, a Mask R-CNN-based instance segmentation model and a Densenet-based model. Each labeling model is used for completing labeling of the pictures corresponding to the labeling tasks so as to generate labeling results.

It is to be understood that any two of the plurality of annotation models can be similar annotation models or can be widely different annotation models. For example, the two annotation models may both be models for annotating pictures, and may be models for annotating pictures and annotating texts, respectively.

The image labeling can also utilize trained target detection models, classification models, segmentation models and the like to obtain corresponding targets, and then the targets are labeled in a manual mode to realize semi-automatic labeling. Here, a neural network algorithm or the like may be used to establish the labeling model based on the training samples. For example, a target detection algorithm yolov5, an object classification algorithm resnet series, a densenert series, an example segmentation algorithm Mask R-CNN and the like are adopted to establish an annotation model.

Of course, automatic annotation can be realized in addition to the annotation of the pictures by the method. The method comprises the steps of obtaining a training sample in advance, enabling the training sample to comprise at least one sample picture and corresponding picture labels, enabling the at least one sample picture to serve as an input variable of a labeling model, enabling the corresponding picture labels to serve as output variables of the labeling model, and establishing the labeling model based on the training sample. When the pictures are labeled, the pictures to be labeled are input into the labeling model, and then corresponding labeling results are output, so that automatic labeling is realized.

In this embodiment, the process of labeling pictures is described with a target detection model based on yolov5 as a labeling model.

As shown in fig. 2, the method for labeling the picture to be labeled by using the yolov 5-based target detection model includes:

s21, acquiring a picture to be annotated;

s22, detecting the picture to be labeled by using yolov5 algorithm to obtain a target to be labeled;

s23, labeling the target to be labeled to obtain a labeling result;

s24, storing the labeling result into a label group.

In one embodiment, the training step of the yolov 5-based target detection model may include:

step 1, obtaining a plurality of pictures to form a sample set, and dividing the sample set into a training set and a verification set according to a proportion. The training set and the verification set are randomly divided from the obtained pictures according to the proportion. The ratio of the general training set to the validation set was 7: 3.

step 2, marking the pictures in the training set by using a marking tool, and setting a marking label for a target in the pictures;

and 3, sending the marked picture to an input end of a target detection model based on yolov 5. Since all pictures may be different in size, and the yolov5 model requires a unified picture to generate a feature layer, the picture needs to be preprocessed, i.e., adaptively scaled. The picture is first enlarged or reduced according to the input size required by yolov5, and then the black lines added for the shorter sides are squared to meet the input specification of 608 pixels by 608 pixels.

If the pictures in the training set are insufficient, data enhancement needs to be carried out on the training set. Specifically, the data can be enhanced in a random scaling, random clipping, random arrangement and other ways to achieve the goal of enriching the data set.

The YOLOv5 network model is primarily composed of three main components:

backbone: the convolutional neural network of image features is aggregated and formed over the different image fine granularities.

And (6) selecting Neck: a series of network layers that blend and combine image features and pass the image features to a prediction layer.

Head: and predicting the image characteristics, generating a boundary box and predicting the category.

The Backbone and the Neck are mainly used for extracting image features, the image features are features of various targets in an input picture prediction frame, and the Head is used for feature detection and prediction categories.

And inputting the preprocessed pictures into a backhaul to generate feature layers with different sizes. And respectively inputting the characteristic layers into the jack part, generating a new characteristic layer after a series of processing, inputting the new characteristic layer to an output end, and predicting through Head.

Head gets a series of rectangular boxes (x, y, w, h) and confidence c from the newly generated feature layer. x, y, w and h respectively represent x and y coordinates of the rectangular frame on an image coordinate system, and width and height respectively represent the width and height of the rectangle; confidence c represents the confidence level that the object does exist within the rectangular box and whether the rectangular box includes all of the features of the entire object.

And (3) screening repeated rectangular frames by adopting a non-maximum value inhibition method, wherein the non-maximum value inhibition method comprises the steps of firstly sorting according to confidence score, selecting the rectangular frame with the highest confidence to add into a final output list, deleting the rectangular frame from the rectangular frame list, calculating the areas of all the rectangular frames, and calculating the intersection ratio IoU (the ratio of the intersection area of two frames to the union area of the two frames, and representing the intersection degree of the two frames) of the rectangular frame with the highest confidence and other candidate frames. Rectangular boxes above a certain value are deleted IoU and the process is repeated until the list of rectangular boxes is empty. The remaining rectangular frame is the prediction frame, the prediction frame is compared with the previous labeled frame (real frame), and the loss is calculated by adopting the GIoU loss function (the loss function maps the difference between the prediction frame and the real frame, and the weight can be continuously adjusted through the loss function to reduce the difference). And then the loss function is used for back propagation, so that the weight of yolov5 is adjusted.

The GIoU represents a loss function formula, A represents a marking frame, B represents a prediction frame, and C represents a minimum circumscribed rectangle formed by the marking frame and the prediction frame, namely the area of a minimum frame simultaneously containing the prediction frame and a real frame.

And 4, repeating the process to gradually converge yolov5, and continuously adjusting parameters through the test of the verification set so as to enable the yolov5 to have generalization capability and improve precision.

The method utilizes the yolov5 model to detect the target, namely directly gives the class probability and the position of the object, has higher recognition speed and can process a large batch of pictures. Yolov5 can achieve higher accuracy when trained.

S13, determining a labeling model corresponding to the labeling task from a database in which a plurality of preset labeling models are stored according to the similarity between the description information of the labeling task and the description information of the preset labeling models;

specifically, the maximum similarity is determined from the plurality of similarity values calculated in step S12, and then the annotation model corresponding to the maximum similarity is used as the annotation model corresponding to the annotation task. Therefore, automatic matching of the labeling model and the labeling task is realized.

And S14, labeling the picture to be labeled by using the labeling model corresponding to the labeling task to obtain a labeling result.

The method comprises the steps of obtaining a picture to be marked corresponding to a marking task; acquiring description information of the labeling task; comparing the description information of the labeling task with the description information of a preset labeling model to obtain the similarity between the description information of the labeling task and the description information of the preset labeling model; determining a labeling model corresponding to the labeling task from a database in which a plurality of preset labeling models are stored according to the similarity between the description information of the labeling task and the description information of the preset labeling model; and marking the picture to be marked by using the marking model corresponding to the marking task to obtain a marking result. According to the invention, the automatic labeling of the picture is realized through the labeling model, and the efficiency of picture labeling is effectively improved.

In an embodiment, the method further comprises: and storing the labeling result.

Specifically, the storing the labeling result includes: and storing the labeling result by utilizing a label group associated with the labeling task, wherein the label group comprises at least one type of tree label structure, and the tree label structure is used for storing the labeling result.

FIG. 3 is a label tree, as shown in FIG. 3, where each node of the tree label structure is a type of label. The label tree structure comprises multi-level labels, wherein the multi-level labels refer to labels which can be subdivided, for example, the mammal labels can be further divided into human labels, cat labels and cattle labels; for example, bird tags may be classified as crow tags, carving tags, and the like, while tree tags may be classified as oak tags, beech tags, and the like.

In this embodiment, since the annotation system supports multi-level tags, processing of finer-grained annotation tasks can be achieved.

In one embodiment, in the process of marking the picture, the marking time and the time point of opening and closing the picture are recorded. In the process of marking the picture, the marking device records the marking time point of a marking person and the time point of opening and closing the picture, and the work efficiency of the marking person is counted. By recording the marking time point and the time point of opening and closing the picture, the work of marking personnel can be monitored, the supervision effect can be achieved, and the marking task can be completed more efficiently. For example, the time interval between the opening and closing of one picture is small, which can be considered as high working efficiency of the annotating personnel, and the time interval between the opening and closing of one picture is large, which can be considered as low working efficiency of the annotating personnel.

In an embodiment, the method further comprises: and auditing the labeling result to obtain an auditing result, and generating an auditing conclusion according to the auditing result.

The audit result may include which annotation results are erroneous, which annotation results are correct, and which pictures are not annotated. The review conclusion is summarized information of the review result, for example, whether a certain annotating person is finished with the annotation task is considered according to the accuracy of the annotation result.

As shown in fig. 4, an embodiment of the present application further provides an image annotation device, where the device includes an image obtaining module 41, a comparing module 42, an annotation model determining module 43, and an annotation module 44;

the image obtaining module 41 is configured to obtain an image to be annotated corresponding to an annotation task;

the comparison module 42 obtains description information of an annotation task, and compares the description information of the annotation task with description information of a plurality of preset annotation models stored in a database to obtain similarity between the description information of the annotation task and the description information of the preset annotation models;

the description information may be a gender of a person in the picture.

The annotation model determining module 43 determines an annotation model corresponding to the annotation task from a database in which a plurality of preset annotation models are stored according to the similarity between the description information of the annotation task and the description information of the preset annotation model;

specifically, the maximum similarity is determined from the multiple similarity values obtained by the comparison module, and then the labeling model corresponding to the maximum similarity is used as the labeling model corresponding to the labeling task. Therefore, automatic matching of the labeling model and the labeling task is realized.

The labeling module 44 labels the to-be-labeled picture by using the labeling model corresponding to the labeling task to obtain a labeling result.

The method comprises the steps that a picture to be marked corresponding to a marking task is obtained through a picture obtaining module; obtaining description information of a labeling task by using a comparison module, and comparing the description information of the labeling task with description information of a preset labeling model to obtain similarity between the description information of the labeling task and the description information of the preset labeling model; then, determining a labeling model corresponding to the labeling task from a database storing a plurality of preset labeling models through a labeling model determining module according to the similarity between the description information of the labeling task and the description information of the preset labeling models; and finally, the labeling module labels the picture to be labeled by using a labeling model corresponding to the labeling task to obtain a labeling result. According to the invention, the automatic labeling of the picture is realized through the labeling model, and the efficiency of picture labeling is effectively improved.

The device provided in the embodiment has the steps and the beneficial effects of realizing the corresponding functional modules of any embodiment of the invention. For technical details that are not described in detail in the above embodiments, reference may be made to a picture labeling method provided in any embodiment of the present invention.

Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.

It can be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, devices and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.

It should be noted that, through the above description of the embodiments, it is clear to those skilled in the art that part or all of the present application can be implemented by software in combination with a necessary general hardware platform. The functions, if implemented in the form of software functional units and sold or used as a separate product, may also be stored in a computer-readable storage medium with the understanding that embodiments of the present invention provide a computer-readable storage medium including a program which, when run on a computer, causes the computer to perform the method shown in fig. 1.

An embodiment of the present invention provides a picture labeling apparatus, which includes a processor, and the processor is coupled to a memory, where the memory stores program instructions, and when the program instructions stored in the memory are executed by the processor, the method shown in fig. 1 is implemented.

With this understanding in mind, the technical solutions of the present application and/or portions thereof that contribute to the prior art may be embodied in the form of a software product that may include one or more machine-readable media having stored thereon machine-executable instructions that, when executed by one or more machines such as a computer, network of computers, or other electronic devices, may cause the one or more machines to perform operations in accordance with embodiments of the present application. The machine-readable medium may include, but is not limited to, floppy diskettes, optical disks, CD-ROMs (compact disc-read only memories), magneto-optical disks, ROMs (read only memories), RAMs (random access memories), EPROMs (erasable programmable read only memories), EEPROMs (electrically erasable programmable read only memories), magnetic or optical cards, flash memory, or other type of media/machine-readable medium suitable for storing machine-executable instructions. The storage medium may be located in a local server or a third-party server, such as a third-party cloud service platform. The specific cloud service platform is not limited herein, such as the Ali cloud, Tencent cloud, etc. The application is operational with numerous general purpose or special purpose computing system environments or configurations. For example: a personal computer, dedicated server computer, mainframe computer, etc. configured as a node in a distributed system.

In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus, and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, for example, the division of the units is only one logical functional division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

The foregoing embodiments are merely illustrative of the principles and utilities of the present invention and are not intended to limit the invention. Any person skilled in the art can modify or change the above-mentioned embodiments without departing from the spirit and scope of the present invention. Accordingly, it is intended that all equivalent modifications or changes which can be made by those skilled in the art without departing from the spirit and technical spirit of the present invention be covered by the claims of the present invention.

Claims

1. A picture labeling method is characterized by comprising the following steps:

acquiring a picture to be labeled corresponding to the labeling task;

acquiring description information of a labeling task, and comparing the description information of the labeling task with description information of a plurality of preset labeling models stored in a database to obtain similarity between the description information of the labeling task and the description information of the preset labeling models;

2. The picture annotation method of claim 1, further comprising: and storing the labeling result.

3. The method for labeling pictures according to claim 2, wherein the storing the labeling result comprises: and storing the labeling result by utilizing a label group associated with the labeling task, wherein the label group comprises at least one type of tree label structure, and the tree label structure is used for storing the labeling result.

4. The method for annotating pictures according to claim 1, wherein in the process of annotating pictures, the annotation time and the time point of opening and closing the pictures are recorded.

5. The picture annotation method of claim 1, wherein the annotation model comprises: a yolov 5-based target detection model, a resnet-based classification model, a Mask R-CNN-based instance segmentation model and a Densenet-based model.

6. The method for labeling pictures according to claim 5, wherein the method for labeling the picture to be labeled by using the yolov 5-based target detection model comprises the following steps:

acquiring a picture to be marked;

labeling the target to be labeled to obtain a labeling result;

and storing the labeling result into a label group.

7. The picture annotation method of claim 1, further comprising: and rechecking the labeling result of the labeled picture to be labeled.

8. A picture labeling apparatus, comprising:

9. A picture marking device comprising a processor coupled to a memory, the memory storing program instructions that, when executed by the processor, implement the method of any one of claims 1-7.

10. A computer-readable storage medium, characterized by comprising a program which, when run on a computer, causes the computer to perform the method of any one of claims 1-7.