CN113723152B

Movatterモバイル変換

Info

Publication number: CN113723152B
Application number: CN202010455973.XA
Authority: CN
Inventors: 李海洋; 王建国; 汪彪; 张超
Original assignee: Alibaba Group Holding Ltd
Current assignee: Alibaba Group Holding Ltd
Priority date: 2020-05-26
Filing date: 2020-05-26
Publication date: 2025-03-25
Anticipated expiration: 2040-05-26
Also published as: CN113723152A

Abstract

The application provides an image processing method, which comprises the steps of obtaining an image to be processed, wherein the image to be processed comprises at least two target objects, identifying a plurality of part areas corresponding to the target objects based on the image to be processed to obtain a plurality of sub-images, wherein the plurality of sub-images comprise characteristic information of the target objects, associating the plurality of sub-images corresponding to the same target object with the target objects to obtain a plurality of sub-images corresponding to the target objects, and identifying the plurality of sub-images corresponding to the target objects based on the characteristic information of the target objects to obtain an identification result aiming at the target objects. When the image is analyzed to obtain the recognition result of the target object in the image, the application associates the plurality of sub-images corresponding to the same target object in the image with the target object, so that the subsequent recognition mode of the target object is simpler and more convenient. The problem that the existing mode for identifying the target object in the image is complex is solved.

Description

Image processing method and device and electronic equipment

Technical Field

The present application relates to the field of computer technologies, and in particular, to an image processing method, an image processing apparatus, an electronic device, and a computer storage medium.

Background

With the continuous development of digital technology, the application of the image pickup apparatus in daily life is becoming wider and wider. The resulting video content is also increasing. As is well known, video content is used primarily for people to query for useful information for monitoring an environment. For example, in intelligent monitoring of a home, video segment information of a target object or the occurrence of an object of interest in a captured video can be searched for from video recording content. Or in public environment monitoring, the video segment information of the interested target object or object is found in the video recording content of the real-time shooting so as to be used for tracking the motion trail information of the interested target object or object.

After obtaining video content, the target object is identified, typically based on video frames. If multiple parts of at least one target object in a video frame need to be identified, most of the prior art identifies each part of the target object separately. However, in this recognition method, since a plurality of portions are recognized separately, when a plurality of target objects are included in a video frame, it is necessary to correspond recognition results of different portions to the target objects, and the recognition process is not simple enough.

Disclosure of Invention

The application provides an image processing method for solving the problem that the existing method for identifying the target object in the video frame is complex. The application also provides an image processing device, and electronic equipment and a computer medium corresponding to the image processing device.

The embodiment of the application provides an image processing method, which comprises the following steps:

obtaining an image to be processed, wherein the image to be processed comprises at least two target objects;

Identifying a plurality of part areas corresponding to the target object based on the image to be processed to obtain a plurality of sub-images, wherein the plurality of sub-images contain characteristic information of the target object;

Associating a plurality of sub-images corresponding to the same target object with the target object to obtain a plurality of sub-images corresponding to the target object;

And carrying out recognition processing on the plurality of sub-images corresponding to the target object based on the characteristic information of the target object to obtain a recognition result aiming at the target object.

Optionally, the identifying the plurality of sub-images corresponding to the target object based on the feature information of the target object to obtain an identification result for the target object includes:

based on the characteristic information of the target object, respectively carrying out recognition processing on the plurality of sub-images corresponding to the target object to obtain a plurality of recognition results;

and obtaining the recognition result aiming at the target object according to the plurality of recognition results.

Optionally, the image to be processed refers to a video frame image;

The video frame image is obtained by:

sending a request for obtaining a video file to a video obtaining device;

And receiving the video file sent by the video acquisition device, and acquiring a video frame image according to the video file.

Optionally, the associating the plurality of sub-images corresponding to the same target object with the target object to obtain the plurality of sub-images corresponding to the target object includes:

Obtaining a current target object;

Determining a plurality of sub-images matched with the current target object based on the characteristic information of the target object contained in the sub-images;

And establishing an association relation between the plurality of sub-images matched with the current target object and the current target object to obtain a plurality of sub-images corresponding to the target object.

Optionally, the determining a plurality of sub-images matched with the current target object based on the feature information of the target object contained in the sub-images includes:

traversing each sub-image in the plurality of sub-images based on the characteristic information of the target object contained in the sub-image to obtain the matching degree information of each sub-image and the current target object;

And determining a plurality of sub-images matched with the current target object according to the matching degree information of each sub-image and the current target object and a preset matching degree threshold condition.

Optionally, the determining a plurality of sub-images matched with the current target object according to the matching degree information of each sub-image and the current target object and the matching degree threshold condition includes:

Judging whether the matching degree information of the sub-image and the current target object meets the matching degree threshold condition or not;

And determining the sub-image as the sub-image matched with the current target object if the sub-image meets the matching degree threshold condition, and determining a plurality of sub-images matched with the current target object in the same way.

Optionally, the identifying a plurality of location areas corresponding to the target object based on the image to be processed, and obtaining a plurality of sub-images includes:

The method comprises the steps of taking an image to be processed as input data of a convolutional neural network and an image area candidate frame generation network to obtain a partitioning result of the image to be processed, wherein the convolutional neural network and the image area candidate frame generation network are jointly used as a neural network for partitioning the image to be processed to obtain an image partitioning result;

And obtaining a plurality of sub-images according to the partitioning result of the image to be processed.

Optionally, the plurality of sub-images includes a first sub-image and a second sub-image;

the identifying processing is performed on the plurality of sub-images corresponding to the target object based on the characteristic information of the target object, so as to obtain a plurality of identifying results, including:

Performing quality evaluation on the first sub-image and the second sub-image, and performing first key point positioning and second key point positioning on feature information contained in the first sub-image and feature information contained in the second sub-image respectively to obtain first key feature information and second key feature information meeting preset image quality evaluation conditions;

And identifying the first key feature information and the second key feature information which meet the preset image quality evaluation condition to obtain a first identification result and a second identification result, and identifying the first identification result and the second identification result as the plurality of identification results.

Optionally, the plurality of sub-images further includes a third sub-image;

performing third key point positioning on the characteristic information contained in the third sub-image to obtain third key characteristic information;

Classifying the characteristic information contained in the third sub-image to obtain the category of the characteristic information of the third sub-image;

and obtaining a third recognition result based on the third key feature information and the feature information category of the third sub-image, and confirming the first recognition result, the second recognition result and the third recognition result as the plurality of recognition results.

Optionally, judging whether the target objects overlap or not;

And if the target objects are overlapped, carrying out segmentation processing on the image to be processed.

Optionally, judging whether a target object in the current image meets preset conditions or not;

and if the target object in the current image does not meet the preset condition, acquiring an image of which the target object meets the preset condition again, and taking the image of which the target object meets the preset condition as the image to be processed.

Optionally, the method further comprises the steps of displaying the identification result aiming at the target object and obtaining feedback information aiming at the identification result;

and judging whether to re-identify the identification result according to the feedback information.

Correspondingly, an embodiment of the present application provides an image processing apparatus, including:

the image processing device comprises an image obtaining unit to be processed, a processing unit and a processing unit, wherein the image obtaining unit is used for obtaining an image to be processed, and the image to be processed comprises at least two target objects;

A sub-image obtaining unit, configured to identify a plurality of location areas corresponding to the target object based on the image to be processed, and obtain a plurality of sub-images, where the plurality of sub-images include feature information of the target object;

the association unit is used for associating a plurality of sub-images corresponding to the same target object with the target object to obtain a plurality of sub-images corresponding to the target object;

and the identification unit of the target object is used for carrying out identification processing on the plurality of sub-images corresponding to the target object based on the characteristic information of the target object to obtain an identification result aiming at the target object.

Obtaining an image to be processed, wherein the image to be processed comprises a target object;

Identifying the image to be processed based on the image to be processed to obtain a plurality of sub-images, wherein the plurality of sub-images contain characteristic information of a target object;

correlating the multiple sub-images corresponding to the same target object to obtain multiple sub-images corresponding to the target object;

The embodiment of the application provides a traffic image processing method, which comprises the following steps:

obtaining a traffic image to be processed, wherein the traffic image to be processed comprises vehicles;

Identifying the traffic image to be processed based on the traffic image to be processed, and obtaining a plurality of sub-images, wherein the plurality of sub-images contain characteristic information of a vehicle;

correlating the multiple sub-images corresponding to the same vehicle to obtain multiple sub-images corresponding to the vehicle;

and carrying out recognition processing on the plurality of sub-images corresponding to the vehicle based on the characteristic information of the vehicle to obtain a recognition result for the vehicle.

The embodiment of the application provides image processing equipment, which comprises an acquisition device and an identification result acquisition device;

the acquisition device is used for acquiring an image to be processed, wherein the image to be processed comprises a target object;

The recognition result obtaining device is used for recognizing the image to be processed based on the image to be processed to obtain a plurality of sub-images, wherein the plurality of sub-images comprise characteristic information of a target object, associating the plurality of sub-images corresponding to the same target object to obtain a plurality of sub-images corresponding to the target object, and recognizing the plurality of sub-images corresponding to the target object based on the characteristic information of the target object to obtain a recognition result aiming at the target object.

Optionally, the display device is further included;

the display device is used for displaying the identification result aiming at the target object.

Correspondingly, the embodiment of the application provides electronic equipment, which comprises:

A processor;

A memory for storing a computer program for execution by a processor for performing an image processing method, the method comprising the steps of:

Correspondingly, an embodiment of the present application provides a computer storage medium storing a computer program to be executed by a processor to perform an image processing method, the method comprising the steps of:

Compared with the prior art, the application has the following advantages:

The embodiment of the application provides an image processing method, which comprises the steps of obtaining an image to be processed, wherein the image to be processed comprises at least two target objects, identifying a plurality of part areas corresponding to the target objects based on the image to be processed to obtain a plurality of sub-images, wherein the plurality of sub-images comprise characteristic information of the target objects, associating the plurality of sub-images corresponding to the same target object with the target object to obtain a plurality of sub-images corresponding to the target object, and identifying the plurality of sub-images corresponding to the target object based on the characteristic information of the target object to obtain an identification result for the target object. When the image comprising the target object is analyzed to obtain the identification result of the target object in the image, the application associates a plurality of sub-images corresponding to the same target object in the image with the target object to obtain a plurality of sub-images corresponding to the target object, thereby realizing that a plurality of characteristic information of the target object corresponds to the target object when a certain target object is identified, and the subsequent identification mode of the target object is simpler and more convenient. The problem that the existing mode for identifying the target object in the image is complex is solved.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings used in the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments described in the present application, and other drawings may be obtained according to these drawings for a person having ordinary skill in the art.

Fig. 1-a is a first schematic diagram of an application scenario of an image processing method provided by the present application.

Fig. 1-B is a second schematic diagram of an application scenario of the image processing method provided by the present application.

Fig. 2 is a flowchart of an image processing method according to a first embodiment of the present application.

Fig. 3 is a schematic diagram of a behavior recognition algorithm framework according to a first embodiment of the present application.

Fig. 4 is a schematic diagram of an image processing apparatus according to a second embodiment of the present application.

Fig. 5 is a flowchart of an image processing method according to a third embodiment of the present application.

Fig. 6 is a flow chart of a traffic image processing method according to a fourth embodiment of the present application.

Fig. 7 is a schematic diagram of an image processing electronic device according to a sixth embodiment of the present application.

Detailed Description

In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present application. The present application may be embodied in many other forms than those herein described, and those skilled in the art will readily appreciate that the present application may be similarly embodied without departing from the spirit or essential characteristics thereof, and therefore the present application is not limited to the specific embodiments disclosed below.

In order to more clearly show the image processing method provided by the application, an application scene of the image processing method provided by the application is introduced. The image processing method provided by the application can be applied to animal identification in zoos for helping animal managers or breeders manage or breed animal scenes.

Fig. 1-a and fig. 1-B are schematic diagrams of application scenarios of the image processing method provided by the present application. Describing the application scenario of the image processing method with reference to fig. 1-a and 1-B, first, an image to be processed is obtained, where the image to be processed may refer to an image to be identified in the present application, and the image to be processed is represented by an image to be analyzed in fig. 1-a and 1-B. The image to be identified may be a video frame of a video file captured based on a video camera. The video camera may refer to a camera device installed in a certain space for collecting real-time behavior data of a target object. For example, in a monkey house in a zoo, in order to observe the daily physical health status and eating habits of a plurality of monkeys, an imaging device is installed in the monkey house, and the number of imaging devices is sufficient to ensure that all monkeys in the monkey house can be collected, and the imaging device can collect audio data and image data of all monkeys in the monkey house. In this scenario, the target is all monkeys in the monkey house. After the video file of the image pickup device is acquired, framing processing is performed on the video file, and a video frame, namely an image to be identified, is obtained. As shown in fig. 1-a, two monkeys are included in the image to be recognized, and of course, an image including a plurality of monkeys or an image including one monkey may be used as the image to be recognized. After the image to be identified is obtained, the image to be identified is subjected to area identification, and a plurality of sub-images are obtained. Specifically, the region including the face, head, and upper body of the monkey in the image to be recognized may be recognized and marked to obtain a plurality of sub-images. Meanwhile, the plurality of sub-images may be further classified, for example, a sub-image corresponding to a face of a monkey may be regarded as a first sub-image, a sub-image corresponding to a head of a monkey may be regarded as a second sub-image, and a sub-image corresponding to an upper body of a monkey may be regarded as a third sub-image. Here, the monkey head may include a monkey face, and the monkey upper body may be a monkey head and a monkey face.

After a plurality of sub-images are obtained, the sub-images are associated with the target objects corresponding to the sub-images. That is, each subimage corresponds to a monkey, and the subimages are associated with the monkeys. Specifically, it is actually that a plurality of sub-images corresponding to the same monkey are associated with the monkey. As shown in fig. 1-a, a first subimage of a first monkey, a second subimage of the first monkey, and a third subimage of the first monkey are obtained from the images to be identified. At the same time, a first sub-image of a second monkey, a second sub-image of the second monkey, and a third sub-image of the second monkey may also be obtained. The first subimage of the first monkey, the second subimage of the first monkey, and the third subimage of the first monkey are all subimages associated with the first monkey. Similarly, the first sub-image of the second monkey, the second sub-image of the second monkey, and the third sub-image of the second monkey are sub-images associated with the second monkey.

Then, a plurality of sub-images associated with each monkey are identified. Specifically, taking a monkey in an identification image as an example, the identification is based on the feature information of the monkey in the sub-image, and a plurality of sub-images corresponding to the monkey are subjected to identification processing to obtain an identification result for the monkey. More specifically, the recognition processing of the plurality of sub-images corresponding to the monkey is performed to obtain a recognition result for the monkey, which means that first, the recognition processing is performed on the plurality of sub-images corresponding to the monkey based on the characteristic information of the monkey, respectively, to obtain a plurality of recognition results. Then, based on the plurality of recognition results, a recognition result for the monkey is obtained.

Specifically, based on the feature information of the monkey, a plurality of sub-images corresponding to the monkey are respectively identified, and a plurality of identification results are obtained, in the following manner:

firstly, performing quality evaluation on a first sub-image corresponding to the monkey and a second sub-image corresponding to the monkey, and performing first key point positioning and second key point positioning on feature information contained in the first sub-image corresponding to the monkey and feature information contained in the second sub-image corresponding to the monkey respectively to obtain first key feature information and second key feature information meeting preset image quality evaluation conditions.

And then, identifying the first key feature information and the second key feature information which meet the preset image quality evaluation condition to obtain a first identification result and a second identification result.

The process of identifying the third sub-image while identifying the first sub-image is described below, first, the third key point location is performed on the feature information included in the third sub-image corresponding to the monkey to obtain third key feature information, and the feature information included in the third sub-image corresponding to the monkey is classified to obtain the category of the feature information of the third sub-image.

And then, obtaining a third identification result based on the third key feature information and the feature information category of the third sub-image. Finally, the first recognition result, the second recognition result and the third recognition result are confirmed to be a plurality of recognition results.

The application scenario of the image processing method described above is merely an embodiment of the application scenario of the image processing method provided by the present application, and the purpose of the application scenario embodiment is to facilitate understanding of the image processing method provided by the present application, and is not intended to limit the image processing method provided by the present application. Other application scenarios of the image processing method in the embodiment of the present application will not be described in detail.

The application provides an image processing method, an image processing device, electronic equipment and a computer storage medium, and the following is a specific embodiment.

Fig. 2 is a flowchart of an embodiment of an image processing method according to the first embodiment of the present application. The method comprises the following steps.

Step S201, obtaining an image to be processed, wherein the image to be processed comprises at least two target objects.

As a first step of the image processing method of the first embodiment, an image to be processed is first obtained, wherein the image to be processed contains a target object. The image to be processed may refer to a video frame image, and as the image to be processed is obtained, a request for obtaining a video file may be first issued to the video acquisition apparatus. And then, receiving the video file sent by the video acquisition device, and acquiring a video frame image according to the video file.

Specifically, taking an example of applying the image processing method to a scene of a zoo, the image to be identified may be a video frame of a video file captured based on a video camera. The video camera may refer to a camera device installed in a certain space for collecting real-time behavior data of a target object. For example, in a monkey house in a zoo, in order to observe the daily physical health status and eating habits of a plurality of monkeys, an imaging device is installed in the monkey house, and the number of imaging devices is sufficient to ensure that all monkeys in the monkey house can be collected, and the imaging device can collect audio data and image data of all monkeys in the monkey house. In this scenario, the target is all monkeys in the monkey house. After the video file of the image pickup device is acquired, framing processing is performed on the video file, and a video frame, namely an image to be identified, is obtained.

In order to timely send the physical health status and the eating habits of all the monkeys in the monkey house to the client of the breeder, real-time shooting monitoring is required to be carried out on a plurality of monkeys in the monkey house, and a plurality of shooting devices can be installed at a plurality of positions in the monkey house so as to ensure that the plurality of shooting devices can monitor all the monkeys in the monkey house.

After the plurality of camera devices collect the image data and the audio data in the monkey house for a period of time, a plurality of video files of the plurality of stored camera devices are sent to a server. Specifically, the breeder may use the client to issue a request for obtaining a video file to each of the plurality of image capturing apparatuses, and each image capturing apparatus, after receiving the request, transmits the respective video file to the client or the server, thereby further realizing the obtaining of the image to be processed in step S201.

When a plurality of video files are transmitted to the client, the client may transmit the plurality of video files to the server. When a plurality of video files are sent to the server, the server can directly obtain the plurality of video files. In either way, multiple video files need to be finally sent to the server for processing of the video files.

Step S202, based on the image to be processed, identifying a plurality of part areas corresponding to the target object to obtain a plurality of sub-images, wherein the plurality of sub-images contain characteristic information of the target object.

After the image to be processed is obtained in step S201, a plurality of location areas corresponding to the target object in the image to be processed are identified, and a plurality of sub-images are obtained, where the plurality of sub-images include feature information of the target object.

Specifically, a plurality of part areas corresponding to a target object in an image to be processed are identified, a plurality of sub-images are obtained, the image to be identified is actually identified in a partitioning manner, and the plurality of sub-images are obtained based on the partitioning identification result.

As an embodiment of identifying a plurality of location areas corresponding to the target object in the image to be processed to obtain a plurality of sub-images, the image to be processed may be used as input data of a convolutional neural network and an image area candidate frame generation network, and a partition result of the image to be processed may be obtained. The convolution neural network and the image area candidate frame generation network are used together as a neural network for partitioning an image to be processed to obtain an image partitioning result.

The convolutional neural network can extract characteristic information of a target object in the image, and the image region candidate frame generation network can partition the image to be identified based on the characteristic information of the target object extracted by the convolutional neural network so as to obtain a partition result. For example, in an image to be recognized including a plurality of monkeys, feature information of the monkeys in the image may be acquired using a convolutional neural network, and facial feature information, head feature information, and upper body feature information of the monkeys may be acquired. After the feature information of the target object is acquired, the image area candidate frame generation network can partition the image to be identified according to the feature information of the target object so as to acquire a partition result. The partitioning result refers to a partitioning result of the image to be recognized, and after the partitioning result of the image to be recognized is obtained, a plurality of sub-images are obtained according to the partitioning result of the image to be recognized. For example, after recognizing the faces, heads, and upper bodies of a plurality of monkeys in an image to be recognized, the recognized faces, heads, and upper bodies are marked in different ways, so that a plurality of sub-images are obtained.

Step S203, associating a plurality of sub-images corresponding to the same target object with the target object to obtain a plurality of sub-images corresponding to the target object.

After identifying a plurality of location areas corresponding to the target object and obtaining a plurality of sub-images, step S202 associates, for each target object, a plurality of sub-images corresponding to the same target object with the target object and obtains a plurality of sub-images corresponding to the target object.

Specifically, as one way of associating a plurality of sub-images corresponding to the same target object with the target object to obtain a plurality of sub-images corresponding to the target object, first, a current target object is obtained. Then, a plurality of sub-images matched with the current target object are determined based on the characteristic information of the target object contained in the sub-images. And finally, establishing an association relation between the plurality of sub-images matched with the current target object and the current target object, and obtaining a plurality of sub-images corresponding to the target object.

For example, when two monkeys are included in the image to be recognized, if only the face, head, and upper body of the monkey in the image are recognized, six sub-images are obtained, which are respectively a sub-image corresponding to the face of the first monkey, a sub-image corresponding to the face of the second monkey, a sub-image corresponding to the head of the first monkey, a sub-image corresponding to the head of the second monkey, a sub-image corresponding to the upper body of the first monkey, and a sub-image corresponding to the upper body of the second monkey.

For each monkey, for example, a monkey a of the two monkeys may be obtained by associating a plurality of sub-images corresponding to monkey a with monkey a to obtain a plurality of sub-images corresponding to monkey a. Specifically, the procedure is to determine a plurality of sub-images matching monkey a based on the feature information of the monkey contained in the sub-images. As the feature information of the monkey included in the sub-images, the plurality of sub-images matching with monkey a may be determined as follows.

First, based on the feature information of the monkey contained in the sub-images, each of the six sub-images is traversed to obtain matching degree information of each sub-image and monkey a. Of course, all the characteristic information of monkey a may be stored in the database in advance, and the characteristic information of the monkey in the plurality of sub-images may be matched with the characteristic information of monkey a stored in the database.

And then, determining a plurality of sub-images matched with the monkey A according to the matching degree information of each sub-image and the monkey A and a preset matching degree threshold condition.

As one embodiment of the above-described determination of a plurality of sub-images matching monkey A based on the degree of matching information of each sub-image with monkey A and a preset degree of matching threshold condition, it is determined whether the degree of matching information of the sub-images with monkey A satisfies the degree of matching threshold condition. If the sub-image satisfies the matching degree threshold condition, the sub-image is determined as a sub-image matching monkey A, and a plurality of sub-images matching monkey A are determined in the same manner.

By the method, each target object can be associated with a plurality of sub-images corresponding to the same target object, and a plurality of sub-images corresponding to the target object are obtained.

Step S204, based on the characteristic information of the target object, the plurality of sub-images corresponding to the target object are identified, and an identification result for the target object is obtained.

After obtaining the plurality of sub-images corresponding to each target object in step S203, based on the feature information of the target object, the plurality of sub-images corresponding to the target object are subjected to recognition processing for each target object, and a recognition result for the target object is obtained.

Specifically, based on the feature information of the target object, for each target object, the plurality of sub-images corresponding to the target object are subjected to the recognition processing, and the recognition result for the target object may be obtained in the manner described below.

First, based on feature information of a target object, a plurality of sub-images corresponding to the target object are respectively identified, and a plurality of identification results are obtained.

Since the plurality of sub-images corresponding to each target object includes the first sub-image and the second sub-image, when the first sub-image and the second sub-image corresponding to the target object are recognized, it is possible to follow the following manner.

And respectively identifying the first sub-image and the second sub-image corresponding to the target object based on the characteristic information of the target object to obtain a first identification result and a second identification result corresponding to the target object.

First, the first sub-image and the second sub-image are respectively subjected to quality evaluation, and the characteristic information contained in the first sub-image and the characteristic information contained in the second sub-image are respectively subjected to first key point positioning and second key point positioning, so that the first key characteristic information and the second key characteristic information which meet the preset image quality evaluation conditions are obtained. And then, identifying the first key feature information and the second key feature information which meet the preset image quality evaluation condition to obtain a first identification result and a second identification result, and identifying the first identification result and the second identification result as a plurality of identification results.

For example, when a face sub-image and a head sub-image of a monkey are recognized, the face sub-image and the head sub-image of the monkey can be used to obtain the expression recognition result and the identity recognition result of the monkey. In practice, after the face sub-image and the head sub-image of the monkey are obtained, the face sub-image and the head sub-image of the monkey are subjected to quality evaluation, and images unsuitable for expression recognition or identity recognition can be filtered out by adopting a quality evaluation mode. For example, some blurred images are not suitable for expression recognition or identity recognition, and the images can be filtered out in advance by a quality evaluation mode so as to improve recognition accuracy.

Since the plurality of sub-images corresponding to each target object includes the third sub-image, when the third sub-image corresponding to the target object is recognized, it is possible to follow the following manner.

And identifying a third sub-image corresponding to the target object based on the characteristic information of the target object, and obtaining a third identification result corresponding to the target object.

As one embodiment of identifying the third sub-image corresponding to the target object based on the feature information of the target object to obtain the third identification result corresponding to the target object, first, third key point positioning is performed on the feature information contained in the third sub-image to obtain third key feature information. And classifying the characteristic information contained in the third sub-image to obtain the category of the characteristic information of the third sub-image, namely performing action classification. Finally, a third recognition result is obtained based on the third key feature information and the feature information category of the third sub-image.

The present embodiment can simultaneously recognize the behavior of each of the plurality of target objects in the plurality of continuous video frames, as shown in fig. 3, which shows an algorithm frame diagram for recognizing the behavior of each of the plurality of target objects in the plurality of continuous video frames, and the recognition method can be applied to a scene in which the behavior of the upper body of each of the plurality of monkeys is simultaneously recognized. In practice, the behavior of each of a plurality of target objects in a plurality of consecutive video frames is identified simultaneously based on the video file. Firstly, inputting a plurality of continuous video frames into a convolutional neural network and a region candidate frame generation network, so that feature graphs of the plurality of continuous video frames can be obtained, then target similarity can be obtained by aligning region features and performing target similarity matching. That is, the algorithm obtains not just the static behavior of a monkey, but the dynamic action behavior of the upper body of a monkey can be identified in continuous video frames. In the identification process, the feature extractor is used for extracting features and the features of a plurality of video frames are subjected to feature fusion, so that the action classification of a monkey in continuous video frames is obtained.

The present embodiment can perform action classification by using the algorithm shown in fig. 3, and after performing action classification, a third recognition result can be obtained based on the third key feature information.

After the first, second, and third recognition results are obtained, the first, second, and third recognition results are taken as a plurality of recognition results.

When the image processing method of the embodiment is adopted, whether the target objects in the image to be processed overlap or not can be judged in advance, and if the target objects overlap, the image to be processed is subjected to segmentation processing. Specifically, the image to be processed may be segmented by adopting a frame regression (Bounding Box) manner, so as to solve the situation that the target objects in the image to be processed overlap, so as to obtain a plurality of sub-images.

In addition, before the image to be processed is obtained, whether the target object in the current image meets the preset condition can be judged in advance, if the target object in the current image does not meet the preset condition, the image of the target object meeting the preset condition is obtained again, and the image of the target object meeting the preset condition is taken as the image to be processed.

Specifically, since the image to be processed of the present embodiment may include at least two target objects, as one case of the preset condition, the number of target objects in the image may be set as the preset condition. Therefore, it can be determined in advance whether the number of target objects in the current image meets the preset number condition. If the number of the target objects in the current image does not meet the preset number condition, the images of which the number of the target objects meets the preset number condition are obtained again, and the images of which the number of the target objects meets the preset number condition are taken as images to be processed. For example, when the number of target objects in the current image is one, the image may be processed without adopting the manner of step S201 to step S204 until the image including at least two target objects is obtained, and the image may be processed with adopting the manner of step S201 to step S204.

In addition, the definition information of the target object in the image may be used as a preset condition, and when the definition of the target object meets the pixel condition capable of being identified, the image of which the definition meets the pixel condition capable of being identified is used as the image to be processed. Of course, it can be understood that the preset conditions may be other conditions related to the target object, which are all within the protection scope of the present application and are not described herein.

Meanwhile, in order to improve accuracy of the recognition result, the recognition result aiming at the target object can be displayed to a user, and feedback information of the user aiming at the recognition result is obtained. For example, when the feedback information of the user is that the recognition result is correct, the recognition result may be directly used as the recognition result for the target object in the image to be processed. And when the feedback information of the user is the error of the identification result, the image to be processed can be identified again.

The embodiment of the application provides an image processing method, which specifically comprises the steps of obtaining an image to be processed, identifying a plurality of part areas corresponding to a target object based on the image to be processed, obtaining a plurality of sub-images, wherein the plurality of sub-images comprise characteristic information of the target object, associating the plurality of sub-images corresponding to the same target object with the target object, obtaining a plurality of sub-images corresponding to the target object, and identifying the plurality of sub-images corresponding to the target object based on the characteristic information of the target object, so as to obtain an identification result aiming at the target object. When an image comprising at least one target object is analyzed to obtain a recognition result of the target object in the image, a plurality of sub-images corresponding to the same target object in the image are associated with the target object for each target object in the image to obtain a plurality of sub-images corresponding to the target object, so that a plurality of characteristic information of the target object are corresponding to the target object when a certain target object is recognized, and a subsequent recognition mode of the target object is simpler and more convenient. The problem that the existing mode for identifying the target object in the image is complex is solved.

In the first embodiment described above, an image processing method is provided, and correspondingly, the present application also provides an image processing apparatus. Fig. 4 is a schematic diagram of an image processing apparatus according to a second embodiment of the present application. Since the apparatus embodiments are substantially similar to the method embodiments, the description is relatively simple, and reference is made to the description of the method embodiments for relevant points. The device embodiments described below are merely illustrative.

The present embodiment provides an image processing apparatus including:

A to-be-processed image obtaining unit 401, configured to obtain an to-be-processed image, where the to-be-processed image includes at least two target objects;

A sub-image obtaining unit 402, configured to identify a plurality of location areas corresponding to the target object based on the image to be processed, and obtain a plurality of sub-images, where the plurality of sub-images include feature information of the target object;

an associating unit 403, configured to associate a plurality of sub-images corresponding to the same target object with the target object, and obtain a plurality of sub-images corresponding to the target object;

And the target object recognition unit 404 is configured to perform recognition processing on the multiple sub-images corresponding to the target object based on the feature information of the target object, so as to obtain a recognition result for the target object.

Optionally, the target object identifying unit is specifically configured to:

Optionally, the image to be processed refers to a video frame image;

The image obtaining unit to be processed is specifically configured to:

sending a request for obtaining a video file to a video obtaining device;

Optionally, the association unit is specifically configured to:

Obtaining a current target object;

Optionally, the association unit is specifically configured to:

Optionally, the sub-image obtaining unit is specifically configured to:

The target object identification unit is specifically configured to:

Optionally, the plurality of sub-images further includes a third sub-image;

The target object identification unit is specifically configured to:

Optionally, the system further comprises a first judging unit and an image segmentation unit, wherein the first judging unit is used for judging whether the target objects are overlapped or not;

the image segmentation unit is used for carrying out segmentation processing on the image to be processed if the target objects are overlapped.

The system comprises a first judging unit, a second judging unit, a first image selecting unit and a second image selecting unit, wherein the first judging unit is used for judging whether a target object in a current image meets preset conditions or not;

The image selection unit is used for re-acquiring an image of which the target object meets the preset condition if the target object in the current image does not meet the preset condition, and taking the image of which the target object meets the preset condition as the image to be processed.

The system comprises a target object, a feedback information acquisition unit, a re-identification unit, a feedback information display unit and a control unit, wherein the target object is used for identifying a target object;

The re-identification unit is used for judging whether to re-identify the identification result according to the feedback information.

The present application also provides an image processing method, as shown in fig. 5, which is a flowchart of an embodiment of an image processing method according to a third embodiment of the present application. The method comprises the following steps.

Step S501, obtaining an image to be processed, wherein the processed image comprises a target object.

Step S502, identifying the image to be processed based on the image to be processed to obtain a plurality of sub-images, wherein the plurality of sub-images contain characteristic information of the target object.

Step S503, associating the plurality of sub-images corresponding to the same target object to obtain a plurality of sub-images corresponding to the target object.

Step S504, based on the characteristic information of the target object, identifying a plurality of sub-images corresponding to the target object to obtain an identification result for the target object.

It should be noted that, the steps in this embodiment are substantially similar to those in the first embodiment, and the difference is that the step S502 directly identifies the object to be processed to obtain a plurality of sub-images, rather than identifying a plurality of part areas corresponding to the target object to obtain a plurality of sub-images. Step S503 associates a plurality of sub-images corresponding to the same target object, instead of associating a plurality of sub-images corresponding to the same target object with the target object. The same parts of this embodiment as those of the first embodiment are referred to in detail for the relevant parts of the first embodiment, and will not be described herein.

Based on the third embodiment, a fourth embodiment of the present application provides a traffic image processing method, as shown in fig. 6, which is a flowchart of an embodiment of the traffic image processing method provided by the fourth embodiment of the present application. The method comprises the following steps.

And step S601, obtaining a traffic image to be processed, wherein the traffic image to be processed comprises vehicles.

Step S602, identifying the traffic image to be processed based on the traffic image to be processed, and obtaining a plurality of sub-images, wherein the plurality of sub-images contain characteristic information of the vehicle.

Step S603, associating the plurality of sub-images corresponding to the same vehicle to obtain a plurality of sub-images corresponding to the vehicle.

Step S604, based on the characteristic information of the vehicle, performing recognition processing on a plurality of sub-images corresponding to the vehicle to obtain a recognition result for the vehicle.

This embodiment enables to identify some vehicle information, such as some faulty vehicles or offending vehicles, etc., from the live traffic image obtained by the image pickup device in the traffic scene.

Based on the third embodiment, a fifth embodiment of the present application provides an image processing apparatus including an acquisition device, a recognition result acquisition device, and a display device.

The acquisition device is used for acquiring an image to be processed, wherein the image to be processed comprises a target object.

In the first embodiment described above, an image processing method is provided, and correspondingly, a sixth embodiment of the present application provides an electronic device corresponding to the method of the first embodiment. As shown in fig. 7, a schematic diagram of the electronic device provided in the present embodiment is shown.

A sixth embodiment of the present application provides an electronic apparatus including:

a processor 701;

A memory 702 for storing a computer program to be executed by a processor for performing an image processing method, the method comprising the steps of:

In the first embodiment described above, there is provided an image processing method, and correspondingly, a seventh embodiment of the present application provides a computer storage medium corresponding to the method of the first embodiment.

A seventh embodiment of the present application provides a computer storage medium storing a computer program that is executed by a processor to perform an image processing method, the method comprising the steps of:

While the application has been described in terms of preferred embodiments, it is not intended to be limiting, but rather, it will be apparent to those skilled in the art that various changes and modifications can be made herein without departing from the spirit and scope of the application as defined by the appended claims.

In one typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory. The memory may include volatile memory in a computer-readable medium, random Access Memory (RAM) and/or nonvolatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of computer-readable media.

1. Computer readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of storage media for a computer include, but are not limited to, phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium, which can be used to store information that can be accessed by a computing device. Computer-readable media, as defined herein, does not include non-transitory computer-readable storage media (non-transitory computer readable storage media), such as modulated data signals and carrier waves.

2. It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

Claims

1. An image processing method, comprising:

2. The method according to claim 1, wherein the image to be processed is a video frame image;

The video frame image is obtained by:

sending a request for obtaining a video file to a video obtaining device;

3. The method according to claim 1, wherein associating the plurality of sub-images corresponding to the same target object with the target object to obtain the plurality of sub-images corresponding to the target object includes:

Obtaining a current target object;

4. A method according to claim 3, wherein determining a plurality of sub-images matching the current target object based on the feature information of the target object contained in the sub-images comprises:

5. The method of claim 4, wherein determining a plurality of sub-images matching the current target object based on the matching degree information of each sub-image and the current target object and a matching degree threshold condition, comprises:

6. The method according to claim 1, wherein the identifying a plurality of part areas corresponding to the target object based on the image to be processed, and obtaining a plurality of sub-images, comprises:

7. The method of claim 1, wherein the plurality of sub-images includes a first sub-image and a second sub-image, wherein the identifying the plurality of sub-images corresponding to the target object based on the feature information of the target object to obtain a plurality of identification results includes:

8. The method of claim 7, wherein the plurality of sub-images further includes a third sub-image, wherein the identifying the plurality of sub-images corresponding to the target object based on the feature information of the target object, respectively, includes:

9. The method as recited in claim 1, further comprising:

judging whether the target objects overlap or not;

10. The method as recited in claim 1, further comprising:

Judging whether a target object in the current image accords with a preset condition or not;

11. The method as recited in claim 1, further comprising:

Displaying the identification result aiming at the target object and obtaining feedback information aiming at the identification result;

12. An image processing apparatus, comprising:

And the target object identification unit is used for respectively carrying out identification processing on the plurality of sub-images corresponding to the target object based on the characteristic information of the target object to obtain a plurality of identification results, and obtaining the identification result aiming at the target object according to the plurality of identification results.

13. An image processing method, comprising:

14. A traffic image processing method, characterized by comprising:

Based on the characteristic information of the vehicle, respectively carrying out recognition processing on the plurality of sub-images corresponding to the vehicle to obtain a plurality of recognition results;

And obtaining the identification result for the vehicle according to the plurality of identification results.

15. An image processing apparatus, characterized by comprising:

the acquisition device is used for acquiring the identification result;

The recognition result obtaining device is used for recognizing the image to be processed based on the image to be processed to obtain a plurality of sub-images, wherein the plurality of sub-images contain characteristic information of a target object;

16. The apparatus according to claim 15, further comprising display means for displaying the recognition result for the target object.

17. An electronic device, comprising:

A processor;

18. A computer storage medium storing a computer program to be executed by a processor to perform an image processing method, the method comprising the steps of: