CN110414514B

Movatterモバイル変換

Info

Publication number: CN110414514B
Application number: CN201910702791.5A
Authority: CN
Inventors: 王旭
Original assignee: Beijing ByteDance Network Technology Co Ltd
Current assignee: Beijing ByteDance Network Technology Co Ltd
Priority date: 2019-07-31
Filing date: 2019-07-31
Publication date: 2021-12-07
Anticipated expiration: 2039-07-31
Also published as: CN110414514A

Abstract

The embodiment of the disclosure provides an image processing method and device, wherein a first area is determined according to the position of a key point of a target object in a current image frame, the first area is amplified by a preset multiple to obtain a second area, the predicted position of the key point in a next image frame is determined according to the second area, and the accurate position of the key point in the next image frame is determined according to the position of the key point in the current image frame and the predicted position of the key point in the next image frame. In the process, the key points in the next image frame are tracked according to the key points of the current image frame, so that the key positions of the next image frame depend on the key points of the current image frame, and each image frame is not processed independently, and the accurate positions of the key points of the next image frame can be accurately determined, so that the aims of reducing the shaking of a shielding area and improving the stability of the shielding area can be fulfilled.

Description

Image processing method and device

Technical Field

The embodiment of the disclosure relates to the technical field of video processing, in particular to an image processing method and device.

Background

With the continuous development of the technology, more and more users shoot videos of interests, smells and the like in life and upload the videos to a network for other users to click and watch.

In general, when a user uploads a shot video to a network, it is desirable to block some relatively private contents in the video, such as a logo (logo), a license plate of a vehicle, and the like, so as to protect privacy. Taking the example of blocking the license plate, before the user uploads the video, the license plate of the vehicle in the video is blocked based on an image segmentation method and the like. According to the method, the area where the vehicle is located is divided from each frame of image of the video, the license plate is extracted from the area where the vehicle is located, and then a rectangular frame formed by four key points of the license plate is covered by using stickers and the like, so that the purpose of shielding the license plate is achieved. And then, the user uploads the video with the blocked license plate to the network.

However, in practice, the vehicle in the video is in a moving state, and in the process of blocking the license plate by the above image separation-based method, since the image segmentation needs to be performed on the license plate in each frame of image, the blocked area in the processed video shakes obviously, and the stability is poor.

Disclosure of Invention

The embodiment of the disclosure provides an image processing method and device, which achieve the purposes of reducing the shaking of a shielded area and improving the stability of the shielded area.

In a first aspect, an embodiment of the present disclosure provides an image processing method, including:

determining a first region based on the position of a key point of a target object in a current image frame in a video, wherein the first region is a region containing the key point of the target object in the current image frame;

expanding the first area by a preset multiple to obtain a second area containing the first area;

determining a predicted location of the keypoint in a next image frame in the video within the second region of the next image frame, the next image frame being an image frame of a sequence of image frames of the video that is adjacent to the current image frame;

determining an exact location of a keypoint in the next image frame based on the location of the keypoint in the current image frame and the predicted location of the keypoint in the next image frame;

and blocking the target object in the next image frame based on the accurate position of the key point in the next image frame.

In a second aspect, an embodiment of the present disclosure provides an image processing apparatus, including:

a first determining module, configured to determine a first region based on a location of a keypoint of a target object in a current image frame in a video, the first region being a region containing the keypoint of the target object in the current image frame;

the amplifying module is used for expanding the first area by preset times to obtain a second area containing the first area;

a prediction module to determine a predicted location of the keypoint in a next image frame in the video within the second region of the next image frame, the next image frame being an image frame of a sequence of image frames of the video that is adjacent to the current image frame;

a second determining module for determining an accurate location of a keypoint in the next image frame based on the location of the keypoint in the current image frame and the predicted location of the keypoint in the next image frame;

and the processing module is used for shielding the target object in the next image frame based on the accurate position of the key point in the next image frame.

In a third aspect, embodiments of the present invention provide an electronic device, a processor, and a memory, where the memory stores a computer program that is executable on the processor, and the computer program, when executed by the processor, causes the electronic device to implement the method according to the first aspect or the various possible implementations of the first aspect.

In a fourth aspect, an embodiment of the present invention provides a storage medium, where instructions are stored, and when the instructions are executed on an electronic device, the electronic device is caused to perform the method according to the first aspect or various possible implementations of the first aspect.

In a fifth aspect, embodiments of the present invention provide a computer program product, which, when run on an electronic device, causes the electronic device to perform the method according to the first aspect or the various possible implementations of the first aspect.

The image processing method and device provided by the embodiment of the disclosure determine a first region according to the position of a key point of a target object in a current image frame, amplify the first region by a preset multiple to obtain a second region, determine a predicted position of the key point in a next image frame according to the second region, and determine the accurate position of the key point in the next image frame according to the position of the key point in the current image frame and the predicted position of the key point in the next image frame, wherein the current image frame and the next image frame are two adjacent image frames in an image frame sequence, the image frame sequence is obtained by performing frame division processing on a video, and the key point in the current image frame and the key point in the next image frame are in one-to-one correspondence. In the process, the key points in the next image frame are tracked according to the key points of the current image frame, so that the key positions of the next image frame depend on the key points of the current image frame, and each image frame is not processed independently, and the accurate positions of the key points of the next image frame can be accurately determined, so that the aims of reducing the shaking of a shielding area and improving the stability of the shielding area can be fulfilled.

Drawings

In order to more clearly illustrate the embodiments of the present disclosure or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present disclosure, and for those skilled in the art, other drawings can be obtained according to the drawings without inventive exercise.

FIG. 1 is a schematic diagram of an operating environment of an image processing method provided by an embodiment of the present disclosure;

fig. 2 is a flowchart of an image processing method provided by an embodiment of the present disclosure;

FIG. 3 is a flow chart of another image processing method provided by the disclosed embodiments;

fig. 4 is a schematic diagram of a current image frame and a next image frame in an image processing method provided by an embodiment of the present disclosure;

fig. 5 is a schematic structural diagram of an image processing apparatus according to an embodiment of the present disclosure;

fig. 6 is a schematic structural diagram of another image processing apparatus according to an embodiment of the present disclosure.

Detailed Description

To make the objects, technical solutions and advantages of the embodiments of the present disclosure more clear, the technical solutions of the embodiments of the present disclosure will be described clearly and completely with reference to the drawings in the embodiments of the present disclosure, and it is obvious that the described embodiments are some, but not all embodiments of the present disclosure. All other embodiments, which can be derived by a person skilled in the art from the embodiments disclosed herein without making any creative effort, shall fall within the protection scope of the present disclosure.

Before uploading the video to the network, the user may wish that certain privacy, such as license plate numbers, etc., be obscured. In the image segmentation-based method, for each image frame of a video, a region where a vehicle is located is segmented from the image frame, then the position of a license plate is extracted from the region where the vehicle is located, and then the position of the license plate is covered by using a sticker and the like, so that the purpose of shielding the license plate is achieved.

However, in practice, the vehicle in the video is in a moving state, and in the process of blocking the license plate by the image separation-based method, the image segmentation is performed on the license plate in each frame of image, so that the blocked area in the processed video shakes obviously, and the stability is poor.

In view of this, the embodiments of the present disclosure provide an image processing method and apparatus, where the key points in the next image frame are tracked according to the key points of the current image frame to obtain the accurate positions of the key points of the target object in the next image frame, so as to determine the area where the target object is located according to the accurate positions of the key points of the target object, and shield the area, thereby achieving the purposes of reducing the shake of the shielded area and improving the stability of the shielded area.

Fig. 1 is a schematic operating environment diagram of an image processing method according to an embodiment of the present disclosure. Referring to fig. 1, anelectronic device 10 establishes a network connection with aserver 20, and theelectronic device 10 has a video shooting capability. A user shoots a video by using theelectronic equipment 10, and before the video is uploaded to theserver 20, the target object in the video is shielded by key point tracking; or, the user uses theelectronic device 10 to shoot a video, and uploads the video to theserver 20, and theserver 20 blocks a target object in the video by using the image processing method according to the embodiment of the present disclosure, and then releases the video for other users to click and watch. Theelectronic device 10 may be, for example, a computer, a notebook, a mobile phone, etc. of a user, and the disclosed embodiment is not limited. Theserver 20 may be, for example, aserver 20 of each video sharing platform.

Next, an image processing method according to an embodiment of the present disclosure will be described in detail with reference to fig. 1. For example, please refer to fig. 2.

Fig. 2 is a flowchart of an image processing method according to an embodiment of the present disclosure. In this embodiment, the image processing method according to the embodiment of the present disclosure is described in detail from the perspective of electronic device interaction, and the embodiment includes:

atblock 101, a first region is determined based on locations of keypoints of a target object in a current image frame in a video, the first region being a region containing keypoints of the target object in the current image frame.

In some embodiments, the first region may be a minimum region containing all the keypoints of the target object in the current image frame.

For example, a piece of video may be processed by framing to obtain an image frame sequence, where a preceding image frame is referred to as a current image frame or a preceding image frame, and a succeeding image frame is referred to as a succeeding image frame after a next image frame. That is, in the embodiments of the present disclosure, the current image frame and the next image frame are relative, and not absolute. For example, a video of 1 minute includes 1500 image frames, which are the 1 st image frame and the 2 nd image frame … … th 1500 th image frame, respectively, so that when the 1 st image frame is the current image frame, the 2 nd image frame is the next image frame, and when the 2 nd image frame is the current image frame, the 3 rd image frame is the next image frame.

For a current image frame, a region, hereinafter referred to as a first region, is determined based on the position of a target object in the image frame. The target object is, for example, an object that the user desires to block, such as a license plate of a vehicle, a logo, and the like. Taking the target object as a license plate as an example, the key points of the license plate are 4 vertices of the license plate, the first region is a rectangular frame including the 4 vertices, for example, the first region is a rectangular frame formed by the 4 vertices.

Atblock 102, the first region is expanded by a preset multiple to obtain a second region comprising the first region.

The preset multiple may be preset, for example, 1.2 times, 1.4 times, 1.5 times, and the like. Taking the preset multiple of 1.5 as an example, in the present frame, in the current image frame, the length and the width of the first region are respectively increased to obtain the second region, the area of the second region is 1.5 times of the area of the first region, and the second region includes the first region. In this way, the second region occupies a larger area in the current image frame, so that the second region necessarily includes the key points of the target object.

Atblock 103, within the second region of a next image frame in the video, a predicted location of the keypoint in the next image frame is determined.

Wherein the next image frame is an image frame adjacent to the current image frame in a sequence of image frames of the video.

Illustratively, the current image frame and the next image frame are, for example, image frames with the same resolution, for example, the current image frame is 640 × 480 image, and the next image frame is also 680 × 480 image. Projecting the second area into the next image frame corresponds to dividing a second area in the next image frame. And extracting key points of the target object in the second area, wherein the positions of the key points of the target object represent the predicted positions of the key points.

Atblock 104, an accurate location of a keypoint in the next image frame is determined based on the location of the keypoint in the current image frame and the predicted location of the keypoint in the next image frame.

For example, the positions of the key points in the current image frame are accurate, and the predicted positions may not be accurate, so that the accurate positions of the key points of the target object in the next image frame need to be determined according to the positions of the key points in the current image frame and the predicted positions of the key points in the next image frame.

Atblock 105, a target object in the next image frame is occluded based on the exact location of the keypoints in the next image frame.

Illustratively, after the accurate positions of the key points of the target object are determined, the region where the target object is located can be obtained according to the accurate positions of the key points, and then the region is shielded.

The image processing method provided by the embodiment of the disclosure determines a first area according to the position of a key point of a target object in a current image frame, enlarges the first area by a preset multiple to obtain a second area, determines a predicted position of a key point in a next image frame according to the second area, and determines an accurate position of the key point in the next image frame according to the position of the key point in the current image frame and the predicted position of the key point in the next image frame, wherein the current image frame and the next image frame are two adjacent image frames in an image frame sequence, the image frame sequence is obtained by performing frame division processing on a video, and the key point in the current image frame and the key point in the next image frame are in one-to-one correspondence. In the process, the key points in the next image frame are tracked according to the key points of the current image frame, so that the key positions of the next image frame depend on the key points of the current image frame, and each image frame is not processed independently, and the accurate positions of the key points of the next image frame can be accurately determined, so that the aims of reducing the shaking of a shielding area and improving the stability of the shielding area can be fulfilled.

Fig. 3 is a flowchart of another image processing method provided in the embodiment of the present disclosure, where the embodiment includes:

inblock 201, a target carrier is detected from a current image frame, the target carrier carrying the target object thereon.

Illustratively, before a segment of video is uploaded to the network, the segment of video is subjected to framing processing to obtain a sequence of image frames. Thereafter, a target carrier is detected from the current image frame. The current image frame may be an original image frame subjected to frame division processing, or may be an image frame subjected to preprocessing on the original image frame, where the preprocessing includes rotation, cropping, grayscale processing, and the like.

In the process of detecting the target carrier from the current image frame, the target carrier in the current image frame may be detected by using a preset model. The preset model can be obtained by training in a machine learning mode and the like. Taking a target carrier as an example, in the process of training a preset model, firstly, obtaining a certain number of vehicle pictures, and labeling the vehicles on each picture to obtain a training sample; then, inputting the training sample into a preset model, so that the model detects the vehicle on the picture, continuously modifying parameters of the model according to the detection result, and stopping training when the model can accurately detect the training sample with a preset proportion, thereby obtaining the preset model according to the embodiment of the disclosure.

It should be noted that the training process is an exemplary training method, and the embodiment of the present disclosure is not limited, and other training methods may also be used to obtain the preset model.

Atblock 202, determining whether the direction of the target carrier is a preset direction for making the target object visible in the current image frame, if the direction of the target carrier is the preset direction, executingblock 203; and if the direction of the target carrier is not the preset direction, ending the process.

For example, in a video segment, since the target carrier is running and the target object is carried on the target carrier, the target object needs to be blocked only if the target object can be seen by the user, and whether the target object is visible to the user is determined by the direction of the target carrier. Therefore, for an image frame containing a target carrier in an image frame sequence, the direction of the target carrier needs to be determined, and if the direction of the target carrier is a preset direction, 203 is executed to detect a target object from the target carrier; otherwise, if the direction of the target carrier is not the preset direction, it indicates that the target object on the target carrier is invisible to the user, and the image frame does not need to be processed by the method of the present disclosure, and the process is ended.

Taking the target carrier as an example of a vehicle, the direction of the vehicle includes but is not limited to forward direction, forward direction inclination, transverse direction, backward direction inclination, etc., and when the direction of the vehicle is transverse direction, the license plate of the vehicle is not visible to the user; the license plate of the vehicle is visible to the user when the vehicle is oriented in a forward direction, a forward tilt, a backward direction, a backward tilt, etc. In the process of detecting the direction of the target carrier, the direction of the target carrier can be detected and identified by using the classification model. Wherein, the classification model can be a pre-trained model. In the process of training a classification model, firstly, a certain number of vehicle pictures are obtained, and the directions of the vehicles on the pictures are labeled, wherein the labeled directions comprise forward inclination, transverse inclination, backward inclination and the like; and then, inputting the marked picture as a training sample into the model, so that the model detects the direction of the vehicle on the picture, continuously correcting the parameters of the model according to the detection result, and stopping training until the model can accurately detect the training sample with a preset proportion, thereby obtaining the classification model of the embodiment of the disclosure.

It should be noted that the training process is an exemplary training method, and the embodiments of the present disclosure are not limited to this, and other training methods may also be used to obtain the classification model.

In response to the direction of the target carrier being the preset direction, the target object is detected from the target carrier atblock 203.

At block 204, based on the target object, the locations of the key points of the target object are determined from the current image frame, and a first region is determined based on the locations.

For example, if the target object is a license plate, the key points of the target object are four vertices of the license plate, and the positions of the vertices in the current image frame are determined respectively.

It should be noted that, in the embodiment of the present disclosure, the current image frame includes two cases: the method comprises the following steps that firstly, a current image is an image frame which contains a target object firstly in an image frame sequence, or one or more image frames before the current image frame do not contain the target object; in case two, one or more image frames prior to the current image frame contain the target object.

Take a 1-minute video segment as an example, the video segment contains a moving object carrier, such as a vehicle, on which an object, such as a license plate, is carried. When the license plate needs to be shielded, the video is subjected to framing processing to obtain 1500 image frames, wherein some image frames in the 1500 image frames contain the target object, some image frames do not contain the target object, and the image frames which do not contain the target object also comprise image frames which contain the target carrier but are invisible to a user by the target object on the target carrier, image frames which do not contain the target carrier and the like. Assuming that the 1500 image frames are the 1 st image frame and the 2 nd image frame … … 1500 th image frame in sequence, when the image frame satisfying the first condition, that is, the first image frame containing the target object and the previous one or more image frames not containing the target object, in the 1500 image frames are taken as the current image frame, the blocks 201-203 need to be executed, that is, the target carrier is detected from the current image frame, then the target object is detected from the target carrier, and finally the key point of the target object is determined according to the position of the target object; for the image frames meeting the second condition, when the image frames are used as the current image frame, the accurate positions determined by the image processing method according to the embodiment of the disclosure in the image frames are used as the positions of the key points of the target object in the image frames without executing theblocks 201 to 203, and the first area is determined according to the positions.

For example, if the 1 st frame image is an image frame containing a target object, the current image frame is the 1 st frame image, and if the 2 nd frame image and each of the subsequent frame images both contain the target object, the 2 nd frame image is the next image frame when the current image frame is the 1 st frame image; thereafter, the 2 nd frame image is the current image frame and the 3 rd frame image is the next image frame … …. However, blocks 201 to 203 are performed only for the 1 st frame image; and when other image frames are used as the current image frame, taking the accurate position determined by the image processing method according to the embodiment of the disclosure in the image frame as the position of the key point of the target object in the image frame, and determining the first area according to the position.

For another example, in 1500 image frames, if each image frame from the 1 st image frame to the 100 th image frame does not include the target object, and the 101 st image frame includes the target object, the above-mentionedblocks 201 to 203 are performed on the 101 st image frame.

For another example, if the 1 st to 100 th image frames include the target object, the 101 st to 200 th image frames do not include the target object, and the 201 st to 1500 th image frames include the target object, the 1 st image frame and the 201 st image frame need to be executed with theblocks 201 to 203, and the 2 nd to 100 th image frames and the 202 nd to 1500 th image frames need not to be executed with theblocks 201 to 203.

Atblock 205, the first region is expanded by a preset multiple to obtain a second region comprising the first region.

Illustratively, the first region is enlarged by 1.2 times, 1.4 times, 1.5 times, etc., to obtain a second region, so that the target object can fall into the second region when the second region is projected onto the next image frame. Fig. 4 is a schematic diagram of a current image frame and a next image frame in an image processing method provided by an embodiment of the present disclosure. Referring to fig. 4, in the vehicle shown in the figure, a solid-line rectangular frame is a first region, a dashed-line rectangular frame is a second region, the first region includes a license plate, and the second region includes the first region, that is, the second region includes a peripheral region of the license plate in addition to the license plate. In fig. 4, the black dots in the first area are the key dots.

Atblock 206, within the second region of a next image frame, a predicted location of the keypoint in the next image frame is determined.

Illustratively, for the next image frame, the key points of the target object are extracted in the second area, and the predicted positions of the key points of the target object in the next image frame are obtained.

207. A first area of a first graph formed by key points in the current image frame is determined.

208. Determining a second area of a second graph formed by predicted locations of keypoints in the next image frame.

Illustratively, a second area of the second image formed by the predicted location of each keypoint in the next image frame is calculated.

209. And determining a first ratio according to the first area of the first graph, the second area of the second graph and the third area of the third graph.

Wherein the third pattern is an overlapping region of the first pattern and the second pattern, and the first ratio is the third area/(first area + second area).

Referring to fig. 4, (a) in fig. 4 is a current image frame, where key points in the current image frame are all located in a first region, and the key points are sequentially connected to form a first image, where the first image has a first area; fig. 4 (b) is a second area of the second image formed by the predicted positions of the key points in the next image frame, and when the two image frames are aligned, the position of the vehicle changes due to the movement of the vehicle, so that the position of the license plate changes, and the first image and the second image do not completely coincide with each other but have a partial overlapping region, where a graph formed by the overlapping region is a third graph.

210. And determining a first weight according to the first ratio.

Wherein the first weight is obtained by exponentiating the first ratio.

Illustratively, the first ratio is squared to 6 to obtain a first weight.

211. And determining a second weight according to the first weight.

Illustratively, 1 is subtracted from the first weight to obtain a second weight.

212. And determining the accurate position of the key point in the next frame based on the first weight, the second weight, the coordinates of the key point in the current image frame and the coordinates of the key point in the next image frame.

Exemplarily, the key points in the current image frame correspond to the key points in the next image frame one to one, and for any one key point, hereinafter referred to as a target key point, the coordinates of the target key point in the current image frame are multiplied by the first weight to obtain a first product; multiplying the coordinates of the target key points in the next image frame by the second weight to obtain a second product; and summing the first product and the second product to obtain the coordinate corresponding to the accurate position of the target key point in the next image frame.

Taking the target object as the license plate as an example, the license plate has at least 4 key points, and the 4 key points are 4 vertexes of the license plate respectively. Assuming that the target key point is a key point at the upper left corner, and the coordinates comprise an abscissa and an ordinate, when the accurate position of the key point at the upper left corner is determined, firstly determining the product of the abscissa of the target key point in the current image frame and the first weight to obtain a first product; determining the product of the abscissa of the target key point in the next image frame and the second weight to obtain a second product; and finally, determining the abscissa of the target key point in the next image frame according to the first product and the second product. The process of determining the ordinate of the target keypoint is similar to the process of determining the abscissa. Finally, the accurate position of the target key point can be determined according to the abscissa and the ordinate.

213. And shielding the target object in the next image frame according to the accurate position of the key point in the next image frame.

In addition, in the above embodiment, when determining the predicted position of the key point of the target object in the next image frame according to the second region, it is necessary to first determine whether the next image frame includes the target object, and only when the next image frame includes the target object, the predicted position of the key point of the target object in the next image frame is determined according to the second region.

For example, a probability model may be preset, the next image frame may be input to the probability model, and if the probability of the output of the probability model is greater than the preset probability, it is determined that the next image frame includes the target object, otherwise, it is determined that the next image frame does not include the target object.

The following are embodiments of the disclosed apparatus that may be used to perform embodiments of the disclosed methods. For details not disclosed in the embodiments of the apparatus of the present disclosure, refer to the embodiments of the method of the present disclosure.

Fig. 5 is a schematic structural diagram of an image processing apparatus provided in an embodiment of the present disclosure, where the image processing apparatus may be implemented by software and/or hardware. As shown in fig. 5, the image processing apparatus 100 includes:

a first determiningmodule 11, configured to determine a first region based on a location of a keypoint of a target object in a current image frame in a video, where the first region is a region containing the keypoint of the target object in the current image frame;

the amplifyingmodule 12 is configured to expand the first region by a preset multiple to obtain a second region including the first region;

aprediction module 13, configured to determine a predicted position of the keypoint in a next image frame in the video within the second region of the next image frame, where the next image frame is an image frame adjacent to the current image frame in an image frame sequence of the video;

a second determiningmodule 14, configured to determine an accurate location of a keypoint in the next image frame based on the location of the keypoint in the current image frame and the predicted location of the keypoint in the next image frame;

and theprocessing module 15 is configured to block the target object in the next image frame based on the accurate position of the key point in the next image frame.

In a possible design, the second determiningmodule 14 is configured to determine a first area of a first graph formed by the key points in the current image frame; determining a second area of a second graph formed by the predicted positions of the key points in the next image frame; determining a first ratio according to a first area of the first graph, a second area of the second graph and a third area of a third graph, wherein the third graph is an overlapping area of the first graph and the second graph; determining a first weight according to the first ratio; determining a second weight according to the first weight; and determining the accurate position of the key point in the next frame based on the first weight, the second weight, the coordinates of the key point in the current image frame and the coordinates of the key point in the next image frame.

In a possible design, the first determiningmodule 11, before determining the first region according to the position of the key point of the target object in the current image frame, is further configured to detect a target carrier from the current image frame, the target carrier carrying the target object, determine whether the direction of the target carrier is a preset direction for making the target object visible in the current image frame, detect the target object from the target carrier in response to the direction of the target carrier being the preset direction, and determine the position of the key point of the target object from the current image frame according to the target object.

In one possible design, the current image frame and the next image frame have the same resolution.

In one possible design, the target object includes a license plate.

In one possible design, the first region is the smallest region that contains all the keypoints of the target object in the current image frame.

For the image processing apparatus provided in the embodiment of the present disclosure, the implementation principle and the technique thereof can be referred to the above method embodiment, and are not described herein again.

Fig. 6 is a schematic structural diagram of another image processing apparatus according to an embodiment of the present disclosure. As shown in fig. 6, the image processing apparatus 200 includes:

at least oneprocessor 21 andmemory 22;

thememory 22 stores computer-executable instructions;

the at least oneprocessor 21 executes computer-executable instructions stored by thememory 22 to cause the at least oneprocessor 21 to perform the image processing method as described above.

For a specific implementation process of theprocessor 21, reference may be made to the above method embodiments, which implement similar principles and technical effects, and this embodiment is not described herein again.

Optionally, the image processing apparatus 200 further includes acommunication section 23. Theprocessor 21, thememory 22, and thecommunication unit 23 may be connected by abus 24.

The embodiment of the present disclosure also provides a storage medium, in which computer execution instructions are stored, and when executed by a processor, the computer execution instructions are used to implement the image processing method as described above.

The embodiment of the present disclosure also provides a computer program product, which, when running on an electronic device, causes the electronic device to execute the image processing method as described above.

In the above embodiments, it should be understood that the described apparatus and method may be implemented in other ways. For example, the above-described device embodiments are merely illustrative, and for example, the division of the modules is only one logical division, and other divisions may be realized in practice, for example, a plurality of modules may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or modules, and may be in an electrical, mechanical or other form.

The modules described as separate parts may or may not be physically separate, and parts displayed as modules may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment.

In addition, functional modules in the embodiments of the present disclosure may be integrated into one processing unit, or each module may exist alone physically, or two or more modules are integrated into one unit. The unit formed by the modules can be realized in a hardware form, and can also be realized in a form of hardware and a software functional unit.

The integrated module implemented in the form of a software functional module may be stored in a computer-readable storage medium. The software functional module is stored in a storage medium and includes several instructions to enable an electronic device (which may be a personal computer, a server, or a network device) or a processor (english: processor) to execute the partial blocks of the methods according to the embodiments of the present disclosure.

It should be understood that the processor may be a Central Processing Unit (CPU), other general purpose processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), etc. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The blocks of the disclosed methods may be implemented directly in a hardware processor or in a combination of hardware and software modules within a processor.

The memory may comprise a high-speed RAM memory, and may further comprise a non-volatile storage NVM, such as at least one disk memory, and may also be a usb disk, a removable hard disk, a read-only memory, a magnetic or optical disk, etc.

The bus may be an Industry Standard Architecture (ISA) bus, a Peripheral Component Interconnect (PCI) bus, an Extended ISA (EISA) bus, or the like. The bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, the buses in the figures of the present disclosure are not limited to only one bus or one type of bus.

The storage medium may be implemented by any type or combination of volatile or non-volatile memory devices, such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disks. A storage media may be any available media that can be accessed by a general purpose or special purpose computer.

An exemplary storage medium is coupled to the processor such the processor can read information from, and write information to, the storage medium. Of course, the storage medium may also be integral to the processor. The processor and the storage medium may reside in an Application Specific Integrated Circuits (ASIC). Of course, the processor and the storage medium may reside as discrete components in a terminal or server.

Those of ordinary skill in the art will understand that: all or a portion of the blocks to implement the above-described method embodiments may be implemented by hardware associated with program instructions. The program may be stored in a computer-readable storage medium. When executed, performs the blocks comprising the method embodiments described above; and the aforementioned storage medium includes: various media that can store program codes, such as ROM, RAM, magnetic or optical disks.

Finally, it should be noted that: the above embodiments are only used for illustrating the technical solutions of the present disclosure, and not for limiting the same; while the present disclosure has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art will understand that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present disclosure.

Claims

1. An image processing method, comprising:

blocking a target object in the next image frame based on the accurate position of the key point in the next image frame;

the determining an accurate location of a keypoint in the next image frame based on the location of the keypoint in the current image frame and the predicted location of the keypoint in the next image frame comprises:

determining a first area of a first graph formed by key points in the current image frame;

determining a second area of a second graph formed by the predicted positions of the key points in the next image frame;

determining a first ratio of a third area of a third graphic to a sum of the first area and the second area, the third graphic being an overlapping region of the first graphic and the second graphic;

determining a first weight according to the first ratio;

determining a second weight according to the first weight;

determining a first product of the coordinates of the key point in the current image frame and the first weight and a second product of the coordinates of the key point in the next image frame and the second weight;

and determining the accurate position of the key point in the next frame of image according to the sum of the first product and the second product.

2. The method of claim 1, wherein before determining the first region based on the locations of the key points of the target object in the current image frame in the video, further comprising:

detecting a target carrier from the current image frame, wherein the target carrier carries the target object;

determining whether an orientation of the target carrier is a preset orientation in which the target object is visible in the current image frame;

detecting the target object from the target carrier in response to the direction of the target carrier being the preset direction;

and according to the target object, determining the positions of key points of the target object from the current image frame.

3. The method of claim 1 or 2, wherein the current image frame and the next image frame are of the same resolution.

4. The method of claim 1 or 2, wherein the target object comprises a license plate.

5. The method according to claim 1 or 2, wherein the first region is the smallest region containing all keypoints of a target object in the current image frame.

6. An image processing apparatus characterized by comprising:

the processing module is used for shielding a target object in the next image frame based on the accurate position of the key point in the next image frame;

the second determining module is specifically configured to determine a first area of a first graph formed by the key points in the current image frame; determining a second area of a second graph formed by the predicted positions of the key points in the next image frame; determining a first ratio of a third area of a third graphic to a sum of the first area and the second area, the third graphic being an overlapping region of the first graphic and the second graphic; determining a first weight according to the first ratio; determining a second weight according to the first weight; determining a first product of the coordinates of the key point in the current image frame and the first weight and a second product of the coordinates of the key point in the next image frame and the second weight; and determining the accurate position of the key point in the next frame of image according to the sum of the first product and the second product.

7. An electronic device, comprising a processor and a memory, the memory having stored thereon a computer program executable on the processor, the computer program, when executed by the processor, causing the electronic device to carry out the method of any one of claims 1-5.

8. A storage medium having stored therein instructions that, when executed on an electronic device, cause the electronic device to perform the method of any one of claims 1-5.