CN115689967A

Movatterモバイル変換

Info

Publication number: CN115689967A
Application number: CN202110830772.8A
Authority: CN
Inventors: 翟世平; 高雪松; 陈维强; 曲磊
Original assignee: Hisense Group Holding Co Ltd
Current assignee: Hisense Group Holding Co Ltd
Priority date: 2021-07-22
Filing date: 2021-07-22
Publication date: 2023-02-03

Abstract

The application discloses an image processing method, system, device, equipment and medium, which are used for solving the problem that the background area in the existing image is determined inaccurately. Due to the fact that the edge detection model is trained in advance, correlation between regions containing target characters in two adjacent frames of images can be considered through the edge detection model, the first pixel points belonging to the target characters in the images are determined, after the first regions containing the target characters in the images are obtained, the first pixel points belonging to the target characters in the determined first regions are more accurate through the edge detection model based on the first regions and the second regions containing the target characters in the previous frames of images, further the background regions in the images determined according to the first pixel points are more accurate subsequently, and the target characters in the images and the background regions are effectively and accurately segmented.

Description

Translated fromChinese

一种图像处理方法、系统、装置、设备及介质An image processing method, system, device, equipment and medium

技术领域technical field

本申请涉及图像处理技术领域，尤其涉及一种图像处理方法、系统、装置、设备及介质。The present application relates to the technical field of image processing, and in particular to an image processing method, system, device, device and medium.

背景技术Background technique

在目前视频通话领域中，为了保护用户隐私，或提高通话过程的趣味性，越来越多的应用会采用目标分割技术，对视频所包含的图像中除目标人物之外的区域(记为背景)进行替换。In the current field of video calls, in order to protect user privacy or improve the fun of the call process, more and more applications will use target segmentation technology to segment the areas of the images included in the video (denoted as background) ) to replace.

为了实现对视频中每帧图像中的背景进行替换，针对获取到的每帧图像，确定当前获取到的图像中目标人物的第一区域，然后通过预先训练的图像分割模型，基于该第一区域，确定该第一区域中归属于目标人物的第一像素点。根据该第一像素点，确定该图像中的背景区域。后续根据背景区域在预存的目标背景图像中对应的像素点，对该背景区域中包含的像素点进行更新。对于上述方法，由于视频是由连续多帧图像构成的，每帧图像中的背景区域可能会不断发生变化，造成确定出的归属于目标人物的第一像素点不准确，进而造成确定的图像中的背景区域也不准确，影响后续背景替换的效果的问题。In order to replace the background in each frame of image in the video, for each frame of image acquired, determine the first area of the target person in the currently acquired image, and then use the pre-trained image segmentation model based on the first area , to determine the first pixel in the first region that belongs to the target person. According to the first pixel, the background area in the image is determined. Subsequently, the pixels contained in the background area are updated according to the corresponding pixels of the background area in the pre-stored target background image. For the above method, since the video is composed of consecutive multiple frames of images, the background area in each frame of images may change continuously, resulting in inaccurate determination of the first pixel point belonging to the target person, which in turn causes The background area is also inaccurate, which affects the effect of subsequent background replacement.

发明内容Contents of the invention

本申请提供了一种图像处理方法、系统、装置、设备及介质，用以解决现有确定图像中的背景区域不准确的问题。The present application provides an image processing method, system, device, equipment and medium, which are used to solve the problem of inaccurate determination of the background area in the existing image.

第一方面，本申请提供了一种图像处理方法，所述方法包括：In a first aspect, the present application provides an image processing method, the method comprising:

确定获取到的图像中包含目标人物的第一区域；Determining the first region containing the target person in the acquired image;

通过预先训练的边缘检测模型，基于所述第一区域以及上一帧图像中包含所述目标人物的第二区域，确定所述第一区域中归属于所述目标人物的第一像素点；Using a pre-trained edge detection model, based on the first region and the second region containing the target person in the previous frame of image, determine the first pixel in the first region that belongs to the target person;

根据所述第一像素点，确定所述图像中的背景区域。Determine a background area in the image according to the first pixel.

第二方面，本申请提供了一种图像处理系统，所述系统包括用于执行上述所述方法的服务器以及用于采集图像并发送的智能设备。In a second aspect, the present application provides an image processing system, the system includes a server for executing the above method and an intelligent device for collecting and sending images.

第三方面，本申请提供了一种图像处理装置，所述装置包括：In a third aspect, the present application provides an image processing device, the device comprising:

预处理单元，用于确定获取到的图像中包含目标人物的第一区域；A preprocessing unit, configured to determine the first region containing the target person in the acquired image;

第一处理单元，用于通过预先训练的边缘检测模型，基于所述第一区域以及上一帧图像中包含所述目标人物的第二区域，确定所述第一区域中归属于所述目标人物的第一像素点；The first processing unit is configured to determine that the target person in the first region belongs to the target person based on the first region and the second region containing the target person in the previous image frame by using a pre-trained edge detection model the first pixel of

第二处理单元，用于根据所述第一像素点，确定所述图像中的背景区域。The second processing unit is configured to determine a background area in the image according to the first pixel points.

第四方面，本申请提供了一种电子设备，所述电子设备至少包括处理器和存储器，所述处理器用于执行存储器中存储的计算机程序时实现如上述所述图像处理方法的步骤。In a fourth aspect, the present application provides an electronic device, the electronic device includes at least a processor and a memory, and the processor is configured to implement the steps of the above-mentioned image processing method when executing a computer program stored in the memory.

第五方面，本申请提供了一种计算机可读存储介质，其存储有计算机程序，所述计算机程序被处理器执行时实现如上述所述图像处理方法的步骤。In a fifth aspect, the present application provides a computer-readable storage medium, which stores a computer program, and when the computer program is executed by a processor, the steps of the above-mentioned image processing method are implemented.

由于预先训练有边缘检测模型，通过该边缘检测模型可以考虑相邻两帧图像中包含目标人物的区域之间的相关性，来确定图像中归属于目标人物的第一像素点，使得在获取到图像中包含目标人物的第一区域后，通过该边缘检测模型，基于该第一区域以及上一帧图像中包含目标人物的第二区域，所确定的第一区域中归属于目标人物的第一像素点更准确，进而使得后续根据该第一像素点所确定的图像中的背景区域也更准确，有效地将图像中的目标人物与背景区域准确地分割出来。Due to the pre-trained edge detection model, the edge detection model can consider the correlation between the regions containing the target person in two adjacent frames of images to determine the first pixel in the image that belongs to the target person, so that the acquired After the image contains the first area of the target person, through the edge detection model, based on the first area and the second area containing the target person in the previous frame image, the determined first area belongs to the first area of the target person. The pixels are more accurate, so that the background area in the image subsequently determined based on the first pixel is also more accurate, effectively and accurately segmenting the target person in the image from the background area.

附图说明Description of drawings

为了更清楚地说明本申请的技术方案，下面将对实施例描述中所需要使用的附图作简要介绍，显而易见地，下面描述中的附图仅仅是本申请的一些实施例，对于本领域的普通技术人员来讲，在不付出创造性劳动的前提下，还可以根据这些附图获得其他的附图。In order to illustrate the technical solution of the present application more clearly, the accompanying drawings that need to be used in the description of the embodiments will be briefly introduced below. Obviously, the accompanying drawings in the following description are only some embodiments of the present application. Ordinary technicians can also obtain other drawings based on these drawings on the premise of not paying creative work.

图1为本申请一些实施例提供的一种图像处理过程示意图；FIG. 1 is a schematic diagram of an image processing process provided by some embodiments of the present application;

图2为本申请一些实施例提供的具体的确定图像中包含目标人物的区域的流程示意图；Fig. 2 is a schematic flow diagram of specific determination of the region containing the target person in the image provided by some embodiments of the present application;

图3为本申请一些实施例提供的具体的确定图像中属于目标人物的像素点的流程示意图；FIG. 3 is a schematic flow diagram of specific determination of pixels belonging to a target person in an image provided by some embodiments of the present application;

图4为本申请一些实施例提供的具体的图像处理的流程示意图；FIG. 4 is a schematic flow chart of specific image processing provided by some embodiments of the present application;

图5为本申请一些实施例提供的具体的图像处理流程示意图；FIG. 5 is a schematic diagram of a specific image processing flow provided by some embodiments of the present application;

图6为本申请一些实施例提供的一种图像处理系统的结构示意图；FIG. 6 is a schematic structural diagram of an image processing system provided by some embodiments of the present application;

图7为本申请一些实施例提供的一种图像处理装置的结构示意图；FIG. 7 is a schematic structural diagram of an image processing device provided by some embodiments of the present application;

图8为本申请一些实施例提供的一种电子设备结构示意图。Fig. 8 is a schematic structural diagram of an electronic device provided by some embodiments of the present application.

具体实施方式Detailed ways

为了提高确定图像中背景区域的准确性，本申请提供了一种图像处理方法、装置、设备及介质。In order to improve the accuracy of determining the background area in the image, the present application provides an image processing method, device, equipment and medium.

为了使本申请的目的、技术方案和优点更加清楚，下面将结合附图本申请作进一步地详细描述，显然，所描述的实施例仅是本申请的一部分实施例，而不是全部的实施例。基于本申请中的实施例，本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例，都属于本申请保护的范围。In order to make the purpose, technical solution and advantages of the application clearer, the application will be further described in detail below in conjunction with the accompanying drawings. Obviously, the described embodiments are only some of the embodiments of the application, not all of them. Based on the embodiments in this application, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the scope of protection of this application.

在实际应用过程中，当用户希望在视频通话过程中替换视频中每帧图像的背景时，用户可以通过操作智能设备的显示屏上的背景替换的图标，或对智能设备输入背景变换的语音信息，以控制智能设备直接对摄像头采集的图像，进行相应的处理，以实现对采集的图像中背景的替换；也可以是用户通过操作智能设备的显示屏上的背景录入的图标，或对智能设备输入背景替换的语音信息，以使智能设备将获取到的图像发送至电子设备，以便电子设备对接收到的图像进行处理，从而实现对图像中的背景进行替换。In the actual application process, when the user wants to replace the background of each frame of the image in the video during the video call, the user can operate the background replacement icon on the display screen of the smart device, or input the voice information of the background change to the smart device , to control the smart device to directly process the image captured by the camera to replace the background in the captured image; it can also be an icon entered by the user through the background on the display screen of the smart device, or to the smart device Input the voice information for background replacement, so that the smart device sends the acquired image to the electronic device, so that the electronic device processes the received image, so as to replace the background in the image.

图1为本申请一些实施例提供的一种图像处理过程示意图，该过程包括：FIG. 1 is a schematic diagram of an image processing process provided by some embodiments of the present application. The process includes:

S101：确定获取到的图像中包含目标人物的第一区域。S101: Determine a first region in an acquired image that contains a target person.

本申请提供的图像处理方法应用于电子设备，该电子设备可以是试衣镜、智能手机、电视等智能设备，也可以是服务器，比如，家庭大脑。The image processing method provided in this application is applied to electronic devices, which may be smart devices such as fitting mirrors, smart phones, and televisions, or servers, such as home brains.

在本申请中，电子设备获取到的图像，可以是自身的摄像头采集的，也可以是其他智能设备发送的图像。当电子设备获取到图像之后，基于本申请提供的图像处理方法，对该图像进行相应的处理。In this application, the images acquired by an electronic device may be collected by its own camera, or may be images sent by other smart devices. After the electronic device acquires the image, it performs corresponding processing on the image based on the image processing method provided in this application.

在一种可能的实施方式中，若电子设备为服务器，当用户希望在视频通话过程中替换视频中每帧图像的背景时，用户可以通过操作智能设备的显示屏上的背景替换的图标，或对智能设备输入背景变换的语音信息，向智能设备输入背景替换的请求。智能设备接收到用户输入的背景替换的请求后，将通过摄像头采集到的每帧图像依次发送至服务器。服务器获取到图像后，基于本申请提供的图像处理方法，对该图像进行相应的处理，确定该图像中包含的背景区域。In a possible implementation, if the electronic device is a server, when the user wishes to replace the background of each frame of image in the video during the video call, the user can operate the background replacement icon on the display screen of the smart device, or Voice information for background change is input to the smart device, and a background replacement request is input to the smart device. After receiving the background replacement request input by the user, the smart device sends each frame of image collected by the camera to the server in turn. After acquiring the image, the server performs corresponding processing on the image based on the image processing method provided by the present application to determine the background area included in the image.

由于是对图像中的背景进行替换，只保留图像中包含的目标人物。因此，当电子设备获取到图像后，可以先识别图像中归属于目标人物的像素点(记为第一像素点)。而为了减少计算量，帮助后续快速且准确地确定图像中归属于目标人物的第一像素点，可以先对获取到的图像进行预处理，获取该图像中包含目标人物的区域(记为第一区域)。其中，该第一区域中可以包括归属于目标人物的第一像素点，也可以包括不归属于该目标人物的像素点。Since the background in the image is replaced, only the target person contained in the image is retained. Therefore, after the electronic device acquires the image, it can first identify the pixel points (denoted as the first pixel points) belonging to the target person in the image. In order to reduce the amount of calculation and help to quickly and accurately determine the first pixel point belonging to the target person in the image, the acquired image can be preprocessed first to obtain the area containing the target person in the image (denoted as the first area). Wherein, the first region may include first pixel points belonging to the target person, or may include pixel points not belonging to the target person.

需要说明的是，该第一区域可以是矩形、圆形、椭圆形等规则形状，也可以是不规则形状，具体实施过程中，可以根据实际需求进行灵活设置。It should be noted that the first area may be in a regular shape such as a rectangle, a circle, or an ellipse, or may be in an irregular shape, and it may be flexibly set according to actual requirements during specific implementation.

在一种可能的实施方式中，可以通过传统视觉算法，比如LBP，获取图像中包含的第一区域。但由于传统视觉算法并不能很好地获取较大拍摄范围的图像中的第一区域，当希望能够获取较大拍摄范围的图像中的第一区域时，还可以通过预先训练完成的人体检测器，获取该图像中包含的第一区域。In a possible implementation manner, the first region included in the image may be acquired through a traditional vision algorithm, such as LBP. However, since the traditional vision algorithm cannot obtain the first region in the image with a larger shooting range very well, when it is desired to obtain the first region in the image with a larger shooting range, the pre-trained human detector can also be used , to get the first region contained in the image.

其中，获取图像中包含的第一区域，可以根据第一区域在图像中的位置信息，比如，第一区域左上角的像素点在图像中的坐标值，以及第一区域右下角的像素点在图像中的坐标值，确定图像中包含的第一区域。Wherein, the acquisition of the first region contained in the image may be based on the position information of the first region in the image, for example, the coordinate value of the pixel in the upper left corner of the first region in the image, and the pixel in the lower right corner of the first region in A coordinate value in the image that determines the first region contained in the image.

需要说明的是，可以通过深度学习的目标识别方法训练人体检测器，具体的训练过程，属于现有技术，在此不再赘述。It should be noted that the human body detector can be trained by the object recognition method of deep learning, and the specific training process belongs to the prior art, and will not be repeated here.

S102：通过预先训练的边缘检测模型，基于所述第一区域以及上一帧图像中包含所述目标人物的第二区域，确定所述第一区域中归属于所述目标人物的第一像素点。S102: Using a pre-trained edge detection model, based on the first region and the second region containing the target person in the previous image frame, determine the first pixel points in the first region that belong to the target person .

由于视频是由连续的多帧图像构成，目标人物在每帧图像中的背景可能会发生变化，每帧图像中包含目标人物的区域也可能有所差异，使得只基于当前帧图像中包含的第一区域，确定的该第一区域中包含该第一像素点的子图像的效果很差，比如，目标人物被椅子等物品遮挡造成目标人物在该子图像中的边缘不完整、边缘出现过多噪声像素点、边缘抖动等。而对于相邻的两帧图像，这两帧图像中包含目标人物的区域一般变化不大。因此，基于上述的实施例获取到当前帧图像中包含的第一区域后，可以基于该第一区域以及上一帧图像中包含目标人物的区域(记为第二区域)，确定该第一区域中归属于目标人物的第一像素点。Since the video is composed of continuous multi-frame images, the background of the target person in each frame of the image may change, and the area containing the target person in each frame of image may also be different, so that only based on the first image contained in the current frame image One area, the effect of the determined sub-image containing the first pixel in the first area is very poor, for example, the target character is blocked by objects such as chairs, resulting in incomplete and excessive edges of the target character in the sub-image Noisy pixels, edge jitter, etc. However, for two adjacent frames of images, the area containing the target person in the two frames of images generally does not change much. Therefore, after obtaining the first region contained in the current frame image based on the above-mentioned embodiment, the first region can be determined based on the first region and the region containing the target person in the previous frame image (denoted as the second region) The first pixel belonging to the target person.

在一种可能的实施方式中，为了准确地确定第一区域中归属于目标人物的第一像素点，预先训练有边缘检测模型。获取到当前帧图像中包含的第一区域以及上一帧图像包含的第二区域，将该第一区域以及第二区域输入到预先训练完成的边缘检测模型。通过预先训练的边缘检测模型，对输入的第一区域以及第二区域进行相应的处理，确定该第一区域中归属于目标人物的第一像素点。其中，该边缘检测模型可以是决策树、逻辑回归(logistic regression，LR)，朴素贝叶斯(naive bayes，NB)分类算法，随机森林(randomforest，RF)算法，支持向量机(support vector machines，SVM)分类算法、方向梯度直方图(histogram of oriented gradients,HOG)、深度学习算法等。In a possible implementation manner, in order to accurately determine the first pixel points belonging to the target person in the first region, an edge detection model is pre-trained. The first region contained in the current frame image and the second region contained in the previous frame image are obtained, and the first region and the second region are input into the pre-trained edge detection model. Through the pre-trained edge detection model, corresponding processing is performed on the input first area and the second area, and the first pixel points belonging to the target person in the first area are determined. Wherein, the edge detection model may be decision tree, logistic regression (logistic regression, LR), naive bayes (naive bayes, NB) classification algorithm, random forest (random forest, RF) algorithm, support vector machines (support vector machines, SVM) classification algorithm, histogram of oriented gradients (HOG), deep learning algorithm, etc.

在一种可能的实施方式中，通过预先训练的边缘检测模型，基于所述第一区域以及上一帧图像中包含所述目标人物的第二区域，确定所述第一区域中归属于所述目标人物的第一像素点，可通过如下公式表示：In a possible implementation manner, by using a pre-trained edge detection model, based on the first area and the second area containing the target person in the previous image frame, it is determined that the area in the first area belongs to the The first pixel of the target person can be expressed by the following formula:

其中，L_i为第i帧图像包含的第一区域中归属于目标人物的第一像素点，

表示第i帧图像包含的第一区域，

表示上一帧图像中包含目标人物的第二区域，f()表示预先训练的边缘检测模型，i为大于等于1的整数。Among them, L_i is the first pixel point belonging to the target person in the first area contained in the i-th frame image,

Indicates the first area contained in the i-th frame image,

Indicates the second area containing the target person in the previous frame image, f() indicates a pre-trained edge detection model, and i is an integer greater than or equal to 1.

通过预先训练的边缘检测模型，对输入的第一区域以及第二区域进行相应的处理，计算该第一区域中包含的每个像素点和第二区域中对应的像素点之间的差异，确定该第一区域中归属于目标人物的第一像素点。Through the pre-trained edge detection model, the input first area and the second area are processed accordingly, and the difference between each pixel contained in the first area and the corresponding pixel in the second area is calculated to determine The first pixel points belonging to the target person in the first area.

当确定了第一区域中归属于目标人物的第一像素点后，可以根据该第一区域，对保存的上一帧图像中包含目标人物的第二区域进行更新，以方便确定下一帧图像中归属于目标人物的第一像素点。After the first pixel belonging to the target person in the first area is determined, the second area containing the target person in the saved last frame image can be updated according to the first area, so as to facilitate the determination of the next frame image The first pixel belonging to the target person.

S103：根据所述第一像素点，确定所述图像中的背景区域。S103: Determine a background area in the image according to the first pixel.

当基于上述实施例中的方法，确定了第一区域中归属于目标人物的第一像素点之后，可以根据获取到的第一像素点，确定该图像中不归属于目标人物的像素点。根据该图像中不归属于目标人物的像素点，确定该图像中的背景区域。After the first pixel points belonging to the target person in the first region are determined based on the method in the above embodiment, the pixels in the image that do not belong to the target person can be determined according to the acquired first pixel points. The background area in the image is determined according to the pixels in the image that do not belong to the target person.

对于电子设备获取到的同一视频中包含的每帧图像，主要分为第一帧图像和非第一帧图像这两种情况。现对该两种情况下如何确定图像中包含目标人物的第一区域的过程进行详细的说明：For each frame of image included in the same video acquired by the electronic device, it is mainly divided into two cases: the first frame image and the non-first frame image. The process of how to determine the first region containing the target person in the image in the two cases is now described in detail:

情况1、对于电子设备获取到的同一视频中包含的第一帧图像，为了准确地确定图像中包含目标人物的第一区域，预设有人脸选取条件。当电子设备获取到第一帧图像后，可以先确定该第一帧图像中包含的每个人脸，然后依次或随机确定获取到的人脸是否满足预设的人脸选取条件。将该第一帧图像中包含的满足人脸选取条件的人脸所归属的人物的区域确定为第一区域。Case 1. For the first frame of image contained in the same video acquired by the electronic device, in order to accurately determine the first area of the image containing the target person, preset human face selection conditions. After the electronic device acquires the first frame of image, it may first determine each face contained in the first frame of image, and then sequentially or randomly determine whether the acquired faces meet the preset face selection conditions. The area of the person to which the face that satisfies the face selection condition contained in the first frame of image belongs is determined as the first area.

在一种可能的实施方式中，确定所述第一帧图像中包含的满足人脸选取条件的人脸，包括：In a possible implementation manner, determining the faces contained in the first frame image that meet the face selection conditions includes:

获取预设的目标人脸图像；通过人脸识别算法，确定所述第一帧图像中的人脸与所述目标人脸图像的相似度；将相似度大于预设的相似度阈值的人脸确定为满足人脸选取条件的人脸；或Obtain a preset target face image; determine the similarity between the face in the first frame image and the target face image through a face recognition algorithm; identify faces whose similarity is greater than a preset similarity threshold A face determined to meet the face selection criteria; or

确定所述第一帧图像中人脸的优先级，将优先级最高的人脸确定为满足人脸选取条件的人脸；所述优先级根据人脸与摄像头之间的距离、人脸在图像中的位置、人脸尺寸、人脸的表情分数以及人脸偏转角度中至少一项确定的。Determine the priority of the face in the first frame image, and determine the face with the highest priority as the face that meets the face selection conditions; the priority is based on the distance between the face and the camera, the face in the image It is determined by at least one of the position in the face, the size of the face, the expression score of the face, and the deflection angle of the face.

在一种可能的实施方式中，用户在通过智能设备输入背景替换的请求时，还需要输入目标人物的信息，以帮助电子设备后续可以识别获取到的图像中包含目标人物的第一区域。其中，该目标人物的信息可以是目标人物的人脸图像、目标人物的人体图像、用户在智能设备采集到的图像中标注出的目标人物所在的区域等。智能设备接收到用户输入的目标人物的信息之后，将该目标人物的信息以及采集到的第一帧图像发送至电子设备。电子设备基于该目标人物的信息，对该第一帧图像进行处理，确定该第一帧图像中包含目标人物的第一区域。In a possible implementation manner, when the user inputs the request for background replacement through the smart device, the user also needs to input the information of the target person, so as to help the electronic device subsequently identify the first area containing the target person in the acquired image. Wherein, the information of the target person may be a face image of the target person, a human body image of the target person, an area where the target person is marked by the user in the image collected by the smart device, and the like. After receiving the information of the target person input by the user, the smart device sends the information of the target person and the collected first frame of image to the electronic device. The electronic device processes the first frame of image based on the information of the target person, and determines a first region in the first frame of image that contains the target person.

例如，以用户输入的目标人物的信息为目标人脸图像为例，电子设备在确定第一帧图像中包含目标人物的第一区域时，获取预设的目标人脸图像。通过人脸识别算法，确定第一帧图像中的人脸与目标人脸图像的相似度。将相似度大于预设的相似度阈值的人脸确定为满足人脸选取条件的人脸。然后将该满足人脸选取条件的人脸在第一帧图像中所归属的人物的区域确定为第一区域。For example, taking the information of the target person input by the user as the target face image as an example, when the electronic device determines the first region containing the target person in the first frame image, it acquires the preset target face image. Through the face recognition algorithm, determine the similarity between the face in the first frame image and the target face image. A face whose similarity is greater than a preset similarity threshold is determined as a face that satisfies the face selection condition. Then, the area of the person to which the face satisfying the face selection condition belongs in the first frame image is determined as the first area.

其中，在设置相似度阈值时，可以根据场景的不同，设置不同的值。如果希望提高识别的目标人物的准确度，可以将该相似度阈值设置的大一些；如果希望避免误识别目标人物为非目标人物，可以将该相似度阈值设置的小一些。Wherein, when setting the similarity threshold, different values may be set according to different scenarios. If you want to improve the accuracy of identifying the target person, you can set the similarity threshold higher; if you want to avoid misidentifying the target person as a non-target person, you can set the similarity threshold smaller.

需要说明的是，该相似度可以是根据余弦距离、欧式距离、汉明距离等表示。具体过程中可以根据实际需求进行灵活设置，在此不做具体限定。It should be noted that, the similarity can be expressed according to cosine distance, Euclidean distance, Hamming distance and the like. The specific process can be flexibly set according to actual needs, and no specific limitation is made here.

作为另一种可能的实施方式，对于电子设备获取到的第一帧图像，电子设备可以通过人脸检测算法，确定该电子设备中包含的每个人脸区域。然后按照预设的优先级设置规则，确定每个人脸区域分别对应的优先级。其中，优先级设置规则可以是距离越近的人脸所对应的优先级越高、位置接近图像中心的人脸所对应的优先级越高、越大的人脸所对应的优先级越高、表情分数越大的人所对应的优先级越高以及人脸偏转角度越小的人脸所对应的优先级越高中的至少一项。可以理解的是，人脸的优先级根据人脸与摄像头之间的距离、人脸在图像中的位置、人脸尺寸、人脸的表情分数以及人脸偏转角度中至少一项确定的。然后根据每个人脸分别对应的优先级，将优先级最高的人脸确定为满足人脸选取条件的人脸。然后将该满足人脸选取条件的人脸在第一帧图像中所归属的人物的区域确定为第一区域。As another possible implementation manner, for the first frame of image acquired by the electronic device, the electronic device may determine each face area included in the electronic device through a face detection algorithm. Then according to the preset priority setting rule, the priority corresponding to each face area is determined. Among them, the priority setting rule can be that the closer the face is, the higher the priority is, the face whose position is close to the center of the image is the higher priority, the larger the face is, the higher the priority is, At least one of a higher priority corresponding to a person with a larger expression score and a higher priority corresponding to a face with a smaller face deflection angle. It can be understood that the priority of the face is determined according to at least one of the distance between the face and the camera, the position of the face in the image, the size of the face, the expression score of the face, and the deflection angle of the face. Then, according to the priority corresponding to each face, the face with the highest priority is determined as the face that satisfies the face selection condition. Then, the area of the person to which the face satisfying the face selection condition belongs in the first frame image is determined as the first area.

情况2、对于电子设备获取到的非第一帧图像，电子设备可以通过如下方式确定非第一帧图像中包含的目标人物的第一区域：Case 2. For the non-first frame of image acquired by the electronic device, the electronic device may determine the first area of the target person included in the non-first frame of image in the following manner:

方式1、由于用户在通过智能设备输入背景替换的请求时，可以输入目标人物的信息，以帮助电子设备后续可以识别获取到的图像中包含目标人物的第一区域。其中，该目标人物的信息可以是目标人物的人脸图像、目标人物的人体图像、用户在智能设备采集到的图像中标注出的目标人物所在的区域等。智能设备接收到用户输入的目标人物的信息之后，将该目标人物的信息发送至电子设备。电子设备基于该目标人物的信息，对获取到的非第一帧图像进行处理，确定该非第一帧图像中包含目标人物的第一区域。Method 1. When the user inputs a background replacement request through the smart device, the user can input the information of the target person, so as to help the electronic device subsequently identify the first area containing the target person in the acquired image. Wherein, the information of the target person may be a face image of the target person, a human body image of the target person, an area where the target person is marked by the user in the image collected by the smart device, and the like. After receiving the information of the target person input by the user, the smart device sends the information of the target person to the electronic device. Based on the information of the target person, the electronic device processes the acquired non-first frame image, and determines the first region containing the target person in the non-first frame image.

例如，以用户输入的目标人物的信息为目标人脸图像为例，电子设备在确定非第一帧图像中包含目标人物的第一区域时，获取预设的目标人脸图像。通过人脸识别算法，确定非第一帧图像中的人脸与目标人脸图像的相似度。将相似度大于预设的相似度阈值的人脸确定为满足人脸选取条件的人脸。然后将该满足人脸选取条件的人脸在非第一帧图像中所归属的人物的区域确定为第一区域。For example, taking the information of the target person input by the user as the target face image as an example, the electronic device obtains the preset target face image when determining the first region that does not contain the target person in the first frame image. Through the face recognition algorithm, the similarity between the face in the non-first frame image and the target face image is determined. A face whose similarity is greater than a preset similarity threshold is determined as a face that satisfies the face selection condition. Then, the area of the person who satisfies the face selection condition in the image other than the first frame image is determined as the first area.

方式2、由于被同一视频的不同图像所包含的同一人物的区域之间相似度较高。因此，为了保证对同一视频的每帧图像中所包含的同一目标人物的区域进行识别，对于获取到的非第一帧图像，可以借助上一帧图像中包含的目标人物的第二区域，将该第二区域中包含的人物作为参照，确定当前非第一帧图像中包含目标人物的第一区域。Mode 2, because the similarity between regions of the same person included in different images of the same video is relatively high. Therefore, in order to ensure that the area of the same target person contained in each frame of the same video is recognized, for the acquired non-first frame image, the second area of the target person contained in the previous frame image can be used to The person contained in the second area is used as a reference to determine the first area containing the target person in the current non-first frame image.

在一种可能的实施方式中，所述确定获取到的图像中包含目标人物的第一区域，包括：In a possible implementation manner, the determining that the acquired image contains the first area of the target person includes:

获取所述图像中包含所述目标人物的候选区域；Acquiring a candidate area containing the target person in the image;

确定所述候选区域与上一帧图像所对应的第二区域之间的相似度；其中，若所述上一帧图像为第一帧图像，所述第一帧图像对应的第二区域是根据所述第一帧图像中包含的满足人脸选取条件的人脸所归属的人物的区域确定的；Determine the similarity between the candidate area and the second area corresponding to the previous frame image; wherein, if the previous frame image is the first frame image, the second area corresponding to the first frame image is based on The area of the person to which the face that satisfies the face selection condition contained in the first frame image belongs is determined;

将相似度满足预设匹配条件的一个候选区域确定为所述第一区域。A candidate region whose similarity meets a preset matching condition is determined as the first region.

为了可以准确地识别获取到每帧图像中归属于目标人物的第一像素点，在获取到图像后，当获取到非第一帧图像之后，可以先对该非第一帧图像中包含的人体进行检测，确定该非第一帧图像中每个包含有人物的区域，将这些包含有人物的区域确定为包含有目标人物的候选区域。其中，可以通过人体检测算法，比如LBP算法，也可以通过人体检测模型等，确定该非第一帧图像中每个包含有人物的区域。In order to accurately identify the first pixel point belonging to the target person in each frame of image, after the image is acquired, when the non-first frame image is acquired, the human body contained in the non-first frame image can be first Perform detection, determine each region containing a person in the non-first frame image, and determine these regions containing a person as candidate regions containing a target person. Wherein, each region containing a person in the non-first frame image may be determined by a human body detection algorithm, such as the LBP algorithm, or by a human body detection model.

当基于上述的实施例获取到非第一帧图像中包含的候选区域后，对于获取到的每个候选区域，计算该候选区域与上一帧图像所对应的第二区域之间的相似度。其中，该相似度可以通过余弦距离、汉明距离、欧式距离等确定，具体的确定过程属于现有技术，在此不做具体限定。After the candidate regions not included in the first frame of image are obtained based on the above-mentioned embodiment, for each obtained candidate region, the similarity between the candidate region and the second region corresponding to the previous frame of image is calculated. Wherein, the similarity can be determined by cosine distance, Hamming distance, Euclidean distance, etc., and the specific determination process belongs to the prior art, and is not specifically limited here.

为了准确地确定非第一帧图像中包含目标人物的第一区域，预设有匹配条件。其中，该匹配条件可以是大于预设的相似度阈值，在该非第一帧图像包含的所有候选区域对应的相似度中为最大值。当基于上述的实施例获取到该非第一帧图像包含的每个候选区域对应的相似度后，依次或随机判断获取到相似度是否满足预设的匹配条件。将相似度满足预设匹配条件的一个候选区域确定为第一区域。In order to accurately determine the first region containing the target person in the image other than the first frame, a matching condition is preset. Wherein, the matching condition may be greater than a preset similarity threshold, and be the maximum among the similarities corresponding to all candidate regions contained in the non-first frame image. After obtaining the similarity corresponding to each candidate region included in the non-first frame image based on the above-mentioned embodiment, it is sequentially or randomly judged whether the obtained similarity satisfies the preset matching condition. A candidate region whose similarity meets a preset matching condition is determined as the first region.

后续通过预先训练的边缘检测模型，基于该第一区域以及上一帧图像中包含目标人物的第二区域，确定第一区域中归属于目标人物的第一像素点。Subsequently, the pre-trained edge detection model is used to determine the first pixel points belonging to the target person in the first region based on the first region and the second region containing the target person in the previous frame of image.

下面通过具体的实施例对本申请提供的确定图像中包含目标人物的区域的过程进行详细的说明，图2为本申请一些实施例提供的具体的确定图像中包含目标人物的区域的流程示意图，该流程包括：The following is a detailed description of the process of determining the region containing the target person in the image provided by this application through specific embodiments. FIG. 2 is a schematic flowchart of the specific determination of the region containing the target person in the image provided by some embodiments of the application. The process includes:

S201：对于获取到的第一帧图像，确定该第一帧图像中包含的满足人脸选取条件的人脸所归属的人物的区域，将该区域确定为目标图像中包含目标人物的区域。S201: For the acquired first frame of image, determine an area of a person to which a face that satisfies a face selection condition contained in the first frame of image belongs, and determine this area as an area in the target image that contains the target person.

其中，具体的确定该第一帧图像中包含的满足人脸选取条件的人脸所归属的人物的区域，包括：获取预设的目标人脸图像；通过人脸识别算法，确定第一帧图像中的人脸与该目标人脸图像的相似度；将相似度大于预设的相似度阈值的人脸确定为满足人脸选取条件的人脸；或，确定第一帧图像中人脸的优先级，将优先级最高的人脸确定为满足人脸选取条件的人脸；其中，优先级根据人脸与摄像头之间的距离、人脸在图像中的位置、人脸尺寸、人脸的表情分数以及人脸偏转角度中至少一项确定的。Wherein, specifically determining the area of the person to which the face that satisfies the face selection condition contained in the first frame of image belongs includes: obtaining a preset target face image; determining the first frame of image through a face recognition algorithm The similarity between the face in the image and the target face image; determine the face whose similarity is greater than the preset similarity threshold as the face that meets the face selection condition; or determine the priority of the face in the first frame image Level, the face with the highest priority is determined as the face that meets the face selection conditions; among them, the priority is based on the distance between the face and the camera, the position of the face in the image, the size of the face, and the expression of the face It is determined by at least one of the score and the face deflection angle.

S202：对于获取到的非第一帧图像，获取该非第一帧图像中包含目标人物的候选区域。S202: For the acquired non-first frame image, acquire a candidate area including the target person in the non-first frame image.

S203：对候选区域进行特征提取。S203: Perform feature extraction on the candidate region.

S204：确定所述候选区域与上一帧图像所对应的第二区域之间的相似度。S204: Determine the similarity between the candidate area and the second area corresponding to the previous frame image.

其中，若上一帧图像为第一帧图像，所述第一帧图像对应的第二区域是根据所述第一帧图像中包含的满足人脸选取条件的人脸所归属的人物的区域确定的。Wherein, if the previous frame image is the first frame image, the second area corresponding to the first frame image is determined according to the area of the person to which the face that satisfies the face selection condition contained in the first frame image belongs of.

S205：将相似度满足预设匹配条件的一个候选区域确定为非第一帧图像中包含目标人物的第一区域。S205: Determine a candidate region whose similarity satisfies a preset matching condition as the first region not containing the target person in the first frame image.

在实际应用过程中，获取到的图像中可能存在目标人物被其它物体遮挡，导致基于上述的方法确定了第一区域中归属于目标人物的第一像素点后，该图像中包含所有第一像素点的子图像中存在孔洞的现象，该孔洞所包含的像素点均为非目标人物的像素点，影响确定出的归属于目标人物的第一像素点的准确性，进而影响确定的图像中的背景区域的准确性。因此，为了进一步提高确定图像中背景区域的准确性，所述确定所述第一区域中归属于所述目标人物的第一像素点之后，所述根据所述第一像素点，确定所述图像中的背景区域之前，所述方法还包括：In the actual application process, the target person may be blocked by other objects in the acquired image, so that after the first pixel in the first area belonging to the target person is determined based on the above method, the image contains all the first pixels There is a hole in the sub-image of the point, and the pixels contained in the hole are all pixels of the non-target person, which affects the accuracy of the determined first pixel belonging to the target person, and then affects the determined image. The accuracy of the background area. Therefore, in order to further improve the accuracy of determining the background area in the image, after determining the first pixel in the first area that belongs to the target person, according to the first pixel, determine the Before the background area in , the method also includes:

从所述图像中确定包含所述第一像素点的子图像；determining a sub-image containing the first pixel from the image;

通过预先训练的连通域检测模型，基于所述子图像，确定所述子图像中归属于所述目标人物的第二像素点，并根据所述第二像素点对所述第一像素点进行更新。Using a pre-trained connected domain detection model, based on the sub-image, determine a second pixel point belonging to the target person in the sub-image, and update the first pixel point according to the second pixel point .

为了进一步提高确定图像中背景区域的准确性，在本申请中，预先训练有连通域检测模型。当基于上述的实施例确定了第一区域中归属于目标人物的第一像素点之后，首先从获取到的图像中确定包含该第一像素点的子图像并剪切。将剪切出的子图像输入到预先训练的连通域检测模型中。通过该连通域检测模型，对输入的子图像进行相应的处理，确定该子图像中归属于目标人物的第二像素点。其中，该连通域检测模型可以是决策树、逻辑回归(logistic regression，LR)，朴素贝叶斯(naive bayes，NB)分类算法，随机森林(random forest，RF)算法，支持向量机(support vector machines，SVM)分类算法、方向梯度直方图(histogram of oriented gradients,HOG)、深度学习算法等。In order to further improve the accuracy of determining the background region in the image, in this application, a connected domain detection model is pre-trained. After the first pixel point belonging to the target person in the first area is determined based on the above-mentioned embodiment, firstly, a sub-image containing the first pixel point is determined from the acquired image and cut out. Input the cropped sub-images into a pre-trained connected domain detection model. Through the connected domain detection model, corresponding processing is performed on the input sub-image, and the second pixel points belonging to the target person in the sub-image are determined. Wherein, the connected domain detection model may be decision tree, logistic regression (logistic regression, LR), naive bayes (naive bayes, NB) classification algorithm, random forest (random forest, RF) algorithm, support vector machine (support vector machines, SVM) classification algorithm, histogram of oriented gradients (HOG), deep learning algorithm, etc.

在一种可能的实施方式中，通过预先训练的连通域检测模型，基于输入的子图像，确定该子图像中归属于目标人物的第二像素点，可通过如下公式表示：In a possible implementation manner, the second pixel point belonging to the target person in the sub-image is determined based on the input sub-image through the pre-trained connected domain detection model, which can be expressed by the following formula:

其中，P_I为第I个子图像中归属于目标人物的第二像素点，α为预先训练的边缘系数，L_I为输入的第I个子图像，

()表示预先训练的连通域检测模型。Among them, P_I is the second pixel point belonging to the target person in the I sub-image, α is the pre-trained edge coefficient, L_I is the I sub-image input,

( ) denotes a pre-trained connected domain detection model.

基于上述的实施例获取到子图像中归属于目标人物的第二像素点后，根据获取到的第二像素点对确定的第一像素点进行更新。根据更新后的第一像素点，确定该图像中不归属于目标人物的像素点。根据该图像中不归属于目标人物的像素点，确定该图像中的背景区域。After the second pixel points belonging to the target person in the sub-image are acquired based on the above-mentioned embodiment, the determined first pixel points are updated according to the acquired second pixel points. According to the updated first pixel points, the pixels in the image that do not belong to the target person are determined. The background area in the image is determined according to the pixels in the image that do not belong to the target person.

由于预先训练有连通域检测模型，通过该连通域检测模型，可以对输入的子图像进行处理，实现了准确地从该子图像中获取到归属于目标人物的第二像素点，避免了获取到的子图像中目标人物的身体内容出现孔洞、将非目标人物的像素点误识别为目标人物的像素点等现象的发生，进一步提高所确定的第一区域中归属于目标人物的第一像素点的准确性，进而使得后续根据该第一像素点所确定的图像中的背景区域也更准确。Since the connected domain detection model is pre-trained, the input sub-image can be processed through the connected domain detection model, and the second pixel point belonging to the target person can be accurately obtained from the sub-image, avoiding the acquisition of Holes appear in the body content of the target person in the sub-image of the target person, and the pixels of the non-target person are misidentified as the pixels of the target person, etc., and the first pixel point belonging to the target person in the determined first area is further improved. Accuracy, and then make the background area in the image subsequently determined according to the first pixel point is also more accurate.

下面通过具体的实施例对本申请提供的确定图像中属于目标人物的像素点的过程进行详细的说明，图3为本申请一些实施例提供的具体的确定图像中属于目标人物的像素点的流程示意图，该流程包括：The process of determining the pixels belonging to the target person in the image provided by the present application will be described in detail below through specific embodiments. FIG. 3 is a schematic flow diagram of specific determination of the pixels belonging to the target person in the image provided by some embodiments of the present application. , the process includes:

S301：确定获取到的图像中包含目标人物的第一区域。S301: Determine the first region in the acquired image that contains the target person.

S302：通过预先训练的边缘检测模型，基于第一区域以及上一帧图像中包含目标人物的第二区域，确定第一区域中归属于目标人物的第一像素点。S302: Using the pre-trained edge detection model, based on the first region and the second region containing the target person in the previous image frame, determine the first pixel points in the first region that belong to the target person.

S303：从图像中确定包含第一像素点的子图像。S303: Determine a sub-image including the first pixel from the image.

S304：通过预先训练的连通域检测模型，基于子图像，确定子图像中归属于目标人物的第二像素点。S304: Using the pre-trained connected domain detection model, based on the sub-image, determine the second pixel points belonging to the target person in the sub-image.

S305：根据所述第二像素点，对第一像素点进行更新并输出更新后的第一像素点。S305: According to the second pixel, update the first pixel and output the updated first pixel.

在一种实施例中，为了方便且准确地确定子图像中归属于目标人物的第二像素点，在本申请中，要预先专门训练连通域检测模型，以使连通域检测模型可以对基于边缘检测模型的输出结果所确定的子图像进行处理。由于连通域检测模型的输入数据是基于边缘检测模型的输出结果所确定的，因此，可以对连通域检测模型和边缘检测模型进行联合训练。具体的，所述连通域检测模型和所述边缘检测模型通过如下方式训练：In one embodiment, in order to conveniently and accurately determine the second pixel points belonging to the target person in the sub-image, in this application, the connected domain detection model should be specially trained in advance, so that the connected domain detection model can The sub-images identified by the output of the detection model are processed. Since the input data of the connected domain detection model is determined based on the output result of the edge detection model, joint training of the connected domain detection model and the edge detection model can be performed. Specifically, the connected domain detection model and the edge detection model are trained in the following manner:

获取样本集中的任一样本数据；所述样本数据包括第一样本图像、第二样本图像以及样本人物在所述第二样本图像中的实际像素点，所述第一样本图像和所述第二样本图像为相邻的两帧图像，所述第一样本图像的采集时间早于所述第二样本图像的采集时间；Acquire any sample data in the sample set; the sample data includes the first sample image, the second sample image and the actual pixel points of the sample person in the second sample image, the first sample image and the The second sample image is two adjacent frames of images, and the acquisition time of the first sample image is earlier than the acquisition time of the second sample image;

通过原始边缘检测模型，基于所述第一样本图像以及所述第二样本图像，确定所述第二样本图像中归属于所述样本人物的第一样本像素点；determining, based on the first sample image and the second sample image, the first sample pixel points belonging to the sample person in the second sample image through an original edge detection model;

从所述第二样本图像中确定包含所述第一样本像素点的样本子图像；determining a sample sub-image including the first sample pixel from the second sample image;

通过原始连通域检测模型，基于所述样本子图像，确定所述样本子图像中归属于所述样本人物的第二样本像素点；Determining a second sample pixel point belonging to the sample person in the sample sub-image based on the sample sub-image through the original connected domain detection model;

根据所述第二样本像素点以及对应的实际像素点，对所述原始边缘检测模型以及所述原始连通域检测模型进行训练，以获取训练完成的边缘检测模型以及连通域检测模型。The original edge detection model and the original connected domain detection model are trained according to the second sample pixel points and corresponding actual pixel points, so as to obtain trained edge detection models and connected domain detection models.

为了方便对连通域检测模型和边缘检测模型进行联合训练，需要预先收集用于训练连通域检测模型和边缘检测模型的样本集，该样本集中包含有大量的样本数据，任一样本数据中包括同一样本人物的相邻的两帧图像，采集时间在前的图像为第一样本图像，采集时间在后的图像为第二样本图像后续可以根据预先收集的样本数据，对连通域检测模型和边缘检测模型进行联合训练。其中，任一样本数据中还包括样本人物在第二样本图像中的实际像素点。In order to facilitate the joint training of the connected domain detection model and the edge detection model, it is necessary to collect a sample set for training the connected domain detection model and the edge detection model in advance. The sample set contains a large number of sample data, and any sample data includes the same For the two adjacent frames of images of the sample person, the image collected earlier is the first sample image, and the image collected later is the second sample image. The connected domain detection model and edge can be detected based on the pre-collected sample data. The detection model is jointly trained. Wherein, any sample data also includes actual pixel points of the sample person in the second sample image.

在一种可能的实施方式中，为了保证收取到的样本数据的多样性，从而保证训练完成的连通域检测模型和边缘检测模型的鲁棒性，包含同一样本人物的样本图像中该样本人物的外形、动作、拍摄角度等可以是不同的，比如样本人物在样本图像A穿卫衣，样本人物在样本图像B穿西服、样本人物在样本图像A向右15度、样本人物在样本图像B向左15度等。In a possible implementation, in order to ensure the diversity of the collected sample data, thereby ensuring the robustness of the trained connected domain detection model and edge detection model, the The shape, action, shooting angle, etc. can be different. For example, the sample character is wearing a sweater in sample image A, the sample character is wearing a suit in sample image B, the sample character is 15 degrees to the right in sample image A, and the sample character is facing left in sample image B. 15 degrees etc.

可选的，为了进一步增强连通域检测模型和边缘检测模型的鲁棒性，可以对预先从智能设备的工作环境中收集的一个或多个包含有样本人物的样本图像(记为原始样本图像)，进行增强处理。比如，对原始样本图像进行裁剪，采用一个或多个不同的卷积核对原始样本图像进行高斯模糊处理，对原始样本图像进行下采样处理等，然后将增强处理后的样本图像，和/或，原始样本图像均确定为样本集中的样本数据，以根据增强处理后的样本图像，和/或，原始样本图像，共同对待训练的连通域检测模型和待训练的边缘检测模型进行训练。Optionally, in order to further enhance the robustness of the connected domain detection model and the edge detection model, one or more sample images (denoted as original sample images) that contain sample characters collected in advance from the working environment of the smart device , for enhanced processing. For example, crop the original sample image, use one or more different convolution kernels to perform Gaussian blur processing on the original sample image, perform downsampling processing on the original sample image, etc., and then enhance the processed sample image, and/or, The original sample images are all determined as sample data in the sample set, so that the connected domain detection model to be trained and the edge detection model to be trained are jointly trained according to the enhanced sample images and/or the original sample images.

具体实施过程中，从预先收集的样本集中获取任一样本数据。将该样本数据中的第一样本图像以及第二样本图像输入到原始边缘检测模型。通过原始边缘检测模型，基于输入的第一样本图像以及第二样本图像，进行相应的处理，确定第二样本图像中归属于样本人物的第一样本像素点。然后从第二样本图像中确定包含该第一样本像素点的样本子图像，并将该样本子图像输入到原始连通域检测模型。通过原始连通域检测模型，基于输入的样本子图像，确定样本子图像中归属于样本人物的第二样本像素点。根据获取到的第二样本像素点以及样本数据中对应的实际像素点，确定损失值。基于该损失值，对原始边缘检测模型以及原始连通域检测模型进行训练，以获取训练完成的边缘检测模型以及连通域检测模型。During specific implementation, any sample data is obtained from a pre-collected sample set. The first sample image and the second sample image in the sample data are input to the original edge detection model. Through the original edge detection model, corresponding processing is performed based on the input first sample image and the second sample image, and the first sample pixel points belonging to the sample person in the second sample image are determined. Then determine a sample sub-image containing the first sample pixel from the second sample image, and input the sample sub-image to the original connected domain detection model. Using the original connected domain detection model, based on the input sample sub-image, determine the second sample pixel points belonging to the sample person in the sample sub-image. The loss value is determined according to the obtained second sample pixel and the corresponding actual pixel in the sample data. Based on the loss value, the original edge detection model and the original connected domain detection model are trained to obtain the trained edge detection model and connected domain detection model.

在一种可能的实施方式中，设置有边缘系数，初始化的边缘系数可以是随机设置的，也可以是根据人工经验进行设置的，该边缘系数可以在对边缘检测模型以及连通域检测模型进行联合训练的过程中也进行训练。具体的，可以将该边缘系数以及样本子图像同时输入到原始连通域检测模型中。通过原始连通域检测模型，基于输入的样本子图像以及边缘系数，确定样本子图像中归属于样本人物的第二样本像素点。根据获取到的第二样本像素点以及样本数据中对应的实际像素点，确定损失值。基于该损失值，对原始边缘检测模型、原始连通域检测模型以及边缘系数进行训练，以获取训练完成的边缘检测模型、连通域检测模型以及边缘系数。In a possible implementation, edge coefficients are set, and the initialized edge coefficients can be set randomly or based on manual experience. The edge coefficients can be combined with the edge detection model and the connected domain detection model. Training is also performed during training. Specifically, the edge coefficient and the sample sub-image can be simultaneously input into the original connected domain detection model. Using the original connected domain detection model, based on the input sample sub-image and the edge coefficient, determine the second sample pixel points belonging to the sample person in the sample sub-image. The loss value is determined according to the obtained second sample pixel and the corresponding actual pixel in the sample data. Based on the loss value, the original edge detection model, the original connected domain detection model and edge coefficients are trained to obtain the trained edge detection model, connected domain detection model and edge coefficients.

由于预先收集的样本集中包含大量的样本数据，对每个样本数据都进行上述操作，当满足预设的收敛条件时，确定该边缘检测模型以及连通域检测模型训练完成。Since the pre-collected sample set contains a large amount of sample data, the above operations are performed on each sample data, and when the preset convergence condition is met, it is determined that the edge detection model and the connected domain detection model have been trained.

其中，满足预设的收敛条件可以为在当前迭代过程中样本集中的每个样本数据分别对应的损失值的和小于预设的损失值阈值，或对原始边缘检测模型以及原始连通域检测模型进行训练的迭代次数达到设置的最大迭代次数等。具体实施中可以灵活进行设置，在此不做具体限定。Among them, satisfying the preset convergence condition can be that the sum of the loss values corresponding to each sample data in the sample set in the current iteration process is less than the preset loss value threshold, or the original edge detection model and the original connected domain detection model. The number of iterations of training reaches the set maximum number of iterations, etc. It can be set flexibly in specific implementation, and no specific limitation is made here.

在一种可能的实施方式中，在对原始边缘检测模型以及原始连通域检测模型进行训练时，可以把样本集中的样本数据分为训练样本和测试样本，先基于训练样本对原始边缘检测模型以及原始连通域检测模型进行训练，再基于测试样本对上述已训练的边缘检测模型以及连通域检测模型的可靠程度进行验证。In a possible implementation, when training the original edge detection model and the original connected domain detection model, the sample data in the sample set can be divided into training samples and test samples, and the original edge detection model and The original connected domain detection model is trained, and then the reliability of the above trained edge detection model and connected domain detection model is verified based on the test samples.

为了保护用户的隐私，提高用户的体验，在上述各实施例的基础上，在本申请中，所述方法还包括：In order to protect user privacy and improve user experience, on the basis of the above embodiments, in this application, the method further includes:

获取目标背景图像；Get the target background image;

根据所述背景区域在所述目标背景图像中对应的像素点，对所述背景区域中包含的像素点进行更新。Update the pixels included in the background area according to the corresponding pixels of the background area in the target background image.

由于在视频通话过程中，用户可能会希望不将当前所处的环境分享给他人。因此，为了保护用户的隐私，提高用户的体验，在本申请中，可以根据目标背景图像对视频中的图像的背景区域进行替换。其中，目标背景图像可以是用户设置的，也可以是预先设置的。During the video call, the user may wish not to share the current environment with others. Therefore, in order to protect the user's privacy and improve the user's experience, in this application, the background area of the image in the video may be replaced according to the target background image. Wherein, the target background image may be set by the user, or may be preset.

例如，用户在通过智能设备输入背景替换的请求时，还可以向智能设备输入目标背景图像，以使电子设备后续可以根据目标背景图像对视频中的图像的背景区域进行替换。智能设备接收到用户输入的目标背景图像之后，将该目标背景图像以及采集到的图像发送至电子设备。电子设备基于上述的实施例确定了采集到的图像中的背景图像之后，可以根据获取到的目标背景图像对该采集到的图像进行处理，从而实现对视频中的图像的背景区域进行替换。For example, when the user inputs a background replacement request through the smart device, the user can also input a target background image to the smart device, so that the electronic device can subsequently replace the background area of the image in the video according to the target background image. After receiving the target background image input by the user, the smart device sends the target background image and the collected images to the electronic device. After the electronic device determines the background image in the captured image based on the above embodiment, it may process the captured image according to the acquired target background image, so as to replace the background area of the image in the video.

具体实施过程中，获取到目标背景图像以及获取到的图像中的背景区域后。确定背景区域在目标背景图像中对应的像素点。然后根据背景区域在目标背景图像中对应的像素点，对该背景区域中包含的像素点进行更新。During the specific implementation process, after the target background image and the background area in the acquired image are acquired. Determine the corresponding pixel points of the background area in the target background image. Then, according to the corresponding pixel points of the background area in the target background image, the pixels included in the background area are updated.

在一种可能的实施方式中，根据背景区域在目标背景图像中对应的像素点，对该背景区域中包含的像素点进行更新，可通过如下公式表示：In a possible implementation manner, the pixels contained in the background area are updated according to the corresponding pixels of the background area in the target background image, which can be expressed by the following formula:

out_(L)＝F+(1-F)Bout_(L) = F+(1-F)B

其中out_(I)是更新背景区域后的第L帧图像，F表示第L帧图像中除背景区域之外的人像区域，B表示目标背景图像，1-F表示第L帧图像中的背景区域。Wherein out_(I) is the L-th frame image after updating the background area, F represents the portrait area in the L-th frame image except the background area, B represents the target background image, and 1-F represents the background area in the L-th frame image .

为了避免图像中的目标人物不在智能设备的图像采集区域中，在上述各实施例的基础上，在本申请中，所述确定获取到的图像中包含目标人物的第一区域之后，所述方法还包括：In order to avoid that the target person in the image is not in the image collection area of the smart device, on the basis of the above embodiments, in this application, after the first area of the acquired image containing the target person is determined, the method Also includes:

确定所述图像上的第一预设像素点的第一位置信息，以及所述第二区域上的第二预设像素点的第二位置信息；determining first position information of a first preset pixel on the image, and second position information of a second preset pixel on the second area;

确定所述第一位置信息与对应的第二位置信息之间的目标距离；determining a target distance between the first location information and the corresponding second location information;

若确定所述目标距离满足预设的调整条件，则根据预设的距离与调整方向及角度的对应关系，确定所述目标距离对应的目标调整方向以及目标角度并发送至智能设备，以使所述智能设备通过所述智能设备的云台控制所述智能设备的摄像头向所述目标调整方向调整所述目标角度。If it is determined that the target distance satisfies the preset adjustment condition, then according to the corresponding relationship between the preset distance and the adjustment direction and angle, the target adjustment direction and the target angle corresponding to the target distance are determined and sent to the smart device, so that all The smart device controls the camera of the smart device to adjust the target angle to the target adjustment direction through the pan/tilt of the smart device.

在实际应用场景下，智能设备的摆放位置一般是保持不动的，导致智能设备采集到的包含有目标人物的图像中，包含目标人物的区域在该图像中的某一角落位置，或者是目标人物的部分身体不在该图像中，即目标人物不在智能设备的图像采集范围内，导致图像中该目标人物的身体是不完整的等情况。因此，为了保证智能设备可以采集到包含有完整的目标人物的图像，以保证目标人物在智能涉笔的图像采集范围内，在本申请中，电子设备基于上述的实施例确定了该图像中包含有目标人物的第一区域后，可以根据该第一区域在图像中的相对位置，控制智能设备的云台对智能设备的摄像头进行拍摄角度的调整。In the actual application scenario, the placement position of the smart device generally remains unchanged, resulting in the image containing the target person collected by the smart device, the area containing the target person is in a corner of the image, or Part of the body of the target person is not in the image, that is, the target person is not within the image collection range of the smart device, resulting in incomplete body of the target person in the image. Therefore, in order to ensure that the smart device can collect an image containing a complete target person, and to ensure that the target person is within the image collection range of the smart pen-related image, in this application, the electronic device determines that the image contains After having the first area of the target person, the pan/tilt of the smart device can be controlled to adjust the shooting angle of the camera of the smart device according to the relative position of the first area in the image.

当基于上述的实施例确定了该图像中包含有目标人物的第一区域后，可以确定该图像上的第一预设像素点的位置信息(记为第一位置信息)，以及该第一区域上的第二预设像素点的位置信息(记为第二位置信息)。其中，图像上的第一预设像素点可以是一个，也可以是多个，可以包括位于该图像的边缘的像素点中的至少一个，也可以包括位于该图像的中央位置的像素点。同样的，第一区域上的第二预设像素点也可以是一个，也可以是多个，可以包括位于该第一区域的边缘的像素点中的至少一个，也可以包括位于该第一区域的中央位置的像素点。然后确定第一预设像素点与对应的第二预设像素点之间的距离，即确定第一位置信息以及第二位置信息之间的目标距离。基于该目标距离，进行相应的处理，以控制智能设备的云台对智能设备的摄像头进行拍摄角度的调整。After the first area containing the target person in the image is determined based on the above-mentioned embodiment, the position information of the first preset pixel point on the image (denoted as the first position information) can be determined, and the first area The position information of the second preset pixel on the above (denoted as the second position information). Wherein, the first preset pixel on the image may be one or multiple, and may include at least one pixel located at the edge of the image, or may include a pixel located at the center of the image. Similarly, the second preset pixel on the first area may also be one or more, and may include at least one of the pixels located on the edge of the first area, or may include The pixel at the central position of . Then determine the distance between the first preset pixel point and the corresponding second preset pixel point, that is, determine the target distance between the first position information and the second position information. Based on the target distance, corresponding processing is performed to control the pan/tilt of the smart device to adjust the shooting angle of the camera of the smart device.

为了准确地控制智能设备的云台对智能设备的摄像头进行拍摄角度的调整，预设有调整条件，以及距离与调整方向及角度的对应关系。其中，预设的调整条件可以是目标距离小于预设的距离阈值。当基于上述的实施例获取到目标距离后，可以判断该目标距离是否满足预设的调整条件。In order to accurately control the pan/tilt of the smart device to adjust the shooting angle of the camera of the smart device, the adjustment conditions and the corresponding relationship between the distance and the adjustment direction and angle are preset. Wherein, the preset adjustment condition may be that the target distance is smaller than a preset distance threshold. After the target distance is obtained based on the above-mentioned embodiment, it may be determined whether the target distance satisfies a preset adjustment condition.

当确定该目标距离不满足预设的调整条件时，说明该目标人物在智能设备的图像采集范围内，则无需控制智能设备的云台对智能设备的摄像头进行拍摄角度的调整，以使智能设备的摄像头以当前的拍摄角度采集图像。When it is determined that the target distance does not meet the preset adjustment conditions, it means that the target person is within the image acquisition range of the smart device, and there is no need to control the pan/tilt of the smart device to adjust the shooting angle of the camera of the smart device so that the smart device The camera captures images at the current shooting angle.

当确定该目标距离满足预设的调整条件，说明该目标人物不在智能设备的图像采集范围内，或者是即将离开智能设备的图像采集范围，则根据预设的距离与调整方向及角度的对应关系，确定目标距离对应的目标调整方向以及目标角度，然后将该目标调整方向以及目标角度发送至智能设备。When it is determined that the target distance meets the preset adjustment conditions, indicating that the target person is not within the image collection range of the smart device, or is about to leave the image collection range of the smart device, then adjust the corresponding relationship between the preset distance and the direction and angle , determine the target adjustment direction and target angle corresponding to the target distance, and then send the target adjustment direction and target angle to the smart device.

智能设备接收到了电子设备发送的目标调整方向以及目标角度后，可以通过智能设备的云台控制智能设备的摄像头向目标调整方向调整目标角度。After the smart device receives the target adjustment direction and the target angle sent by the electronic device, it can control the camera of the smart device to adjust the target angle to the target adjustment direction through the pan/tilt of the smart device.

通过上述的方法，可以避免智能设备受拍摄位置固定的局限，导致拍摄的目标人物不再图像采集范围内，有利于后续对图像中目标人物与背景区域之间的分割，也有利于对目标人物的跟踪，提高了对图像中目标人物与背景区域进行分割的效果。Through the above method, it is possible to prevent the smart device from being limited by the fixed shooting position, causing the target person to be photographed is no longer within the image acquisition range, which is beneficial to the subsequent segmentation of the target person and the background area in the image, and is also beneficial to the target person. The tracking improves the segmentation effect of the target person and the background area in the image.

下面通过具体的实施例对本申请提供的图像处理方法进行说明，图4为本申请一些实施例提供的具体的图像处理的流程示意图，如图4所示，该图像处理方法主要分为图像采集、跟踪目标人物、确定背景区域以及背景替换这四个部分。图5为本申请一些实施例提供的具体的图像处理流程示意图，以执行主体为服务器，采集图像的设备为智能设备为例，针对每个部分，结合图5进行详细的介绍：The image processing method provided by the present application is described below through specific embodiments. FIG. 4 is a schematic flow chart of a specific image processing provided by some embodiments of the present application. As shown in FIG. 4 , the image processing method is mainly divided into image acquisition, The four parts are tracking the target person, determining the background area, and background replacement. Fig. 5 is a schematic diagram of a specific image processing flow provided by some embodiments of the present application. Taking the execution subject as a server and the device for collecting images as an example of a smart device, each part is described in detail in conjunction with Fig. 5:

第一部分：图像采集Part 1: Image Acquisition

S501：智能设备在固定拍摄位置，通过智能设备的摄像头采集图像并发送至服务器。S501: The smart device collects images through a camera of the smart device at a fixed shooting position and sends them to a server.

S502：服务器获取图像。S502: The server obtains the image.

第二部分：跟踪目标人物Part Two: Tracking the Target Person

S503：服务器确定获取到的图像中包含目标人物的第一区域。S503: The server determines that the acquired image contains the first area of the target person.

其中，服务器确定获取到的图像中包含目标人物的第一区域的过程包括：Wherein, the process for the server to determine that the acquired image contains the first area of the target person includes:

对于获取到的第一帧图像，获取预设的目标人脸图像；通过人脸识别算法，确定第一帧图像中的人脸与目标人脸图像的相似度；将相似度大于预设的相似度阈值的人脸确定为满足人脸选取条件的人脸；或，确定第一帧图像中人脸的优先级，将优先级最高的人脸确定为满足人脸选取条件的人脸；优先级根据人脸与摄像头之间的距离、人脸在图像中的位置、人脸尺寸、人脸的表情分数以及人脸偏转角度中至少一项确定的。For the obtained first frame image, obtain the preset target face image; through the face recognition algorithm, determine the similarity between the face in the first frame image and the target face image; make the similarity greater than the preset similarity The face with the degree threshold is determined as the face that satisfies the face selection condition; or, determine the priority of the face in the first frame image, and determine the face with the highest priority as the face that meets the face selection condition; priority Determined according to at least one of the distance between the face and the camera, the position of the face in the image, the size of the face, the expression score of the face, and the deflection angle of the face.

对于获取到的非第一帧图像，获取该非第一帧图像中包含目标人物的候选区域；确定候选区域与上一帧图像所对应的第二区域之间的相似度；将相似度满足预设匹配条件的一个候选区域确定为第一区域。For the obtained non-first frame image, obtain the candidate area containing the target person in the non-first frame image; determine the similarity between the candidate area and the second area corresponding to the previous frame image; It is assumed that a candidate area matching the condition is determined as the first area.

S504：服务器确定图像上的第一预设像素点的第一位置信息，以及第一区域上的第二预设像素点的第二位置信息。S504: The server determines first position information of a first preset pixel point on the image, and second position information of a second preset pixel point on the first area.

S505：服务器确定第一位置信息与对应的第二位置信息之间的目标距离。S505: The server determines a target distance between the first location information and the corresponding second location information.

S506：当服务器确定目标距离满足预设的调整条件时，根据预设的距离与调整方向及角度的对应关系，确定目标距离对应的目标调整方向以及目标角度并发送至智能设备。S506: When the server determines that the target distance satisfies the preset adjustment condition, determine the target adjustment direction and the target angle corresponding to the target distance according to the preset correspondence relationship between the distance and the adjustment direction and angle, and send them to the smart device.

S507：智能设备获取目标调整方向以及目标角度。S507: The smart device obtains the target adjustment direction and the target angle.

S508：智能设备通过智能设备的云台控制智能设备的摄像头向目标调整方向调整目标角度。S508: The smart device controls the camera of the smart device to adjust the target angle to the target adjustment direction through the pan/tilt of the smart device.

第三部分：确定背景区域Part 3: Determining the background area

S509：服务器通过预先训练的边缘检测模型，基于第一区域以及上一帧图像中包含目标人物的第二区域，确定第一区域中归属于目标人物的第一像素点。S509: The server determines the first pixel points belonging to the target person in the first region based on the first region and the second region containing the target person in the previous image frame by using the pre-trained edge detection model.

S510：服务器从图像中确定包含第一像素点的子图像。S510: The server determines a sub-image including the first pixel from the image.

S511：服务器通过预先训练的连通域检测模型，基于子图像，确定子图像中归属于目标人物的第二像素点，并根据第二像素点对第一像素点进行更新。S511: The server determines the second pixel points belonging to the target person in the sub-image based on the pre-trained connected domain detection model, and updates the first pixel points according to the second pixel points.

S512：服务器根据S511确定的第一像素点，确定图像中的背景区域。S512: The server determines a background area in the image according to the first pixel determined in S511.

第四部分：背景替换Part Four: Background Replacement

S513：获取目标背景图像。S513: Acquire the target background image.

S514：根据背景区域在目标背景图像中对应的像素点，对背景区域中包含的像素点进行更新。S514: Update the pixels contained in the background area according to the pixels corresponding to the background area in the target background image.

可选的，该图像处理流程还包括获取训练完成的连通域检测模型和边缘检测模型。需要说明的是，用于训练模型的电子设备与后续进行图像处理的电子设备可以相同，也可以不同。具体的，连通域检测模型和边缘检测模型通过如下方式训练：Optionally, the image processing procedure also includes acquiring a trained connected domain detection model and an edge detection model. It should be noted that the electronic device for training the model may be the same as or different from the electronic device for subsequent image processing. Specifically, the connected domain detection model and the edge detection model are trained as follows:

获取样本集中的任一样本数据；样本数据包括第一样本图像、第二样本图像以及样本人物在第二样本图像中的实际像素点，第一样本图像和第二样本图像为相邻的两帧图像，第一样本图像的采集时间早于第二样本图像的采集时间；Obtain any sample data in the sample set; the sample data includes the first sample image, the second sample image and the actual pixel points of the sample person in the second sample image, the first sample image and the second sample image are adjacent Two frames of images, the acquisition time of the first sample image is earlier than the acquisition time of the second sample image;

通过原始边缘检测模型，基于第一样本图像以及第二样本图像，确定第二样本图像中归属于样本人物的第一样本像素点；Determining the first sample pixel points belonging to the sample person in the second sample image based on the first sample image and the second sample image through the original edge detection model;

从第二样本图像中确定包含第一样本像素点的样本子图像；determining a sample sub-image including the first sample pixel from the second sample image;

通过原始连通域检测模型，基于样本子图像，确定样本子图像中归属于样本人物的第二样本像素点；Determining second sample pixels belonging to the sample person in the sample sub-image based on the sample sub-image through the original connected domain detection model;

根据第二样本像素点以及对应的实际像素点，对原始边缘检测模型以及原始连通域检测模型进行训练，以获取训练完成的边缘检测模型以及连通域检测模型。According to the second sample pixel points and the corresponding actual pixel points, the original edge detection model and the original connected domain detection model are trained to obtain the trained edge detection model and the connected domain detection model.

基于相同的发明构思，本申请还提供了一种图像处理系统，图6为本申请一些实施例提供的一种图像处理系统的结构示意图，该系统包括：执行上述实施例中的方法的服务器61以及用于采集图像并发送的智能设备62。Based on the same inventive concept, this application also provides an image processing system. FIG. 6 is a schematic structural diagram of an image processing system provided by some embodiments of this application. The system includes: aserver 61 that executes the methods in the above embodiments And asmart device 62 for collecting images and sending them.

在一种可能的实施方式中，所述智能设备62，用于执行：In a possible implementation manner, thesmart device 62 is configured to execute:

获取目标调整方向以及目标角度；Obtain the target adjustment direction and target angle;

通过所述智能设备62的云台控制所述智能设备62的摄像头向所述目标调整方向调整所述目标角度。The camera of thesmart device 62 is controlled by the pan/tilt of thesmart device 62 to adjust the target angle to the target adjustment direction.

本申请中的图像处理系统对服务器61功能的介绍，与上述实施例中的进行图像处理的电子设备的功能类似，重复之处不做赘述。The introduction of the functions of theserver 61 in the image processing system in this application is similar to the functions of the electronic device for image processing in the above embodiments, and the repetitive parts will not be repeated.

本申请还提供了一种图像处理装置，图7为本申请一些实施例提供的一种图像处理装置的结构示意图，该装置包括：The present application also provides an image processing device. FIG. 7 is a schematic structural diagram of an image processing device provided in some embodiments of the present application. The device includes:

预处理单元71，用于确定获取到的图像中包含目标人物的第一区域；Apre-processing unit 71, configured to determine a first area in the acquired image that contains the target person;

第一处理单元72，用于通过预先训练的边缘检测模型，基于所述第一区域以及上一帧图像中包含所述目标人物的第二区域，确定所述第一区域中归属于所述目标人物的第一像素点；Thefirst processing unit 72 is configured to use a pre-trained edge detection model to determine that the first area belongs to the target based on the first area and the second area containing the target person in the previous frame of image The first pixel of the character;

第二处理单元73，用于根据所述第一像素点，确定所述图像中的背景区域。Thesecond processing unit 73 is configured to determine a background area in the image according to the first pixel points.

在某些可能的实施方式中，所述预处理单元71，具体用于获取所述图像中包含所述目标人物的候选区域；确定所述候选区域与上一帧图像所对应的第二区域之间的相似度；其中，若所述上一帧图像为第一帧图像，所述第一帧图像对应的第二区域是根据所述第一帧图像中包含的满足人脸选取条件的人脸所归属的人物的区域确定的；将相似度满足预设匹配条件的一个候选区域确定为所述第一区域。In some possible implementation manners, the preprocessingunit 71 is specifically configured to acquire a candidate area in the image that contains the target person; determine the difference between the candidate area and the second area corresponding to the previous frame image wherein, if the last frame image is the first frame image, the second area corresponding to the first frame image is based on the face selection condition contained in the first frame image The area of the assigned person is determined; a candidate area whose similarity meets a preset matching condition is determined as the first area.

在某些可能的实施方式中，所述预处理单元71，具体用于获取预设的目标人脸图像；通过人脸识别算法，确定所述第一帧图像中的人脸与所述目标人脸图像的相似度；将相似度大于预设的相似度阈值的人脸确定为满足人脸选取条件的人脸；或，确定所述第一帧图像中人脸的优先级，将优先级最高的人脸确定为满足人脸选取条件的人脸；所述优先级根据人脸与摄像头之间的距离、人脸在图像中的位置、人脸尺寸、人脸的表情分数以及人脸偏转角度中至少一项确定的。In some possible implementation manners, thepre-processing unit 71 is specifically configured to obtain a preset target face image; determine the face in the first frame image and the target person through a face recognition algorithm. The similarity of the face image; determine the face whose similarity is greater than the preset similarity threshold as the face that meets the face selection condition; or determine the priority of the face in the first frame image, and set the highest priority The face is determined as the face that meets the face selection criteria; the priority is based on the distance between the face and the camera, the position of the face in the image, the size of the face, the expression score of the face, and the deflection angle of the face At least one of the identified.

在某些可能的实施方式中，所述第一处理单元72，还用于确定所述第一区域中归属于所述目标人物的第一像素点之后，从所述图像中确定包含所述第一像素点的子图像；通过预先训练的连通域检测模型，基于所述子图像，确定所述子图像中归属于所述目标人物的第二像素点，并根据所述第二像素点对所述第一像素点进行更新。In some possible implementation manners, thefirst processing unit 72 is further configured to, after determining the first pixel in the first area that belongs to the target person, determine from the image that the A sub-image of one pixel; through the pre-trained connected domain detection model, based on the sub-image, determine the second pixel in the sub-image that belongs to the target person, and determine the second pixel according to the second pixel The first pixel is updated.

在某些可能的实施方式中，所述连通域检测模型和所述边缘检测模型通过如下方式训练：In some possible implementation manners, the connected domain detection model and the edge detection model are trained in the following manner:

在某些可能的实施方式中，所述第二处理单元73，还用于获取目标背景图像；根据所述背景区域在所述目标背景图像中对应的像素点，对所述背景区域中包含的像素点进行更新。In some possible implementation manners, thesecond processing unit 73 is further configured to acquire the target background image; according to the corresponding pixels of the background area in the target background image, the Pixels are updated.

在某些可能的实施方式中，所述预处理单元71，还用于确定获取到的图像中包含目标人物的第一区域之后，确定所述图像上的第一预设像素点的第一位置信息，以及所述第一区域上的第二预设像素点的第二位置信息；确定所述第一位置信息与对应的第二位置信息之间的目标距离；若确定所述目标距离满足预设的调整条件，则根据预设的距离与调整方向及角度的对应关系，确定所述目标距离对应的目标调整方向以及目标角度并发送至智能设备，以使所述智能设备通过所述智能设备的云台控制所述智能设备的摄像头向所述目标调整方向调整所述目标角度。In some possible implementation manners, the preprocessingunit 71 is further configured to determine the first position of the first preset pixel on the image after determining the first area containing the target person in the acquired image information, and the second position information of the second preset pixel on the first area; determine the target distance between the first position information and the corresponding second position information; if it is determined that the target distance satisfies the preset According to the preset adjustment conditions, according to the corresponding relationship between the preset distance and the adjustment direction and angle, the target adjustment direction and the target angle corresponding to the target distance are determined and sent to the smart device, so that the smart device passes through the smart device. The pan-tilt controls the camera of the smart device to adjust the target angle to the target adjustment direction.

具体的该图像处理装置解决问题的技术原理与上述实施例中描述的内容相同，在此不做具体赘述。The specific technical principle of the image processing device to solve the problem is the same as that described in the above embodiment, and will not be described in detail here.

如图8为本申请一些实施例提供的一种电子设备结构示意图，在上述各实施例的基础上，本申请还提供了一种电子设备，如图8所示，包括：处理器81、通信接口82、存储器83和通信总线84，其中，处理器81，通信接口82，存储器83通过通信总线84完成相互间的通信；Figure 8 is a schematic structural diagram of an electronic device provided by some embodiments of the present application. On the basis of the above-mentioned embodiments, the present application also provides an electronic device, as shown in Figure 8, including: aprocessor 81, acommunication Interface 82,memory 83 andcommunication bus 84, wherein,processor 81,communication interface 82,memory 83 complete mutual communication throughcommunication bus 84;

所述存储器83中存储有计算机程序，当所述程序被所述处理器81执行时，使得所述处理器81执行如下步骤：A computer program is stored in thememory 83, and when the program is executed by theprocessor 81, theprocessor 81 is made to perform the following steps:

在某些可能的实施方式中，所述处理器81，具体用于获取预设的目标人脸图像；通过人脸识别算法，确定所述第一帧图像中的人脸与所述目标人脸图像的相似度；将相似度大于预设的相似度阈值的人脸确定为满足人脸选取条件的人脸；或，确定所述第一帧图像中人脸的优先级，将优先级最高的人脸确定为满足人脸选取条件的人脸；所述优先级根据人脸与摄像头之间的距离、人脸在图像中的位置、人脸尺寸、人脸的表情分数以及人脸偏转角度中至少一项确定的。In some possible implementation manners, theprocessor 81 is specifically configured to acquire a preset target face image; determine the face in the first frame image and the target face through a face recognition algorithm The similarity of the image; determine the face whose similarity is greater than the preset similarity threshold as the face that meets the face selection condition; or, determine the priority of the face in the first frame image, and use the face with the highest priority The human face is determined as a human face that satisfies the human face selection condition; the priority is determined according to the distance between the human face and the camera, the position of the human face in the image, the size of the human face, the expression score of the human face, and the deflection angle of the human face At least one is certain.

在某些可能的实施方式中，所述处理器81，还用于确定所述第一区域中归属于所述目标人物的第一像素点之后，从所述图像中确定包含所述第一像素点的子图像；通过预先训练的连通域检测模型，基于所述子图像，确定所述子图像中归属于所述目标人物的第二像素点，并根据所述第二像素点对所述第一像素点进行更新。In some possible implementation manners, theprocessor 81 is further configured to, after determining the first pixel in the first area that belongs to the target person, determine from the image that the first pixel A sub-image of a point; through a pre-trained connected domain detection model, based on the sub-image, determine the second pixel point belonging to the target person in the sub-image, and perform an operation on the second pixel point according to the second pixel point One pixel is updated.

在某些可能的实施方式中，所述处理器81，还用于获取目标背景图像；根据所述背景区域在所述目标背景图像中对应的像素点，对所述背景区域中包含的像素点进行更新。In some possible implementation manners, theprocessor 81 is further configured to acquire the target background image; according to the corresponding pixels of the background area in the target background image, the pixels contained in the background area to update.

在某些可能的实施方式中，所述处理器81，还用于确定获取到的图像中包含目标人物的第一区域之后，确定所述图像上的第一预设像素点的第一位置信息，以及所述第一区域上的第二预设像素点的第二位置信息；确定所述第一位置信息与对应的第二位置信息之间的目标距离；若确定所述目标距离满足预设的调整条件，则根据预设的距离与调整方向及角度的对应关系，确定所述目标距离对应的目标调整方向以及目标角度并发送至智能设备，以使所述智能设备通过所述智能设备的云台控制所述智能设备的摄像头向所述目标调整方向调整所述目标角度。In some possible implementation manners, theprocessor 81 is further configured to determine the first position information of the first preset pixel on the image after determining the first area containing the target person in the acquired image , and the second position information of the second preset pixel point on the first area; determine the target distance between the first position information and the corresponding second position information; if it is determined that the target distance satisfies the preset According to the corresponding relationship between the preset distance and the adjustment direction and angle, the target adjustment direction and the target angle corresponding to the target distance are determined and sent to the smart device, so that the smart device can pass through the smart device. The pan/tilt controls the camera of the smart device to adjust the target angle to the target adjustment direction.

由于上述电子设备解决问题的原理与图像处理方法相似，因此上述电子设备的实施可以参见方法的实施，重复之处不再赘述。Since the problem-solving principle of the above-mentioned electronic device is similar to the image processing method, the implementation of the above-mentioned electronic device can refer to the implementation of the method, and the repetition will not be repeated.

上述电子设备提到的通信总线可以是外设部件互连标准(Peripheral ComponentInterconnect，PCI)总线或扩展工业标准结构(Extended Industry StandardArchitecture，EISA)总线等。该通信总线可以分为地址总线、数据总线、控制总线等。为便于表示，图中仅用一条粗线表示，但并不表示仅有一根总线或一种类型的总线。The communication bus mentioned in the above electronic device may be a Peripheral Component Interconnect (PCI) bus or an Extended Industry Standard Architecture (EISA) bus or the like. The communication bus can be divided into an address bus, a data bus, a control bus, and the like. For ease of representation, only one thick line is used in the figure, but it does not mean that there is only one bus or one type of bus.

通信接口82用于上述电子设备与其他设备之间的通信。Thecommunication interface 82 is used for communication between the above-mentioned electronic devices and other devices.

存储器可以包括随机存取存储器(Random Access Memory，RAM)，也可以包括非易失性存储器(Non-Volatile Memory，NVM)，例如至少一个磁盘存储器。可选地，存储器还可以是至少一个位于远离前述处理器的存储装置。The memory may include a random access memory (Random Access Memory, RAM), and may also include a non-volatile memory (Non-Volatile Memory, NVM), such as at least one disk memory. Optionally, the memory may also be at least one storage device located away from the aforementioned processor.

上述处理器可以是通用处理器，包括中央处理器、网络处理器(NetworkProcessor，NP)等；还可以是数字指令处理器(Digital Signal Processing，DSP)、专用集成电路、现场可编程门陈列或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。Above-mentioned processor can be general-purpose processor, comprises central processing unit, network processor (NetworkProcessor, NP) etc.; Programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc.

在上述各实施例的基础上，本发明实施例还提供了一种计算机可读存储介质，所述计算机可读存储介质内存储有可由处理器执行的计算机程序，当所述程序在所述处理器上运行时，使得所述处理器执行时实现如下步骤：On the basis of the above-mentioned embodiments, the embodiment of the present invention also provides a computer-readable storage medium, wherein a computer program executable by a processor is stored in the computer-readable storage medium. When running on the processor, the following steps are implemented when the processor is executed:

由于上述提供的计算机可读取介质解决问题的原理与图像处理方法相似，因此处理器执行上述计算机可读取介质中的计算机程序后，实现的步骤可以参见方法的实施，重复之处不再赘述。Since the problem-solving principle of the computer-readable medium provided above is similar to the image processing method, after the processor executes the computer program in the above-mentioned computer-readable medium, the steps to be realized can refer to the implementation of the method, and the repetition will not be repeated. .

本领域内的技术人员应明白，本申请的实施例可提供为方法、系统、或计算机程序产品。因此，本申请可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且，本申请可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。Those skilled in the art should understand that the embodiments of the present application may be provided as methods, systems, or computer program products. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.

本申请是参照根据本申请的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程图像处理设备的处理器以产生一个机器，使得通过计算机或其他可编程图像处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to the present application. It should be understood that each procedure and/or block in the flowchart and/or block diagram, and a combination of procedures and/or blocks in the flowchart and/or block diagram can be realized by computer program instructions. These computer program instructions can be provided to a general purpose computer, special purpose computer, embedded processor, or processor of other programmable image processing devices to produce a machine such that instructions executed by the processor of the computer or other programmable image processing devices produce An apparatus for realizing the functions specified in one or more procedures of the flowchart and/or one or more blocks of the block diagram.

这些计算机程序指令也可存储在能引导计算机或其他可编程图像处理设备以特定方式工作的计算机可读存储器中，使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品，该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。These computer program instructions may also be stored in a computer-readable memory capable of directing a computer or other programmable image processing device to operate in a specific manner, such that the instructions stored in the computer-readable memory produce an article of manufacture comprising instruction means, the instructions The device realizes the function specified in one or more procedures of the flowchart and/or one or more blocks of the block diagram.

这些计算机程序指令也可装载到计算机或其他可编程图像处理设备上，使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理，从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。These computer program instructions can also be loaded on a computer or other programmable image processing device, so that a series of operation steps are performed on the computer or other programmable device to produce a computer-implemented process, so that the image processing performed on the computer or other programmable device The instructions provide steps for implementing the functions specified in the flow chart or blocks of the flowchart and/or the block or blocks of the block diagrams.

显然，本领域的技术人员可以对本申请进行各种改动和变型而不脱离本申请的精神和范围。这样，倘若本申请的这些修改和变型属于本申请权利要求及其等同技术的范围之内，则本申请也意图包含这些改动和变型在内。Obviously, those skilled in the art can make various changes and modifications to the application without departing from the spirit and scope of the application. In this way, if these modifications and variations of the present application fall within the scope of the claims of the present application and their equivalent technologies, the present application is also intended to include these modifications and variations.

Claims

1. An image processing method, characterized in that the method comprises:

determining a first region containing a target person in the acquired image;

determining a first pixel point belonging to the target character in the first region based on the first region and a second region containing the target character in a previous frame of image through a pre-trained edge detection model;

and determining a background area in the image according to the first pixel point.

2. The method of claim 1, wherein the determining that the captured image contains the first region of the target person comprises:

acquiring a candidate region containing the target person in the image;

determining the similarity between the candidate region and a second region corresponding to the previous frame of image; if the previous frame of image is a first frame of image, determining a second region corresponding to the first frame of image according to a region of a person to which a face meeting a face selection condition belongs, wherein the person is contained in the first frame of image;

and determining a candidate region with the similarity meeting a preset matching condition as the first region.

3. The method of claim 2, wherein determining the face included in the first frame image that satisfies the face selection condition comprises:

acquiring a preset target face image; determining the similarity between the face in the first frame image and the target face image through a face recognition algorithm; determining the face with the similarity larger than a preset similarity threshold as the face meeting face selection conditions; or

Determining the priority of the face in the first frame of image, and determining the face with the highest priority as the face meeting face selection conditions; the priority is determined according to at least one of the distance between the face and the camera, the position of the face in the image, the size of the face, the expression score of the face and the deflection angle of the face.

4. The method of claim 1, wherein after determining the first pixel point in the first region attributed to the target person and before determining the background region in the image according to the first pixel point, the method further comprises:

determining a sub-image containing the first pixel point from the image;

and determining a second pixel point belonging to the target character in the sub-image based on the sub-image through a pre-trained connected domain detection model, and updating the first pixel point according to the second pixel point.

5. The method of claim 4, wherein the connected component detection model and the edge detection model are trained by:

acquiring any sample data in a sample set; the sample data comprises a first sample image, a second sample image and actual pixel points of sample characters in the second sample image, the first sample image and the second sample image are two adjacent frames of images, and the acquisition time of the first sample image is earlier than that of the second sample image;

determining, by an original edge detection model, a first sample pixel point belonging to the sample character in the second sample image based on the first sample image and the second sample image;

determining a sample sub-image containing the first sample pixel point from the second sample image;

determining a second sample pixel point belonging to the sample figure in the sample sub-image based on the sample sub-image through an original connected domain detection model;

and training the original edge detection model and the original connected domain detection model according to the second sample pixel points and the corresponding actual pixel points to obtain a trained edge detection model and a trained connected domain detection model.

6. The method of claim 1, further comprising:

acquiring a target background image;

and updating the pixel points contained in the background area according to the corresponding pixel points of the background area in the target background image.

7. The method of claim 1, wherein after determining the first region of the captured image containing the target person, the method further comprises:

determining first position information of a first preset pixel point on the image and second position information of a second preset pixel point on the first area;

determining a target distance between the first location information and corresponding second location information;

and if the target distance meets the preset adjustment condition, determining a target adjustment direction and a target angle corresponding to the target distance according to the corresponding relation between the preset distance and the adjustment direction and angle, and sending the target adjustment direction and the target angle to the intelligent equipment, so that the intelligent equipment controls a camera of the intelligent equipment to adjust the target angle to the target adjustment direction through a holder of the intelligent equipment.

8. An image processing system, comprising a server for performing the method of any of the above 1-7 and a smart device for capturing and transmitting the image.

9. The image processing system of claim 8, wherein the smart device is configured to perform:

acquiring a target adjusting direction and a target angle;

and controlling a camera of the intelligent equipment to adjust the target angle towards the target adjusting direction through a holder of the intelligent equipment.

10. An electronic device, characterized in that the electronic device comprises at least a processor and a memory, the processor being adapted to carry out the steps of the image processing method according to any of claims 1-7 when executing a computer program stored in the memory.