CN116403088A

Movatterモバイル変換

Info

Publication number: CN116403088A
Application number: CN202310320372.1A
Authority: CN
Inventors: 钟贞炎
Original assignee: Shenzhen Shuliantianxia Intelligent Technology Co Ltd
Current assignee: Shenzhen Shuliantianxia Intelligent Technology Co Ltd
Priority date: 2023-03-23
Filing date: 2023-03-23
Publication date: 2023-07-07

Abstract

The embodiment of the application relates to the technical field of intelligent monitoring, and discloses a method for training a static human body detection model, a human body detection method and a human body detection device. The static heat source samples comprise k frames of time-sequential infrared images, each static heat source sample is marked with a real label, and the real labels reflect that the static heat sources in the static heat source samples belong to static human heat sources or interference heat sources. By the method, the 3D convolutional neural network can learn the single-frame characteristics and the time characteristics of the static human body heat source and the interference heat source, so that the static human body detection model obtained by training has the capability of accurately distinguishing the static human body from the interference heat source, the human body which does not move is accurately detected, and the problem that the interference heat source is easily misjudged to be a person and the static human body exists and is missed is effectively solved.

Description

Translated fromChinese

训练静态人体检测模型的方法、人体检测方法及装置Method for training static human body detection model, human body detection method and device

技术领域technical field

本申请实施例涉及智能监测技术领域，尤其涉及一种训练静态人体检测模型的方法、人体检测方法及装置。The embodiments of the present application relate to the technical field of intelligent monitoring, and in particular to a method for training a static human body detection model, a human body detection method and a device.

背景技术Background technique

伴随万物互联时代的来临，智慧生活已成为众多终端用户的向往，如何实现更加智能化的服务已成为热点。现如今智能家居中所注入的智慧监测服务备受青睐，而室内区域人体存在检测技术在弱势群体的智慧照护、智能家居的交互等方面发挥着至关重要的作用，因而近些年受到了广泛的关注。其中，利用非接触式智能感知方式对室内目标区域中的人体存在实现精准感应与识别，进而监测人员是否在目标区域中发生长时间滞留，对提升用户的身心健康以及家居的智能化程度具有重要价值。With the advent of the Internet of Everything era, smart life has become the yearning of many end users, and how to realize more intelligent services has become a hot topic. Nowadays, smart monitoring services injected into smart homes are very popular, and human presence detection technology in indoor areas plays a vital role in the smart care of vulnerable groups and the interaction of smart homes, so it has been widely accepted in recent years. s concern. Among them, the use of non-contact intelligent sensing to accurately sense and identify the presence of human bodies in the indoor target area, and then monitor whether people stay in the target area for a long time, is of great importance to improve the physical and mental health of users and the degree of intelligence in the home. value.

然而，当背景中静态的干扰热源和人体的温度相近，人体未发生移动时，将难以区分物体和静态人体，容易将干扰热源误检为人体，静态人体存在漏判。However, when the temperature of the static interference heat source in the background is similar to that of the human body, and the human body does not move, it will be difficult to distinguish the object from the static human body.

发明内容Contents of the invention

有鉴于此，本申请一些实施例提供了一种训练静态人体检测模型的方法、人体检测方法及装置，使得训练得到的静态人体检测模型能够准确区分静态人体和干扰热源，准确检测出未发生移动的人体。In view of this, some embodiments of the present application provide a method for training a static human body detection model, a human body detection method and a device, so that the trained static human body detection model can accurately distinguish between a static human body and an interference heat source, and accurately detect that no movement occurs human body.

第一方面，本申请一些实施例中提供了一种训练静态人体检测模型的方法，包括：In the first aspect, some embodiments of the present application provide a method for training a static human detection model, including:

获取若干个静态热源样本，静态热源样本包括k帧红外图像，k帧红外图像具有时序性，各静态热源样本均标注有真实标签，真实标签反映静态热源样本中的静态热源属于静态人体热源或干扰热源；Obtain several static heat source samples. The static heat source samples include k frames of infrared images. The k frames of infrared images are sequential. Each static heat source sample is marked with a real label. The real label reflects that the static heat source in the static heat source sample belongs to the static human body heat source or interference. heat source;

采用若干个静态热源样本，对预先设置的3D卷积神经网络进行迭代训练，直至3D卷积神经网络收敛，得到静态人体检测模型。Several static heat source samples are used to iteratively train the preset 3D convolutional neural network until the 3D convolutional neural network converges to obtain a static human body detection model.

在一些实施例中，前述3D卷积神经网络包括依次级联的第一特征提取模块、第二特征提取模块和分类模块；In some embodiments, the aforementioned 3D convolutional neural network includes a sequentially cascaded first feature extraction module, a second feature extraction module, and a classification module;

其中，第一特征提取模块用于提取静态热源样本中各个红外图像中的静态热源的大致形状特征；Wherein, the first feature extraction module is used to extract the general shape features of the static heat source in each infrared image in the static heat source sample;

第二特征提取模块用于提取静态热源样本中各个红外图像中的静态热源的温度分布特征以及静态热源样本中静态热源在时间维度上的变化特征；The second feature extraction module is used to extract the temperature distribution characteristics of the static heat source in each infrared image in the static heat source sample and the variation characteristics of the static heat source in the static heat source sample on the time dimension;

分类模块用于对第二特征提取模块输出的特征图进行分类，输出属于静态人体热源的概率和属于干扰热源的概率。The classification module is used to classify the feature map output by the second feature extraction module, and output the probability of belonging to the static human body heat source and the probability of belonging to the interference heat source.

在一些实施例中，前述第一特征提取模块包括多个3D卷积层，各3D卷积层后均设置有最大池化层。In some embodiments, the aforementioned first feature extraction module includes multiple 3D convolutional layers, and each 3D convolutional layer is followed by a maximum pooling layer.

在一些实施例中，前述第二特征提取模块包括多个级联的子模块，各子模块包括依次连接的多个3D卷积层，其中，最后一个3D卷积层后设置有最大池化层。In some embodiments, the aforementioned second feature extraction module includes a plurality of cascaded sub-modules, each sub-module includes a plurality of 3D convolutional layers connected in sequence, wherein a maximum pooling layer is arranged after the last 3D convolutional layer .

在一些实施例中，前述分类模块包括多个全连接层和softmax函数层，其中，相邻两个全连接层之间设置有Dropout层。In some embodiments, the aforementioned classification module includes multiple fully connected layers and softmax function layers, wherein a Dropout layer is arranged between two adjacent fully connected layers.

在一些实施例中，训练过程中所采用的损失函数包括：In some embodiments, the loss function used in the training process includes:

其中，Loss为单个训练分支的损失，N为单个训练分支中的样本数量，w_n为第n个静态热源样本中检测到存在热斑的帧数占比，y_n为第n个静态热源样本对应的预测标签，

为第n个静态热源样本对应的真实标签。Among them, Loss is the loss of a single training branch, N is the number of samples in a single training branch, w_n is the proportion of frames with hot spots detected in the nth static heat source sample, and y_n is the nth static heat source sample The corresponding predicted label,

is the true label corresponding to the nth static heat source sample.

第二方面，本申请一些实施例中提供了一种人体检测方法，包括：In the second aspect, some embodiments of the present application provide a human body detection method, including:

获取测试样本，测试样本包括具有时序性的k帧红外图像；Obtain a test sample, the test sample includes time-sequential k frames of infrared images;

根据测试样本，确定测试样本中是否存在热源，若不存在热源，则输出不存在人体；According to the test sample, determine whether there is a heat source in the test sample, if there is no heat source, output that there is no human body;

若存在热源，则根据测试样本，确定热源是否发生移动，若发生移动，则输出存在人体；If there is a heat source, then according to the test sample, it is determined whether the heat source has moved, and if it moves, it is output that there is a human body;

若未发生移动，则将测试样本输入静态人体检测模型，输出测试样本中的热源属于静态人体热源还是属于干扰热源，其中，静态人体检测模型是采用第一方面的方法训练得到的。If there is no movement, input the test sample into the static human body detection model, and output whether the heat source in the test sample belongs to the static human body heat source or the interference heat source, wherein the static human body detection model is trained by the method of the first aspect.

在一些实施例中，前述根据测试样本，确定热源是否发生移动，包括：In some embodiments, determining whether the heat source has moved according to the test sample includes:

对测试样本进行差分计算，得到差分样本；Perform differential calculations on the test samples to obtain differential samples;

遍历差分样本，确定每一帧差分图像分别对应的温度阈值；Traversing the differential samples to determine the temperature threshold corresponding to each frame of the differential image;

针对差分图像中的像素点，筛选出差值不小于温度阈值的像素点，构成移动区域，若移动区域的最大连通面积大于或等于面积阈值，则确定差分图像中热源存在移动；For the pixels in the difference image, screen out the pixels whose difference is not less than the temperature threshold to form a moving area. If the maximum connected area of the moving area is greater than or equal to the area threshold, it is determined that there is movement of the heat source in the difference image;

当差分样本遍历完成后，若差分样本中存在热源移动的差分图像的数量大于或等于数量阈值，则确定热源发生移动。After the differential sample traversal is completed, if the number of differential images in which the heat source moves in the differential sample is greater than or equal to the number threshold, it is determined that the heat source has moved.

第三方面，本申请一些实施例中提供了一种电子设备，其特征在于，包括：In a third aspect, some embodiments of the present application provide an electronic device, which is characterized in that it includes:

至少一个处理器；和at least one processor; and

与至少一个处理器通信连接的存储器；其中，存储器存储有可被至少一个处理器执行的指令,指令被至少一个处理器执行,以使至少一个处理器能够执行第一方面或第二方面的方法。A memory connected in communication with at least one processor; wherein, the memory stores instructions that can be executed by at least one processor, and the instructions are executed by at least one processor, so that at least one processor can perform the method of the first aspect or the second aspect .

第四方面，本申请一些实施例中提供了一种计算机可读存储介质，其特征在于，计算机可读存储介质存储有计算机可执行指令，计算机可执行指令用于使计算机设备执行第一方面或第二方面的方法。In a fourth aspect, some embodiments of the present application provide a computer-readable storage medium, wherein the computer-readable storage medium stores computer-executable instructions, and the computer-executable instructions are used to cause the computer device to execute the first aspect or The second aspect of the method.

本申请实施例的有益效果：区别于现有技术的情况，本申请一些实施例提供的训练静态人体检测模型的方法，通过获取若干个静态热源样本，采用这若干个静态热源样本，对预先设置的3D卷积神经网络进行迭代训练，直至3D卷积神经网络收敛，得到静态人体检测模型。其中，静态热源样本包括具有时序性的k帧红外图像，各静态热源样本均标注有真实标签，该真实标签反映静态热源样本中的静态热源属于静态人体热源或干扰热源。Beneficial effects of the embodiment of the present application: Different from the situation of the prior art, the method for training the static human body detection model provided by some embodiments of the present application obtains several static heat source samples, and adopts these several static heat source samples to perform the pre-set The 3D convolutional neural network is iteratively trained until the 3D convolutional neural network converges to obtain a static human detection model. Among them, the static heat source samples include k frames of infrared images with time series, and each static heat source sample is marked with a real label, which reflects that the static heat source in the static heat source sample belongs to a static human body heat source or an interference heat source.

在此实施例中，采用大量丰富的静态热源样本对3D卷积神经网络进行训练，使得3D卷积神经网络能够学习静态人体热源和干扰热源的单帧特征及时域特征，从而，训练得到的静态人体检测模型具备准确区分静态人体和干扰热源的能力，准确检测出未发生移动的人体，有效解决了干扰热源易误判为有人、静态人体存在而漏判的难点问题。此外，训练得到的静态人体检测模型可应用于具有红外摄像头的监测设备，一方面，对阳光、热水、加热物体等热源引起的干扰具有较高的鲁棒性，检测准确率高，另一方面，能够有效规避用户隐私暴露问题。In this embodiment, a large number of abundant static heat source samples are used to train the 3D convolutional neural network, so that the 3D convolutional neural network can learn the single-frame and time-domain features of static human body heat sources and interference heat sources. The human body detection model has the ability to accurately distinguish between a static human body and an interference heat source, accurately detect a non-moving human body, and effectively solve the difficult problem that the interference heat source is easily misjudged as a person, and the presence of a static human body is missed. In addition, the static human detection model obtained through training can be applied to monitoring equipment with infrared cameras. On the one hand, it has high robustness to interference caused by heat sources such as sunlight, hot water, and heated objects, and has high detection accuracy. On the one hand, it can effectively avoid the problem of user privacy exposure.

附图说明Description of drawings

一个或多个实施例通过与之对应的附图中的图片进行示例性说明，这些示例性说明并不构成对实施例的限定，附图中具有相同参考数字标号的元件表示为类似的元件，除非有特别申明，附图中的图不构成比例限制。One or more embodiments are exemplified by the pictures in the corresponding drawings, and these exemplifications do not constitute a limitation to the embodiments. Elements with the same reference numerals in the drawings represent similar elements. Unless otherwise stated, the drawings in the drawings are not limited to scale.

图1为本申请一些实施例中红外图像的示意图；Fig. 1 is a schematic diagram of an infrared image in some embodiments of the present application;

图2为本申请一些实施例中人体检测系统的应用场景示意图；FIG. 2 is a schematic diagram of an application scenario of a human detection system in some embodiments of the present application;

图3为本申请一些实施例中电子设备的结构示意图；FIG. 3 is a schematic structural diagram of an electronic device in some embodiments of the present application;

图4为本申请一些实施例中训练静态人体检测模型的方法的流程示意图；FIG. 4 is a schematic flowchart of a method for training a static human detection model in some embodiments of the present application;

图5为本申请一些实施例中采集静态热源样本的示意图；5 is a schematic diagram of collecting static heat source samples in some embodiments of the present application;

图6为本申请一些实施例中3D卷积神经网络的结构示意图；6 is a schematic structural diagram of a 3D convolutional neural network in some embodiments of the present application;

图7为本申请一些实施例中人体检测方法的流程示意图。Fig. 7 is a schematic flowchart of a human body detection method in some embodiments of the present application.

具体实施方式Detailed ways

下面结合具体实施例对本申请进行详细说明。以下实施例将有助于本领域的技术人员进一步理解本申请，但不以任何形式限制本申请。应当指出的是，对本领域的普通技术人员来说，在不脱离本申请构思的前提下，还可以做出若干变形和改进。这些都属于本申请的保护范围。The present application will be described in detail below in conjunction with specific embodiments. The following examples will help those skilled in the art to further understand the present application, but do not limit the present application in any form. It should be noted that those skilled in the art can make several modifications and improvements without departing from the concept of the present application. These all belong to the protection scope of this application.

为了使本申请的目的、技术方案及优点更加清楚明白，以下结合附图及实施例，对本申请进行进一步详细说明。应当理解，此处所描述的具体实施例仅用以解释本申请，并不用于限定本申请。In order to make the purpose, technical solution and advantages of the present application clearer, the present application will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present application, not to limit the present application.

需要说明的是，如果不冲突，本申请实施例中的各个特征可以相互结合，均在本申请的保护范围之内。另外，虽然在装置示意图中进行了功能模块划分，在流程图中示出了逻辑顺序，但是在某些情况下，可以以不同于装置中的模块划分，或流程图中的顺序执行所示出或描述的步骤。此外，本文所采用的“第一”、“第二”、“第三”等字样并不对数据和执行次序进行限定，仅是对功能和作用基本相同的相同项或相似项进行区分。It should be noted that, if there is no conflict, various features in the embodiments of the present application may be combined with each other, and all of them are within the protection scope of the present application. In addition, although the functional modules are divided in the schematic diagram of the device, and the logical order is shown in the flowchart, in some cases, the division of modules in the device or the sequence shown in the flowchart can be performed in different ways. or the steps described. In addition, words such as "first", "second", and "third" used in this article do not limit the data and execution order, but only distinguish the same or similar items with basically the same function and effect.

除非另有定义，本说明书所使用的所有的技术和科学术语与属于本申请的技术领域的技术人员通常理解的含义相同。本说明书中在本申请的说明书中所使用的术语只是为了描述具体的实施方式的目的，不是用于限制本申请。本说明书所使用的术语“和/或”包括一个或多个相关的所列项目的任意的和所有的组合。Unless otherwise defined, all technical and scientific terms used in this specification have the same meaning as commonly understood by one of ordinary skill in the technical field of this application. The terminology used in the description of the present application is only for the purpose of describing a specific embodiment, and is not used to limit the present application. The term "and/or" used in this specification includes any and all combinations of one or more of the associated listed items.

此外，下面所描述的本申请各个实施方式中所涉及到的技术特征只要彼此之间未构成冲突就可以相互组合。In addition, the technical features involved in the various embodiments of the present application described below may be combined with each other as long as they do not constitute a conflict with each other.

为便于对本申请实施例提供的方法进行理解，首先对本申请实施例中涉及的名词进行介绍：In order to facilitate the understanding of the methods provided in the embodiments of the present application, firstly, the nouns involved in the embodiments of the present application are introduced:

(1)红外热成像技术(1) Infrared thermal imaging technology

红外热成像技术运用光电技术检测物体热辐射的红外线特定波段信号，将该信号转换成可供人类视觉分辨的红外成像图，并可以进一步计算出温度值。红外成像图中每个像素点的值为对应世界坐标系中物体的温度值。从而，红外热成像技术使人类超越了视觉障碍，由此人们可以“看到”物体表面的温度分布状况。Infrared thermal imaging technology uses photoelectric technology to detect infrared specific band signals of thermal radiation of objects, converts the signals into infrared imaging images that can be distinguished by human vision, and can further calculate the temperature value. The value of each pixel in the infrared imaging image corresponds to the temperature value of the object in the world coordinate system. Thus, infrared thermal imaging technology enables humans to transcend visual barriers, so that people can "see" the temperature distribution on the surface of objects.

例如，将红外摄像头安装在室内的天花板上，对室内的人体进行拍摄采集得到红外视频，可以理解的是，红外视频包括连续的多帧红外图像。请参阅图1，图1为采用低分辨率(24*32)的红外摄像头安装在地面上方天花板时，采集到的一帧红外图像。图1中右下角的高亮区域为人体热源的温度分布，左上角的高亮区域为干扰热源的温度分布。For example, an infrared camera is installed on a ceiling indoors, and an infrared video is obtained by shooting and collecting indoor human bodies. It can be understood that the infrared video includes continuous multi-frame infrared images. Please refer to Figure 1. Figure 1 is a frame of infrared image collected when a low-resolution (24*32) infrared camera is installed on the ceiling above the ground. The highlighted area in the lower right corner of Figure 1 is the temperature distribution of the human heat source, and the highlighted area in the upper left corner is the temperature distribution of the interference heat source.

(2)分位数(2) quantile

对于一串数量为N的数字x₁、x₂…、x_n-1、x_n，定义有0-100％分位数，0％分位数就是这串数字的最小值，100％分位数就是这串数字的最大值。而中间分位数，比如50％分位数Q₅₀，则表示这串数字里面有50％的数字都小于等于Q₅₀，同理75％分位数Q₇₅，则这串数字中75％的数都小于等于Q₇₅。可以理解的是，Q₅₀不一定是这串数字从小到大排名中间的那个数，可能是接近中间数的一个数，分位数的具体求取方法有现成算法，这里不做赘述。For a string of N numbers x₁ , x₂ ..., x_n-1 , x_n , there are 0-100% quantiles defined, 0% quantile is the minimum value of this string of numbers, 100% quantile The number is the maximum value of this string of numbers. And the middle quantile, such as the 50% quantile Q₅₀ , means that 50% of the numbers in this string of numbers are less than or equal to Q₅₀ . Similarly, the 75% quantile Q₇₅ means that 75% of the numbers in this string The numbers are all less than or equal to Q₇₅ . It is understandable that Q₅₀ is not necessarily the number in the middle of this series of numbers from small to large, but may be a number close to the middle number. There are ready-made algorithms for calculating the quantile, so I won’t go into details here.

QL：下四分位数，即25％分位数，表示全部数值中有四分之一的数值比QL小。QL: The lower quartile, the 25% quantile, means that a quarter of all values are smaller than QL.

QU：上四分位数，即75％分位数，表示全部数值中有四分之一的数值比QU大。QU: The upper quartile, that is, the 75% quantile, which means that a quarter of all values are larger than QU.

IQR：四分位间距，即QU-QL＝75％分位数-25％分位数，期间包含了全部数值的一半。IQR: Interquartile range, that is, QU-QL=75% quantile-25% quantile, the period contains half of all values.

红外热成像设备按一定的频率采集红外成像图，在保障隐私的前提下，可以捕捉和分析目标人体的动态行为。红外成像图中各像素点反映对应空间中物体的温度，在对目标人体进行监护时，需要从红外成像图中识别出人体，以方便监护。Infrared thermal imaging equipment collects infrared imaging images at a certain frequency. Under the premise of ensuring privacy, it can capture and analyze the dynamic behavior of the target human body. Each pixel in the infrared imaging image reflects the temperature of the object in the corresponding space. When monitoring the target human body, it is necessary to identify the human body from the infrared imaging image to facilitate monitoring.

为介绍本申请实施例前，先对本申请发明人所知晓的人体检测方法进行简单介绍，使得后续便于理解本申请实施例。Before introducing the embodiments of the present application, a brief introduction to the human body detection methods known to the inventors of the present application is made, so that the subsequent understanding of the embodiments of the present application is facilitated.

在一些方案中，基于毫米波多普勒雷达技术对室内人体存在进行检测。首先，通过毫米波雷达发射电磁波信号并获取该信号遇到空间内的目标物反射回来的回波信号；其次，依据回波信号确定所反射信号的目标物相对于设备的距离信息和方位信息，并基于距离信息和方位信息确定回波信号功率谱；最后，依据功率谱从回波信号中筛选出由人体反射回来的目标回波信号，并基于此确定人体是否存在于目标空间内。In some schemes, indoor human presence detection is based on millimeter-wave Doppler radar technology. Firstly, the electromagnetic wave signal is transmitted through the millimeter-wave radar and the echo signal reflected by the target object in the space is obtained; secondly, the distance information and orientation information of the target object of the reflected signal relative to the equipment are determined according to the echo signal, The power spectrum of the echo signal is determined based on the distance information and orientation information; finally, the target echo signal reflected by the human body is screened out from the echo signal according to the power spectrum, and based on this, it is determined whether the human body exists in the target space.

在此方案中，毫米波雷达技术主要依靠目标移动、心跳等波动信号进行人体存在检测，对环境中存在的扰动信息较为敏感，例如动态的窗帘、风扇、掉落物体等易误报为有人，从而易引发长时间滞留预警的误报；另外，毫米波雷达受多径效应影响较大，对于非波束范围内的目标容易丢失，即出现目标区域中有静态人体存在的漏报。In this solution, millimeter-wave radar technology mainly relies on fluctuating signals such as target movement and heartbeat to detect human presence, and is sensitive to disturbance information in the environment, such as dynamic curtains, fans, falling objects, etc., which are easy to be falsely reported as people. As a result, it is easy to cause false alarms for long-term stay warnings; in addition, millimeter-wave radars are greatly affected by multipath effects, and targets within the non-beam range are easily lost, that is, there are false alarms that there are static human bodies in the target area.

在一些方案中，基于红外热成像传感技术进行目标区域内的人体存在检测。首先，通过红外热成像传感器获取初始帧温度数据并计算其最大值与平均值的差值D1，如果差值D1小于阈值T1则默认目标区域内无人，否则判断后续两帧温度数据的最大值与平均值的差值D2是否小于阈值T1；如果差值D2小于阈值T1则输出目标区域内无人，否则进入下一步的判断流程；即通过标记当前帧温度最大值的坐标信息，并计算以坐标值为中心形成3*3点阵的温度最大值与平均值的差值D3，如果差值D3大于阈值T2则输出目标区域内无人，否则输出当前时刻目标区域内有人体存在。In some schemes, human presence detection in the target area is performed based on infrared thermal imaging sensing technology. First, obtain the initial frame temperature data through the infrared thermal imaging sensor and calculate the difference D1 between the maximum value and the average value. If the difference D1 is less than the threshold T1, there will be no one in the target area by default; otherwise, the maximum value of the subsequent two frames of temperature data will be judged. Whether the difference D2 with the average value is less than the threshold T1; if the difference D2 is less than the threshold T1, output that there is no one in the target area, otherwise enter the next step of the judgment process; that is, by marking the coordinate information of the maximum temperature of the current frame and calculating The coordinate value is the difference D3 between the maximum temperature and the average value of the 3*3 lattice formed by the center. If the difference D3 is greater than the threshold T2, it will output that there is no one in the target area, otherwise it will output that there is a human body in the target area at the current moment.

在此方案中，依靠红外热成像传感器获取的热源温度信息进行简单的差值计算，并比较差值与设定阈值的大小进行判断人体存在，虽然可以排除水杯、茶壶等小尺寸热源信号的误报，但居家场景(如卫生间等场所)干扰热源多且复杂，该技术方案对于面积稍大的静态干扰热源容易引起误报为有人的情况，从而给用户带来较差的体验感。In this scheme, the temperature information of the heat source obtained by the infrared thermal imaging sensor is used for simple difference calculation, and the difference is compared with the set threshold to determine the existence of the human body. However, there are many and complex interference heat sources in home scenes (such as toilets and other places). This technical solution may cause false alarms for people with a slightly larger area of static interference heat sources, thus bringing poor experience to users.

针对上述问题，本申请实施例提供了一种训练静态人体检测模型的方法、人体检测方法及装置，通过获取若干个静态热源样本，采用这若干个静态热源样本，对预先设置的3D卷积神经网络进行迭代训练，直至3D卷积神经网络收敛，得到静态人体检测模型。其中，静态热源样本包括具有时序性的k帧红外图像，各静态热源样本均标注有真实标签，该真实标签反映静态热源样本中的静态热源属于静态人体热源或干扰热源。In view of the above problems, the embodiment of the present application provides a method for training a static human body detection model, a human body detection method and a device, by obtaining several static heat source samples, and using these several static heat source samples, the preset 3D convolution neural network The network is trained iteratively until the 3D convolutional neural network converges to obtain a static human detection model. Among them, the static heat source samples include k frames of infrared images with time series, and each static heat source sample is marked with a real label, which reflects that the static heat source in the static heat source sample belongs to a static human body heat source or an interference heat source.

下面说明本申请实施例提供的用于训练静态人体检测模型或用于人体检测的电子设备的示例性应用。本申请实施例提供的电子设备可以是服务器，例如部署在云端的服务器。本申请一些实施例提供的电子设备可以是笔记本电脑、台式计算机或移动设备等各种类型的终端。The following describes an exemplary application for training a static human body detection model or an electronic device for human body detection provided by the embodiments of the present application. The electronic device provided in the embodiment of the present application may be a server, for example, a server deployed in the cloud. The electronic device provided by some embodiments of the present application may be various types of terminals such as a notebook computer, a desktop computer, or a mobile device.

作为示例，参见图2，图2是本申请实施例提供的人体检测系统的应用场景示意图。终端10通过网络连接服务器20，其中，网络可以是广域网或者局域网，又或者是二者的组合。As an example, refer to FIG. 2 , which is a schematic diagram of an application scenario of a human body detection system provided by an embodiment of the present application. The terminal 10 is connected to theserver 20 through a network, wherein the network may be a wide area network or a local area network, or a combination of both.

终端10可以被用来获取训练数据和构建神经网络，例如，本领域技术人员在终端上下载准备好的训练数据，以及，搭建神经网络的网络结构。其中，训练数据包括若干个静态热源样本。可以理解的是，终端10也可以被用来获取测试样本，例如，红外摄像头将采集到的测试样本发送给终端10，从而，终端10获取到测试样本。在一些实施例中，终端10可以与红外摄像头集成在一起。The terminal 10 can be used to acquire training data and build a neural network, for example, those skilled in the art download prepared training data on the terminal, and build a network structure of a neural network. Wherein, the training data includes several static heat source samples. It can be understood that the terminal 10 can also be used to obtain a test sample, for example, the infrared camera sends the collected test sample to the terminal 10, so that the terminal 10 obtains the test sample. In some embodiments, the terminal 10 can be integrated with an infrared camera.

在一些实施例中，终端10本地执行本申请实施例提供的训练静态人体检测模型的方法来完成采用训练数据对设计好的神经网络进行训练，确定最终的模型参数，从而神经网络配置该最终的模型参数，即可得到静态人体检测模型。在一些实施例中，终端10也可以通过网络向服务器20发送本领域技术人员在终端上存储的训练数据和构建好的神经网络，服务器20接收该训练数据和神经网络，采用训练数据对神经网络进行训练，确定最终的模型参数，然后将该最终的模型参数发送给终端10，终端10保存该最终的模型参数，使得神经网络配置该最终的模型参数，即可得到静态人体检测模型。In some embodiments, the terminal 10 locally executes the method for training the static human detection model provided by the embodiment of the present application to complete the training of the designed neural network using the training data, and determine the final model parameters, so that the neural network configures the final Model parameters, the static human detection model can be obtained. In some embodiments, the terminal 10 can also send the training data and the constructed neural network stored on the terminal by those skilled in the art to theserver 20 through the network, theserver 20 receives the training data and the neural network, and uses the training data to analyze the neural network. Perform training to determine the final model parameters, and then send the final model parameters to the terminal 10, and the terminal 10 saves the final model parameters so that the neural network configures the final model parameters to obtain a static human detection model.

下面说明本申请实施例中电子设备的结构，图3是本申请实施例中电子设备500的结构示意图，电子设备500包括至少一个处理器510、存储器550、至少一个网络接口520和用户接口530。电子设备500中的各个组件通过总线系统540耦合在一起。可理解,总线系统540用于实现这些组件之间的连接通信。总线系统540除包括数据总线之外,还包括电源总线、控制总线和状态信号总线。但是为了清楚说明起见,在图3中将各种总线都标为总线系统540。The following describes the structure of the electronic device in the embodiment of the present application. FIG. 3 is a schematic structural diagram of theelectronic device 500 in the embodiment of the present application. Theelectronic device 500 includes at least one processor 510, memory 550, at least one network interface 520 and user interface 530. Various components in theelectronic device 500 are coupled together through thebus system 540 . It can be understood that thebus system 540 is used to realize connection and communication between these components. In addition to the data bus, thebus system 540 also includes a power bus, a control bus and a status signal bus. However, for clarity of illustration, the various buses are labeledbus system 540 in FIG. 3 .

处理器510可以是一种集成电路芯片,具有信号的处理能力,例如通用处理器、数字信号处理器(DSP,Digital Signal Processor),或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等,其中,通用处理器可以是微处理器或者任何常规的处理器等。The processor 510 may be an integrated circuit chip with signal processing capabilities, such as a general-purpose processor, a digital signal processor (DSP, Digital Signal Processor), or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware Components, etc., wherein the general-purpose processor can be a microprocessor or any conventional processor, etc.

用户接口530包括使得能够呈现媒体内容的一个或多个输出装置531,包括一个或多个扬声器和/或一个或多个视觉显示屏。用户接口530还包括一个或多个输入装置532,包括有助于用户输入的用户接口部件,比如键盘、鼠标、麦克风、触屏显示屏、摄像头,其他输入按钮和控件。User interface 530 includes one or more output devices 531 that enable presentation of media content, including one or more speakers and/or one or more visual displays. The user interface 530 also includes one or more input devices 532, including user interface components that facilitate user input, such as a keyboard, mouse, microphone, touch screen display, camera, other input buttons and controls.

存储器550包括易失性存储器或非易失性存储器,也可包括易失性和非易失性存储器两者。其中,非易失性存储器可以是只读存储器(ROM,Read Only Memory),易失性存储器可以是随机存取存储器(RAM,Random Access M emory)。本申请实施例描述的存储器550旨在包括任意适合类型的存储器。存储器550可选地包括在物理位置上远离处理器510的一个或多个存储设备。Memory 550 includes volatile memory or nonvolatile memory, and may include both volatile and nonvolatile memory. Wherein, the non-volatile memory may be a read only memory (ROM, Read Only Memory), and the volatile memory may be a random access memory (RAM, Random Access Memory). The memory 550 described in the embodiment of the present application is intended to include any suitable type of memory. Memory 550 optionally includes one or more storage devices located physically remote from processor 510 .

在一些实施例中,存储器550能够存储数据以支持各种操作,这些数据的示例包括程序、模块和数据结构或者其子集或超集,下面示例性说明。In some embodiments, memory 550 is capable of storing data to support various operations, examples of which include programs, modules, and data structures, or subsets or supersets thereof, as exemplified below.

操作系统551,包括用于处理各种基本系统服务和执行硬件相关任务的系统程序，例如框架层、核心库层、驱动层等,用于实现各种基础业务以及处理基于硬件的任务；Operating system 551, including system programs for processing various basic system services and performing hardware-related tasks, such as framework layer, core library layer, driver layer, etc., for implementing various basic services and processing hardware-based tasks;

网络通信模块552,用于经由一个或多个(有线或无线)网络接口520到达其他计算设备,示例性的网络接口520包括:蓝牙、无线相容性认证(WiFi),和通用串行总线(USB,Universal Serial Bus)等；A network communication module 552 for reaching other computing devices via one or more (wired or wireless) network interfaces 520, exemplary network interfaces 520 include: Bluetooth, Wireless Compatibility Authentication (WiFi), and Universal Serial Bus ( USB, Universal Serial Bus), etc.;

显示模块553,用于经由一个或多个与用户接口530相关联的输出装置531(例如，显示屏、扬声器等)使得能够呈现信息(例如,用于操作外围设备和显示内容和信息的用户接口)；Display module 553 for enabling presentation of information via one or more output devices 531 (e.g., display screen, speakers, etc.) associated with user interface 530 (e.g., a user interface for operating peripherals and displaying content and information );

输入处理模块554,用于对一个或多个来自一个或多个输入装置532之一的一个或多个用户输入或互动进行检测以及翻译所检测的输入或互动。The input processing module 554 is configured to detect one or more user inputs or interactions from one or more of the input devices 532 and translate the detected inputs or interactions.

根据上文可以理解，本申请实施例提供的训练静态人体检测模型的方法可以由各种类型具有处理能力的电子设备实施，例如由电子设备的处理器实施执行或由其它具有计算处理能力的设备实施执行等。其它具有计算处理能力的设备可以是与电子设备通信连接的智能终端或服务器等。According to the above, it can be understood that the method for training a static human body detection model provided by the embodiment of the present application can be implemented by various types of electronic devices with processing capabilities, for example, implemented by a processor of an electronic device or by other devices with computing processing capabilities implementation, etc. Other devices with computing and processing capabilities may be smart terminals or servers that are communicatively connected to electronic devices.

下面结合本申请实施例提供的电子设备的示例性应用和实施，说明本申请实施例提供的训练静态人体检测模型的方法。请参阅图4，图4是本申请实施例提供的训练静态人体检测模型的方法的流程示意图。可以理解的是，该训练方法的执行主体可以是电子设备的一个或多个处理器。The method for training a static human body detection model provided by the embodiment of the present application will be described below in conjunction with the exemplary application and implementation of the electronic device provided in the embodiment of the present application. Please refer to FIG. 4 . FIG. 4 is a schematic flowchart of a method for training a static human detection model provided by an embodiment of the present application. It can be understood that the training method may be executed by one or more processors of the electronic device.

请再次参阅图4，该方法S100具体可以包括如下步骤：Please refer to FIG. 4 again, the method S100 may specifically include the following steps:

S10：获取若干个静态热源样本。S10: Obtain several static heat source samples.

其中，静态热源样本包括k帧红外图像，该k帧红外图像具有时序性。各静态热源样本均标注有真实标签，真实标签反映静态热源样本中的静态热源属于静态人体热源或干扰热源。Wherein, the static heat source samples include k frames of infrared images, and the k frames of infrared images are time-sequential. Each static heat source sample is marked with a real label, and the real label reflects that the static heat source in the static heat source sample belongs to a static human body heat source or an interference heat source.

可以理解的是，静态热源样本是红外摄像头对某一区域内进行采集得到的红外视频，该红外视频包括k帧按时序排列的红外图像。在一些实施例中，k＝f*t,其中，f为红外摄像头的采集频率，t为时间(秒)。在一些实施例中，1≤t≤60，1≤f≤32。It can be understood that the static heat source sample is an infrared video collected by an infrared camera in a certain area, and the infrared video includes k frames of infrared images arranged in time sequence. In some embodiments, k=f*t, where f is the acquisition frequency of the infrared camera, and t is time (seconds). In some embodiments, 1≤t≤60, 1≤f≤32.

在这些若干个静态热源样本中，一部分静态热源样本仅包括干扰热源，例如一盆热水、取暖器或发热马桶等，不包括人体；另一部分静态热源样本包括干扰热源和静态人体，其中，静态人体是指未发生大幅移动的人体。Among these several static heat source samples, some static heat source samples only include interference heat sources, such as a pot of hot water, a heater, or a heating toilet, etc., excluding human bodies; another part of static heat source samples include interference heat sources and static human bodies, among which, A human body refers to a human body that has not moved substantially.

每个静态热源标注有反映热源类别的真实标签。在一些实施例中，可以采用编码标注标签，例如，干扰热源用0表示，静态人体热源用1表示，则只包括干扰热源的静态热源样本标注0，包括静态人体热源的静态热源样本标注1。Each static heat source is annotated with a ground-truth label reflecting the heat source category. In some embodiments, coding labels can be used. For example, interference heat sources are represented by 0, and static human body heat sources are represented by 1, then static heat source samples that only include interference heat sources are marked with 0, and static heat source samples that include static human body heat sources are marked with 1.

在一些实施例中，请参阅图5，将红外摄像头安装在室内目标区域对应天花板中心位置，设备正对下方目标区域，且距离地面高度h为2.0至3.0米。如图5所示，红外摄像头采集目标区域内的外视频数据，在数据采集期间，人员自行进入目标区域，人员在目标区域内可能静止或移动，也可以制造静态干扰热源，例如在目标区域内放置一盆热水、热毛巾或取暖器等，一段时间后离开。数据采集期间可以反复进入或离开目标区域，并在目标区域内制造不同的静态干扰热源。在一些实施例中，数据采集可在真实应用场景中进行，例如将红外摄像头安装在卫生间天花板中央，对卫生间场景进行实时数据采集，以24小时为周期进行数据采集与存储。In some embodiments, referring to FIG. 5 , the infrared camera is installed at the center of the ceiling corresponding to the indoor target area, the device is directly facing the target area below, and the height h from the ground is 2.0 to 3.0 meters. As shown in Figure 5, the infrared camera collects the external video data in the target area. During the data collection, the personnel enter the target area by themselves. The personnel may be stationary or moving in the target area, and static interference heat sources can also be created, such as in the target area Put a basin of hot water, hot towels or heaters, etc., and leave after a period of time. During data collection, it can repeatedly enter or leave the target area and create different static interference heat sources in the target area. In some embodiments, data collection can be carried out in a real application scene, for example, an infrared camera is installed in the center of the bathroom ceiling to collect real-time data on the bathroom scene, and collect and store data in a 24-hour cycle.

红外摄像头每采集并缓存到一定时长(例如1分钟)的红外视频数据后，将缓存的红外视频数据发送给电子设备(例如电脑或服务器等)，然后清空当前缓存，并继续进行数据采集与缓存。在一些实施例中，所采集单个时间周期内的红外视频数据的帧数远大于静态热源样本所需要的k帧。通过观看红外数据视频记录人员进入与离开目标区域时对应的数据帧编号，根据该数据帧编号即可标注红外视频数据中某一段数据序列是否有人体热源。根据记录的人员进入、离开目标区域对应数据帧编号即可从采集的单个时间周期内的红外视频数据中提取出有人的数据片段和无人的数据片段。再分别将有人和无人的数据片段分割为固定时长t秒(即k帧)的备选样本。After the infrared camera collects and caches infrared video data for a certain period of time (for example, 1 minute), it sends the cached infrared video data to an electronic device (such as a computer or server, etc.), then clears the current cache, and continues data collection and caching . In some embodiments, the number of frames of infrared video data collected within a single time period is much greater than k frames required for static heat source samples. By watching the infrared data video recording the corresponding data frame number when the personnel enters and leaves the target area, according to the data frame number, it can be marked whether there is a human body heat source in a certain data sequence in the infrared video data. According to the data frame numbers corresponding to the recorded personnel entering and leaving the target area, the human data fragments and unmanned data fragments can be extracted from the infrared video data collected within a single time period. Then segment the data segments with people and no people into candidate samples with a fixed duration of t seconds (that is, k frames).

在一些实施例中，可通过检测人体移动的算法筛选出人体存在移动的备选样本，剔除人体存在移动的备选样本，从而，剩余的有效备选样本包括只含有干扰热源的静态热源样本(负样本)和含有静态人体热源的静态热源样本(正样本)。In some embodiments, the algorithm for detecting the movement of the human body can be used to screen out the candidate samples in which the human body moves, and eliminate the candidate samples in which the human body moves, so that the remaining effective candidate samples include static heat source samples containing only interfering heat sources ( Negative samples) and static heat source samples containing static human heat sources (positive samples).

电子设备获取到若干个有效备选样本后，可以通过人工或标签工具，对这些有效备选样本标注标签，从而，得到若干个静态热源样本，每个静态热源样本均标注有反映是干扰热源还是静态人体热源的真实标签。After the electronic equipment obtains several effective candidate samples, it can label these effective candidate samples manually or by labeling tools, thereby obtaining several static heat source samples, and each static heat source sample is marked to reflect whether it is an interfering heat source or a Realistic labels for static human heat sources.

可以理解的是，上述采集数据的实施例中，静态热源样本是模仿真实居家场景中获取的，如固定的干扰热源或者人体长时间在一个位置站立或静坐时，会产生重复性样本。为了使用于训练的数据尽可能保证样本的多样性，在一些实施例中，通过对样本进行筛选，丢弃重复性样本，使得训练得到的静态人体检测模型具有更好的泛化性。It can be understood that in the above data collection example, the static heat source samples are obtained in imitation of real home scenes, such as fixed interference heat sources or when the human body stands or sits in one position for a long time, repetitive samples will be generated. In order to make the data used for training as diverse as possible, in some embodiments, the samples are screened and repetitive samples are discarded, so that the trained static human detection model has better generalization.

具体地，先从上述若干个静态热源样本中选取至少一个已选样本，剩余的作为待选样本，通过抽取待选样本的中间帧imge2和已选样本的中间帧imge1做差值处理，得到差值帧diff_imge＝imge2-imge1，采用某一阈值threshold_1将差值帧中不小于该阈值threshold_1的像素点进行单独分割，若被分割出的最大连通区域面积达到某一设定面积阈值S1，则将该待选样本作为已选样本。若被分割出的最大连通区域面积未达到该面积阈值S1，则丢弃当前的待选样本。Specifically, at least one selected sample is selected from the above-mentioned several static heat source samples, and the rest are used as candidates for selection. By extracting the intermediate frame imge2 of the candidate sample and the intermediate frame imge1 of the selected sample for difference processing, the difference Value frame diff_imge=imge2-imge1, use a certain threshold threshold_1 to separate the pixels in the difference frame that are not less than the threshold threshold_1 separately, if the area of the segmented maximum connected region reaches a certain set area threshold S1, then The sample to be selected is the selected sample. If the area of the segmented maximum connected region does not reach the area threshold S1, the current candidate sample is discarded.

在一些实施例中，对丢弃重复样本后的若干个静态热源样本进行数据增强处理，例如对静态热源样本的每帧红外图像分别进行上下镜面翻转、左右镜面翻转或180°旋转等处理。在一些实施例中，为了增加包括静态人体热源的样本的数量，对包括静态人体热源的静态热源样本进一步进行数据增强处理，例如以静态人体热源的质心为参考，对静态人体热源进行平移后，使其质心位置会随之移动。In some embodiments, data enhancement processing is performed on several static heat source samples after discarding repeated samples, for example, each frame of infrared images of the static heat source samples is subjected to up-down mirror flip, left-right mirror flip, or 180° rotation. In some embodiments, in order to increase the number of samples including static human heat sources, data enhancement processing is further performed on the static heat source samples including static human heat sources, for example, after translation of the static human heat sources with the centroid of the static human heat sources as a reference, The position of its center of mass will move accordingly.

在一些实施例中，在样本输入神经网络之前先进行异常值处理，对温度值为空值或温度值偏高与偏低的异常像素点进行处理，针对像素点对应温度值为空值的情况，采用当前样本数据中所有像素点温度值的中值进行填充处理；而针对像素点对应温度值偏高与偏低的情况，通过设定温度数值的上下限阈值进行截断处理。In some embodiments, the outlier processing is performed before the samples are input into the neural network, and the abnormal pixel points whose temperature value is null or the temperature value is high or low are processed, and the corresponding temperature value of the pixel point is null , using the median temperature value of all pixel points in the current sample data for filling processing; and for the case where the temperature value corresponding to the pixel point is too high or too low, the upper and lower limit thresholds of the temperature value are set for truncation processing.

由于各季节对应环境温度值差异性较大，从而红外摄像头在不同季节获取到的相同热源温度分布数据差异性也较大，为了能使得模型训练时更快收敛且保证模型的鲁棒性，对静态热源样本数据进行归一化处理，即将每个静态热源样本数据映射到固定区间范围：Due to the large difference in the ambient temperature values corresponding to each season, the temperature distribution data of the same heat source obtained by the infrared camera in different seasons also has a large difference. In order to make the model converge faster and ensure the robustness of the model, the Static heat source sample data is normalized, that is, each static heat source sample data is mapped to a fixed interval range:

其中，imges₀为当前进行归一化处理的静态热源样本，T_min与T_max分别为静态热源样本imges₀中的温度最小值与最大值，imges₁为归一化后的静态热源样本。Among them, imges₀ is the static heat source sample currently undergoing normalization processing, T_min and T_max are the minimum and maximum temperature values in the static heat source sample imges₀ respectively, and imges₁ is the normalized static heat source sample.

通过上述丢弃重复样本、数据增强、异常值处理和/或数据增强处理后的若干个静态热源样本，作为输入神经网络的训练数据，不仅能让神经网络学习到丰富的样本，还能加速神经网络收敛，提高模型的鲁棒性和泛化能力。Through the above-mentioned discarding of repeated samples, data enhancement, outlier processing and/or data enhancement processing, several static heat source samples are used as training data input to the neural network, which not only allows the neural network to learn rich samples, but also accelerates the neural network. Convergence, improve the robustness and generalization ability of the model.

S20：采用若干个静态热源样本，对预先设置的3D卷积神经网络进行迭代训练，直至3D卷积神经网络收敛，得到静态人体检测模型。S20: Using several static heat source samples, iteratively train the preset 3D convolutional neural network until the 3D convolutional neural network converges to obtain a static human body detection model.

可以理解的是，在一些实施例中，这里的若干个静态热源样本可以是丢弃重复样本、数据增强、异常值处理和/或数据增强处理后的若干个静态热源样本。It can be understood that, in some embodiments, the several static heat source samples here may be the several static heat source samples after repeated sample discarding, data enhancement, outlier processing and/or data enhancement processing.

将若干个静态热源样本作为输入3D神经网络的训练数据，对预先设置的3D神经网络进行训练，不断调整3D神经网络的参数，在损失函数的约束下，3D神经网络输出的预测标签会越来越接近真实标签。当由损失函数计算的损失在一定范围内波动或达到某一值时，3D神经网络收敛，将收敛时的参数作为模型参数，得到静态人体检测模型。Several static heat source samples are used as the training data of the input 3D neural network, the preset 3D neural network is trained, and the parameters of the 3D neural network are continuously adjusted. Under the constraint of the loss function, the predicted label output by the 3D neural network will become more and more closer to the real label. When the loss calculated by the loss function fluctuates within a certain range or reaches a certain value, the 3D neural network converges, and the parameters at the time of convergence are used as model parameters to obtain a static human detection model.

其中，3D卷积神经网络是包括3D卷积层的神经网络，更适用于时空特征的学习，从而，3D卷积神经网络除了学习静态热源样本的单帧特征外，还能够学习到静态热源样本的时域特征，有益于训练得到的静态人体检测模型通过区分时域特征，检测静态人体。Among them, the 3D convolutional neural network is a neural network including a 3D convolutional layer, which is more suitable for the learning of spatiotemporal features. Therefore, in addition to learning the single-frame features of static heat source samples, the 3D convolutional neural network can also learn static heat source samples. The time-domain features are beneficial to the static human detection model trained to detect static human bodies by distinguishing time-domain features.

在此实施例中，采用大量丰富的静态热源样本对3D卷积神经网络进行训练，使得3D卷积神经网络能够学习静态人体热源和干扰热源的单帧特征及时域特征，从而，训练得到的静态人体检测模型具备准确区分静态人体和干扰热源的能力，准确检测出未发生移动的人体，有效解决了决干扰热源易误判为有人、静态人体存在而漏判的难点问题。此外，训练得到的静态人体检测模型可应用于具有红外摄像头的监测设备，一方面，对阳光、热水、加热物体等热源引起的干扰具有较高的鲁棒性，检测准确率高，另一方面，能够有效规避用户隐私暴露问题。In this embodiment, a large number of abundant static heat source samples are used to train the 3D convolutional neural network, so that the 3D convolutional neural network can learn the single-frame and time-domain features of static human body heat sources and interference heat sources. The human body detection model has the ability to accurately distinguish the static human body and the interference heat source, and accurately detect the non-moving human body, which effectively solves the difficult problem that the interference heat source is easily misjudged as a person, and the presence of a static human body is missed. In addition, the static human detection model obtained through training can be applied to monitoring equipment with infrared cameras. On the one hand, it has high robustness to interference caused by heat sources such as sunlight, hot water, and heated objects, and has high detection accuracy. On the one hand, it can effectively avoid the problem of user privacy exposure.

在一些实施例中，3D卷积神经网络包括依次级联的第一特征提取模块、第二特征提取模块和分类模块。In some embodiments, the 3D convolutional neural network includes a first feature extraction module, a second feature extraction module and a classification module cascaded in sequence.

对于任意一个静态热源样本，静态热源样本中的N帧图像输入3D卷积神经网络，经过第一特征提取模块进行特征提取后，输出的结果输入第二特征提取模块进行特征提取后，输出的结果输入分类模块进行计算，输出静态热源属于静态人体热源的概率，以及，静态热源属于干扰热源的概率。For any static heat source sample, N frames of images in the static heat source sample are input into the 3D convolutional neural network, after the feature extraction is performed by the first feature extraction module, the output result is input into the second feature extraction module for feature extraction, and the output result is Input the classification module for calculation, and output the probability that the static heat source belongs to the static human body heat source, and the probability that the static heat source belongs to the interference heat source.

其中，第一特征提取模块用于提取静态热源样本中各个红外图像中的静态热源的大致形状特征。第二特征提取模块用于提取静态热源样本中各个红外图像中的静态热源的温度分布特征以及静态热源样本中静态热源在时间维度上的变化特征。分类模块用于对第二特征提取模块输出的特征图进行分类，输出属于静态人体热源的概率和属于所述干扰热源的概率。Wherein, the first feature extraction module is used to extract the general shape features of the static heat source in each infrared image of the static heat source sample. The second feature extraction module is used to extract the temperature distribution feature of the static heat source in each infrared image in the static heat source sample and the change feature of the static heat source in the time dimension in the static heat source sample. The classification module is used to classify the feature map output by the second feature extraction module, and output the probability of belonging to the static human body heat source and the probability of belonging to the interference heat source.

在此实施例中，通过设置第一特征提取模块和第二特征提取模块，使得3D神经网络能够从特征的粒度和维度两方面，提取静态热源的形状特征、温度分布特征和其在时间维度上的变化特征(例如轻微晃动等时域特征)，从而，分类模块能够基于静态热源的形状特征、温度分布特征和在时间维度上的变化特征，对静态热源进行分类，有利于提高分类结果的准确性，一方面，能够加快3D神经网络收敛，另一方面，能够得到检测准确的静态人体检测模型。In this embodiment, by setting the first feature extraction module and the second feature extraction module, the 3D neural network can extract the shape feature, temperature distribution feature and its time dimension of the static heat source from the granularity and dimension of the feature. Therefore, the classification module can classify static heat sources based on the shape characteristics, temperature distribution characteristics and change characteristics in the time dimension of static heat sources, which is conducive to improving the accuracy of classification results. On the one hand, it can speed up the convergence of the 3D neural network, and on the other hand, it can obtain a static human detection model with accurate detection.

在一些实施例中，第一特征提取模块包括多个3D卷积层，各3D卷积层后均设置有最大池化层。In some embodiments, the first feature extraction module includes multiple 3D convolutional layers, and each 3D convolutional layer is followed by a maximum pooling layer.

请参阅图6，第一特征提取模块包括2个3D卷积层和2个最大池化层，这2个3D卷积层和2个最大池化层交叉堆叠设置。每个3D卷积层的激活函数为ReLU函数。请再次参阅图6，将包括k帧大小为24*32红外图像的静态热源样本作为输入，其维度为(1，24，32，k)，第一维度1是单帧红外图像的通道数，第二维度24和第三维度32分别是单帧红外图像的大小，第四维度k是红外图像的帧数。Please refer to FIG. 6 , the first feature extraction module includes 2 3D convolutional layers and 2 maximum pooling layers, and the 2 3D convolutional layers and 2 maximum pooling layers are cross-stacked. The activation function of each 3D convolutional layer is a ReLU function. Please refer to Figure 6 again, the static heat source samples including k frames of 24*32 infrared images are taken as input, and its dimension is (1, 24, 32, k), thefirst dimension 1 is the channel number of a single frame infrared image, Thesecond dimension 24 and thethird dimension 32 are the size of a single frame infrared image respectively, and the fourth dimension k is the frame number of the infrared image.

图6中3D卷积层的F为做卷积时输出的滤波器数量(即第一维度通道数转化后的值)，S是在卷积层或最大池化层运算时窗口的移动步长。基于为了用户隐私采用分辨率较低的静态热源样本，即第二维度与第三维度较小，第一个最大池化层不对样本数据进行降维，而是在第二个最大池化层开始对样本数据的第二维度与第三维度进行降维，从而，能够保证学习到热源的大致形状等基础性特征。F in the 3D convolutional layer in Figure 6 is the number of filters output during convolution (that is, the value converted from the number of channels in the first dimension), and S is the moving step of the window during the convolutional layer or the maximum pooling layer operation . Based on the use of low-resolution static heat source samples for user privacy, that is, the second dimension and the third dimension are smaller, the first maximum pooling layer does not perform dimensionality reduction on the sample data, but starts at the second maximum pooling layer Dimensionality reduction is performed on the second and third dimensions of the sample data, so that basic features such as the approximate shape of the heat source can be learned.

在此实施例中，通过将3D卷积层和最大池化层交叉堆叠设置，使得第一特征提取模块能够提取静态热源样本中各个红外图像中的静态热源的大致形状特征。In this embodiment, by cross-stacking the 3D convolution layer and the maximum pooling layer, the first feature extraction module can extract the general shape feature of the static heat source in each infrared image of the static heat source sample.

在一些实施例中，第二特征提取模块包括多个级联的子模块，各子模块包括依次连接的多个3D卷积层，其中，最后一个3D卷积层后设置有最大池化层。In some embodiments, the second feature extraction module includes a plurality of cascaded sub-modules, and each sub-module includes a plurality of 3D convolutional layers connected in sequence, wherein the last 3D convolutional layer is followed by a maximum pooling layer.

在图6所示的实施例中，第二特征提取模块包括2个级联的子模块，一个子模块包括依次设置的2个3D卷积层和1个最大池化层。第一特征提取模块输出的特征图输入第二特征提取模块进一步进行卷积、池化处理以提取特征，使得输入数据的第二维度逐渐降至3，第三维度逐渐降至4以及第四维度逐渐降至k′，k′的值与样本帧数k相关，第一维度对应的通道数升维至64。In the embodiment shown in FIG. 6 , the second feature extraction module includes two cascaded sub-modules, and one sub-module includes two 3D convolutional layers and one maximum pooling layer arranged in sequence. The feature map output by the first feature extraction module is input to the second feature extraction module for further convolution and pooling processing to extract features, so that the second dimension of the input data is gradually reduced to 3, the third dimension is gradually reduced to 4 and the fourth dimension Gradually decrease to k', the value of k' is related to the number of sample frames k, and the number of channels corresponding to the first dimension is increased to 64.

在此实施例中，通过设置多个级联的重复的子模块，使得输入的数据经过3D卷积层与池化层等操作后映射到隐层特征空间，能够提取静态热源样本中各个红外图像中的静态热源的温度分布特征以及静态热源样本中静态热源在时间维度上的变化特征。In this embodiment, by setting multiple cascaded repeated sub-modules, the input data is mapped to the hidden layer feature space after operations such as 3D convolutional layer and pooling layer, and each infrared image in the static heat source sample can be extracted The temperature distribution characteristics of the static heat source in and the change characteristics of the static heat source in the time dimension in the static heat source sample.

在一些实施例中，分类模块包括多个全连接层和softmax函数层，其中，相邻两个全连接层之间设置有Dropout层。In some embodiments, the classification module includes multiple fully connected layers and softmax function layers, wherein a Dropout layer is arranged between two adjacent fully connected layers.

其中，全连接层用于将输入的特征图展平成一维的向量。softmax函数层采用softmax函数基于输入的向量进行分类，输出静态热源被划分为静态人体热源和干扰热源的概率值。相邻两个全连接层之间设置有Dropout层,用于在每一个分支的训练当中随机减掉一些神经元，有利于避免模型训练时过拟合，从而增强模型的鲁棒性。Among them, the fully connected layer is used to flatten the input feature map into a one-dimensional vector. The softmax function layer uses the softmax function to classify based on the input vector, and outputs the probability value that the static heat source is divided into static human body heat source and interference heat source. A Dropout layer is set between two adjacent fully connected layers, which is used to randomly subtract some neurons during the training of each branch, which is beneficial to avoid over-fitting during model training, thereby enhancing the robustness of the model.

请再次参阅图6，例如分类模块包括串联的三个全连接层(Flatten,FC)，全连接层的神经元数量配置分别为128、32和2，即第1个全连接层FC-128输出一个长度为128的向量，第2个全连接层FC-32输出一个长度为32的向量，第3个全连接层FC-2输出一个长度为2的向量。Please refer to Figure 6 again. For example, the classification module includes three fully connected layers (Flatten, FC) connected in series. The number of neurons in the fully connected layer is configured as 128, 32, and 2 respectively, that is, the output of the first fully connected layer FC-128 A vector with a length of 128, the second fully connected layer FC-32 outputs a vector with a length of 32, and the third fully connected layer FC-2 outputs a vector with a length of 2.

第3个全连接层FC-2输出的向量随后输入softmax层，输出静态热源被划分为静态人体热源和干扰热源的概率值。在一些实施例中，通过比较输出的概率值p₁与预先设定的阈值p₀的大小判断目标区域中的静态热源是否为人体热源，即：The vector output by the third fully connected layer FC-2 is then input into the softmax layer, and the output static heat source is divided into the probability value of static human body heat source and interference heat source. In some embodiments, it is judged whether the static heat source in the target area is a human body heat source by comparing the output probability value_p1 with the preset threshold value_p0 , namely:

其中，presence为目标区域人体存在状态值，presence为0表示目标区域内的静态热源为干扰热源，presence为1表示目标区域中的静态热源为人体热源。Among them, presence is the state value of the human body in the target area, presence being 0 indicates that the static heat source in the target area is an interference heat source, and presence being 1 indicates that the static heat source in the target area is a human body heat source.

在此实施例中，通过将分类模块设置为包括多个全连接层和softmax函数层，相邻两个全连接层之间设置有Dropout层，能够递进式进行降维，避免过拟合，增加模型的鲁棒性。In this embodiment, by setting the classification module to include multiple fully connected layers and softmax function layers, and a Dropout layer is provided between two adjacent fully connected layers, the dimensionality reduction can be performed progressively to avoid overfitting, increase the robustness of the model.

其中，Loss为单个训练分支的损失，N为单个训练分支中的样本数量，w_n为第n个静态热源样本中检测到存在热源的帧数占比，y_n为第n个静态热源样本对应的预测标签，

为第n个静态热源样本对应的真实标签。Among them, Loss is the loss of a single training branch, N is the number of samples in a single training branch, w_n is the proportion of frames in which heat sources are detected in the nth static heat source sample, and y_n is the corresponding the predicted label of

is the true label corresponding to the nth static heat source sample.

在此实施例中，将每个静态热源样本中检测到存在热源的帧数占比作为预测标签和真实标签之间差异的加权系数，一方面，能够有效避免训练过程中难以学习的困难样本所产生的损失被简单样本所稀释而导致困难样本未被充分学习的问题；另一方面，能够有效避免热源无法被分割的红外图像造成的干扰。In this embodiment, the proportion of the number of frames in which heat sources are detected in each static heat source sample is used as the weighting coefficient of the difference between the predicted label and the real label. The resulting loss is diluted by simple samples, resulting in the problem that difficult samples are not fully learned; on the other hand, it can effectively avoid the interference caused by infrared images where heat sources cannot be segmented.

综上所述，本申请一些实施例提供的训练静态人体检测模型的方法，采用大量丰富的静态热源样本对3D卷积神经网络进行训练，使得3D卷积神经网络能够学习静态人体热源和干扰热源的单帧特征及时域特征，从而，训练得到的静态人体检测模型具备准确区分静态人体和干扰热源的能力，准确检测出未发生移动的人体，有效解决了干扰热源易误判为有人、静态人体存在而漏判的难点问题。此外，训练得到的静态人体检测模型可应用于具有红外摄像头的监测设备，一方面，对阳光、热水、加热物体等热源引起的干扰具有较高的鲁棒性，检测准确率高，另一方面，能够有效规避用户隐私暴露问题。In summary, the methods for training static human body detection models provided by some embodiments of the present application use a large number of abundant static heat source samples to train the 3D convolutional neural network, so that the 3D convolutional neural network can learn static human body heat sources and interference heat sources Therefore, the static human body detection model obtained through training has the ability to accurately distinguish between static human bodies and interference heat sources, and accurately detect non-moving human bodies. Difficult problems that exist but are missed. In addition, the static human detection model obtained through training can be applied to monitoring equipment with infrared cameras. On the one hand, it has high robustness to interference caused by heat sources such as sunlight, hot water, and heated objects, and has high detection accuracy. On the one hand, it can effectively avoid the problem of user privacy exposure.

在通过本申请实施例提供的训练静态人体检测模型的方法训练得到静态人体检测模型后，可利用该静态人体检测模型应用至人体检测。本申请实施例提供的人体检测方法可以由各种类型具有计算处理能力的电子设备实施,例如智能终端、服务器或具有红外摄像头的监控设备等。After the static human body detection model is obtained through training through the method for training the static human body detection model provided in the embodiment of the present application, the static human body detection model can be used for human body detection. The human body detection method provided in the embodiment of the present application can be implemented by various types of electronic devices with computing and processing capabilities, such as smart terminals, servers, or monitoring devices with infrared cameras.

下面结合本申请实施例提供的终端的示例性应用和实施,说明本申请实施例提供的人体检测方法。参见图7,图7是本申请实施例提供的人体检测方法的流程示意图。该方法S200包括如下步骤：The following describes the human body detection method provided in the embodiment of the present application in combination with the exemplary application and implementation of the terminal provided in the embodiment of the present application. Referring to FIG. 7, FIG. 7 is a schematic flowchart of a human body detection method provided in an embodiment of the present application. The method S200 includes the following steps:

S201：获取测试样本，该测试样本包括具有时序性的k帧红外图像。S201: Acquire a test sample, where the test sample includes time-sequential k frames of infrared images.

可以理解的是，该测试样本是红外摄像头在实际应用场景中采集得到的红外视频。红外摄像头将测试样本发送给终端，从而，终端获取到测试样本。It can be understood that the test sample is an infrared video collected by an infrared camera in an actual application scene. The infrared camera sends the test sample to the terminal, so that the terminal obtains the test sample.

终端内置有人体检测应用程序，人体检测模型封装于该应用程序中，调用人体检测模型对前述测试样本进行人体检测，经过一系列的计算处理后，输出是否存在人体。The terminal has a built-in human body detection application program, and the human body detection model is encapsulated in the application program. The human body detection model is called to perform human body detection on the aforementioned test samples. After a series of calculations, it outputs whether there is a human body.

S202：根据测试样本，确定测试样本中是否存在热源，若不存在热源，则输出不存在人体。S202: According to the test sample, determine whether there is a heat source in the test sample, and if there is no heat source, output that there is no human body.

首先，根据测试样本中的红外图像，确定测试样本中是否存在热源。可以理解的是，若不存在热源，则说明也不存在人体。First, according to the infrared image in the test sample, it is determined whether there is a heat source in the test sample. It can be understood that if there is no heat source, then the human body also does not exist.

在一些实施例中，采用以下方式确定测试样本中是否存在热源：In some embodiments, the presence or absence of a heat source in a test sample is determined in the following manner:

(1)遍历该测试样本中的每一帧红外图像，确定每一帧红外图像对应的温度阈值threshold_2。(1) Traverse each frame of infrared images in the test sample, and determine the temperature threshold threshold_2 corresponding to each frame of infrared images.

(2)针对红外图像中的像素点，筛选出温度大于或等于温度阈值threshold_2的像素点，构成热斑区域，若热斑区域的最大连通面积大于或等于面积阈值S2，则确定红外图像存在热源。(2) For the pixels in the infrared image, screen out the pixels whose temperature is greater than or equal to the temperature threshold threshold_2 to form a hot spot area. If the maximum connected area of the hot spot area is greater than or equal to the area threshold S2, it is determined that there is a heat source in the infrared image .

(3)当测试样本遍历完成后，若测试样本中存在热源的红外图像的数量大于或等于数量阈值M1，则确定测试样本中存在热源。(3) After the test sample traversal is completed, if the number of infrared images with heat sources in the test sample is greater than or equal to the number threshold M1, it is determined that there is a heat source in the test sample.

可以理解的是，对于任意一帧红外图像，对应有自己的温度阈值threshold_2。也就是说，测试样本中的各个红外图像，采用的温度阈值threshold_2不是相同的。可以理解的是，实际场景中温度受环境、物体种类、传感器自身影响，每一红外图像中温度分布不同，例如，同样的场景(同样的温度分布)在两个红外图像中温度分布不同。若采用统一的绝对阈值，会造成一些红外图像的阈值不合理，影响热斑区域的提取。因此，在此实施例中，每帧红外图像，对应有自己的温度阈值threshold_2，有益于后续准确分割提取热斑区域。It can be understood that, for any frame of infrared image, it has its own temperature threshold threshold_2. That is to say, the temperature threshold threshold_2 used for each infrared image in the test sample is not the same. It can be understood that the temperature in the actual scene is affected by the environment, the type of object, and the sensor itself, and the temperature distribution in each infrared image is different. For example, the temperature distribution in the same scene (same temperature distribution) is different in two infrared images. If a uniform absolute threshold is adopted, the threshold of some infrared images will be unreasonable, which will affect the extraction of hot spot regions. Therefore, in this embodiment, each frame of infrared image corresponds to its own temperature threshold threshold_2, which is beneficial to subsequent accurate segmentation and extraction of hot spot regions.

在一些实施例中，每帧红外图像对应的温度阈值threshold_2采用如下公式进行计算：In some embodiments, the temperature threshold threshold_2 corresponding to each frame of infrared image is calculated using the following formula:

threshold_2＝Q3+α*(Q3-Q1)threshold_2=Q3+α*(Q3-Q1)

其中，α∈[0.5,2.0]为参数因子，Q1为每帧红外图像对应的温度值的下四分位数；Q3为每帧红外图像对应的温度值的上四分位数。Among them, α∈[0.5,2.0] is a parameter factor, Q1 is the lower quartile of the temperature value corresponding to each frame of infrared image; Q3 is the upper quartile of the temperature value corresponding to each frame of infrared image.

在此实施例中，根据红外图像中各像素点的温度分布，确定温度阈值threshold_2，使得温度阈值threshold_2能够与该红外图像相匹配，准确分割提取出热斑区域。In this embodiment, the temperature threshold threshold_2 is determined according to the temperature distribution of each pixel in the infrared image, so that the temperature threshold threshold_2 can be matched with the infrared image, and the hot spot area can be accurately segmented and extracted.

将红外图像中的各个像素点对应的温度值，分别与温度阈值threshold_2进行比较，若某一像素点的温度值大于或等于温度阈值threshold_2，则将该像素点划分至热斑区域。可以理解的是，当对红外图像中的各个像素点完成筛选后，得到热斑区域。The temperature value corresponding to each pixel in the infrared image is compared with the temperature threshold threshold_2, and if the temperature value of a certain pixel is greater than or equal to the temperature threshold threshold_2, the pixel is divided into a hot spot area. It can be understood that, after the screening of each pixel in the infrared image is completed, the hot spot area is obtained.

然后，计算热斑区域的最大连通面积，即最大连通区域中像素点的个数。面积阈值S2为判断红外图像是否包括热源的临界值，可以排除异常像素点。在一些实施例中，面积阈值S2可以基于面积较小的热源确定，能够大致区分热源和异常像素点。Then, calculate the maximum connected area of the hot spot area, that is, the number of pixels in the maximum connected area. The area threshold S2 is a critical value for judging whether the infrared image includes a heat source, and abnormal pixels can be excluded. In some embodiments, the area threshold S2 can be determined based on a heat source with a smaller area, which can roughly distinguish a heat source from an abnormal pixel.

由此，若热斑区域的最大连通面积大于或等于面积阈值S2，可以有效排除异常像素点，区分热源。从而，可以确定红外图像中存在热源。Therefore, if the maximum connected area of the hot spot area is greater than or equal to the area threshold S2, abnormal pixels can be effectively eliminated and heat sources can be distinguished. Thus, it can be determined that there is a heat source in the infrared image.

可以理解的是，考虑到单帧红外图像进行热源检测的偶发误差，这里，采用k帧红外图像进行热源检测。即，当测试样本遍历完成后，若测试样本中存在热源的红外图像的数量大于或等于数量阈值M1，则确定测试样本中存在热源。It can be understood that, considering the occasional error of heat source detection with a single frame of infrared images, here, k frames of infrared images are used for heat source detection. That is, after the traversal of the test sample is completed, if the number of infrared images of the heat source in the test sample is greater than or equal to the number threshold M1, it is determined that the heat source exists in the test sample.

其中，数量阈值M1可以根据测试样本中红外图像的总帧数确定，在一些实施例中，数量阈值M1＝β*k，其中，β取值为[0.4,0.8]),k为测试样本的总帧数。Wherein, the number threshold M1 can be determined according to the total number of frames of the infrared image in the test sample. In some embodiments, the number threshold M1=β*k, wherein, β takes a value of [0.4,0.8]), and k is the number of test samples total frames.

在此实施例中，通过上述方式，能够规避采用单帧红外图像进行热源检测的偶发误差，使得热源检测准确。In this embodiment, through the above method, the occasional error of using a single frame infrared image for heat source detection can be avoided, so that the heat source detection is accurate.

S203：若存在热源，则根据测试样本，确定热源是否发生移动，若发生移动，则输出存在人体。S203: If there is a heat source, according to the test sample, determine whether the heat source has moved, and if it has moved, output that there is a human body.

若存在热源，该热源可能是静态干扰热源，也可能是人体热源。为了进一步区分，根据测试样本，确定热源是否发生移动。可以理解的是，若发生移动，说明热源是人体热源。若未发生移动，则热源可能是静态干扰热源，也有可能是静态的人体热源。If a heat source is present, it may be static disturbance heat or human body heat. For further differentiation, according to the test sample, determine whether the heat source has moved. It can be understood that if there is movement, it means that the heat source is the heat source of the human body. If no movement occurs, the heat source may be a static disturbance heat source or a static human body heat source.

在一些实施例中，前述“根据测试样本，确定热源是否发生移动”具体包括：In some embodiments, the aforementioned "according to the test sample, determine whether the heat source has moved" specifically includes:

(1)对测试样本进行差分计算，得到差分样本。(1) Perform differential calculation on the test samples to obtain differential samples.

(2)遍历差分样本，确定每一帧差分图像分别对应的温度阈值threshold_3。(2) Traverse the differential samples, and determine the temperature threshold threshold_3 corresponding to each frame of the differential image.

(3)针对差分图像中的像素点，筛选出差值不小于温度阈值threshold_3的像素点，构成移动区域，若移动区域的最大连通面积大于或等于面积阈值S3，则确定差分图像中热源存在移动。(3) For the pixels in the difference image, filter out the pixels whose difference is not less than the temperature threshold threshold_3 to form a moving area. If the maximum connected area of the moving area is greater than or equal to the area threshold S3, it is determined that the heat source in the difference image has moved .

(4)当差分样本遍历完成后，若差分样本中存在热源移动的差分图像的数量大于或等于数量阈值M2，则确定热源发生移动。(4) After the differential sample traversal is completed, if the number of differential images in which the heat source moves in the differential sample is greater than or equal to the number threshold M2, it is determined that the heat source has moved.

其中，差分计算是指将两个图像按对应像素点位置进行像素值相减运算。可以采用如下公式进行差分计算：Wherein, the difference calculation refers to subtracting pixel values of two images according to corresponding pixel point positions. The following formula can be used for differential calculation:

D_(i,j)＝P_(i,j)-H_(i,j)D_(i,j) = P_(i,j) -H_(i,j)

其中，P_(i,j)是测试样本中一个红外图像中第i行第j列的像素点的像素值，H_(i,j)是测试样本中另一个红外图像中第i行第j列的像素点的像素值，D_(i,j)是差分图像中第i行第j列的像素点的像素值(即差分值)。Among them, P_{(i, j)} is the pixel value of the pixel point in row i and column j in an infrared image in the test sample, and H_{(i, j)} is the pixel value in row i and column j in another infrared image in the test sample D_(i,j) is the pixel value of the pixel in row i and column j in the difference image (ie, the difference value).

可以理解的是，测试样本的两幅图像中未发生重合部分的像素区域对应的差分值较大，发生重合部分的像素区域的差分值接近于0。由于测试样本中每帧红外图像中的静态热源的位置近乎不会发生变化，从而，差分计算可以消除测试样本中的静态热源。It can be understood that, in the two images of the test sample, the difference value corresponding to the pixel area of the non-overlapping part is larger, and the difference value of the pixel area of the overlapping part is close to 0. Since the position of the static heat source in each frame of the infrared image in the test sample hardly changes, the differential calculation can eliminate the static heat source in the test sample.

在一些实施例中，从第2帧起依次计算每帧红外图像与第1帧红外图像的差值，即得到包括k-1帧差分图像的差分样本。In some embodiments, the difference between each frame of infrared image and the first frame of infrared image is sequentially calculated from the second frame, that is, difference samples including k-1 frames of difference images are obtained.

可以理解的是，对于任意一帧差分图像，对应有自己的温度阈值threshold_3。在一些实施例中，也可以采用公式threshold_3＝Q3+β*(Q3-Q1)计算温度阈值threshold_3。其中，β∈[0.5,2.0]为参数因子，Q1为每帧差分图像对应的温度值的下四分位数；Q3为每帧差分图像对应的温度值的上四分位数。It can be understood that for any frame of difference image, it has its own temperature threshold threshold_3. In some embodiments, the temperature threshold threshold_3 may also be calculated using the formula threshold_3=Q3+β*(Q3-Q1). Among them, β∈[0.5,2.0] is a parameter factor, Q1 is the lower quartile of the temperature value corresponding to each frame of differential image; Q3 is the upper quartile of the temperature value corresponding to each frame of differential image.

筛选出差值不小于温度阈值的像素点，构成移动区域，若移动区域的最大连通面积大于或等于面积阈值，则确定差分图像中热源存在移动。Pixels whose difference is not less than the temperature threshold are screened out to form a moving area. If the maximum connected area of the moving area is greater than or equal to the area threshold, it is determined that the heat source in the differential image has moved.

将差分图像中的各个像素点对应的温度值，分别与温度阈值threshold_3进行比较，若某一像素点的温度值大于或等于温度阈值threshold_3，则将该像素点划分至移动区域。可以理解的是，当对差分图像中的各个像素点完成筛选后，得到移动区域。The temperature value corresponding to each pixel in the difference image is compared with the temperature threshold threshold_3 respectively, and if the temperature value of a certain pixel is greater than or equal to the temperature threshold threshold_3, the pixel is divided into a moving area. It can be understood that, after filtering each pixel in the difference image, a moving area is obtained.

然后，计算移动区域的最大连通面积，即最大连通区域中像素点的个数。面积阈值S3为判断差分图像中热源发生移动的临界值，可以排除异常像素点。Then, calculate the maximum connected area of the moving region, that is, the number of pixels in the maximum connected region. The area threshold S3 is a critical value for judging the movement of the heat source in the difference image, and can exclude abnormal pixels.

由此，若移动区域的最大连通面积大于或等于面积阈值S3，则说明热源发生移动。若移动区域的最大连通面积小于面积阈值S3，则说明热源未发生移动。Thus, if the maximum connected area of the moving region is greater than or equal to the area threshold S3, it indicates that the heat source has moved. If the maximum connected area of the moving area is smaller than the area threshold S3, it means that the heat source has not moved.

可以理解的是，考虑到运动在时间上的连续性以及单帧差值图像进行移动检测的偶发误差，这里，采用包括k-1帧差分图像的差分样本进行移动检测。即，当遍历完成后，若差分样本中存在热源移动的差分图像的数量大于或等于数量阈值M2，说明运动存在持续性，则确定热源发生了移动，否则，确定热源未发生移动。其中，数量阈值M2可以根据差值样本的总帧数确定。在一些实施例中，也可与数量阈值M1相同的方式，确定数量阈值M2。It can be understood that, considering the continuity of motion in time and the occasional error of motion detection in single-frame difference images, here, difference samples including k-1 frame difference images are used for motion detection. That is, after the traversal is completed, if the number of difference images with heat source movement in the difference samples is greater than or equal to the number threshold M2, it indicates that the motion is persistent, and it is determined that the heat source has moved; otherwise, it is determined that the heat source has not moved. Wherein, the number threshold M2 may be determined according to the total number of frames of difference samples. In some embodiments, the quantity threshold M2 can also be determined in the same manner as the quantity threshold M1.

在此实施例中，通过对测试样本进行差分计算，计算各个差分图像中移动区域的面积，筛选出移动区域的最大连通面积大于或等于面积阈值M2的差分图像，基于筛选出的差分图像的数量大于或等于数量阈值M2，能够准确确定热源是否发生移动。In this embodiment, by performing differential calculation on the test samples, the area of the moving region in each difference image is calculated, and the difference images whose maximum connected area of the moving region is greater than or equal to the area threshold M2 are screened out, based on the number of the selected difference images being greater than Or equal to the number threshold M2, it can be accurately determined whether the heat source moves.

若热源发生移动，说明热源是人体热源。若热源未发生移动，则热源可能是静态干扰热源，也有可能是静态的人体热源，则继续进行步骤S204。If the heat source moves, it means that the heat source is the heat source of the human body. If the heat source does not move, the heat source may be a static interference heat source, or a static human body heat source, and proceed to step S204.

S204：若未发生移动，则将测试样本输入静态人体检测模型，输出测试样本中的热源属于静态人体热源还是属于干扰热源。其中，静态人体检测模型是采用上述训练方法实施例中任意一项训练静态人体检测模型的方法训练得到的。S204: If there is no movement, input the test sample into the static human body detection model, and output whether the heat source in the test sample belongs to the static human body heat source or the interference heat source. Wherein, the static human body detection model is trained by using any one of the methods for training the static human body detection model in the above training method embodiments.

可以理解的是，该静态人体检测模型是通过上述实施例中训练静态人体检测模型的方法训练得到，与上述实施例中静态人体检测模型具有相同的结构和功能，在此不再一一赘述。It can be understood that the static human body detection model is trained by the method for training the static human body detection model in the above embodiment, and has the same structure and function as the static human body detection model in the above embodiment, and will not be repeated here.

在此实施例中，先根据测试样本中各个红外图像的温度分布，判断是否存在热源，若不存在热源，说明也不存在人体。若存在热源，则进一步判断热源是否发生移动，若发生移动，说明存在人体。若未发生移动，则进一步采用静态人体检测模型检测热源是静态人体还是干扰热源。通过此方式，能够准确检测出静态的人体，使得人体检测更加准确。In this embodiment, firstly, according to the temperature distribution of each infrared image in the test sample, it is judged whether there is a heat source. If there is no heat source, it means that there is no human body. If there is a heat source, it is further judged whether the heat source moves, and if it moves, it means that there is a human body. If there is no movement, a static human body detection model is further used to detect whether the heat source is a static human body or an interference heat source. In this manner, a static human body can be accurately detected, making human body detection more accurate.

本申请实施例还提供了一种计算机可读存储介质，例如包括程序代码的存储器，上述程序代码可由处理器执行以完成上述实施例中的训练静态人体检测模型的方法或人体检测方法。例如，该计算机可读存储介质可以是只读存储器(Read-Only Memory,ROM)、随机存取存储器(Random Access Memory，RAM)、只读光盘(Compact Disc Read-OnlyMemory，CDROM)、磁带、软盘和光数据存储设备等。The embodiment of the present application also provides a computer-readable storage medium, such as a memory including program codes, which can be executed by a processor to implement the method for training a static human detection model or the human detection method in the above embodiments. For example, the computer-readable storage medium may be a read-only memory (Read-Only Memory, ROM), a random access memory (Random Access Memory, RAM), a compact disc (Compact Disc Read-Only Memory, CDROM), a tape, a floppy disk and optical data storage devices, etc.

本申请实施例还提供了一种计算机程序产品，该计算机程序产品包括一条或多条程序代码，该程序代码存储在计算机可读存储介质中。电子设备的处理器从计算机可读存储介质读取该程序代码，处理器执行该程序代码，以完成上述实施例中提供的训练静态人体检测模型的方法的步骤或人体检测方法的步骤。The embodiment of the present application also provides a computer program product, where the computer program product includes one or more pieces of program codes, and the program codes are stored in a computer-readable storage medium. The processor of the electronic device reads the program code from the computer-readable storage medium, and the processor executes the program code to complete the steps of the method for training a static human detection model or the steps of the human detection method provided in the above embodiments.

需要说明的是，以上所描述的装置实施例仅仅是示意性的，其中所述作为分离部件说明的单元可以是或者也可以不是物理上分开的，作为单元显示的部件可以是或者也可以不是物理单元，即可以位于一个地方，或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。It should be noted that the device embodiments described above are only illustrative, and the units described as separate components may or may not be physically separated, and the components shown as units may or may not be physically separated. A unit can be located in one place, or it can be distributed to multiple network units. Part or all of the modules can be selected according to actual needs to achieve the purpose of the solution of this embodiment.

通过以上的实施方式的描述，本领域普通技术人员可以清楚地了解到各实施方式可借助软件加通用硬件平台的方式来实现，当然也可以通过硬件。本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程是可以通过计算机程序来指令相关的硬件来完成，所述的程序可存储于一计算机可读取存储介质中，该程序在执行时，可包括如上述各方法的实施例的流程。其中，所述的存储介质可为磁碟、光盘、只读存储记忆体(Read-Only Memory,ROM)或随机存储记忆体(Random Access Memory,RAM)等。Through the above description of the embodiments, those skilled in the art can clearly understand that each embodiment can be implemented by means of software plus a general hardware platform, and of course also by hardware. Those of ordinary skill in the art can understand that all or part of the processes in the methods of the above embodiments can be completed by instructing related hardware through computer programs, and the programs can be stored in a computer-readable storage medium. During execution, it may include the processes of the embodiments of the above-mentioned methods. Wherein, the storage medium may be a magnetic disk, an optical disk, a read-only memory (Read-Only Memory, ROM) or a random access memory (Random Access Memory, RAM), etc.

最后应说明的是：以上实施例仅用以说明本申请的技术方案，而非对其限制；在本申请的思路下，以上实施例或者不同实施例中的技术特征之间也可以进行组合，步骤可以以任意顺序实现，并存在如上所述的本申请的不同方面的许多其它变化，为了简明，它们没有在细节中提供；尽管参照前述实施例对本申请进行了详细的说明，本领域的普通技术人员应当理解：其依然可以对前述各实施例所记载的技术方案进行修改，或者对其中部分技术特征进行等同替换；而这些修改或者替换，并不使相应技术方案的本质脱离本申请各实施例技术方案的范围。Finally, it should be noted that: the above embodiments are only used to illustrate the technical solutions of the present application, but not to limit them; under the thinking of the present application, the above embodiments or technical features in different embodiments can also be combined, The steps can be performed in any order, and there are many other variations of the different aspects of the application as described above, which have not been presented in detail for the sake of brevity; although the application has been described in detail with reference to the preceding examples, those of ordinary skill in the art The skilled person should understand that: it is still possible to modify the technical solutions described in the foregoing embodiments, or perform equivalent replacements for some of the technical features; and these modifications or replacements do not make the essence of the corresponding technical solutions deviate from the implementation of the present application. The scope of technical solutions.

Claims

Translated fromChinese

1.一种训练静态人体检测模型的方法，其特征在于，包括：1. A method for training a static human detection model, comprising:

获取若干个静态热源样本，所述静态热源样本包括k帧红外图像，所述k帧红外图像具有时序性，各所述静态热源样本均标注有真实标签，所述真实标签反映所述静态热源样本中的静态热源属于静态人体热源或干扰热源；Obtain several static heat source samples, the static heat source samples include k frames of infrared images, the k frames of infrared images are sequential, each of the static heat source samples is marked with a real label, and the real label reflects the static heat source sample The static heat source in is a static human body heat source or an interference heat source;

采用所述若干个静态热源样本，对预先设置的3D卷积神经网络进行迭代训练，直至所述3D卷积神经网络收敛，得到所述静态人体检测模型。Using the several static heat source samples, the preset 3D convolutional neural network is iteratively trained until the 3D convolutional neural network converges to obtain the static human body detection model.

2.根据权利要求1所述的方法，其特征在于，所述3D卷积神经网络包括依次级联的第一特征提取模块、第二特征提取模块和分类模块；2. method according to claim 1, is characterized in that, described 3D convolutional neural network comprises first feature extraction module, the second feature extraction module and classification module of sequential cascading;

其中，所述第一特征提取模块用于提取所述静态热源样本中各个红外图像中的静态热源的大致形状特征；Wherein, the first feature extraction module is used to extract the general shape features of the static heat source in each infrared image in the static heat source sample;

所述第二特征提取模块用于提取所述静态热源样本中各个红外图像中的静态热源的温度分布特征以及所述静态热源样本中静态热源在时间维度上的变化特征；The second feature extraction module is used to extract the temperature distribution characteristics of the static heat sources in each infrared image in the static heat source samples and the change characteristics of the static heat sources in the time dimension in the static heat source samples;

所述分类模块用于对所述第二特征提取模块输出的特征图进行分类，输出属于所述静态人体热源的概率和属于所述干扰热源的概率。The classification module is used to classify the feature maps output by the second feature extraction module, and output the probability of belonging to the static human body heat source and the probability of belonging to the interference heat source.

3.根据权利要求2所述的方法，其特征在于，所述第一特征提取模块包括多个3D卷积层，各所述3D卷积层后均设置有最大池化层。3. The method according to claim 2, wherein the first feature extraction module comprises a plurality of 3D convolutional layers, and a maximum pooling layer is arranged after each of the 3D convolutional layers.

4.根据权利要求2所述的方法，其特征在于，所述第二特征提取模块包括多个级联的子模块，各所述子模块包括依次连接的多个3D卷积层，其中，最后一个3D卷积层后设置有最大池化层。4. The method according to claim 2, wherein the second feature extraction module includes a plurality of cascaded sub-modules, each of which includes a plurality of 3D convolutional layers connected in sequence, wherein the last A 3D convolutional layer is followed by a max pooling layer.

5.根据权利要求2所述的方法，其特征在于，所述分类模块包括多个全连接层和softmax函数层，其中，相邻两个全连接层之间设置有Dropout层。5. The method according to claim 2, wherein the classification module comprises a plurality of fully connected layers and softmax function layers, wherein a Dropout layer is arranged between adjacent two fully connected layers.

6.根据权利要求1所述的方法，其特征在于，训练过程中所采用的损失函数包括：6. The method according to claim 1, wherein the loss function adopted in the training process comprises:

其中，Loss为单个训练分支的损失，N为单个训练分支中的样本数量，w_n为第n个静态热源样本中检测到存在热斑的帧数占比，y_n为所述第n个静态热源样本对应的预测标签，

为所述第n个静态热源样本对应的真实标签。Among them, Loss is the loss of a single training branch, N is the number of samples in a single training branch, w_n is the proportion of frames in which hot spots are detected in the nth static heat source sample, and y_n is the nth static heat source sample The predicted label corresponding to the heat source sample,

is the true label corresponding to the nth static heat source sample.

7.一种人体检测方法，其特征在于，包括：7. A human detection method, characterized in that, comprising:

获取测试样本，所述测试样本包括具有时序性的k帧红外图像；Obtain a test sample, the test sample includes time-sequential k frames of infrared images;

根据所述测试样本，确定所述测试样本中是否存在热源，若不存在热源，则输出不存在人体；According to the test sample, determine whether there is a heat source in the test sample, and if there is no heat source, output that there is no human body;

若存在热源，则根据所述测试样本，确定所述热源是否发生移动，若发生移动，则输出存在人体；If there is a heat source, then according to the test sample, determine whether the heat source moves, and if it moves, output that there is a human body;

若未发生移动，则将所述测试样本输入静态人体检测模型，输出所述测试样本中的热源属于静态人体热源还是属于干扰热源，其中，静态人体检测模型是采用如权利要求1-6任意一项训练静态人体检测模型的方法训练得到的。If no movement occurs, input the test sample into the static human body detection model, and output whether the heat source in the test sample belongs to the static human body heat source or the interference heat source, wherein the static human body detection model adopts any one of claims 1-6 It is obtained by training the method of training a static human detection model.

8.根据权利要求7所述的方法，其特征在于，所述根据所述测试样本，确定所述热源是否发生移动，包括：8. The method according to claim 7, wherein the determining whether the heat source moves according to the test sample comprises:

对所述测试样本进行差分计算，得到差分样本；performing difference calculation on the test sample to obtain a difference sample;

遍历所述差分样本，确定每一帧差分图像分别对应的温度阈值；Traverse the differential samples, and determine the temperature thresholds corresponding to each frame of the differential image;

针对所述差分图像中的像素点，筛选出差值不小于所述温度阈值的像素点，构成移动区域，若所述移动区域的最大连通面积大于或等于面积阈值，则确定所述差分图像中热源存在移动；For the pixels in the difference image, screen out pixels whose difference is not less than the temperature threshold to form a moving area, and if the maximum connected area of the moving area is greater than or equal to the area threshold, then determine There is movement of the heat source;

当所述差分样本遍历完成后，若所述差分样本中存在热源移动的差分图像的数量大于或等于数量阈值，则确定所述热源发生移动。After the traversal of the differential samples is completed, if the number of differential images in which the heat source moves in the differential sample is greater than or equal to a quantity threshold, it is determined that the heat source has moved.

9.一种电子设备，其特征在于，包括：9. An electronic device, characterized in that it comprises:

至少一个处理器；和at least one processor; and

与所述至少一个处理器通信连接的存储器；其中，所述存储器存储有可被所述至少一个处理器执行的指令,所述指令被所述至少一个处理器执行,以使所述至少一个处理器能够执行如权利要求1-8任一项所述的方法。A memory connected in communication with the at least one processor; wherein, the memory stores instructions executable by the at least one processor, and the instructions are executed by the at least one processor so that the at least one processing The device is capable of performing the method according to any one of claims 1-8.

10.一种计算机可读存储介质，其特征在于，所述计算机可读存储介质存储有计算机可执行指令，所述计算机可执行指令用于使计算机设备执行如权利要求1-8任一项所述的方法。10. A computer-readable storage medium, wherein the computer-readable storage medium stores computer-executable instructions, and the computer-executable instructions are used to enable a computer device to perform the operation described in any one of claims 1-8. described method.