Movatterモバイル変換


[0]ホーム

URL:


CN112947872A - Method for placing detection equipment on display screen based on multilayer convolutional neural network - Google Patents

Method for placing detection equipment on display screen based on multilayer convolutional neural network
Download PDF

Info

Publication number
CN112947872A
CN112947872ACN201911262034.7ACN201911262034ACN112947872ACN 112947872 ACN112947872 ACN 112947872ACN 201911262034 ACN201911262034 ACN 201911262034ACN 112947872 ACN112947872 ACN 112947872A
Authority
CN
China
Prior art keywords
display screen
image
neural network
convolutional neural
images
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911262034.7A
Other languages
Chinese (zh)
Other versions
CN112947872B (en
Inventor
雷秀洋
赵国荣
陆飞
易典
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Pinqi Technology Co ltd
Original Assignee
PQ LABS Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by PQ LABS IncfiledCriticalPQ LABS Inc
Priority to CN201911262034.7ApriorityCriticalpatent/CN112947872B/en
Publication of CN112947872ApublicationCriticalpatent/CN112947872A/en
Application grantedgrantedCritical
Publication of CN112947872BpublicationCriticalpatent/CN112947872B/en
Activelegal-statusCriticalCurrent
Anticipated expirationlegal-statusCritical

Links

Images

Classifications

Landscapes

Abstract

The invention relates to a method for placing detection equipment on a display screen based on a multilayer convolutional neural network, which comprises the following steps: placing an intelligent device with an image acquisition device on a display screen in a lighting state, wherein the image acquisition device faces the display screen; constructing a detection model based on a multilayer convolutional neural network, and loading a weight parameter set; acquiring an image through an image acquisition device facing to the display screen, inputting the image into a detection model based on a multilayer convolutional neural network for reasoning, and giving classification information and a probability value of the image; and regarding the given classification information and the probability value, if the probability value of the image belonging to the category placed on the display screen is the highest or exceeds a set threshold value, the intelligent device is considered to be placed on the display screen. Compared with the prior art, the method has the advantages of low error recognition rate, high detection precision, high detection speed and the like.

Description

Method for placing detection equipment on display screen based on multilayer convolutional neural network
Technical Field
The invention relates to a multi-screen interaction technology, in particular to a method for placing detection equipment on a display screen based on a multilayer convolutional neural network.
Background
The multi-screen interaction means that a series of operations such as transmission, analysis, display, control and the like of multimedia (audio, video and picture) contents can be performed on different multimedia terminal devices (such as a common mobile phone and a common television) through wireless network connection, the same contents can be displayed on different terminal devices, and the content intercommunication among all terminals is realized.
The prior art WO2016066079a1 discloses a multi-screen interaction method and system, including acquiring the position of a fixed terminal, and monitoring the position of a mobile terminal; judging whether the position of the mobile terminal is within a set range from the position of the fixed terminal, if so, automatically butting the mobile terminal with the fixed terminal, and performing multi-screen interaction.
In the existing interaction technology, whether intelligent equipment is placed on a display screen or not is detected, sensors such as a gyroscope and a gravity sensor are mainly used, and the method for identifying the intelligent equipment through detecting whether the change of the sensor value meets the characteristics designed in advance is adopted, so that the error identification rate is high under the condition that multiple intelligent equipment are interfered.
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provide a method for placing detection equipment based on a multilayer convolutional neural network on a display screen, wherein the detection equipment has low error recognition rate, high detection precision and high detection speed.
The purpose of the invention can be realized by the following technical scheme:
a method of placing a multi-layer convolutional neural network-based detection device on a display screen, comprising:
placing an intelligent device with an image acquisition device on a display screen in a lighting state, wherein the image acquisition device faces the display screen;
constructing a detection model based on a multilayer convolutional neural network, and loading a weight parameter set;
acquiring an image through an image acquisition device facing to the display screen, inputting the image into a detection model based on a multilayer convolution neural network for reasoning, and giving classification information and a probability value of the image;
and regarding the given classification information and the probability value, if the probability value of the image belonging to the category placed on the display screen is the highest or exceeds a set threshold value, the intelligent device is considered to be placed on the display screen.
Preferably, the constructing a detection model based on a multilayer convolutional neural network, and the loading the set of weight parameters specifically includes:
respectively acquiring images when the images are placed on a display screen and images when the images are not placed on the display screen by using intelligent equipment, and labeling the image category information;
screening the image;
constructing a multilayer convolutional neural network;
and inputting the image into a multilayer convolution neural network, training the multilayer convolution network until convergence, and acquiring a final weight parameter set to obtain a final classification network model.
Preferably, the acquiring of the image placed on the display screen by using the intelligent device specifically includes:
the intelligent equipment is placed in different areas on the display screen, the display content of the display screen continuously changes, exposure and shutter parameters of a camera of the intelligent equipment are changed, backlight brightness and motion compensation parameters of the display screen are changed, the brightness of the environment is changed, sample images of the intelligent equipment placed on the display screen under different conditions are collected, the diversity of samples is ensured, and the samples are marked as classes placed on the display screen.
Preferably, the acquiring, by using the smart device, the image that is not placed on the display screen specifically includes:
the intelligent device is placed on the surface of a common non-luminous object, transparent glass or the surface of the common luminous object, the exposure and shutter parameters of a camera of the intelligent device are changed, the brightness of the environment is changed, sample images of the intelligent device which are not placed on the display screen under different conditions are collected, and the sample images are marked as classes which are not placed on the display screen.
Preferably, the screening of the image is specifically: and judging whether the similarity of the continuously shot images exceeds a set threshold, if so, only keeping the images in the same scene in a set number, and ensuring the balance of the images in various scenes.
Preferably, the building of the multilayer convolutional neural network comprises a convolutional layer, a deep separable convolutional layer, a batch layer, a pooling layer, a global average pooling layer and a full-connection layer.
Preferably, the method specifically comprises the following steps:
s401, selecting an optimization method of a training model;
s402, setting hyper-parameters required by a training model;
s403, according to the characteristics of the training sample, data enhancement is carried out during model training; (ii) a
And S404, selecting a softmax function as the calculation of the final classification probability, wherein the calculation formula of the softmax is as follows:
Figure BDA0002311842180000031
wherein x isijIs the ith sample and the jth output of the last layer of the neural network, C is the number of classes,
Figure BDA0002311842180000032
is the probability that the ith sample belongs to the jth class;
s405, selecting the difference between the predicted value and the true value of the cross entropy loss function measurement model, wherein the calculation formula of the cross entropy loss function is as follows:
Figure BDA0002311842180000033
wherein,
Figure BDA0002311842180000034
for loss value, n is the number of batches, yijA true tag indicating whether the ith sample belongs to the jth class, and if so, yij1, otherwise yijWhere 0, C is the number of categories,
Figure BDA0002311842180000035
is the probability that the ith sample belongs to the jth class;
step S406, loading a pre-training model;
and S407, training the whole model based on the training samples until convergence to obtain a final classification model.
Preferably, the data enhancement method includes random flipping, random cropping and random tone variation.
Preferably, in the step 3), the image may be transmitted to a cloud server or other devices through a network, and inference is performed by using a multilayer convolutional neural network.
Compared with the prior art, the invention has the following advantages:
1) the error identification rate is low, and the error identification rate is effectively reduced under the condition that interference exists in multiple intelligent devices by identifying through a multilayer convolutional neural network;
2) the detection precision is high, and by collecting different samples at the source, the diversity of the samples is ensured, the training precision of the model is improved, and the detection precision is further improved;
3) the detection speed is high, and the detection precision is ensured and the detection speed is improved by designing a lightweight multilayer convolutional neural network model.
Drawings
FIG. 1 is a diagram of BasicBlock in ashufflentev 2 network structure;
FIG. 2 is a schematic diagram of a DownSampleBlock in a network structure ofshufflentv 2;
FIG. 3 is a flow chart of a neural network model training step;
fig. 4 is a comparison diagram of images displayed on a display captured by a camera of a mobile phone at different distances, wherein S501 is an image captured by the mobile phone at a longer distance from the display, and S502 is an image captured by the mobile phone at a shorter distance when the mobile phone is close to the display.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, not all, embodiments of the present invention. All other embodiments, which can be obtained by a person skilled in the art without any inventive step based on the embodiments of the present invention, shall fall within the scope of the present invention.
The invention discloses a method for placing detection equipment on a display screen based on a multilayer convolutional neural network, which specifically comprises the following steps:
1. the intelligent equipment is provided with an image acquisition device, the display screen is in a lighting state, and when the intelligent equipment is placed on the display screen, the image acquisition device faces the display screen;
2. constructing a detection model based on a multilayer convolutional neural network, and loading a weight parameter set, wherein the steps are as follows:
a) different regions of intelligent equipment on the display screen are placed, the display content of the display screen changes continuously, parameters such as exposure and shutter of a camera of the intelligent equipment are changed, parameters such as backlight brightness and motion compensation of the display screen are changed, and the ambient light brightness is changed: for example, holding a strong light lamp to irradiate a display screen, turning off light, closing an outdoor environment and the like, collecting sample images of the intelligent equipment placed on the display screen under various conditions, ensuring the diversity of samples, and marking the samples as the types placed on the display screen;
b) the intelligent device is not placed on the display screen, does not place on any surface, places ordinary luminous object surface, transparent glass, ordinary illuminant surface etc. changes parameters such as intelligent device camera exposure, shutter, changes ambient light brightness: for example, holding a strong light to irradiate a display screen, turning off light, closing an outdoor environment and the like, acquiring a sample image of intelligent equipment which is not placed on the display screen under various conditions, ensuring the diversity of the sample, and marking the sample as a category which is not placed on the display screen;
c) screening sample pictures: for continuously shot images, judging whether the similarity is high according to experience, if so, only retaining a small number of images under the same scene, ensuring the balance of images of various scenes, reducing the number of sample images and accelerating the training process;
d) a multi-layer neural network is constructed, and the network includes, but is not limited to, convolutional layers (Conv), depth separable convolutional layers (DWConv), batch layers, pooling layers, global average pooling layers, full connection layers, etc., and as shown in table 1, fig. 1, and fig. 2, a typicalnetwork structure shufflentv 2 that can operate in a smart device, where table 1 is a basic structure ofshufflentv 2, fig. 1 is a basic component ofshufflentv 2, and fig. 2 is a downsampling component ofshufflentv 2.
TABLE 1
Figure BDA0002311842180000051
e) Inputting the image into a multilayer convolutional neural network, setting parameters such as batch processing number, learning rate and training algebra of the network according to the resources of training equipment, and training the multilayer convolutional network until convergence to obtain a better accuracy rate and a final classification network model; the classification model may be trained from scratch directly, or may be trimmed from other data sets, such as ImageNet, from a trained pre-trained model, where fine-tuning from the pre-trained model may speed up the convergence of the model. As shown in fig. 3, taking an example of training a classification model based onshufflentv 2 by a device containing GPU 1080ti, the training steps of the whole model are described as follows:
s401, selecting an optimization method of a training model, such as a batch random gradient descent method with momentum;
and S402, setting hyper-parameters required by the training model, wherein the batch processing number of the training is set to be 32, the initial learning rate is set to be 0.001, and the momentum is set to be 0.9. The training algebra of the model is 100 × N, wherein N is the number of training samples, the weight attenuation is set to be 4e-5, and the learning rate is attenuated by 10 times after every 30 × N training algebras;
and S403, in order to improve the generalization of the model and prevent overfitting, selecting the following data enhancement method during model training according to the characteristics of the training sample: random turning, random cutting, random tone variation and the like;
and S404, selecting a softmax function as the calculation of the final classification probability, wherein the calculation formula of the softmax is as follows:
Figure BDA0002311842180000052
wherein x isijIs the ith sample and the jth output of the last layer of the neural network, C is the number of classes,
Figure BDA0002311842180000061
is the probability that the ith sample belongs to the jth class;
s405, selecting the difference between the predicted value and the true value of the cross entropy loss function measurement model, wherein the calculation formula of the cross entropy loss function is as follows:
Figure BDA0002311842180000062
wherein,
Figure BDA0002311842180000063
for loss value, n is the number of batches, yijA true tag indicating whether the ith sample belongs to the jth class, and if so, yij1, otherwise yijWhere 0, C is the number of categories,
Figure BDA0002311842180000064
is the probability that the ith sample belongs to the jth class;
step S406, optionally, loading a pre-training model;
and S407, training the whole model based on the training samples until convergence to obtain a final classification model.
3. On the intelligent equipment, an image is acquired through an image acquisition device facing to the direction of a display screen, the image is input into a detection model based on a multilayer convolutional neural network for reasoning, and classification information and probability values of the image are given; optionally, the image can be transmitted to a cloud server or other devices through a network, and inference is carried out by using a multilayer convolution neural network;
4. and regarding the classification information and the probability value given by the detection model, if the probability value of the image belonging to the category placed on the display screen is the highest or exceeds a certain threshold value, the intelligent device is considered to be placed on the display screen.
Fig. 4 is a comparison diagram of images displayed on a display photographed by a camera of a mobile phone at different distances, wherein S501 is an image photographed by the mobile phone at a longer distance from the display, and S502 is an image photographed by the mobile phone at a shorter distance when the mobile phone is close to the display, and it can be seen that S502 has some obvious grid-like distribution characteristics.
While the invention has been described with reference to specific embodiments, the invention is not limited thereto, and various equivalent modifications and substitutions can be easily made by those skilled in the art within the technical scope of the invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (9)

1. A method for placing a detection device on a display screen based on a multilayer convolutional neural network is characterized by comprising the following steps:
placing an intelligent device with an image acquisition device on a display screen in a lighting state, wherein the image acquisition device faces the display screen;
constructing a detection model based on a multilayer convolutional neural network, and loading a weight parameter set;
acquiring an image through an image acquisition device facing to the display screen, inputting the image into a detection model based on a multilayer convolutional neural network for reasoning, and giving classification information and a probability value of the image;
and regarding the given classification information and the probability value, if the probability value of the image belonging to the category placed on the display screen is the highest or exceeds a set threshold value, the intelligent device is considered to be placed on the display screen.
2. The method according to claim 1, wherein the constructing a detection model based on a multilayer convolutional neural network and loading the set of weight parameters specifically comprises:
respectively acquiring images when the images are placed on a display screen and images when the images are not placed on the display screen by using intelligent equipment, and labeling the image category information;
screening the image;
constructing a multilayer convolutional neural network;
and inputting the image into a multilayer convolution neural network, training the multilayer convolution network until convergence, and acquiring a final weight parameter set to obtain a final classification network model.
3. The method according to claim 2, wherein the capturing of the image when placed on the display screen using the smart device is specifically:
the intelligent equipment is placed in different areas on the display screen, the display content of the display screen continuously changes, exposure and shutter parameters of a camera of the intelligent equipment are changed, backlight brightness and motion compensation parameters of the display screen are changed, ambient light brightness is changed, sample images of the intelligent equipment placed on the display screen under different conditions are collected, and the sample images are marked as types placed on the display screen.
4. The method according to claim 2, wherein the capturing of the image without being placed on the display screen using the smart device is specifically:
the intelligent device is placed on the surface of a common non-luminous object, transparent glass or the surface of the common luminous object, the exposure and shutter parameters of a camera of the intelligent device are changed, the brightness of the environment is changed, sample images of the intelligent device which are not placed on the display screen under different conditions are collected, and the sample images are marked as classes which are not placed on the display screen.
5. The method according to claim 2, wherein said screening of the images is in particular: and judging whether the similarity of the continuously shot images exceeds a set threshold, if so, only keeping the images in the same scene in a set number, and ensuring the balance of the images in various scenes.
6. The method of claim 2, wherein the multi-layered convolutional neural network is constructed and comprises convolutional layers, deep separable convolutional layers, batch layers, pooling layers, global average pooling layers, and fully-connected layers.
7. The method according to claim 2, characterized in that it comprises in particular the steps of:
s401, selecting an optimization method of a training model;
s402, setting hyper-parameters required by a training model;
s403, according to the characteristics of the training sample, data enhancement is carried out during model training;
and S404, selecting a softmax function as the calculation of the final classification probability, wherein the calculation formula of the softmax is as follows:
Figure FDA0002311842170000021
wherein x isijIs the jth output of the ith sample at the last layer of the neural network, C is the number of classes,
Figure FDA0002311842170000024
is the probability that the ith sample belongs to the jth class;
s405, selecting the difference between the predicted value and the true value of the cross entropy loss function measurement model, wherein the calculation formula of the cross entropy loss function is as follows:
Figure FDA0002311842170000022
wherein l is a loss value, n is a number of batches, yijA real label indicating whether the ith sample belongs to the jth class, and if so, yij1, otherwise yijWhere 0, C is the number of categories,
Figure FDA0002311842170000023
is the probability that the ith sample belongs to the jth class;
step S406, loading a pre-training model;
and S407, training the whole model based on the training samples until convergence to obtain a final classification model.
8. The method of claim 1, wherein the data enhancement method comprises random flipping, random cropping, and random tone variation.
9. The method of claim 1, wherein the detection model based on the multi-layer convolutional neural network can transmit the image to a cloud server, a smart device or other devices through a network, and the inference is performed by using the multi-layer convolutional neural network.
CN201911262034.7A2019-12-102019-12-10Method for placing detection equipment on display screen based on multilayer convolutional neural networkActiveCN112947872B (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN201911262034.7ACN112947872B (en)2019-12-102019-12-10Method for placing detection equipment on display screen based on multilayer convolutional neural network

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN201911262034.7ACN112947872B (en)2019-12-102019-12-10Method for placing detection equipment on display screen based on multilayer convolutional neural network

Publications (2)

Publication NumberPublication Date
CN112947872Atrue CN112947872A (en)2021-06-11
CN112947872B CN112947872B (en)2022-10-04

Family

ID=76226073

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN201911262034.7AActiveCN112947872B (en)2019-12-102019-12-10Method for placing detection equipment on display screen based on multilayer convolutional neural network

Country Status (1)

CountryLink
CN (1)CN112947872B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN114693621A (en)*2022-03-212022-07-01圣山集团有限公司Vision-based automatic cloth inspecting method for white ultra-fine denier fabric

Citations (6)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
WO2016066079A1 (en)*2014-10-302016-05-06深圳市九洲电器有限公司Multi-screen interaction method and system
CN105573688A (en)*2014-10-102016-05-11广州杰赛科技股份有限公司Multi-screen interoperation method based on image capture
CN105931217A (en)*2016-04-052016-09-07李红伟Image processing technology-based airport pavement FOD (foreign object debris) detection method
CN106127204A (en)*2016-06-302016-11-16华南理工大学A kind of multi-direction meter reading Region detection algorithms of full convolutional neural networks
CN106383679A (en)*2016-08-312017-02-08联想(北京)有限公司Locating method and terminal device using same
CN107085696A (en)*2016-10-152017-08-22安徽百诚慧通科技有限公司A kind of vehicle location and type identifier method based on bayonet socket image

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN105573688A (en)*2014-10-102016-05-11广州杰赛科技股份有限公司Multi-screen interoperation method based on image capture
WO2016066079A1 (en)*2014-10-302016-05-06深圳市九洲电器有限公司Multi-screen interaction method and system
CN105931217A (en)*2016-04-052016-09-07李红伟Image processing technology-based airport pavement FOD (foreign object debris) detection method
CN106127204A (en)*2016-06-302016-11-16华南理工大学A kind of multi-direction meter reading Region detection algorithms of full convolutional neural networks
CN106383679A (en)*2016-08-312017-02-08联想(北京)有限公司Locating method and terminal device using same
CN107085696A (en)*2016-10-152017-08-22安徽百诚慧通科技有限公司A kind of vehicle location and type identifier method based on bayonet socket image

Cited By (1)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN114693621A (en)*2022-03-212022-07-01圣山集团有限公司Vision-based automatic cloth inspecting method for white ultra-fine denier fabric

Also Published As

Publication numberPublication date
CN112947872B (en)2022-10-04

Similar Documents

PublicationPublication DateTitle
Rijal et al.Ensemble of deep neural networks for estimating particulate matter from images
CN107784654B (en)Image segmentation method and device and full convolution network system
CN108009515B (en)Power transmission line positioning and identifying method of unmanned aerial vehicle aerial image based on FCN
CN105812746B (en)A kind of object detection method and system
CN106462737A (en)Systems and methods for haziness detection
CN109484935A (en)A kind of lift car monitoring method, apparatus and system
CN112686833A (en)Industrial product surface defect detecting and classifying device based on convolutional neural network
CN111626090B (en) A Moving Object Detection Method Based on Deep Frame Difference Convolutional Neural Network
CN115115713B (en) A bird's-eye view perception method based on unified spatiotemporal fusion
CN111160100A (en) A Lightweight Deep Model Aerial Vehicle Detection Method Based on Sample Generation
CN111582074A (en)Monitoring video leaf occlusion detection method based on scene depth information perception
CN116758441B (en)Unmanned aerial vehicle cluster intelligent scheduling management system
CN112396035A (en)Object detection method and device based on attention detection model
CN114627090A (en)Convolutional neural network optical lens defect detection method based on attention mechanism
CN113537226A (en)Smoke detection method based on deep learning
CN116187398A (en)Method and equipment for constructing lightweight neural network for unmanned aerial vehicle ocean image detection
CN114821154A (en) A deep learning-based detection algorithm for the state of ventilation windows in grain depots
CN117197789A (en) Curtain wall frame recognition method and system based on multi-scale boundary feature fusion
CN112947872B (en)Method for placing detection equipment on display screen based on multilayer convolutional neural network
CN116661477A (en)Substation unmanned aerial vehicle inspection method, device, equipment and storage medium
CN112750113B (en)Glass bottle defect detection method and device based on deep learning and linear detection
TWI771432B (en)Unique portion detecting system and unique portion detecting method
CN117523426A (en)Low-altitude target detection algorithm integrating feature pyramids
CN119131585A (en) A method and device for identifying young tea buds based on YOLO-TI
CN205263814U (en)Image processing device based on unmanned vehicles shoots

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination
TA01Transfer of patent application right

Effective date of registration:20211013

Address after:201901 room 118, building 20, No. 1-42, Lane 83, Hongxiang North Road, Lingang New District, China (Shanghai) pilot Free Trade Zone, Pudong New Area, Shanghai

Applicant after:Shanghai Pinqi Technology Co.,Ltd.

Address before:201801 area B, 3rd floor, No.7 Lane 1015, boxue South Road, Jiading District, Shanghai

Applicant before:PQ LABS, Inc.

TA01Transfer of patent application right
GR01Patent grant
GR01Patent grant

[8]ページ先頭

©2009-2025 Movatter.jp