Movatterモバイル変換


[0]ホーム

URL:


CN110619339A - Target detection method and device - Google Patents

Target detection method and device
Download PDF

Info

Publication number
CN110619339A
CN110619339ACN201810626961.1ACN201810626961ACN110619339ACN 110619339 ACN110619339 ACN 110619339ACN 201810626961 ACN201810626961 ACN 201810626961ACN 110619339 ACN110619339 ACN 110619339A
Authority
CN
China
Prior art keywords
output characteristic
characteristic data
output
processing
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201810626961.1A
Other languages
Chinese (zh)
Other versions
CN110619339B (en
Inventor
彭劲璋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xilinx Technology Beijing Ltd
Original Assignee
Beijing Shenjian Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Shenjian Intelligent Technology Co LtdfiledCriticalBeijing Shenjian Intelligent Technology Co Ltd
Priority to CN201810626961.1ApriorityCriticalpatent/CN110619339B/en
Publication of CN110619339ApublicationCriticalpatent/CN110619339A/en
Application grantedgrantedCritical
Publication of CN110619339BpublicationCriticalpatent/CN110619339B/en
Activelegal-statusCriticalCurrent
Anticipated expirationlegal-statusCritical

Links

Classifications

Landscapes

Abstract

A target detection method and apparatus are provided. The object detection method (100) comprises: inputting image data (S110); passing the input image data through a convolutional neural network to obtain output characteristic data of a target detection frame with respect to a detection target (S120); performing an approximation process on the output characteristic data (S130); the output characteristic data after the approximation processing is output (S140). Aiming at the problem of jitter of a target detection frame in the prior art, the technology is based on the image detection method, performs constraint design on network output characteristics, and improves the stability of the displayed target detection frame on the basis of not combining a tracking method. The technology is simple and easy to implement, and does not need extra computational cost.

Description

Target detection method and device
Technical Field
The present invention relates to image recognition, and more particularly, to a target detection method and apparatus.
Background
In recent years, with the development of deep learning, deep learning has been highly successful in the fields of classification, target detection, and segmentation in computer vision. Target detection is used as a basic research direction of computer vision, and the quality of detection performance determines the research of other tasks. Most of the existing detection methods are to perform neural network training and finally predict the result by manually marking an image data set. The algorithm designed by the method has the potential problem that the prediction error occurs in two adjacent frames in the video test, so that the target frame displayed and detected is jittered. In order to solve the problem, the existing method is to add a tracking algorithm and combine detection and tracking to solve the problem. Thus, while eliminating or reducing jitter errors, the added algorithm incurs additional computational overhead.
Disclosure of Invention
Embodiments of the present invention provide a target detection method and apparatus, which are directed to the problem of jitter of a target detection frame mentioned in the background art, and perform constraint design on network output characteristics based on an image detection method itself, so as to improve the stability of a displayed target detection frame without combining a tracking method. The method is simple and easy to implement, and does not need extra computational cost.
To achieve the object of the present invention, according to a first aspect of the present invention, there is provided an object detection method. The target detection method may include: inputting image data; enabling the input image data to pass through a convolutional neural network to obtain output characteristic data of a target detection frame of a detection target; carrying out approximate processing on the output characteristic data; and outputting the output characteristic data after the approximate processing.
However, in the above method, the approximation process may have an error. To further eliminate this error, preferably, the step of approximating the output characteristic data may further include: normalizing the output characteristic data based on the adjacent output characteristic data; and performing approximate processing on the normalized output characteristic data.
Preferably, the step of approximating comprises: the value is approximated as taking only a few bits after the reserved decimal point.
Further, the object detection method according to the first aspect of the present invention may further include: and displaying the target detection frame on the image according to the output characteristic data after output.
In order to achieve the object of the present invention, according to a second aspect of the present invention, there is provided an object detection apparatus. The object detection apparatus may include: an input module for inputting image data; the network module is used for enabling the input image data to pass through a convolutional neural network to obtain output characteristic data of a target detection frame related to a detection target; the approximate processing module is used for carrying out approximate processing on the output characteristic data; and the output module is used for outputting the output characteristic data after the approximate processing.
Likewise, in order to further eliminate the error in the approximation process, preferably, the approximation processing module may further include: the normalization submodule is used for performing normalization processing on the output characteristic data based on the adjacent output characteristic data; and the normalization approximate processing submodule is used for carrying out approximate processing on the output characteristic data after normalization processing.
Preferably, the approximation module or the normalized approximation sub-module may be further configured to approximate the value to only a few bits after the reserved decimal point.
In addition, the object detection apparatus according to the second aspect of the present invention may further include an object detection frame display module for displaying the object detection frame on the image according to the output feature data after the output.
To achieve the object of the present invention, according to a third aspect of the present invention, there is provided a computer readable medium for recording instructions executable by a processor, the instructions, when executed by the processor, causing the processor to perform an object detection method, comprising the operations of: inputting image data; enabling the input image data to pass through a convolutional neural network to obtain output characteristic data of a target detection frame of a detection target; carrying out approximate processing on the output characteristic data; and outputting the output characteristic data after the approximate processing.
Similarly, in order to further eliminate the error in the approximation process, preferably, the operation of approximating the output characteristic data may further include: normalizing the output characteristic data based on the adjacent output characteristic data; and performing approximate processing on the normalized output characteristic data.
Based on the target detection technology, aiming at the problem of jitter of a target detection frame in the prior art, a tracking algorithm and extra calculation power are not required to be introduced, and prediction errors causing the jitter are eliminated only through approximate processing, so that the target detection technology can be simply realized and has a good effect. In addition, the error of approximate processing is reduced through normalization operation, so that the target detection technology can further achieve a good anti-jitter effect.
Drawings
The invention is described below with reference to the embodiments with reference to the drawings.
Fig. 1 shows a flow chart of a target detection method according to an embodiment of the invention.
Fig. 2 shows detailed steps of the approximation processing in the object detection method of fig. 1.
Fig. 3 shows a schematic block diagram of an object detection arrangement according to an embodiment of the invention.
Fig. 4 illustrates a process of approximating a network output characteristic according to a specific embodiment of the present invention.
Fig. 5 illustrates a more detailed process for approximating a network output characteristic, according to a specific embodiment of the present invention.
Detailed Description
The drawings are only for purposes of illustration and are not to be construed as limiting the invention. The technical solution of the present invention is further described below with reference to the accompanying drawings and examples.
Fig. 1 shows a flow chart of a target detection method according to an embodiment of the invention.
As shown in fig. 1, the object detection method 100 according to the present invention starts at step S110, where image data is input. It will be understood by those skilled in the art that the input image data may be data of one frame in a dynamic video in general. The target detection method 100 is used to find a suitable target object in such a frame of image. The target object may be determined in advance or determined in the image according to some rule.
In step S120, the input image data is passed through a convolutional neural network, and feature data about the position of the detection target is obtained. And predicting the position of the target in the image by using the characteristic data, and identifying by using a rectangular frame.
As mentioned above, one problem in practice is that: in two adjacent frames of the dynamic video, the target hardly changes, but the predicted target detection frame is jittered due to the prediction error. For example, suppose that we set up 20 frames of images per second in a piece of video, and display the target detection frame of each frame of image. Then, this target detection frame appears 20 times in one second, and due to the existence of the prediction error, the displayed target detection frame generates a small but perceptible displacement between each frame, so that the viewer visually generates a jittering effect and feels uncomfortable.
To solve this problem, next, in step S130, approximation processing is performed on the output characteristic data.
Fig. 2 shows detailed steps of the approximation processing in the object detection method of fig. 1.
Specifically, the approximating the output characteristic data described in step S130 may further include the following two sub-steps: in step S130a, normalization processing is performed on the output feature data based on the adjacent output feature data; then, in step S130b, the output feature data after the normalization processing is subjected to the approximation processing. The normalization operation reduces the error of the approximate processing, so that the target detection method can further achieve a good anti-jitter effect.
More specifically, the approximation processing (S130 or S130 b) referred to herein means approximating a numerical value to take only a few bits after the reserved decimal point. For example, for the output characteristic data, the precision of each value is reserved only to the 3 rd bit or 5 th bit or nth bit after the decimal point.
Finally, in step S140, the output characteristic data after the approximation processing is output.
After step S140, the object detection method 100 of the present invention may further include: an object detection frame (not shown in fig. 1) is displayed on the image according to the output feature data after the output. However, for the core purpose and core scheme of the present invention, the method may end by step S140.
Fig. 3 shows a schematic block diagram of an object detection arrangement according to an embodiment of the invention.
As shown in fig. 3, the object detection apparatus 300 according to the present invention may include: an input module 301, a network module 302, an approximation processing module 303 and an output module 304.
The input module 301 is used for inputting image data. Those skilled in the art will appreciate that the operation performed by the input module 301 corresponds to step S110 in fig. 1.
The network module 302 is configured to pass the input image data through a convolutional neural network to obtain output feature data of a target detection frame of a detection target. Those skilled in the art will appreciate that the operation performed by the network module 302 corresponds to step S120 in fig. 1.
The approximation module 303 is configured to perform approximation processing on the output feature data. Those skilled in the art will appreciate that the operation performed by the approximation processing module 303 corresponds to step S130 in fig. 1.
The approximation processing module 303 may further include a normalization sub-module 303a and a normalized approximation processing sub-module 303 b. The normalization submodule 303a is configured to perform normalization processing on the output feature data based on adjacent output feature data. The normalization approximate processing submodule 303b is configured to perform approximate processing on the normalized output feature data. It will be appreciated by those skilled in the art that the operations performed by the normalization submodule 303a and the normalization approximation submodule 303b correspond to steps S130a and S130b in fig. 2, respectively.
The approximation processing module 303 or more specifically the normalized approximation processing sub-module 303b described above may be specifically configured to approximate the value to take only a few bits after the reserved decimal point.
The output module 304 is used for outputting the output characteristic data after the approximation processing.
Although not shown in fig. 3, the object detection apparatus according to the present invention may further include an object detection frame display module for displaying an object detection frame on the image according to the output feature data after the output.
Fig. 4 illustrates a process of approximating a network output characteristic according to a specific embodiment of the present invention. We can refer to the process of feature approximation in fig. 4 as the first stage of the present invention. Specifically, an input image is subjected to convolution neural network to obtain output characteristics, and approximation processing is performed by rounding characteristic data after a plurality of decimal points are performed. That is, each value of the feature data is approximated as taking only a few bits after the reserved decimal point as described above. The decimal point multi-bit rounding does not influence the output of the result, and can also forcibly ensure that two extremely similar images give the same output, thereby eliminating the root cause of the jitter of the detection frame. As shown in fig. 4, for the four data displayed at the forefront (and then the latter data are processed in sequence), approximation processing is performed respectively, thereby obtaining the final network output.
Fig. 5 illustrates a more detailed process for approximating a network output characteristic, according to a specific embodiment of the present invention. We can refer to the process of feature approximation in fig. 5 as the second stage of the present invention. The second stage may be seen as a further improvement to the first stage. This is because, in practice, the processing in the first stage may generate an error in performing the approximation processing, such an error being generated for each data, and there may be an inconsistency between the respective data, that is, the error may be identical or different in direction, thereby possibly causing the error to be amplified. The second stage normalizes the values of the neighboring data before the approximation process, so as to reduce the error of the first stage in this respect. For example, as shown in fig. 5, according to every four adjacent data squares, based on the values of the adjacent data, a normalized value of the upper left data is obtained, and the calculation is performed by analogy, and finally, the normalized values of all the data are obtained. The approximation processing is carried out by the normalized numerical value, so that the error of the approximation processing can be further reduced, and the target detection method can further achieve a good anti-jitter effect.
Those skilled in the art will appreciate that the methods of the present invention may be implemented as computer programs. As described above in connection with fig. 1, 2, 4, 5, the methods according to the above embodiments may execute one or more programs, including instructions, to cause a computer or processor to perform the algorithms described in connection with the figures. These programs may be stored and provided to a computer or processor using various types of non-transitory computer readable media. Non-transitory computer readable media include various types of tangible storage media. Examples of the non-transitory computer readable medium include magnetic recording media such as floppy disks, magnetic tapes, and hard disk drives, magneto-optical recording media such as magneto-optical disks, CD-ROMs (compact disc read only memories), CD-R, CD-R/W, and semiconductor memories such as ROMs, PROMs (programmable ROMs), EPROMs (erasable PROMs), flash ROMs, and RAMs (random access memories). Further, these programs can be provided to the computer by using various types of transitory computer-readable media. Examples of the transitory computer readable medium include an electric signal, an optical signal, and an electromagnetic wave. The transitory computer readable medium can be used to provide the program to the computer through a wired communication path such as an electric wire and an optical fiber or a wireless communication path.
Therefore, according to the present invention, it is also proposed a computer program or a computer readable medium for recording instructions executable by a processor, the instructions, when executed by the processor, causing the processor to perform an object detection method, comprising the operations of: inputting image data; enabling the input image data to pass through a convolutional neural network to obtain output characteristic data of a target detection frame of a detection target; carrying out approximate processing on the output characteristic data; and outputting the output characteristic data after the approximate processing.
In the above computer program or computer readable medium, more specifically, the operation of performing approximate processing on the output characteristic data may further include: normalizing the output characteristic data based on the adjacent output characteristic data; and performing approximate processing on the normalized output characteristic data.
Various embodiments and implementations of the present invention have been described above. However, the spirit and scope of the present invention is not limited thereto. Those skilled in the art will be able to devise many more applications in accordance with the teachings of the present invention which are within the scope of the present invention.
That is, the above examples of the present invention are only examples for clearly illustrating the present invention, and do not limit the embodiments of the present invention. Other variations and modifications will be apparent to persons skilled in the art in light of the above description. And are neither required nor exhaustive of all embodiments. Any modification, replacement or improvement made within the spirit and principle of the present invention should be included in the scope of the claims of the present invention.

Claims (10)

CN201810626961.1A2018-06-192018-06-19Target detection method and deviceActiveCN110619339B (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN201810626961.1ACN110619339B (en)2018-06-192018-06-19Target detection method and device

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN201810626961.1ACN110619339B (en)2018-06-192018-06-19Target detection method and device

Publications (2)

Publication NumberPublication Date
CN110619339Atrue CN110619339A (en)2019-12-27
CN110619339B CN110619339B (en)2022-07-15

Family

ID=68919890

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN201810626961.1AActiveCN110619339B (en)2018-06-192018-06-19Target detection method and device

Country Status (1)

CountryLink
CN (1)CN110619339B (en)

Citations (11)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US20140133718A1 (en)*2012-11-142014-05-15Varian Medical Systems, Inc.Method and Apparatus Pertaining to Identifying Objects of Interest in a High-Energy Image
US20140169663A1 (en)*2012-12-192014-06-19Futurewei Technologies, Inc.System and Method for Video Detection and Tracking
US20140369555A1 (en)*2013-06-142014-12-18Qualcomm IncorporatedTracker assisted image capture
WO2015089436A1 (en)*2013-12-132015-06-18Intel CorporationEfficient facial landmark tracking using online shape regression method
US20150178943A1 (en)*2013-12-212015-06-25Qualcomm IncorporatedSystem and method to stabilize display of an object tracking box
CN105184278A (en)*2015-09-302015-12-23深圳市商汤科技有限公司Human face detection method and device
CN205014978U (en)*2015-10-122016-02-03南京信息工程大学Digit spirit level based on MEMS gravity detects
CN105447459A (en)*2015-11-182016-03-30上海海事大学Unmanned plane automation detection target and tracking method
US20160203388A1 (en)*2015-01-132016-07-14Arris Enterprises, Inc.Automatic detection of logos in video sequences
CN106874825A (en)*2015-12-102017-06-20展讯通信(天津)有限公司The training method of Face datection, detection method and device
CN107343145A (en)*2017-07-122017-11-10中国科学院上海技术物理研究所A kind of video camera electronic image stabilization method based on robust features point

Patent Citations (13)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US20140133718A1 (en)*2012-11-142014-05-15Varian Medical Systems, Inc.Method and Apparatus Pertaining to Identifying Objects of Interest in a High-Energy Image
US20140169663A1 (en)*2012-12-192014-06-19Futurewei Technologies, Inc.System and Method for Video Detection and Tracking
CN105264570A (en)*2013-06-142016-01-20高通股份有限公司Tracker assisted image capture
US20140369555A1 (en)*2013-06-142014-12-18Qualcomm IncorporatedTracker assisted image capture
WO2015089436A1 (en)*2013-12-132015-06-18Intel CorporationEfficient facial landmark tracking using online shape regression method
US20150178943A1 (en)*2013-12-212015-06-25Qualcomm IncorporatedSystem and method to stabilize display of an object tracking box
CN105830430A (en)*2013-12-212016-08-03高通股份有限公司 Systems and methods to stabilize display of object tracking boxes
US20160203388A1 (en)*2015-01-132016-07-14Arris Enterprises, Inc.Automatic detection of logos in video sequences
CN105184278A (en)*2015-09-302015-12-23深圳市商汤科技有限公司Human face detection method and device
CN205014978U (en)*2015-10-122016-02-03南京信息工程大学Digit spirit level based on MEMS gravity detects
CN105447459A (en)*2015-11-182016-03-30上海海事大学Unmanned plane automation detection target and tracking method
CN106874825A (en)*2015-12-102017-06-20展讯通信(天津)有限公司The training method of Face datection, detection method and device
CN107343145A (en)*2017-07-122017-11-10中国科学院上海技术物理研究所A kind of video camera electronic image stabilization method based on robust features point

Non-Patent Citations (7)

* Cited by examiner, † Cited by third party
Title
HAO ZHU等: "Target-focused video stabilization for human computer interaction", 《2017 29TH CHINESE CONTROL AND DECISION CONFERENCE (CCDC)》*
HAO ZHU等: "Target-focused video stabilization for human computer interaction", 《2017 29TH CHINESE CONTROL AND DECISION CONFERENCE (CCDC)》, 17 July 2017 (2017-07-17), pages 7688 - 7693*
HONG ZHANG等: "On The Stability of Video Detection and Tracking", 《COMPUTER VISION AND PATTERN RECOGNITION》*
HONG ZHANG等: "On The Stability of Video Detection and Tracking", 《COMPUTER VISION AND PATTERN RECOGNITION》, 6 April 2017 (2017-04-06), pages 1 - 9*
卫磊: "扩展目标的稳定检测技术研究", 《中国优秀博硕士学位论文全文数据库(硕士)信息科技辑》, no. 2017, 15 December 2017 (2017-12-15), pages 138 - 437*
陈慧杰: "视频监控中的行人检测与跟踪方法研究", 《中国优秀博硕士学位论文全文数据库(硕士)信息科技辑》*
陈慧杰: "视频监控中的行人检测与跟踪方法研究", 《中国优秀博硕士学位论文全文数据库(硕士)信息科技辑》, no. 08, 15 August 2014 (2014-08-15), pages 138 - 1263*

Also Published As

Publication numberPublication date
CN110619339B (en)2022-07-15

Similar Documents

PublicationPublication DateTitle
CN109117831B (en)Training method and device of object detection network
JP6581068B2 (en) Image processing apparatus, image processing method, program, operation control system, and vehicle
US11645735B2 (en)Method and apparatus for processing image, device and computer readable storage medium
US10810721B2 (en)Digital image defect identification and correction
JP6362333B2 (en) Image processing apparatus, image processing method, and program
US11144817B2 (en)Device and method for determining convolutional neural network model for database
US20190156499A1 (en)Detection of humans in images using depth information
US20210342642A1 (en)Machine learning training dataset optimization
CN110717919A (en)Image processing method, device, medium and computing equipment
CN113436100A (en)Method, apparatus, device, medium and product for repairing video
CN112597918B (en)Text detection method and device, electronic equipment and storage medium
US9558534B2 (en)Image processing apparatus, image processing method, and medium
CN109377508B (en)Image processing method and device
US11156968B2 (en)Adaptive control of negative learning for limited reconstruction capability auto encoder
CN111179276A (en)Image processing method and device
CN113643260A (en) Method, apparatus, apparatus, medium and product for detecting image quality
CN111798376B (en)Image recognition method, device, electronic equipment and storage medium
US20200319140A1 (en)Automated analysis of analytical gels and blots
CN109035167A (en) Method, device, device and medium for processing multiple human faces in an image
CN113657317A (en) A cargo location identification method, system, electronic device and storage medium
CN110390344A (en) Alternative frame update method and device
CN113392241B (en)Method, device, medium and electronic equipment for identifying definition of well logging image
CN110705633A (en)Target object detection and target object detection model establishing method and device
CN110619339B (en)Target detection method and device
CN113902001A (en)Model training method and device, electronic equipment and storage medium

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination
TA01Transfer of patent application right

Effective date of registration:20200907

Address after:Unit 01-19, 10 / F, 101, 6 / F, building 5, yard 5, Anding Road, Chaoyang District, Beijing 100029

Applicant after:Xilinx Electronic Technology (Beijing) Co.,Ltd.

Address before:100083, 17 floor, 4 Building 4, 1 Wang Zhuang Road, Haidian District, Beijing.

Applicant before:BEIJING DEEPHI INTELLIGENT TECHNOLOGY Co.,Ltd.

TA01Transfer of patent application right
GR01Patent grant
GR01Patent grant

[8]ページ先頭

©2009-2025 Movatter.jp