CN108471543A

Movatterモバイル変換

Info

Publication number: CN108471543A
Application number: CN201810200729.1A
Authority: CN
Inventors: 刘莹; 于珊
Original assignee: Beijing Sohu New Media Information Technology Co Ltd
Current assignee: Beijing Sohu New Media Information Technology Co Ltd
Priority date: 2018-03-12
Filing date: 2018-03-12
Publication date: 2018-08-31

Abstract

A kind of advertisement information adding method and device are disclosed in the embodiment of the present invention, image recognition is carried out to video file and generates image recognition result, described image recognition result includes the time of occurrence point of the product for including and each product in the video file in the video file；According to time of occurrence point of each product in the video file, determine the optimal displaying time of the advertisement information of each product, and the corresponding scene of optimal displaying time by mainstream industry product category in the advertisement information of each product carries out graded class mark to each product；Obtain the publicity product that user selects from the product for including in the video file, the video file be played to it is described publicity product mark where scene when, by it is described publicity product advertisement information be added to it is described publicity product mark where scene in be shown.Based on the above method and device, the application efficiency of video can be improved.

Description

Translated fromChinese

一种宣传信息添加方法及装置Method and device for adding publicity information

技术领域technical field

本发明涉及视频识别技术领域，具体涉及一种宣传信息添加方法及装置。The invention relates to the technical field of video recognition, in particular to a method and device for adding publicity information.

背景技术Background technique

图像识别技术是指利用计算机对图像进行处理、分析和理解，以识别各种不同模式的目标和对象的技术。随着图像识别技术的日趋成熟，应用范围也越来越广，目前基于图像识别技术能够准确识别图片中的物体类别、位置、置信度等综合信息，但是在视频领域尚未有规模化的应用，由于无法规模化识别视频图像，导致视频除了被观看之外无法用于其他方面，导致视频应用效率较低。Image recognition technology refers to the technology that uses computers to process, analyze and understand images to identify targets and objects in various patterns. With the maturity of image recognition technology, the scope of application is becoming wider and wider. At present, based on image recognition technology, it can accurately identify comprehensive information such as object category, location, and confidence level in the picture, but there is no large-scale application in the video field. Due to the inability to identify video images on a large scale, the video cannot be used for other purposes except for viewing, resulting in low video application efficiency.

发明内容Contents of the invention

有鉴于此，本发明实施例提供一种宣传信息添加方法及装置，能够提高视频的应用效率。In view of this, embodiments of the present invention provide a method and device for adding promotional information, which can improve the application efficiency of videos.

为实现上述目的，本发明实施例提供如下技术方案：In order to achieve the above purpose, embodiments of the present invention provide the following technical solutions:

一种宣传信息添加方法，包括：A method for adding publicity information, including:

对视频文件进行图像识别生成图像识别结果，所述图像识别结果包括所述视频文件中包含的产品以及各个产品在所述视频文件中的出现时间点；Performing image recognition on the video file to generate an image recognition result, the image recognition result including the products contained in the video file and the time points at which each product appears in the video file;

根据所述各个产品在所述视频文件中的出现时间点，确定所述各个产品的宣传信息的最优展示时间，并按主流行业产品品类在所述各个产品的宣传信息的最优展示时间对应的场景对所述各个产品进行分品类标注。According to the appearance time points of each product in the video file, determine the optimal display time of the promotional information of each product, and correspond to the optimal display time of the promotional information of each product according to the mainstream industry product category Each product is labeled by category in the scene.

获取用户从所述视频文件中包含的产品中选择的宣传产品，在所述视频文件播放到所述宣传产品的标注所在的场景时，将所述宣传产品的宣传信息添加至所述宣传产品的标注所在的场景中进行展示。Obtain the promotional product selected by the user from the products contained in the video file, and when the video file is played to the scene where the promotional product is marked, add the promotional information of the promotional product to the Displayed in the scene where the annotation is located.

可选的，所述对视频文件进行图像识别生成图像识别结果，包括：Optionally, performing image recognition on the video file to generate an image recognition result includes:

识别视频文件中的图像包含的产品。Identify the products contained in the images in the video files.

可选的，在所述识别视频文件中的图像包含的产品之前，还包括：Optionally, before the identification of the product contained in the image in the video file, further include:

采用Google Inception V3算法对于图片数据集进行深度学习，得到图像分类模型。The Google Inception V3 algorithm is used to conduct deep learning on the image data set to obtain an image classification model.

可选的，所述采用Google inception V3算法对于图片数据集进行深度学习的过程中，还包括：Optionally, in the process of using the Google inception V3 algorithm to carry out deep learning on the image data set, it also includes:

基于Annotator Open Images图片数据集完善深度学习模型。Improve the deep learning model based on the Annotator Open Images image dataset.

可选的，所述识别视频文件中的图像包含的产品，具体包括：Optionally, the identifying the products contained in the image in the video file specifically includes:

基于开源计算机视觉库Open CV的边缘检测算法，提取并保存所述视频文件中的视频关键帧；Based on the edge detection algorithm of the open source computer vision library Open CV, extract and save the video key frame in the video file;

使用RandomForest算法对所述视频关键帧进行筛选；Use the RandomForest algorithm to filter the key frames of the video;

根据图像分类模型，采用Detector SSD算法对筛选后的所述视频关键帧进行识别，确定筛选后的所述视频关键帧中包含的产品。According to the image classification model, the Detector SSD algorithm is used to identify the key frames of the video after screening, and determine the products contained in the key frames of the video after screening.

一种宣传信息添加装置，包括：A device for adding publicity information, comprising:

图像识别模块，用于对视频文件进行图像识别生成图像识别结果，所述图像识别结果包括所述视频文件中包含的产品以及各个产品在所述视频文件中的出现时间点；An image recognition module, configured to perform image recognition on the video file to generate an image recognition result, the image recognition result including the products contained in the video file and the time points at which each product appears in the video file;

筛选分类模块，用于根据所述各个产品在所述视频文件中的出现时间点，确定所述各个产品的宣传信息的最优展示时间，并按主流行业产品品类在所述各个产品的宣传信息的最优展示时间对应的场景对所述各个产品进行分品类标注。The screening and classification module is used to determine the optimal display time of the promotional information of each product according to the appearance time point of each product in the video file, and display the promotional information of each product according to the mainstream industry product category The scene corresponding to the optimal display time of the product is labeled by category.

宣传信息投放模块，用于获取用户从所述视频文件中包含的产品中选择的宣传产品，在所述视频文件播放到所述宣传产品的标注所在的场景时，将所述宣传产品的宣传信息添加至所述宣传产品的标注所在的场景中进行展示。The promotional information delivery module is used to obtain the promotional product selected by the user from the products contained in the video file, and when the video file is played to the scene where the promotional product is marked, the promotional information of the promotional product Displayed in the same scene as the callout added to the advertised product.

可选的，所述图像识别模块具体用于：Optionally, the image recognition module is specifically used for:

识别所述视频文件中的图像包含的产品。Identify the products contained in the images in the video file.

可选的，所述装置还包括：Optionally, the device also includes:

图像分类模型获取模块，用于在所述识别视频文件中的图像包含的产品之前，采用Google Inception V3算法对于图片数据集进行深度学习，得到图像分类模型。The image classification model acquisition module is used to use the Google Inception V3 algorithm to carry out deep learning on the image data set to obtain the image classification model before the product contained in the image in the identified video file.

可选的，所述图像分类模型获取模块，具体用于：Optionally, the image classification model acquisition module is specifically used for:

在所述采用Google inception V3算法对于图片数据集进行深度学习的过程中，基于Annotator Open Images图片数据集完善深度学习模型。In the process of using the Google inception V3 algorithm to carry out deep learning on the image data set, the deep learning model is improved based on the Annotator Open Images image data set.

基于上述技术方案，本发明实施例中公开了一种宣传信息添加方法及装置，对视频文件进行图像识别生成图像识别结果，所述图像识别结果包括所述视频文件中包含的产品以及各个产品在所述视频文件中的出现时间点；根据所述各个产品在所述视频文件中的出现时间点，确定所述各个产品的宣传信息的最优展示时间，并按主流行业产品品类在所述各个产品的宣传信息的最优展示时间对应的场景对所述各个产品进行分品类标注；获取用户从所述视频文件中包含的产品中选择的宣传产品，在所述视频文件播放到所述宣传产品的标注所在的场景时，将所述宣传产品的宣传信息添加至所述宣传产品的标注所在的场景中进行展示。基于上述方法及装置，能够提高视频的应用效率。Based on the above technical solution, the embodiment of the present invention discloses a method and device for adding promotional information, which performs image recognition on a video file to generate an image recognition result, and the image recognition result includes the products contained in the video file and each product in the video file. The time point of appearance in the video file; according to the time point of appearance of each product in the video file, determine the optimal display time of the promotional information of each product, and display it in each product category according to the mainstream industry. The scene corresponding to the optimal display time of the promotional information of the product classifies each product; obtains the promotional product selected by the user from the products contained in the video file, and plays the promotional product in the video file When the scene where the label of the promotion product is located, the promotion information of the promotional product is added to the scene where the label of the promotion product is located for display. Based on the above method and device, the application efficiency of video can be improved.

附图说明Description of drawings

为了更清楚地说明本发明实施例或现有技术中的技术方案，下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍，显而易见地，下面描述中的附图仅仅是本发明的实施例，对于本领域普通技术人员来讲，在不付出创造性劳动的前提下，还可以根据提供的附图获得其他的附图。In order to more clearly illustrate the technical solutions in the embodiments of the present invention or the prior art, the following will briefly introduce the drawings that need to be used in the description of the embodiments or the prior art. Obviously, the accompanying drawings in the following description are only It is an embodiment of the present invention, and those skilled in the art can also obtain other drawings according to the provided drawings without creative work.

图1为本发明实施例提供的一种宣传信息添加方法的流程示意图；FIG. 1 is a schematic flowchart of a method for adding publicity information provided by an embodiment of the present invention;

图2为本发明实施例提供的Inception V3模块示意图；Fig. 2 is a schematic diagram of the Inception V3 module provided by the embodiment of the present invention;

图3为本发明实施例提供的Inception V3的网格结构示意图；FIG. 3 is a schematic diagram of a grid structure of Inception V3 provided by an embodiment of the present invention;

图4为本发明实施例提供的一种识别视频文件中的图像包含的产品的方法流程示意图；FIG. 4 is a schematic flowchart of a method for identifying products contained in images in video files provided by an embodiment of the present invention;

图5为本发明实施例提供的Open CV主体的基本结构示意图；Fig. 5 is the basic structure schematic diagram of the Open CV main body that the embodiment of the present invention provides;

图6为本发明实施例提供的基于OpenCV的运动物体的视频检测原理示意图；Fig. 6 is the schematic diagram of the video detection principle of the moving object based on OpenCV that the embodiment of the present invention provides;

图7为本发明实施例提供的SSD物体检测方法的示意图；7 is a schematic diagram of an SSD object detection method provided by an embodiment of the present invention;

图8为本发明实施例公开的一种宣传信息添加装置的结构示意图。Fig. 8 is a schematic structural diagram of a device for adding publicity information disclosed in an embodiment of the present invention.

具体实施方式Detailed ways

下面将结合本发明实施例中的附图，对本发明实施例中的技术方案进行清楚、完整地描述，显然，所描述的实施例仅仅是本发明一部分实施例，而不是全部的实施例。基于本发明中的实施例，本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例，都属于本发明保护的范围。The following will clearly and completely describe the technical solutions in the embodiments of the present invention with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only some, not all, embodiments of the present invention. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the protection scope of the present invention.

请参阅附图1，图1为本发明实施例提供的一种宣传信息添加方法的流程示意图，该方法具体包括如下步骤：Please refer to accompanying drawing 1. Fig. 1 is a schematic flowchart of a method for adding publicity information provided by an embodiment of the present invention. The method specifically includes the following steps:

步骤S100，对视频文件进行图像识别生成图像识别结果，所述图像识别结果包括所述视频文件中包含的产品以及各个产品在所述视频文件中的出现时间点；Step S100, performing image recognition on the video file to generate an image recognition result, the image recognition result including the products contained in the video file and the time points at which each product appears in the video file;

该步骤中，包括识别视频文件中的图像包含的产品。在所述识别视频文件中的图像包含的产品之前，还包括：采用Google Inception V3算法对于图片数据集进行深度学习，得到图像分类模型。在所述采用Google inception V3算法对于图片数据集进行深度学习的过程中，还包括：基于Annotator Open Images图片数据集完善深度学习模型。This step includes identifying the products contained in the images in the video file. Before identifying the products contained in the image in the video file, it also includes: using the Google Inception V3 algorithm to perform deep learning on the image data set to obtain an image classification model. In the process of using the Google inception V3 algorithm to perform deep learning on the image data set, it also includes: improving the deep learning model based on the Annotator Open Images image data set.

Inception为Google开源的CNN模型，至今已经公开四个版本，每一个版本都是基于大型图像数据库ImageNet中的数据训练而成。因此我们可以直接利用Google的Inception模型来实现图像分类。以Inception V3模型为基础。Inception V3模型大约有2500万个参数，分类一张图像就用了50亿的乘加指令，分类一张图像转眼就能完成。Inception V3模块示意图具体由图2所示。Inception V3的网格结构具体由图3所示。Inception is Google's open source CNN model. So far, four versions have been released, each of which is trained based on the data in the large image database ImageNet. Therefore, we can directly use Google's Inception model to achieve image classification. Based on the Inception V3 model. The Inception V3 model has about 25 million parameters, and it takes 5 billion multiplication and addition instructions to classify an image, and the classification of an image can be completed in a blink of an eye. The schematic diagram of the Inception V3 module is shown in Figure 2. The grid structure of Inception V3 is shown in Figure 3.

步骤S110，根据所述各个产品在所述视频文件中的出现时间点，确定所述各个产品的宣传信息的最优展示时间，并按主流行业产品品类在所述各个产品的宣传信息的最优展示时间对应的场景对所述各个产品进行分品类标注；Step S110: Determine the optimal display time of the promotional information of each product according to the appearance time point of each product in the video file, and determine the optimal display time of the promotional information of each product according to the mainstream industry product category. The scene corresponding to the display time marks each product by category;

主流行业产品品类包括11个行业共28个产品类别，具体如下：The product categories of mainstream industries include 28 product categories in 11 industries, as follows:

汽车：【SUV】、【MPV】、【轿车】、【跑车】、【其他车型】Cars: [SUV], [MPV], [Sedan], [Sports Car], [Other models]

电子家电：【手机及其配件】、【家用电器】、【摄影器材】Electronic home appliances: [mobile phones and their accessories], [household appliances], [photographic equipment]

IT业：【电脑】、【软件】IT industry: [Computer], [Software]

化妆品：【个人洗护用品】、【化妆用品】Cosmetics: [personal toiletries], [cosmetics]

日用品：【洗涤用品】、【其他日用品】Daily necessities: [washing supplies], [other daily necessities]

酒类：【啤酒】、【红酒】、【白酒】、【果酒】、【其他酒类】Alcohol: [beer], [red wine], [white wine], [fruit wine], [other alcohol]

食品饮料:【食品】、【饮品】Food and beverage: [food], [drink]

药业：【感冒药】、【皮肤药】Pharmaceutical industry: [cold medicine], [skin medicine]

房产：【中介】Real Estate Agents】

餐饮：【便利店】、【餐饮店】Catering: [Convenience Store], [Restaurant]

服装饰品：【服装】、【饰品】Clothing accessories: [Clothing], [Accessories]

步骤S120，获取用户从所述视频文件中包含的产品中选择的宣传产品，在所述视频文件播放到所述宣传产品的标注所在的场景时，将所述宣传产品的宣传信息添加至所述宣传产品的标注所在的场景中进行展示。Step S120, acquire the promotional product selected by the user from the products contained in the video file, and add the promotional information of the promotional product to the It is displayed in the scene where the label of the promotional product is located.

将所述宣传产品的宣传信息添加至所述宣传产品的标注所在的场景中之后，即可引导观看所述视频的用户点击查看所述宣传产品的宣传信息。After the promotional information of the promotional product is added to the scene where the labeling of the promotional product is located, users watching the video can be guided to click to view the promotional information of the promotional product.

所述宣传信息具体可以为创意压屏条广告。The publicity information may specifically be a creative banner advertisement.

本实施例中公开了一种宣传信息添加方法，对视频文件进行图像识别生成图像识别结果，所述图像识别结果包括所述视频文件中包含的产品以及各个产品在所述视频文件中的出现时间点；根据所述各个产品在所述视频文件中的出现时间点，确定所述各个产品的宣传信息的最优展示时间，并按主流行业产品品类在所述各个产品的宣传信息的最优展示时间对应的场景对所述各个产品进行分品类标注；获取用户从所述视频文件中包含的产品中选择的宣传产品，在所述视频文件播放到所述宣传产品的标注所在的场景时，将所述宣传产品的宣传信息添加至所述宣传产品的标注所在的场景中进行展示。基于上述方法，能够提高视频的应用效率。This embodiment discloses a method for adding promotional information, performing image recognition on a video file to generate an image recognition result, the image recognition result including the products contained in the video file and the appearance time of each product in the video file point; according to the appearance time point of each product in the video file, determine the optimal display time of the promotional information of each product, and determine the optimal display time of the promotional information of each product according to the mainstream industry product category The scenes corresponding to the time mark the products by categories; obtain the promotional products selected by the user from the products contained in the video file, and when the video file is played to the scene where the promotional product is marked, the The promotional information of the promotional product is added to the scene where the labeling of the promotional product is displayed for display. Based on the above method, the application efficiency of video can be improved.

请参阅附图4，图4为本发明实施例中公开的一种识别视频文件中的图像包含的产品的方法流程示意图，该方法具体包括：Please refer to accompanying drawing 4. Fig. 4 is a schematic flowchart of a method for identifying products contained in images in video files disclosed in an embodiment of the present invention. The method specifically includes:

步骤S200，基于开源计算机视觉库Open CV的边缘检测算法，提取并保存所述视频文件中的视频关键帧；Step S200, extracting and saving video key frames in the video file based on the edge detection algorithm of the open source computer vision library Open CV;

帧，就是动画中最小单位的单幅影像画面，相当于电影胶片上的每一格镜头。在动画软件的时间轴上帧表现为一格或一个标记。关键帧，相当于二维动画中的原画，指角色或者物体运动或变化中的关键动作所处的那一帧。A frame is a single image frame in the smallest unit of animation, which is equivalent to each frame on a movie film. A frame is represented as a frame or a marker on the animation software's timeline. A key frame, which is equivalent to the original painting in a two-dimensional animation, refers to the frame where the key action of a character or object is moving or changing.

OpenCV的全称是：Open Source Computer Vision Library，是一个基于BSD许可(开源)发行的跨平台计算机视觉库，其移植性和通用性高，可以运行于Linux、Windows和Mac OS等多个操作系统。它由许多的函数和少量的类组成其开发的编程语言，并且为了提高其通用性，提供了Python、Ruby、MATLAB等编程软件语言的接口，实现了图像处理和计算机视觉方面的很多通用算法，从而较为完美的分析处理图像以及完成许多的通用算法用于计算机智能视觉方面。Open CV主体的基本结构如图5所示。The full name of OpenCV is: Open Source Computer Vision Library, which is a cross-platform computer vision library released based on BSD license (open source). It has high portability and versatility, and can run on multiple operating systems such as Linux, Windows, and Mac OS. It consists of many functions and a small number of classes to form the programming language it develops, and in order to improve its versatility, it provides interfaces of programming software languages such as Python, Ruby, MATLAB, etc., and realizes many general algorithms in image processing and computer vision. Therefore, it can analyze and process images more perfectly and complete many general-purpose algorithms for computer intelligent vision. The basic structure of the Open CV main body is shown in Figure 5.

在OpenCV中，主要使用的图像格式为IplImage，其结构的定义如下:In OpenCV, the main image format used is IplImage, and its structure is defined as follows:

运动目标检测是视频运动目标检测与跟踪的第一部分，它就是实时的在被监视的场景中检测运动目标，并将其提取出来。运动目标检测常用的有四种常用方法:连续帧间差分法、背景差分法、光流法和运动能量法。其中基于OpenCV的运动物体的视频检测原理主要是根据目标物体的某些特征信息，比如轮廓、颜色或者形状等，在复杂的背景图中利用这些信息将目标移动物体进行分离出背景图像。图6为本发明实施例示出的基于OpenCV的运动物体的视频检测原理。Moving object detection is the first part of video moving object detection and tracking, which is to detect and extract moving objects in the monitored scene in real time. There are four commonly used methods for moving object detection: continuous frame difference method, background difference method, optical flow method and motion energy method. Among them, the principle of video detection of moving objects based on OpenCV is mainly based on certain characteristic information of the target object, such as outline, color or shape, etc., and using this information to separate the target moving object from the background image in the complex background image. FIG. 6 shows the principle of video detection of moving objects based on OpenCV according to an embodiment of the present invention.

对于从图像中提取目标物体，其实质就是对于某个物体轮廓的检测，接着分割的过程。整个提取过程其实就是将每帧图像的差异所表现出来。For extracting the target object from the image, its essence is the detection of the outline of an object, followed by the process of segmentation. The whole extraction process is actually to show the difference of each frame image.

步骤S210，使用RandomForest算法对所述视频关键帧进行筛选；Step S210, using the RandomForest algorithm to filter the key frames of the video;

Random Forest算法对关键帧进行筛选和清洗。Random Forest又叫随机森林算法，在机器学习中，是一个包含多个决策树的分类器，并且其输出的类别是由个别树输出的类别的众数而定。The Random Forest algorithm filters and cleans key frames. Random Forest is also called random forest algorithm. In machine learning, it is a classifier that contains multiple decision trees, and its output category is determined by the mode of the category output by individual trees.

随机森林算法实现大致流程如下：The general flow of the random forest algorithm is as follows:

1)从样本集中有放回随机采样选出n个样本；1) Select n samples from the sample set with replacement random sampling;

2)从所有特征中随机选择k个特征，对选出的样本利用这些特征建立决策树(一般是CART，也可是别的或混合)；2) Randomly select k features from all features, and use these features to build a decision tree for the selected samples (usually CART, or other or mixed);

3)重复以上两步m次，即生成m棵决策树，形成随机森林；3) Repeat the above two steps m times, that is, generate m decision trees to form a random forest;

4)对于新数据，经过每棵树决策，最后投票确认分到哪一类。4) For new data, after each tree decision, vote to confirm which category it is assigned to.

步骤S220，根据图像分类模型，采用Detector SSD算法对筛选后的所述视频关键帧进行识别，确定筛选后的所述视频关键帧中包含的产品。Step S220, using the Detector SSD algorithm to identify the screened video key frames according to the image classification model, and determine the products contained in the screened video key frames.

Detector SSD对关键帧图片进行识别处理和分类。SSD是一种基于回归算法的深度卷积神经网络物体检测方法，图7为本发明实施例示出的SSD物体检测方法的示意图，如图7所示，SSD网络对输入图像卷积处理时，针对尺寸为8x8或4x4特征图上的每个位置评估出不同长宽比的小集合默认框。对于每个默认框，预测对所有对象类别的形状偏移和置信度。在训练时，首先将这些默认框匹配到真实标签区域框。例如，两个默认框匹配到猫和狗，这些框为正，其余视为负。模型损失是位置损失和置信损失之间的加权和。Detector SSD identifies, processes and classifies key frame images. SSD is a deep convolutional neural network object detection method based on regression algorithm. FIG. 7 is a schematic diagram of the SSD object detection method shown in the embodiment of the present invention. As shown in FIG. 7, when the SSD network convolutes the input image, it targets A small set of default boxes of different aspect ratios are evaluated for each position on a feature map of size 8x8 or 4x4. For each default box, predict shape offsets and confidences for all object categories. At training time, these default boxes are first matched to the ground truth box. For example, two default boxes match cats and dogs, and those boxes are considered positive, and the rest are considered negative. The model loss is a weighted sum between the position loss and the confidence loss.

SSD方法基于前馈卷积神经网络，其产生固定大小的区域框集合和区域框中物体类别的分数，然后利用非极大值抑制步骤W产生最终检测。The SSD method is based on a feed-forward convolutional neural network that produces a fixed-size set of bounding boxes and scores for object categories in the bounding boxes, and then utilizes a non-maximum suppression step to produce final detections.

SSD将Faster R-CNN中的RPN得分机制与YOLO中的回归思想相结合，使用整幅图像各个位置的多尺度区域特征进行回归，不仅具有检测速度快的特性，而且能够大幅度提高区域框预测的精度。SSD combines the RPN scoring mechanism in Faster R-CNN with the regression idea in YOLO, and uses the multi-scale regional features of each position of the entire image for regression. It not only has the characteristics of fast detection speed, but also can greatly improve the area box prediction. accuracy.

请参阅附图8，图8为本发明实施例公开的一种宣传信息添加装置的结构示意图，该装置包括：Please refer to accompanying drawing 8, which is a schematic structural diagram of a publicity information adding device disclosed in an embodiment of the present invention, which includes:

图像识别模块10，用于对视频文件进行图像识别生成图像识别结果，所述图像识别结果包括所述视频文件中包含的产品以及各个产品在所述视频文件中的出现时间点；The image recognition module 10 is used to carry out image recognition to the video file to generate an image recognition result, and the image recognition result includes the products contained in the video file and the time points of appearance of each product in the video file;

筛选分类模块11，用于根据所述各个产品在所述视频文件中的出现时间点，确定所述各个产品的宣传信息的最优展示时间，并按主流行业产品品类在所述各个产品的宣传信息的最优展示时间对应的场景对所述各个产品进行分品类标注。The screening and classification module 11 is used to determine the optimal display time of the promotional information of each product according to the appearance time point of each product in the video file, and promote the promotion information of each product according to the mainstream industry product category. The scene corresponding to the optimal display time of the information is labeled by category for each product.

宣传信息投放模块12，用于获取用户从所述视频文件中包含的产品中选择的宣传产品，在所述视频文件播放到所述宣传产品的标注所在的场景时，将所述宣传产品的宣传信息添加至所述宣传产品的标注所在的场景中进行展示。The publicity information delivery module 12 is used to obtain the promotional product selected by the user from the products contained in the video file, and when the video file is played to the scene where the promotional product is marked, the publicity of the promotional product The information is added to the scene where the label of the advertised product is located for display.

可选的，所述装置还包括：Optionally, the device also includes:

使用Random Forest算法对所述视频关键帧进行筛选；Use the Random Forest algorithm to filter the key frames of the video;

综上所述：In summary:

本发明实施例中公开了一种宣传信息添加方法及装置，对视频文件进行图像识别生成图像识别结果，所述图像识别结果包括所述视频文件中包含的产品以及各个产品在所述视频文件中的出现时间点；根据所述各个产品在所述视频文件中的出现时间点，确定所述各个产品的宣传信息的最优展示时间，并按主流行业产品品类在所述各个产品的宣传信息的最优展示时间对应的场景对所述各个产品进行分品类标注；获取用户从所述视频文件中包含的产品中选择的宣传产品，在所述视频文件播放到所述宣传产品的标注所在的场景时，将所述宣传产品的宣传信息添加至所述宣传产品的标注所在的场景中进行展示。基于上述方法及装置，能够提高视频的应用效率。The embodiment of the present invention discloses a method and device for adding promotional information, which performs image recognition on a video file to generate an image recognition result, and the image recognition result includes the products contained in the video file and each product in the video file The time point of appearance of each product; according to the time point of appearance of each product in the video file, determine the optimal display time of the promotional information of each product, and display the promotional information of each product according to the mainstream industry product category The scene corresponding to the optimal display time is used to mark each product by category; obtain the promotional product selected by the user from the products contained in the video file, and play the video file to the scene where the promotional product is marked , the promotional information of the promotional product is added to the scene where the label of the promotional product is located for display. Based on the above method and device, the application efficiency of video can be improved.

本说明书中各个实施例采用递进的方式描述，每个实施例重点说明的都是与其他实施例的不同之处，各个实施例之间相同相似部分互相参见即可。对于实施例公开的装置而言，由于其与实施例公开的方法相对应，所以描述的比较简单，相关之处参见方法部分说明即可。Each embodiment in this specification is described in a progressive manner, each embodiment focuses on the difference from other embodiments, and the same and similar parts of each embodiment can be referred to each other. As for the device disclosed in the embodiment, since it corresponds to the method disclosed in the embodiment, the description is relatively simple, and for the related information, please refer to the description of the method part.

专业人员还可以进一步意识到，结合本文中所公开的实施例描述的各示例的单元及算法步骤，能够以电子硬件、计算机软件或者二者的结合来实现，为了清楚地说明硬件和软件的可互换性，在上述说明中已经按照功能一般性地描述了各示例的组成及步骤。这些功能究竟以硬件还是软件方式来执行，取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能，但是这种实现不应认为超出本发明的范围。Professionals can further realize that the units and algorithm steps of the examples described in conjunction with the embodiments disclosed herein can be implemented by electronic hardware, computer software or a combination of the two. In order to clearly illustrate the possible For interchangeability, in the above description, the composition and steps of each example have been generally described according to their functions. Whether these functions are executed by hardware or software depends on the specific application and design constraints of the technical solution. Those skilled in the art may use different methods to implement the described functions for each specific application, but such implementation should not be regarded as exceeding the scope of the present invention.

结合本文中所公开的实施例描述的方法或算法的步骤可以直接用硬件、处理器执行的软件模块，或者二者的结合来实施。软件模块可以置于随机存储器(RAM)、内存、只读存储器(ROM)、电可编程ROM、电可擦除可编程ROM、寄存器、硬盘、可移动磁盘、CD-ROM、或技术领域内所公知的任意其它形式的存储介质中。The steps of the methods or algorithms described in connection with the embodiments disclosed herein may be directly implemented by hardware, software modules executed by a processor, or a combination of both. Software modules can be placed in random access memory (RAM), internal memory, read-only memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, removable disk, CD-ROM, or any other Any other known storage medium.

对所公开的实施例的上述说明，使本领域专业技术人员能够实现或使用本发明。对这些实施例的多种修改对本领域的专业技术人员来说将是显而易见的，本文中所定义的一般原理可以在不脱离本发明的精神或范围的情况下，在其它实施例中实现。因此，本发明将不会被限制于本文所示的这些实施例，而是要符合与本文所公开的原理和新颖特点相一致的最宽的范围。The above description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the general principles defined herein may be implemented in other embodiments without departing from the spirit or scope of the invention. Therefore, the present invention will not be limited to the embodiments shown herein, but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.