CN107683469A

Movatterモバイル変換

Info

Publication number: CN107683469A
Application number: CN201580001265.6A
Authority: CN
Inventors: 樊春玲; 张巍; 姜青山
Original assignee: Shenzhen Institute of Advanced Technology of CAS
Current assignee: Shenzhen Institute of Advanced Technology of CAS
Priority date: 2015-12-30
Filing date: 2015-12-30
Publication date: 2018-02-09
Also published as: WO2017113232A1

Abstract

Translated fromChinese

一种基于深度学习的产品分类方法及装置，其中，该方法包括以下步骤：从产品的描述文本中提取产品的文本特征；基于预训练得到的卷积神经网络模型，从产品的图像中提取产品的图像特征；将产品的文本特征与产品的图像特征融合，得到产品的特征信息；基于预训练得到的产品分类模型对产品的特征信息进行处理，得到产品的分类结果。由于该方案综合考虑了待分类产品的产品文本特征和产品图像特征，与只根据产品的文本信息进行产品分类相比，提高了分类准确率。

A product classification method and device based on deep learning, wherein the method includes the following steps: extracting product text features from product description text; extracting product features from product images based on a pre-trained convolutional neural network model The image features of the product; the text features of the product and the image features of the product are fused to obtain the feature information of the product; the feature information of the product is processed based on the product classification model obtained through pre-training, and the classification result of the product is obtained. Because the scheme comprehensively considers the product text features and product image features of the products to be classified, compared with product classification based on product text information only, the classification accuracy rate is improved.

Description

Translated fromChinese

一种基于深度学习的产品分类方法及装置A product classification method and device based on deep learning

技术领域technical field

本发明涉及模式识别技术领域，特别涉及一种基于深度学习的产品分类方法及装置。The invention relates to the technical field of pattern recognition, in particular to a product classification method and device based on deep learning.

背景技术Background technique

随着电子商务的迅猛发展，网上购物已逐渐成为网民的日常行为。网络产品种类繁杂，数量庞大，电商网站在物品管理方面需要花费很大精力，才能为用户提供良好的购物体验。产品分类问题是物品管理的首要问题，然而目前产品分类主要靠人工标定产品类别。虽然目前已有的智能分类方法大多使用产品的文本信息进行分类，然而由于文字并非能完全描述图片的所有内容，如果文字信息描述偏差的情况下，会导致产品被错误分类，需要花费很多人力成本来修正产品类别，因此现有的产品分类方法分类准确性较差。With the rapid development of e-commerce, online shopping has gradually become the daily behavior of netizens. The variety and quantity of online products are huge, and e-commerce websites need to spend a lot of effort in item management in order to provide users with a good shopping experience. The problem of product classification is the primary problem of item management. However, at present, product classification mainly relies on manual calibration of product categories. Although the existing intelligent classification methods mostly use the text information of the product to classify, but because the text cannot fully describe all the content of the picture, if the text information describes deviations, the product will be misclassified, and a lot of labor costs will be required. To modify the product category, so the existing product classification method classification accuracy is poor.

发明内容Contents of the invention

本发明实施例提供了一种基于深度学习的产品分类方法，解决了现有技术中根据产品的文本信息进行产品分类准确性较差的技术问题。该产品分类方法包括：The embodiment of the present invention provides a product classification method based on deep learning, which solves the technical problem in the prior art that the accuracy of product classification based on product text information is poor. The product classification method includes:

从产品的描述文本中提取产品的文本特征；Extract the text features of the product from the description text of the product;

基于预训练得到的卷积神经网络模型，从产品的图像中提取产品的图像特征；Based on the pre-trained convolutional neural network model, the image features of the product are extracted from the image of the product;

将产品的文本特征与产品的图像特征融合，得到产品的特征信息；Merge the text features of the product with the image features of the product to obtain the feature information of the product;

基于预训练得到的产品分类模型对产品的特征信息进行处理，得到产品的分类结果。Based on the product classification model obtained by pre-training, the feature information of the product is processed to obtain the classification result of the product.

本发明实施例还提供了一种基于深度学习的产品分类装置，解决了现有技术中根据产品的文本信息进行产品分类准确性较差的技术问题。该产品分类装置包括：The embodiment of the present invention also provides a product classification device based on deep learning, which solves the technical problem of poor accuracy in product classification based on product text information in the prior art. The product classification device includes:

文本特征提取模块，用于从产品的描述文本中提取产品的文本特征；The text feature extraction module is used to extract the text feature of the product from the description text of the product;

图像特征提取模块，用于基于预训练得到的卷积神经网络模型，从产品的图像中提取产品的图像特征；The image feature extraction module is used to extract the image features of the product from the image of the product based on the convolutional neural network model obtained through pre-training;

特征信息获得模块，用于将产品的文本特征与产品的图像特征融合，得到产品的特征信息；A feature information acquisition module is used to fuse the text feature of the product with the image feature of the product to obtain the feature information of the product;

分类模块，用于基于预训练得到的产品分类模型对产品的特征信息进行处理，得到产品的分类结果。The classification module is used to process the feature information of the product based on the product classification model obtained through pre-training, and obtain the classification result of the product.

在本发明实施例中，通过提取产品的文本特征和图像特征，再将产品的文本特征与产品的图像特征融合，得到产品的特征信息，从而利用该产品的特征信息进行分类获得分类结果，由于综合考虑了待分类产品的文本特征和图像特征，与只根据产品的文本信息进行产品分类相比，提高了分类准确率。In the embodiment of the present invention, by extracting the text features and image features of the product, and then merging the text features of the product with the image features of the product, the feature information of the product is obtained, so that the feature information of the product is used to classify and obtain the classification result. The text feature and image feature of the product to be classified are considered comprehensively, and the classification accuracy rate is improved compared with product classification based on the product text information only.

附图说明Description of drawings

此处所说明的附图用来提供对本发明的进一步理解，构成本申请的一部分，并不构成对本发明的限定。在附图中：The drawings described here are used to provide further understanding of the present invention, constitute a part of the application, and do not limit the present invention. In the attached picture:

图1是本发明实施例提供的一种基于深度学习的产品分类方法流程图；Fig. 1 is a flow chart of a product classification method based on deep learning provided by an embodiment of the present invention;

图2是本发明实施例提供的一种文本特征提取方法流程图；FIG. 2 is a flow chart of a text feature extraction method provided by an embodiment of the present invention;

图3是本发明实施例提供的一种预训练网络示意图；Fig. 3 is a schematic diagram of a pre-training network provided by an embodiment of the present invention;

图4是本发明实施例提供的一种图像特征提取方法流程示意图；Fig. 4 is a schematic flow chart of an image feature extraction method provided by an embodiment of the present invention;

图5是本发明实施例提供的一种训练模型和预测产品流程图；Fig. 5 is a flow chart of a training model and a predicted product provided by an embodiment of the present invention;

图6是本发明实施例提供的一种基于深度学习的产品分类装置结构示意图；Fig. 6 is a schematic structural diagram of a product classification device based on deep learning provided by an embodiment of the present invention;

图7是本发明实施例提供的文本特征提取模块的结构示意图。Fig. 7 is a schematic structural diagram of a text feature extraction module provided by an embodiment of the present invention.

具体实施方式detailed description

为使本发明的目的、技术方案和优点更加清楚明白，下面结合实施方式和附图，对本发明做进一步详细说明。在此，本发明的示意性实施方式及其说明用于解释本发明，但并不作为对本发明的限定。In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be described in further detail below in conjunction with the embodiments and accompanying drawings. Here, the exemplary embodiments and descriptions of the present invention are used to explain the present invention, but not to limit the present invention.

现有的对产品进行分类的方法，只是单纯的使用产品的文本信息进行分类，如果文字信息描述出现偏差，会导致产品被错误分类，需要花费很多人力成本来修正产品类别，分类准确性较差。如果结合使用产品的文本信息和图像信息，就可以解决现有的分类方法分类准确性较差这一问题。基于此，本发明提出一种基于深度学习的产品分类方法及装置。The existing methods for classifying products simply use the text information of products for classification. If there is a deviation in the description of the text information, the product will be misclassified. It will take a lot of labor costs to correct the product category, and the classification accuracy is poor. . If text information and image information of products are used in combination, the problem of poor classification accuracy of existing classification methods can be solved. Based on this, the present invention proposes a product classification method and device based on deep learning.

图1是本发明实施例提供的一种基于深度学习的产品分类方法流程图，如图1所示，该方法包括：Fig. 1 is a flow chart of a product classification method based on deep learning provided by an embodiment of the present invention. As shown in Fig. 1, the method includes:

本实施例对互联网产品中C类产品进行分类，如针织衫、T恤、外套、裤子、衬衫、连衣裙、单肩包、单鞋、产务包、靴子等，其中每类有500个产品。This embodiment classifies the C-type products in Internet products, such as sweaters, T-shirts, coats, trousers, linings, etc.shirts, dresses, shoulder bags, shoes, business bags, boots, etc., each of which has 500 products.

产品的描述文本是指用于描述待分类产品的文本，包括文字、符号、数字等。产品的描述文本可对应存储于产品文本文档中，这样一个产品文本对应一个产品文本文档。The product description text refers to the text used to describe the product to be classified, including words, symbols, numbers, etc. The description text of the product can be correspondingly stored in the product text file, such that one product text corresponds to one product text file.

步骤101：从产品的描述文本中提取产品的文本特征，具体的流程如图2所示，包括：Step 101: Extract the text features of the product from the description text of the product. The specific process is shown in Figure 2, including:

输入产品p_j，根据给定的文本Text，利用文本特征提取的方法，提取相应的文本特征，得到T_j。Input the product p_j , according to the given text Text, use the text feature extraction method to extract the corresponding text features, and get T_j .

步骤一：将产品的描述文本进行分词，获得候选词；Step 1: Segment the product description text to obtain candidate words;

每个产品信息被作为一个文档，首先对其进行分词，将文档分割为一系列的词序列。本发明中采用中国科学院计算技术研究所基于多层隐马尔科夫模型的汉语词法分析系统ICTCLAS(Institute of Computing Technology，Chinese Lexical Analysis System)进行中文分词，分词精度达98.45％。Each product information is regarded as a document, which is segmented first, and the document is divided into a series of word sequences. In the present invention, the Chinese Lexical Analysis System ICTCLAS (Institute of Computing Technology, Chinese Lexical Analysis System) based on the multi-layer hidden Markov model of the Institute of Computing Technology of the Chinese Academy of Sciences is used for Chinese word segmentation, and the word segmentation accuracy reaches 98.45%.

步骤二：根据预设评估函数从所述候选词中筛选出产品特征词；Step 2: Screen out product feature words from the candidate words according to a preset evaluation function;

本发明用于特征提取的评估函数有：特征频率函数、文档频率函数、信息增益函数、互信息函数、开方拟和检验函数五种。这五种评估函数可以任选其中一种，也可以是几种的组合，最优的是采用五种评估函数来得到五种产品特征词，然后将五种产品特征词综合使用。The evaluation functions used for feature extraction in the present invention include five types: feature frequency function, document frequency function, information gain function, mutual information function, and square root fitting and checking function. One of these five evaluation functions can be selected, or a combination of several types can be used. The optimal method is to use five evaluation functions to obtain five product feature words, and then use the five product feature words comprehensively.

1)特征频率函数(Term Frequency，TF)：1) Feature frequency function (Term Frequency, TF):

计算候选词在样本文档中出现的次数，将出现次数大于或等于次数阈值的候选词作为产品特征词。Calculate the number of occurrences of candidate words in the sample document, and use the candidate words whose occurrence times are greater than or equal to the number threshold as product feature words.

具体的，首先遍历所有候选词(有用词)，求出每个候选词(有用词)在样本文档中出现的次数，设定一定的阈值(如10)，删除那些出现次数小于阈值对分类贡献很小的词，选取大于该阈值的分词作为产品特征词。Specifically, first traverse all candidate words (useful words), find the number of occurrences of each candidate word (useful words) in the sample document, set a certain threshold (such as 10), and delete those whose occurrences are less than the threshold. For very small words, the word segmentation greater than the threshold is selected as the product feature word.

2)文档频率函数(Document Frequency，DF)：2) Document Frequency Function (Document Frequency, DF):

计算包含候选词的样本文档占样本文档总数的比重，将比重在预设范围内的候选词作为产品特征词。Calculate the ratio of sample documents containing candidate words to the total number of sample documents, and use candidate words with a ratio within a preset range as product feature words.

具体的，根据公式(1)计算每个候选词(有用词)t的文档频次P_t：Specifically, the document frequency P_t of each candidate word (useful word) t is calculated according to formula (1):

其中，n_t为包含每个候选词(有用词)t的样本文档数，n为样本文档总数。Among them, n_t is the number of sample documents containing each candidate word (useful word) t, and n is the total number of sample documents.

设定特征词频次阈值(如(0.005，0.08))，筛选出在阈值范围内的候选词(有用词)t作为产品特征词。Set the feature word frequency threshold (such as (0.005, 0.08)), and select candidate words (useful words) t within the threshold range as product feature words.

3)信息增益函数(Information Gain，IG)：3) Information gain function (Information Gain, IG):

计算所述候选词的信息增益权值，将信息增益权值大于信息增益权值阈值的候选词作为产品特征词。The information gain weight of the candidate words is calculated, and the candidate words whose information gain weight is greater than the information gain weight threshold are used as product feature words.

具体的，根据公式(2)计算每个候选词(有用词)t的信息增益权值：Specifically, the information gain weight of each candidate word (useful word) t is calculated according to formula (2):

其中，t表示候选词(有用词)，C表示文档类别，m表示类别数，P(C_i)表示C_i类文档在训练样本集中出现的概率，P(t)表示训练样本集中包含词条t的文档的概率，P(C_i|t)表示文档包含词条t时属于C类的条件概率，表示训练样本集中不包含词条t的文档的概率，表示文档中不包含词条t时属于C类的条件概率。Among them, t represents the candidate word (useful word), C represents the document category, m represents the number of categories, P(C_i ) represents the probability of C_i class documents appearing in the training sample set, P(t) represents the training sample set contains terms The probability of the document of t, P(C_i |t) indicates the conditional probability that the document belongs to the C category when the document contains the term t, indicates the probability of the document that does not contain the term t in the training sample set, and indicates that the document does not contain the term t The conditional probability of belonging to class C.

求出权值后，设定阈值(如0.006)，选取权值大于该阈值的有用词作为产品特征词。After calculating the weight value, set a threshold value (such as 0.006), and select useful words whose weight value is greater than the threshold value as product feature words.

4)互信息函数(Mutual Information，MI)：4) Mutual Information function (Mutual Information, MI):

计算候选词的互信息值，将互信息值大于互信息值阈值的候选词作为产品特征词。Calculate the mutual information value of the candidate words, and use the candidate words whose mutual information value is greater than the mutual information value threshold as product feature words.

具体的，按照公式(3)或(4)计算每个候选词(有用词)t_k与每个类别C_i的互信息值：Specifically, calculate the mutual information value of each candidate word (useful word) t_k and each category C_i according to formula (3) or (4):

也可表示为can also be expressed as

MI(t_k，C_i)＝logP(t_k|C_i)-logP(t_k) (4)MI(t_k , C_i )=logP(t_k |C_i )-logP(t_k ) (4)

其中，P(t_k，C_i)为类别C_i、特征P(t_k)在训练样本集中出现的概率，P(t_k)为t_k在整个训练样本集中出现的概率，P(C_i)为C_i类样本文档在整个训练样本集中出现的概率，P(t_k|C_i)为t_k在C_i类样本文档中出现的条件概率。Among them, P(t_k , C_i ) is the probability of category C_i and feature P(t_k ) appearing in the training sample set, P(t_k ) is the probability of t_k appearing in the entire training sample set, P(C_i ) is the probability that C_i class sample documents appear in the entire training sample set, P(t_k |C_i ) is the conditional probability that t_k appears in C_i class sample documents.

在计算出的互信息值中选取大于阈值1.54的有用词作为特征词。In the calculated mutual information value, the useful words greater than the threshold value of 1.54 are selected as feature words.

5)开方拟和检验函数(Chi-square，CHI)：5) Square root fit test function (Chi-square, CHI):

计算候选词与预设类别的相关度，将相关度大于相关度阈值的候选词作为产品特征词。Calculate the correlation between the candidate words and the preset category, and use the candidate words whose correlation is greater than the correlation threshold as product feature words.

具体的，按照公式(5)计算每个有用词候选词(有用词)t_k与每个类别C_i之间的相关性，其值定义为Specifically, the correlation between each useful word candidate (useful word) t_k and each category C_i is calculated according to formula (5), and its value is defined as

其中，n为训练样本集的样本文档数，P(t_k，C_i)为训练样本集中出现特征t_k并且属于类别C_i的样本文档出现的概率，为训练样本集中不出现特征t_k并且不属于类别C_i的样本文档出现的概率，为训练样本集中出现特征t_k并且不属于类别C_i的样本文档出现的概率，不出现特征t_k并且属于类别C_i的样本文档出现的概率。Among them, n is the number of sample documents in the training sample set, P(t_k , C_i ) is the probability that the feature t_k appears in the training sample set and belongs to the sample document of category C_i , and it is the probability that the feature t_k does not appear in the training sample set and The probability of a sample document that does not belong to category C_i is the probability that a sample document that does not have feature t_k and does not belong to category C_i appears in the training sample set, and the probability that a sample document that does not appear in feature t_k and belongs to category C_i appears.

设定相关性阈值(如10)，筛选出大于该阈值的有用词作为特征词。Set a correlation threshold (such as 10), and filter useful words greater than the threshold as feature words.

上述1)至5)能够生成五组产品特征词，对应五种产品特征文本，可显著提高产品文本特征描述待分类产品的能力，从而提高分类的准确性。The above 1) to 5) can generate five sets of product feature words, corresponding to five product feature texts, which can significantly improve the ability of product text features to describe products to be classified, thereby improving the accuracy of classification.

具体实施时，在步骤二之前还包括：过滤掉包含在预设停用词表中的所述候选词。During specific implementation, before step 2, the method further includes: filtering out the candidate words included in the preset stop vocabulary list.

候选词中可能会存在一些会造成分类干扰、对分类没有价值的字或词(停用词)，比如语气词、助词等。因此预先设置停用词表，将这些会造成分类干扰的字或词加入停用词表中，从而过滤掉包含在预设停用词表中的候选词，可以避免不必要的计算，节省产品分类所需时间。There may be some words or words (stop words) that will cause classification interference and have no value for classification in the candidate words, such as modal particles, auxiliary words, etc. Therefore, the stop vocabulary list is set in advance, and these words or words that will cause classification interference are added to the stop vocabulary list, thereby filtering out the candidate words contained in the preset stop vocabulary list, which can avoid unnecessary calculations and save products. Time required for classification.

步骤三：根据所述产品特征词在样本文档中出现的频率、样本文档总数和包含所述产品特征词的样本文档的个数确定产品特征词权值。Step 3: Determine the product feature word weight according to the frequency of the product feature word appearing in the sample document, the total number of sample documents, and the number of sample documents containing the product feature word.

具体的，通过上述五种方法选取的产品特征词后，分别对每组产品特征词根据公式(6)计算每个产品特征词的权值：Specifically, after the product feature words selected by the above five methods, calculate the weight of each product feature word according to formula (6) for each group of product feature words:

W_i＝TF_i(t，d)×n/DF(t) (6)W_i =TF_i (t,d)×n/DF(t) (6)

其中，W_i为第i个产品特征词的权重，TF_i(t，d)为产品特征词t在文档d中出现的频率，n表示文档数，DF(t)为包含产品特征词t的文档数。Among them, W_i is the weight of the i-th product feature word, TF_i (t, d) is the frequency of the product feature word t in document d, n represents the number of documents, and DF(t) is the document containing the product feature word t number of documents.

步骤四：根据所述产品特征词权值生成待分类产品的产品文本特征。Step 4: Generate product text features of the product to be classified according to the weight of the product feature words.

具体的，根据式(6)分别计算出每种方法中求出的每个产品特征词的权值之后，即可将每个产品的描述文本转换为一个以产品特征词为维度的向量，每个维度的属性值为产品特征词的权值。每种方法会得出一个向量，即一个产品文本特征。则对于一个产品文本，根据1)至5)可以得出五种向量，即五种产品文本特征，这样就得到待分类产品的产品文本特征。采用五种产品文本特征，可提高产品分类的准确率。Specifically, after calculating the weight of each product feature word obtained in each method according to formula (6), the description text of each product can be converted into a vector with product feature words as the dimension, and each The attribute value of each dimension is the weight of product feature words. Each method results in a vector, which is a product text feature. Then, for a product text, according to 1) to 5), five vectors, ie, five product text features, can be obtained, so that the product text features of the product to be classified are obtained. Using five product text features can improve the accuracy of product classification.

步骤102：基于预训练得到的卷积神经网络模型，从产品的图像中提取产品的图像特征。Step 102: Based on the pre-trained convolutional neural network model, extract image features of the product from the image of the product.

具体实施时，近年来，深度学习在图像分类方面表现突出，尤其是卷积神经网络能够自动学习图像特征，并且提取的特征稳定可靠。本发明实施例采集互联网产品中十类产品的图片信息进行分类，针织衫、T恤、外套、裤子、衬衫、连衣裙、单肩包、单鞋、商务包、靴子等，其中每类有300个产品。每个产品将会包含一条文字描述性文字和一张图片，本实施例将利用预训练好的卷积神经网络自动学习产品的图像特征。In terms of implementation, in recent years, deep learning has performed well in image classification, especially convolutional neural networks can automatically learn image features, and the extracted features are stable and reliable. The embodiment of the present invention collects picture information of ten types of products in Internet products and classifies them, such as sweaters, T-shirts, coats, pants, shirts, dresses, shoulder bags, single shoes, business bags, boots, etc., each of which has 300 pictures product. Each product will contain a descriptive text and a picture, and this embodiment will use the pre-trained convolutional neural network to automatically learn the image features of the product.

产品图像是指包括待分类产品的影像的图像。可提取产品图像的颜色特征(比如颜色直方图)、纹理特征或形状特征等作为产品图像特征。The product image refers to an image including images of products to be classified. The color features (such as color histogram), texture features or shape features of the product image can be extracted as product image features.

具体的，首先，本发明实施例采用卷积神经网络模型，由于网络参数庞大，需要大量的训练数据，因此很有必要做数据增强(图像增强)。本发明实施例采用的数据增强(图像增强)方式包括，首先对每幅产品图像按比例缩放，将短边缩放到256pixel；再对图像进行翻转；最后，随机加入光照噪声，随机改变图像的对比度、亮度等。Specifically, firstly, the embodiment of the present invention adopts a convolutional neural network model. Since the network parameters are huge, a large amount of training data is required, so it is necessary to perform data enhancement (image enhancement). The data enhancement (image enhancement) method adopted in the embodiment of the present invention includes firstly scaling each product image proportionally, and scaling the short side to 256pixel; then flipping the image; finally, randomly adding illumination noise and randomly changing the contrast of the image , brightness, etc.

然后，预训练卷积神经网络：Then, pretrain the convolutional neural network:

本发明实施例采用ImageNet 2012数据集预训练卷积神经网络，网络示意图见图3，具体为五层卷积层C_i{N，S}i＝1，...，5，其中，N表示卷积核个数，S表示卷积核大小，每个卷积层都采用矫正的线性单元(Rectified linear units，ReLU)激活函数。本发明实施例采用的每个卷积层参数分别为C₁{48，5*5}，C2{128，3*3}，C3{192，3*3}，C4{128，3*3}，C5{128，3*3}，前四个卷积层后面分别连接一个最大池化(max pooling)层，即从局部范围中选取最大值的元素，第五个卷积层后接一个多尺度空间池化(Spatial Pooling Pooling，SPP)层，本实施例采用的pooling尺度为(6*6，3*3，2*2)，用于对不同大小图像卷积得到的不同大小的特征图进行pooling得到相同长度的特征向量。具体为对每幅大小不同的图像都将其平均划分为6*6、3*3、2*2个子块，用max pooling的方式提取子块特征，最终将得到6*6+3*3+2*2＝49*Feature维的特征向量，Feature是第五层卷积层输出的特征图大小。The embodiment of the present invention uses the_ImageNet 2012 data set to pre-train the convolutional neural network. The schematic diagram of the network is shown in FIG. The number of convolution kernels, S represents the size of the convolution kernel, and each convolution layer uses a rectified linear unit (ReLU) activation function. The parameters of each convolution layer used in the embodiment of the present invention are respectively C₁ {48, 5*5}, C2{128, 3*3}, C3{192, 3*3}, C4{128, 3*3} , C5{128, 3*3}, the first four convolutional layers are followed by a max pooling layer, that is, the element with the maximum value is selected from the local range, and the fifth convolutional layer is followed by a multi- Scale Space Pooling (Spatial Pooling Pooling, SPP) layer, the pooling scale used in this embodiment is (6*6, 3*3, 2*2), which is used to convolve different sizes of image feature maps of different sizes Perform pooling to obtain feature vectors of the same length. Specifically, each image with different sizes is divided into 6*6, 3*3, 2*2 sub-blocks on average, and the sub-block features are extracted by max pooling, and finally 6*6+3*3+ 2*2=49*Feature-dimensional feature vector, Feature is the size of the feature map output by the fifth convolutional layer.

卷积层之后连接三层全连接层，前两层FC1、FC2分别为2048个节点，最后一层为一个1000个输出的softmax分类器。训练网络采用随机梯度下降法，为了避免过拟合，在前两层全连接层采用随机丢弃比例为0.5的丢弃(dropout)策略。Three layers of fully connected layers are connected after the convolutional layer. The first two layers FC1 and FC2 are 2048 nodes respectively, and the last layer is a softmax classifier with 1000 outputs. The stochastic gradient descent method was used to train the network. In order to avoid overfitting, a dropout strategy with a random drop ratio of 0.5 was used in the first two fully connected layers.

再者，对预训练好的卷积神经网络进行微调：Furthermore, fine-tune the pre-trained convolutional neural network:

由于训练使用的ImageNet 2012数据集是1000个类别，所以训练的卷积神经网络输出时1000-way，而本实施例对互联网产品分类总共有C类，因此将最后一层全连接层改为C个节点，再用互联网产品对网络最后一层全连接层进行微调。微调采用随机梯度下降法，动量设置为0.9，权重衰减设置为0.0005，学习率初始值设为0.01，随着迭代次数增加逐渐减小学习率。Since the ImageNet 2012 dataset used for training is 1000 categories, the trained convolutional neural network outputIt is 1000-way, but this embodiment classifies Internet products into categories C in total, so the last fully connected layer is changed to C nodes, and then Internet products are used to fine-tune the last fully connected layer of the network. The fine-tuning adopts the stochastic gradient descent method, the momentum is set to 0.9, the weight decay is set to 0.0005, the initial value of the learning rate is set to 0.01, and the learning rate is gradually reduced as the number of iterations increases.

最后，基于微调后的预训练好的卷积神经网络，从产品图像中提取产品的图像特征。Finally, based on the fine-tuned pre-trained convolutional neural network, the image features of the product are extracted from the product image.

本发明实施例将测试图像输入到预训练好的卷积神经网络，卷积神经网络用来抽象图像的特征，经过五层卷积神经网络可以提取到较高层级的图像特征，通过全连接层将图像特征拉成一维向量，本发明实施例选取第二个全连接层输出作为图像特征t_j，见图4。In the embodiment of the present invention, the test image is input to the pre-trained convolutional neural network, and the convolutional neural network is used to abstract the features of the image. After the five-layer convolutional neural network can extract higher-level image features, through the fully connected layer The image features are pulled into a one-dimensional vector, and the embodiment of the present invention selects the output of the second fully connected layer as the image feature t_j , as shown in FIG. 4 .

步骤103：将产品的文本特征与产品的图像特征融合，得到产品的特征信息；Step 103: merging the text features of the product with the image features of the product to obtain feature information of the product;

产品的文本特征与产品的图像特征均为一个一维向量，本发明实施例将产品的文本特征与产品的图像特征拼接起来，作为第j个产品的特征P_j＝{x_j，t_j}。The text feature of the product and the image feature of the product are both a one-dimensional vector. In the embodiment of the present invention, the text feature of the product and the image feature of the product are spliced together as the feature P_j = {x_j , t_j } of the jth product .

步骤104：基于预训练得到的产品分类模型对产品的特征信息进行处理，得到产品的分类结果。Step 104: Process the feature information of the product based on the product classification model obtained through pre-training to obtain a classification result of the product.

具体实施时，在目前众多的智能分类方法中，支持向量机技术(Support vector machine，SVM)分类和训练速度较快，模型泛化能力较强，成为机器学习相关领域的热点和重点，本发明实施例采用SVM技术。其基本思想是建立一个或一系列高维空间的超平面，使得超平面到最相邻的训练样本之间的距离最大。SVM技术中一个重要的工作就是核函数的选择。当样本特征还有异构信息，样本规模很大，多维数据的不规则或数据在高位特征空间分布的不平坦，采用单核进行映射的方式对所有样本进行处理并不合理，即需要将多个核函数进行组合及多核学习方法。During specific implementation, among numerous intelligent classification methods at present, support vector machine technology (Support vector machine, SVM) classification and training speed are faster, and model generalization ability is stronger, become the hotspot and key point of machine learning related field, the present invention Embodiments employ SVM techniques. The basic idea is to establish one or a series of hyperplanes in high-dimensional space, so that the distance between the hyperplane and the nearest training sample is the largest. An important task in SVM technology is the choice of kernel function. When the sample features still have heterogeneous information, the sample size is large, the multi-dimensional data is irregular or the data is not evenly distributed in the high-level feature space, it is unreasonable to use a single-core mapping method to process all samples, that is, multiple A combination of kernel functions and a multi-kernel learning method.

构造多核学习最常见也最常用的一种方法就是考虑多个核函数的凸组合，形如式：One of the most common and commonly used methods for constructing multi-core learning is to consider the convex combination of multiple kernel functions, in the form of:

式中K_j是基本核函数，M是基本和的总个数，β_j是权系数In the formula, K_j is the basic kernel function, M is the total number of basic sums, and β_j is the weight coefficient

合成核的方法有很多，本发明实施例采用Francesco提出的基于稀疏编码的多核学习方法，稀疏性的提高在一些情况下可以减少冗余，提高运算效率。There are many methods for synthesizing kernels. The embodiment of the present invention adopts the multi-core learning method based on sparse coding proposed by Francesco. The improvement of sparsity can reduce redundancy and improve computing efficiency in some cases.

具体的，本发明通过多核学习算法按如下方式预训练得到了产品分类模型：Specifically, the present invention obtains the product classification model by pre-training the multi-core learning algorithm as follows:

从训练样本集中产品样本的描述文本中提取产品样本的文本特征；Extract the text features of the product samples from the description text of the product samples in the training sample set;

基于预训练得到的卷积神经网络模型，从训练样本集中产品样本的图像中提取产品样本的图像特征；Based on the convolutional neural network model obtained by pre-training, the image features of the product sample are extracted from the images of the product sample in the training sample set;

将产品样本的文本特征和产品样本的图像特征融合，得到产品样本的特征信息；Merge the text features of the product sample with the image features of the product sample to obtain the feature information of the product sample;

对产品样本的特征信息进行训练，获得基于支持向量机的产品分类模型；Train the feature information of product samples to obtain a product classification model based on support vector machines;

其中，训练样本集包括预设类别的多个产品样本，所述产品样本包括产品样本的描述文本和图像。Wherein, the training sample set includes a plurality of product samples of preset categories, and the product samples include description text and images of the product samples.

根据多核学习算法，将多个产品特征信息输入到产品分类模型Model中进行处理，就能获得产品的分类标记label_j，流程图如图5。According to the multi-core learning algorithm, multiple product feature information is input into the product classification model Model for processing, and the classification label label_j of the product can be obtained. The flow chart is shown in Figure 5.

具体实施时，由于互联网产品中每个产品会有一个文字描述和多个产品图像，因此本发明实施例利用不限输入图像大小的卷积神经网络自动学习产品图像特征，并且将图像特征与文本特征融合，最后再将每个产品的不同图像样本预测结果进行跨样本的最大化池化max pooling，选取每个产品中对类别响应最强的预测结果作为每个产品的预测类别，以便自动剔除掉噪声信息，在互联网产品分类中能提高自动分类的准确性。During specific implementation, since each product in Internet products has a text description and multiple product images, the embodiment of the present invention uses a convolutional neural network with no input image size to automatically learn product image features, and combines image features with text Feature fusion, and finally perform cross-sample max pooling on the prediction results of different image samples of each product, and select the prediction result with the strongest response to the category in each product as the prediction category of each product, so as to automatically eliminate By removing noise information, it can improve the accuracy of automatic classification in Internet product classification.

本方法在AMAX服务器平台上已经做实验，在产品分类中能够获得比使用人工制定的图像特征与文本信息结合分类更高的准确率。This method has been tested on the AMAX server platform, and it can obtain higher accuracy in product classification than the combination of manually formulated image features and text information.

基于同一发明构思，本发明实施例中还提供了一种基于深度学习的产品分类装置，如下面的实施例所述。由于基于深度学习的产品分类装置解决问题的原理与基于深度学习的产品分类方法相似，因此基于深度学习的产品分类装置的实施可以参见基于深度学习的产品分类方法的实施，重复之处不再赘述。以下所使用的，术语“单元”或者“模块”可以实现预定功能的软件和/或硬件的组合。尽管以下实施例所描述的装置较佳地以软件来实现，但是硬件，或者软件和硬件的组合的实现也是可能并被构想的。Based on the same inventive concept, an embodiment of the present invention also provides a product classification device based on deep learning, as described in the following embodiments. Since the problem-solving principle of the product classification device based on deep learning is similar to that of the product classification method based on deep learning, the implementation of the product classification device based on deep learning can be referred to the implementation of the product classification method based on deep learning, and the repetition will not be repeated. . As used below, the term "unit" or "module" may be a combination of software and/or hardware that realizes a predetermined function. Although the devices described in the following embodiments are preferably implemented in software, implementations in hardware, or a combination of software and hardware are also possible and contemplated.

图6是本发明实施例的基于深度学习的产品分类装置的一种结构框图，如图6所示，该产品分类装置包括：Fig. 6 is a structural block diagram of a deep learning-based product classification device according to an embodiment of the present invention. As shown in Fig. 6, the product classification device includes:

文本特征提取模块601，用于从产品的描述文本中提取产品的文本特征；Text feature extraction module 601, for extracting the text features of the product from the description text of the product;

图像特征提取模块602，用于基于预训练得到的卷积神经网络模型，从产品的图像中提取产品的图像特征；The image feature extraction module 602 is used to extract the image features of the product from the image of the product based on the convolutional neural network model obtained through pre-training;

特征信息获得模块603，用于将产品的文本特征与产品的图像特征融合，得到产品的特征信息；The feature information obtaining module 603 is used to fuse the text feature of the product with the image feature of the product to obtain the feature information of the product;

分类模块604，用于基于预训练得到的产品分类模型对产品的特征信息进行处理，得到产品的分类结果。The classification module 604 is configured to process the characteristic information of the product based on the product classification model obtained through pre-training, and obtain a classification result of the product.

下面对该结构进行说明。This structure will be described below.

具体实施时，如图7所示所述文本特征提取模块601具体包括，：During specific implementation, the text feature extraction module 601 as shown in Figure 7 specifically includes:

分词模块701，用于将产品的描述文本进行分词，获得候选词；The word segmentation module 701 is used to segment the description text of the product to obtain candidate words;

特征词筛选模块702，用于根据预设评估函数从所述候选词中筛选出产品特征词；A feature word screening module 702, configured to filter product feature words from the candidate words according to a preset evaluation function;

特征词权值确定模块703，用于根据所述产品特征词在样本文档中出现的频率、样本文档总数和包含所述产品特征词的样本文档的个数确定产品特征词权值；The feature word weight determining module 703 is used to determine the product feature word weight according to the frequency of occurrence of the product feature word in the sample document, the total number of sample documents and the number of sample documents containing the product feature word;

文本特征生成模块704，用于根据所述产品特征词权值生成待分类产品的产品文本特征；A text feature generating module 704, configured to generate product text features of products to be classified according to the product feature word weight;

其中，所述产品的描述文本存储于样本文档中。Wherein, the description text of the product is stored in the sample file.

具体实施时，所述文本特征提取模块601还包括：During specific implementation, the text feature extraction module 601 also includes:

候选词过滤模块，用于过滤掉包含在预设停用词表中的所述候选词。The candidate word filtering module is used to filter out the candidate words included in the preset stop vocabulary list.

具体实施时，所述特征词筛选模块702具体用于：During specific implementation, the feature word screening module 702 is specifically used for:

确定所述候选词在所述样本文档中出现的次数，将出现次数大于或等于次数阈值的候选词作为产品特征词；和/或，Determining the number of occurrences of the candidate words in the sample document, using the candidate words with the number of occurrences greater than or equal to the number threshold as product feature words; and/or,

确定包含所述候选词的样本文档占样本文档总数的比重，将比重在预设范围内的候选词作为产品特征词；和/或，Determining the proportion of sample documents containing the candidate words in the total number of sample documents, and using candidate words with proportions within a preset range as product feature words; and/or,

确定所述候选词的信息增益权值，将信息增益权值大于信息增益权值阈值的候选词作为产品特征词；和/或，Determine the information gain weight of the candidate words, and use the candidate words whose information gain weight is greater than the information gain weight threshold as product feature words; and/or,

确定所述候选词的互信息值，将互信息值大于互信息值阈值的候选词作为产品特征词；和/或，Determine the mutual information value of the candidate words, and use the candidate words whose mutual information value is greater than the mutual information value threshold as product feature words; and/or,

确定所述候选词与所述预设类别的相关度，将相关度大于相关度阈值的候选词作为产品特征词。Determine the correlation between the candidate words and the preset category, and use the candidate words with a correlation greater than a correlation threshold as product feature words.

具体实施时，所述特征词筛选模块702具体按如下方式确定所述候选词与所述预设类别的相关度：During specific implementation, the characteristic word screening module 702 specifically determines the degree of relevance between the candidate word and the preset category in the following manner:

根据所述训练样本集中是否出现所述候选词和所述候选词是否属于所述预设类别的概率，确定所述候选词与所述预设类别的相关度。According to the probability of whether the candidate word appears in the training sample set and whether the candidate word belongs to the preset category, the degree of correlation between the candidate word and the preset category is determined.

具体实施时，所述分类模块604具体用于按如下方式获得产品分类模型：During specific implementation, the classification module 604 is specifically used to obtain the product classification model in the following manner:

具体实施时，该产品分类装置还包括：During specific implementation, the product classification device also includes:

图像增强模块，用于对所述待分类产品的产品图像进行图像增强；An image enhancement module, configured to perform image enhancement on the product image of the product to be classified;

所述图像特征提取模块602，还用于基于预训练得到的卷积神经网络模型，从图像增强后的产品的图像中提取产品的图像特征。The image feature extraction module 602 is further configured to extract image features of the product from the image of the product after image enhancement based on the convolutional neural network model obtained through pre-training.

具体实施时，所述图像增强模块605具体用于：During specific implementation, the image enhancement module 605 is specifically used for:

将产品的图像按预设比例缩放；Scale the image of the product according to a preset ratio;

将比例缩放后的产品的图像进行翻转；Flip the image of the scaled product;

在翻转后的产品的图像中加入光照噪声；Add lighting noise to the image of the flipped product;

改变加入光照噪声的产品的图像的对比度和/或亮度。Change the contrast and/or brightness of images of products that add lighting noise.

综上所述，本发明提出一种基于深度学习的(多特征)产品分类方法及装置，打破传统的以人工制定的图像描述子提取图像特征的方法，将产品图像原始数据直接输入卷积神经网络自动学习图像特征，最后将图像特征和文本特征融合，再通过SVM分类器预测产品类别，以实现产品自动分类，提高智能分类的准确性。In summary, the present invention proposes a deep learning-based (multi-feature) product classification method and device, which breaks the traditional method of extracting image features with manually formulated image descriptors, and directly inputs the original data of the product image into the convolutional neural network. The network automatically learns image features, and finally integrates image features and text features, and then predicts product categories through the SVM classifier to realize automatic product classification and improve the accuracy of intelligent classification.

本领域内的技术人员应明白，本发明的实施例可提供为方法、系统、或计算机程序产品。因此，本发明可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且，本发明可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。Those skilled in the art should understand that the embodiments of the present invention may be provided as methods, systems, or computer program products. Accordingly, the present invention can take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.

本发明是参照根据本发明实施例的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器，使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It should be understood that each procedure and/or block in the flowchart and/or block diagram, and a combination of procedures and/or blocks in the flowchart and/or block diagram can be realized by computer program instructions. These computer program instructions may be provided to a general purpose computer, special purpose computer, embedded processor, or processor of other programmable data processing equipment to produce a machine such that the instructions executed by the processor of the computer or other programmable data processing equipment produce a An apparatus for realizing the functions specified in one or more procedures of the flowchart and/or one or more blocks of the block diagram.

这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中，使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品，该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。These computer program instructions may also be stored in a computer-readable memory capable of directing a computer or other programmable data processing apparatus to operate in a specific manner, such that the instructions stored in the computer-readable memory produce an article of manufacture comprising instruction means, the instructions The device realizes the function specified in one or more procedures of the flowchart and/or one or more blocks of the block diagram.

这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上，使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理，从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。These computer program instructions can also be loaded onto a computer or other programmable data processing device, causing a series of operational steps to be performed on the computer or other programmable device to produce a computer-implemented process, thereby The instructions provide steps for implementing the functions specified in the flow chart or blocks of the flowchart and/or the block or blocks of the block diagrams.

上所述的具体实施例，对本发明的目的、技术方案和有益效果进行了进一步详细说明，所应理解的是，以上所述仅为本发明的具体实施例而已，并不用于限定本发明的保护范围，凡在本发明的精神和原则之内，所做的任何修改、等同替换、改进等，均应包含在本发明的保护范围之内。The specific embodiments described above have further described the purpose, technical solutions and beneficial effects of the present invention in detail. It should be understood that the above descriptions are only specific embodiments of the present invention and are not intended to limit the scope of the present invention. Protection scope, within the spirit and principles of the present invention, any modification, equivalent replacement, improvement, etc., shall be included in the protection scope of the present invention.