Movatterモバイル変換


[0]ホーム

URL:


CN109582963A - A kind of archives automatic classification method based on extreme learning machine - Google Patents

A kind of archives automatic classification method based on extreme learning machine
Download PDF

Info

Publication number
CN109582963A
CN109582963ACN201811438592.XACN201811438592ACN109582963ACN 109582963 ACN109582963 ACN 109582963ACN 201811438592 ACN201811438592 ACN 201811438592ACN 109582963 ACN109582963 ACN 109582963A
Authority
CN
China
Prior art keywords
text
sample
bottom layer
classification
feature
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811438592.XA
Other languages
Chinese (zh)
Inventor
曾伟波
张建辉
林培煜
潘淑英
陈泰隆
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fujian Linewell Software Co Ltd
Original Assignee
Fujian Linewell Software Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fujian Linewell Software Co LtdfiledCriticalFujian Linewell Software Co Ltd
Priority to CN201811438592.XApriorityCriticalpatent/CN109582963A/en
Publication of CN109582963ApublicationCriticalpatent/CN109582963A/en
Pendinglegal-statusCriticalCurrent

Links

Classifications

Landscapes

Abstract

The present invention relates to a kind of archives automatic classification method based on extreme learning machine.This method includes learning and running two stages, and two stage first steps will pass through preprocessing module, which be that data are carried out with standardization processing, removal and the incoherent information of this task;Content of text is unified for the coded format of UTF-8 by preprocessing module first;Then processing is filtered to forbidden character by the way of regular expression matching;Then participle and part-of-speech tagging are carried out using ICTCLAS Chinese lexical analysis system;Finally using Baidu deactivate vocabulary in text often occur but itself the word that text analyzing has little significance is filtered.The present invention can accurately understand the archive content in text while construct the lower archives dictionary of an efficient, stable dimension, while can guarantee there is higher nicety of grading.

Description

Automatic file classification method based on extreme learning machine
Technical Field
The invention belongs to the technical field of text classification, and particularly relates to an automatic file classification method based on an extreme learning machine.
Background
In the face of massive electronic file information, the current management mode depends on professionals with abundant file work experience to perform manual operation classification and classification supervision work in a file management system. However, with the explosive increase of the number of electronic files, the manpower consumed by manual classification is more and more, which greatly exceeds the workload of file staff, and in addition, different file professionals have unpredictable differences in the classification processing results of the same file material, which may cause inconsistency before and after classification of part of file files for a long time, so that the classification management of the electronic files by the automatic classification technology of computer texts is the best way to realize effective management and efficient utilization of the electronic files.
Some difficulties in the field of text classification still need to be solved urgently, and the problems are mainly 1): how to construct an efficient and stable semantic classification dictionary; 2): how to break the independence between words in the vector space model; (3) how to effectively balance the classification precision and the mass data training speed.
The invention provides an automatic file classification method based on an extreme learning machine. The method comprises a preprocessing module, a text feature extraction module, a feature fusion module and a classification module based on an extreme learning machine. The text feature extraction module comprises two sub-modules: the bottom layer feature extraction module and the middle layer feature autonomous learning module. The invention can effectively solve the problems in the text classification field.
Disclosure of Invention
The invention aims to solve the problem that the existing file text classification is not efficient and stable enough, and provides an automatic file classification method based on an extreme learning machine.
In order to achieve the purpose, the technical scheme of the invention is as follows: an automatic file classification method based on an extreme learning machine comprises the following steps:
step S1, training sample preprocessing: carrying out standardization processing on a text training sample set for model learning;
step S2, extracting the bottom-layer features of the text training sample: sending the preprocessed sample into a bottom layer feature extraction module to extract text bottom layer features, realizing two processes of construction of a file dictionary and a corpus and formation of bottom layer feature expression of a training sample, wherein the bottom layer features are expressed by selecting a vector space model, and the feature of each dimension in a vector is normalized TF-IDF weight;
step S3, self-learning of layer features in text training samples: combining the archive dictionary and the corpus generated in the step S2 to train a Skip-gram model in an unsupervised mode, and generating a training sample word vector by using the trained model; finally, forming a middle-layer characteristic expression of each training document by adopting a pooling technology;
step S4, combining the bottom layer and middle layer features of the text training sample: the bottom layer characteristics calculated in the step S2 and the middle layer characteristics calculated in the step S3 are weighted and connected in series to form the final fusion characteristic expression of the document;
step S5, training a file classification model based on the extreme learning machine: respectively training three archive classification models based on the extreme learning machine by adopting a supervised training mode based on the bottom layer feature calculated in the step S2, the middle layer feature calculated in the step S3 and the fusion feature calculated in the step S4, wherein the three archive classification models correspond to the bottom layer feature archive classification model, the middle layer feature archive classification model and the fusion feature archive classification model;
step S6, sample pretreatment to be determined: carrying out standardization processing on a sample to be judged;
step S7, extracting bottom layer features of the sample to be judged: sending the preprocessed samples into a bottom layer feature extraction module to extract text bottom layer features, and directly forming a bottom layer feature expression based on the archive dictionary generated in the step S2, wherein the bottom layer features are expressed by selecting a vector space model, and the features of each dimension in the vector are normalized TF-IDF weights;
step S8, extracting layer features in the sample to be judged: generating a word vector of the sample to be judged by using the Skip-gram model learned in the step S3, and finally forming a middle-layer feature expression of the sample to be judged by using a pooling technology;
step S9, combining the bottom layer and middle layer features of the sample text to be judged: the text bottom layer features calculated in the step S7 and the text middle layer features calculated in the step S8 are weighted and connected in series to form the final combined feature expression of the document to be judged;
step S10, automatically classifying the sample files to be determined: respectively sending the bottom layer features, the middle layer features and the combined features which are calculated in the steps S7, S8 and S9 into the three extreme learning machine-based archive classification models which are learned in the step S5 for classification, and synthesizing the classification results of the three classification models to obtain the archive category to which the sample to be judged belongs;
step S11, continuously operating the steps S6-S10, and finishing the classification of the text samples;
and step S12, inputting a new sample to be judged, and operating steps S6-S10 to finish automatic file classification of the new text sample.
In an embodiment of the present invention, the preprocessing of the samples in the steps S1 and S6 includes 4 processes: standardizing coding formats, removing illegal characters, segmenting words, performing part-of-speech tagging processing and stopping word processing; the text content is unified into a UTF-8 coding format by the standard coding format; removing illegal characters, and filtering the illegal characters in a regular expression matching mode; performing word segmentation and part-of-speech tagging by adopting an ICTCCLAS Chinese lexical analysis system; stop word processing employs a Baidu stop vocabulary to filter words that occur frequently in text but are not themselves meaningful to archive analysis.
In an embodiment of the present invention, in the step S2, the step of constructing the archive dictionary includes two processes of part-of-speech selection and bottom-layer feature selection; selecting nouns, verbs, adjectives and adverbs as reference words by part of speech selection, and combining the words with different parts of speech to form potential semantic information of a document; and the bottom-layer feature selection further selects the feature words which can represent the classification of the archives most by adopting a bottom-layer feature selection principle based on chi-square statistics on the basis of part of speech selection.
In an embodiment of the present invention, in the steps S4 and S9, the fusion policy of the bottom layer feature and the middle layer feature is: adopting weight combination, the formula is:wherein L represents the bottom layer feature vector, M represents the middle layer feature vector,the weight of the underlying features is set to 0.2, and | represents the concatenated symbols.
In an embodiment of the present invention, in the steps S3 and S8, the concrete process of pooling is as follows: dividing the dimensionality k of the vector expressed by the bottom layer characteristic text into N parts, accumulating word vectors corresponding to characteristic words appearing in each dimensionality, and if no word exists in the dimensionality, setting all the word vectors in the dimensionality to be 0; and after the word vectors in each dimension are counted, splicing the vectors again according to the sequence of the dimension degrees to obtain a brand new vector for representing the document.
In an embodiment of the present invention, in the step S10, a specific process of classifying the archive to which the sample to be determined belongs is as follows: and respectively sending the bottom layer characteristic, the middle layer characteristic and the combined characteristic of the sample to be judged into corresponding trained file classification models based on the extreme learning machine, and adding output result vectors of the three classification models to obtain a final judgment vector, wherein the label corresponding to the maximum value of the vector is the final file category.
Compared with the prior art, the invention has the following beneficial effects:
1. according to the file classification method based on the extreme learning machine, due to the fact that part of speech selection and bottom layer feature selection are adopted, file contents in a text can be accurately understood, and an efficient and stable file dictionary with low dimensionality is constructed;
2. according to the automatic file classification method based on the extreme learning machine, due to the adoption of the mode of feature fusion and classifier fusion, the classification precision can be effectively improved;
3. the invention relates to an automatic file classification method based on an extreme learning machine, which randomly selects hidden nodes in a training network structure and calculates output weights once by using a least square rule method, thereby completing the training of a network. Therefore, the network training is greatly simplified, and the training speed is dozens of times to hundreds of times faster than that of algorithms such as a classical SVM and the like.
Drawings
FIG. 1 is a general block diagram of an extreme learning machine-based automatic document classification method.
FIG. 2 is a flow chart of a middle layer feature pooling step.
FIG. 3 is a flow chart of a combined classifier.
FIG. 4 is a schematic diagram of an extreme learning machine-based archival classification model.
Detailed Description
The technical scheme of the invention is specifically explained below with reference to the accompanying drawings.
The invention provides an automatic file classification method based on an extreme learning machine, which specifically comprises the following steps:
(1) training sample pretreatment: carrying out standardization processing on a text training sample set for model learning, and removing information irrelevant to the task;
(2) extracting bottom layer features of the text training sample: sending the sample processed by the preprocessing module into a bottom layer feature extraction module to extract text bottom layer features, wherein the module comprises two processes of constructing a file dictionary and forming bottom layer feature expression of a training sample in a model learning stage, wherein the bottom layer features are expressed by selecting a vector space model, and the features of each dimension in the vector are normalized TF-IDF weights;
(3) the middle-layer feature of the text training sample is independently learned: and (3) combining the archival dictionary generated in the step (2) and the large-scale corpus, training a Skip-gram model in an unsupervised mode, and generating a training sample word vector by using the trained model. Finally, forming a middle-layer characteristic expression of each training document by adopting a pooling technology;
(4) combining the bottom layer and middle layer features of the text training sample: weighting and connecting the bottom layer characteristics calculated in the step (2) and the middle layer characteristics calculated in the step (3) in series to form final characteristic expression of the document;
(5) training a file classification model based on an extreme learning machine: respectively training three archive classification models based on an extreme learning machine by adopting a supervised training mode based on the sample bottom layer features calculated in the step (2), the middle layer features calculated in the step (3) and the fusion features calculated in the step (4), wherein the three archive classification models correspond to a bottom layer feature archive classification model, a middle layer feature archive classification model and a fusion feature archive classification model;
(6) preprocessing a sample to be judged: carrying out standardization processing on a sample to be judged, and removing information irrelevant to the task;
(7) extracting bottom layer features of a sample to be judged: sending the sample processed by the preprocessing module into a bottom layer feature extraction module to extract text bottom layer features, and directly forming a bottom layer feature expression based on the archive dictionary generated in the step (2), wherein the bottom layer features are expressed by selecting a vector space model, and the features of each dimension in the vector are normalized TF-IDF weights;
(8) extracting the layer features in the sample to be judged: generating a word vector of the sample to be judged by using the learned Skip-gram model in the step (3), and finally forming a middle-layer feature expression of the sample to be judged by adopting a pooling technology;
(9) combining the bottom layer and middle layer features of the sample text to be judged: weighting and connecting the text bottom layer features calculated in the step (7) and the text middle layer features calculated in the step (8) in series to form the final feature expression of the document to be judged;
(10) automatically classifying the sample files to be judged: respectively sending the bottom layer features, the middle layer features and the combined features calculated in the steps (7), (8) and (9) into the three extreme learning machine-based archive classification models learned in the step (5) for classification, and synthesizing the classification results of the three classification models to obtain the archive category to which the sample to be judged belongs;
(11) and (5) continuously executing the steps (6) - (10) in the model operation phase to finish the classification of the text sample.
(12) Inputting a new sample to be judged, and executing the steps (6) to (10) to finish the automatic file classification of the new text sample.
In the method for automatically classifying archives based on the extreme learning machine, the preprocessing of the samples in the step (1) and the step (6) comprises 4 processes: standardizing coding format, removing illegal characters, segmenting words, performing part-of-speech tagging and stopping word processing. The process 1 unifies the text content into the encoding format of UTF-8; 2, filtering the illegal characters by adopting a regular expression matching mode; in the process 3, an ICTCCLAS Chinese lexical analysis system is adopted to carry out word segmentation and part of speech tagging; process 4 filters words that occur frequently in text but are not themselves meaningful to archive analysis using a Baidu stop vocabulary.
In the method for automatically classifying the archives based on the extreme learning machine, in the step (2), the step of constructing the archives classification dictionary comprises two processes of part of speech selection and bottom layer feature selection. The process 1 selects nouns, verbs, adjectives and adverbs as reference words together, and combines words of different parts of speech to form potential semantic information of a document, so that the coverage of an archive dictionary can be ensured to the maximum extent, and the semantic information of the document is kept. And 2, further selecting the feature words which can represent the classification of the archives most by adopting a bottom-layer feature selection principle based on chi-square statistics on the basis of the process 1.
In the archive classification method based on the extreme learning machine, in the step (4) and the step (9), the fusion strategy of the bottom layer features and the middle layer features is as follows: adopting weight combination, the formula is:wherein L represents the bottom layer feature vector, M represents the middle layer feature vector,the weight of the underlying features is set to 0.2, and | represents the concatenated symbols.
In the automatic file classification method based on the extreme learning machine, the concrete process of pooling in the step (3) and the step (8) is as follows: dividing the dimensionality k of the vector expressed by the bottom layer characteristic text into N parts equally, accumulating the word vectors corresponding to the characteristic words appearing in each dimensionality, and if no word exists in the dimensionality, the word vectors of the dimensionality are all 0. And after the word vectors in each dimension are counted, splicing the vectors again according to the sequence of the dimension degrees to obtain a brand new vector for representing the document.
In the method for automatically classifying the archives based on the extreme learning machine, in the step (10), the concrete process of classifying the archives to which the samples to be judged belong is as follows: and respectively sending the bottom layer characteristic, the middle layer characteristic and the combined characteristic of the sample to be judged into corresponding trained file classification models based on the extreme learning machine, and adding output result vectors of the three classification models to obtain a final judgment vector, wherein the label corresponding to the maximum value of the vector is the final file category.
The following are specific examples of the present invention.
The first embodiment is as follows: see fig. 1. The automatic file classification method based on the extreme learning machine mainly comprises two stages: a model learning phase and a model operating phase. Each stage contains four modules: the system comprises a preprocessing module, a text feature extraction module, a bottom layer feature and middle layer feature fusion module and an extreme learning machine-based archive classification module. The text feature extraction module comprises two sub-modules: the bottom layer feature extraction module and the middle layer feature autonomous learning module. The method comprises the following steps:
(1) training sample pretreatment: carrying out standardization processing on a text training sample set for model learning, and removing information irrelevant to the task;
(2) extracting bottom layer features of the text training sample: sending the sample processed by the preprocessing module into a bottom layer feature extraction module to extract text bottom layer features, wherein the module comprises two processes of constructing a file dictionary and forming bottom layer feature expression of a training sample in a model learning stage, wherein the bottom layer features are expressed by selecting a vector space model, and the features of each dimension in the vector are normalized TF-IDF weights;
(3) the middle-layer feature of the text training sample is independently learned: and (3) combining the archive classification dictionary generated in the step (2) and the large-scale corpus to train a Skip-gram model in an unsupervised mode, and generating a training sample word vector by using the trained model. Finally, forming a middle-layer characteristic expression of each training document by adopting a pooling technology;
(4) combining the bottom layer and middle layer features of the text training sample: weighting and connecting the bottom layer characteristics calculated in the step (2) and the middle layer characteristics calculated in the step (3) in series to form final characteristic expression of the document;
(5) training a file classification model based on an extreme learning machine: respectively training three archive classification models based on an extreme learning machine by adopting a supervised training mode based on the sample bottom layer features calculated in the step (2), the middle layer features calculated in the step (3) and the fusion features calculated in the step (4), wherein the three archive classification models correspond to a bottom layer feature archive classification model, a middle layer feature archive classification model and a fusion feature archive classification model;
(6) preprocessing a sample to be judged: carrying out standardization processing on a sample to be judged, and removing information irrelevant to the task;
(7) extracting bottom layer features of a sample to be judged: sending the sample processed by the preprocessing module into a bottom layer feature extraction module to extract text bottom layer features, and directly forming a bottom layer feature expression based on the archive dictionary generated in the step (2), wherein the bottom layer features adopt a TF-IDF algorithm to calculate the weight;
(8) extracting the layer features in the sample to be judged: generating a word vector of the sample to be judged by using the learned Skip-gram model in the step (3), and finally forming a middle-layer feature expression of the sample to be judged by adopting a pooling technology;
(9) combining the bottom layer and middle layer features of the sample text to be judged: weighting and connecting the text bottom layer features calculated in the step (7) and the text middle layer features calculated in the step (8) in series to form the final feature expression of the document to be judged;
(10) automatically classifying the sample files to be judged: respectively sending the bottom layer features, the middle layer features and the combined features calculated in the steps (7), (8) and (9) into the three extreme learning machine-based archive classification models learned in the step (5) for classification, and synthesizing the classification results of the three classification models to obtain the archive category to which the sample to be judged belongs;
(11) and (5) continuously executing the steps (6) to (10) in the model operation stage, and completing automatic classification of the archives of the text samples.
Example two: see fig. 1, 2. The automatic file classification method of the extreme learning machine in this embodiment further details the pooling technical solutions in step (3) and step (6). The flow of the step is as follows:
(1) suppose that x words are contained in the archive file, t words are left after bottom layer feature extraction, and the text is represented asWherein the word vector for each word isEach word vector has k-dimensional features;
(2) equally dividing word vectors in the text T into N parts to form N word vector groups, wherein each group corresponds to T/N word vectors;
(3) for each word vector group the following operations are performed: accumulating all word vectors in the group, and finally forming a feature vector v (z) by each word vector group, wherein the dimension of the feature vector is also k;
(4) the feature vectors of the N word vector groups are connected in series to obtain the feature vector of the whole document, as shown in a formula:
where | represents symbols in series.
Example three: see fig. 1, 3. The embodiment further details the technical solution of step (10) based on the automatic file classification method of the extreme learning machine. The detailed contents of the classification of the sample archive to be determined in the step (10) are as follows:
the algorithm consists of the following steps:
(1) respectively extracting bottom layer characteristics, middle layer characteristics and fusion characteristics of the text sample;
(2) respectively sending the three characteristics into a trained file classification model based on the bottom layer characteristics, a trained file classification model based on the middle layer characteristics and a trained file classification model based on the fusion characteristics;
(3) adding the output result vectors of the three classification models (wherein each dimension of the vector corresponds to one class of archive category, and the numerical value of each dimension represents the probability that the text sample belongs to the archive category) to obtain a final output vector;
(4) and finding the quantity with the maximum value in the final output vector, wherein the corresponding file type is the file type of the sample to be judged.
Example four: referring to fig. 4, the automatic document classification method based on the extreme learning machine according to this embodiment further details the technical solution of the document classification model based on the extreme learning machine in step (5). The details are as follows:
an extreme learning machine based archive classification model. The extreme learning machine is a Single-hidden Layer Feedforward Neural network (SLFNs), and the network consists of an input Layer, a hidden Layer and an output Layer, wherein the input Layer is fully connected with the hidden Layer, and the hidden Layer is fully connected with the output Layer. Where the input layer X representation is a sample feature vector, the hidden layer includes L hidden neurons, typically L being much smaller than N (number of samples), and the output layer outputs m-dimensional vectors (corresponding to the number of archive classes). Different from the traditional neural network, the weight between the input layer and the hidden layer of the extreme learning machine is randomly generated, only the connection weight between the hidden layer and the output layer needs to be considered, the optimization process of the extreme learning machine needs to minimize the error and the output weight of the hidden layer, so the generalization capability of the model is the best, and the optimization target equation is as follows:
formula (1)
Wherein,
formula (2)
H is for multiple training samplesThe hidden layer output matrix of (2). Where x denotes the set of N training sample textual expressions, the size of H is determined by the number N of training samples and the number L of hidden nodes, typically L is much smaller than N. T is a label matrix formed by a training sample set, each row represents one sample and is stored in a one-hot form.
Formula (3)
Is the connection weight of the hidden layer and the output layer. Finally, the method can solve the formula (1)Analytic solution of (2):
equation (4).
The above are preferred embodiments of the present invention, and all changes made according to the technical scheme of the present invention that produce functional effects do not exceed the scope of the technical scheme of the present invention belong to the protection scope of the present invention.

Claims (6)

CN201811438592.XA2018-11-292018-11-29A kind of archives automatic classification method based on extreme learning machinePendingCN109582963A (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN201811438592.XACN109582963A (en)2018-11-292018-11-29A kind of archives automatic classification method based on extreme learning machine

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN201811438592.XACN109582963A (en)2018-11-292018-11-29A kind of archives automatic classification method based on extreme learning machine

Publications (1)

Publication NumberPublication Date
CN109582963Atrue CN109582963A (en)2019-04-05

Family

ID=65924925

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN201811438592.XAPendingCN109582963A (en)2018-11-292018-11-29A kind of archives automatic classification method based on extreme learning machine

Country Status (1)

CountryLink
CN (1)CN109582963A (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN111007068A (en)*2019-11-212020-04-14中国兵器工业信息中心Yellow cultivation diamond grade classification method based on deep learning
CN112632971A (en)*2020-12-182021-04-09上海明略人工智能(集团)有限公司Word vector training method and system for entity matching
CN112785266A (en)*2021-01-222021-05-11广西安怡臣信息技术有限公司Electronic archive detection management system
CN113191123A (en)*2021-04-082021-07-30中广核工程有限公司Indexing method and device for engineering design archive information and computer equipment
CN113434639A (en)*2021-07-082021-09-24中国银行股份有限公司Audit data processing method and device
CN113609361A (en)*2021-08-202021-11-05东北大学Data classification method based on Gaia system
CN113610194A (en)*2021-09-092021-11-05重庆数字城市科技有限公司Automatic classification method for digital files
US20220245401A1 (en)*2021-01-292022-08-04Beijing Dajia Internet Information Technology Co., Ltd.Method and apparatus for training model
US12429842B2 (en)2021-10-072025-09-30Saudi Arabian Oil CompanyMethod and system for managing plant safety using machine learning

Citations (6)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN105160290A (en)*2015-07-032015-12-16东南大学Mobile boundary sampling behavior identification method based on improved dense locus
CN107451278A (en)*2017-08-072017-12-08北京工业大学Chinese Text Categorization based on more hidden layer extreme learning machines
CN107590134A (en)*2017-10-262018-01-16福建亿榕信息技术有限公司Text sentiment classification method, storage medium and computer
US20180284747A1 (en)*2016-05-092018-10-04StrongForce IoT Portfolio 2016, LLCMethods and systems for optimization of data collection and storage using 3rd party data from a data marketplace in an industrial internet of things environment
CN108733653A (en)*2018-05-182018-11-02华中科技大学A kind of sentiment analysis method of the Skip-gram models based on fusion part of speech and semantic information
CN108875961A (en)*2018-06-112018-11-23中国石油大学(华东)A kind of online weighting extreme learning machine method based on pre- boundary's mechanism

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN105160290A (en)*2015-07-032015-12-16东南大学Mobile boundary sampling behavior identification method based on improved dense locus
US20180284747A1 (en)*2016-05-092018-10-04StrongForce IoT Portfolio 2016, LLCMethods and systems for optimization of data collection and storage using 3rd party data from a data marketplace in an industrial internet of things environment
CN107451278A (en)*2017-08-072017-12-08北京工业大学Chinese Text Categorization based on more hidden layer extreme learning machines
CN107590134A (en)*2017-10-262018-01-16福建亿榕信息技术有限公司Text sentiment classification method, storage medium and computer
CN108733653A (en)*2018-05-182018-11-02华中科技大学A kind of sentiment analysis method of the Skip-gram models based on fusion part of speech and semantic information
CN108875961A (en)*2018-06-112018-11-23中国石油大学(华东)A kind of online weighting extreme learning machine method based on pre- boundary's mechanism

Cited By (13)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN111007068A (en)*2019-11-212020-04-14中国兵器工业信息中心Yellow cultivation diamond grade classification method based on deep learning
CN111007068B (en)*2019-11-212022-05-13中国兵器工业信息中心Yellow cultivated diamond grade classification method based on deep learning
CN112632971B (en)*2020-12-182023-08-25上海明略人工智能(集团)有限公司Word vector training method and system for entity matching
CN112632971A (en)*2020-12-182021-04-09上海明略人工智能(集团)有限公司Word vector training method and system for entity matching
CN112785266A (en)*2021-01-222021-05-11广西安怡臣信息技术有限公司Electronic archive detection management system
US20220245401A1 (en)*2021-01-292022-08-04Beijing Dajia Internet Information Technology Co., Ltd.Method and apparatus for training model
CN113191123A (en)*2021-04-082021-07-30中广核工程有限公司Indexing method and device for engineering design archive information and computer equipment
CN113434639A (en)*2021-07-082021-09-24中国银行股份有限公司Audit data processing method and device
CN113609361A (en)*2021-08-202021-11-05东北大学Data classification method based on Gaia system
CN113609361B (en)*2021-08-202023-11-14东北大学Data classification method based on Gaia system
CN113610194A (en)*2021-09-092021-11-05重庆数字城市科技有限公司Automatic classification method for digital files
CN113610194B (en)*2021-09-092023-08-11重庆数字城市科技有限公司Automatic classification method for digital files
US12429842B2 (en)2021-10-072025-09-30Saudi Arabian Oil CompanyMethod and system for managing plant safety using machine learning

Similar Documents

PublicationPublication DateTitle
CN109582963A (en)A kind of archives automatic classification method based on extreme learning machine
CN110580292A (en)Text label generation method and device and computer readable storage medium
US20200394509A1 (en)Classification Of Sparsely Labeled Text Documents While Preserving Semantics
US20170344822A1 (en)Semantic representation of the content of an image
CN110188195A (en)A kind of text intension recognizing method, device and equipment based on deep learning
CN110704616B (en)Equipment alarm work order identification method and device
CN111143840B (en)Method and system for identifying abnormity of host operation instruction
CN114627282A (en)Target detection model establishing method, target detection model application method, target detection model establishing device, target detection model application device and target detection model establishing medium
CN115098690B (en)Multi-data document classification method and system based on cluster analysis
CN112257425A (en)Power data analysis method and system based on data classification model
CN113886562A (en) An AI resume screening method, system, device and storage medium
CN112989058A (en)Information classification method, test question classification method, device, server and storage medium
CN113722439A (en)Cross-domain emotion classification method and system based on antagonism type alignment network
CN117494051A (en)Classification processing method, model training method and related device
Athavale et al.Predicting algorithm classes for programming word problems
Alam et al.Social media content categorization using supervised based machine learning methods and natural language processing in bangla language
WO2025066156A1 (en)Method and system for interpreting common interaction utility amongst multiple blackbox artificial intelligence models
CN119416874B (en)Knowledge enhancement and self-adaptive fine tuning based power large language model construction method
Alshahrani et al.Applied linguistics with red-tailed hawk optimizer-based ensemble learning strategy in natural language processing
CN118051592A (en)Question answering method, device, equipment and storage medium
CN119782503A (en)LLM-based document structured automatic processing method and system
CN116226747A (en) Training method of data classification model, data classification method and electronic device
CN111046934A (en) A kind of SWIFT message soft clause identification method and device
CN120068134A (en)Sensitive word auditing method
CN115392254A (en)Interpretable cognitive prediction and discrimination method and system based on target task

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination
RJ01Rejection of invention patent application after publication
RJ01Rejection of invention patent application after publication

Application publication date:20190405


[8]ページ先頭

©2009-2025 Movatter.jp