Movatterモバイル変換


[0]ホーム

URL:


CN109255014A - The recognition methods of file keyword accuracy is promoted based on many algorithms - Google Patents

The recognition methods of file keyword accuracy is promoted based on many algorithms
Download PDF

Info

Publication number
CN109255014A
CN109255014ACN201811210049.4ACN201811210049ACN109255014ACN 109255014 ACN109255014 ACN 109255014ACN 201811210049 ACN201811210049 ACN 201811210049ACN 109255014 ACN109255014 ACN 109255014A
Authority
CN
China
Prior art keywords
keyword
model
module
calculated result
extraction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811210049.4A
Other languages
Chinese (zh)
Inventor
张永静
张彤
郝佳
高晓琼
李世成
郑春
郑春一
李景田
司敬
徐海
左晓辉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Jinghang Computing Communication Research Institute
Original Assignee
Beijing Jinghang Computing Communication Research Institute
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Jinghang Computing Communication Research InstitutefiledCriticalBeijing Jinghang Computing Communication Research Institute
Priority to CN201811210049.4ApriorityCriticalpatent/CN109255014A/en
Publication of CN109255014ApublicationCriticalpatent/CN109255014A/en
Pendinglegal-statusCriticalCurrent

Links

Landscapes

Abstract

The invention belongs to keyword retrieval technical fields, and in particular to a kind of recognition methods that the accuracy of file keyword is promoted based on many algorithms.By comparing each algorithm to keyword hit-count, the weight ratio of each algorithm configuration can be configured voluntarily or using default configuration, be calculated according to the weight ratio of each algorithm hit-count, and as final result.Algorithm includes that the Chinese key extraction algorithm, the Chinese key extraction algorithm based on High Dimensional Clustering Analysis technology, algorithm of disjunctive model is used to accurately identify the method for extraction, semantic-based Chinese text keyword extraction algorithm, the Chinese key extraction algorithm based on model-naive Bayesian to file and file keyword.By this way, in keyword retrieval technical field, by the recognition methods for promoting the accuracy of file keyword based on many algorithms.

Description

The recognition methods of file keyword accuracy is promoted based on many algorithms
Technical field
The invention belongs to keyword retrieval technical fields, and in particular to one kind promotes file keyword standard based on many algorithmsThe recognition methods of exactness.
Background technique
In natural language processing field, the text file of magnanimity is handled it is crucial that the most concerned problem of user is mentionedIt takes out.Regardless of being that can often spy upon the theme of entire text by several keywords for long text or short textThought.At the same time, searched for whether based on the recommendation of text or text based, for text key word dependence also veryGreatly, the order of accuarcy of keyword extraction is directly related to the final effect of recommender system or search system.Therefore, keyword mentionsTaking in text mining field is a critically important part.
Keyword identification retrieval is based on Unified Policy, using deep content analysis, to static data, dynamic data andData in use carry out the relevant technologies of instant identification, monitoring, protection.
Most of scheme mainly uses disjunctive model algorithm at present, extracts to key words and crucial word string is extracted.It is existingThere is technical solution due to single using algorithm, and various algorithms have respective advantage and characteristic, are calculated using single algorithm crucialWord can not evade the drawbacks of algorithm itself.Therefore, the keyword identification technology accuracy used on the market at present has to be hoisted.
Summary of the invention
(1) technical problems to be solved
The technical problem to be solved by the present invention is how to solve at present since algorithm is single, can not be tied in conjunction with a variety of scanningsFruit carries out the problem of accurate comprehensive analysis.
(2) technical solution
In order to solve the above technical problems, the present invention provides a kind of knowledge for promoting the accuracy of file keyword based on many algorithmsOther method, the recognition methods are implemented based on identifying system, and the identifying system includes: that original text input module, text are pre-Processing module, the Chinese key extraction module based on disjunctive model, the Chinese key based on High Dimensional Clustering Analysis technology extract mouldBlock, semantic-based Chinese key extraction module, the Chinese key extraction module based on model-naive Bayesian, algorithm powerAgain than distribution module, keyword recognition result generation module;Specifically,
The recognition methods includes the following steps:
Step 1: the original text of pending keyword identification is inputted by the original text input module;
Step 2: text formatting being carried out to the original text that original text input module inputs by the Text Pretreatment module and is turnedPretreatment is changed, the candidate word handled for subsequent recognizer is formed;
Step 3: by the Chinese key extraction module based on disjunctive model, disjunctive model is based on, to from textThe candidate word of preprocessing module, carries out key words extraction and crucial word string is extracted, and generates the calculated result based on disjunctive model,Obtain keyword number of extracted information;
Step 4: by the Chinese key extraction module based on High Dimensional Clustering Analysis technology, it is based on High Dimensional Clustering Analysis technology, it is rightCandidate word from Text Pretreatment module, carries out key words extraction and crucial word string is extracted, and generates based on High Dimensional Clustering Analysis skillThe calculated result of art obtains keyword number of extracted information;
Step 5: by being set forth in semantic Chinese key extraction module, semantic-based Chinese text keyword extraction is calculatedMethod carries out key words extraction and crucial word string is extracted, generate semantic-based to the candidate word from Text Pretreatment moduleCalculated result obtains keyword number of extracted information;
Step 6: by the Chinese key extraction module based on model-naive Bayesian, being based on naive Bayesian mouldType carries out key words extraction and crucial word string is extracted, generate and be based on simple shellfish to the candidate word from Text Pretreatment moduleThe calculated result of this model of leaf obtains keyword number of extracted information;
Step 7: by the algorithm weights than distribution module, configuring the above-mentioned calculated result based on disjunctive model, based on heightTie up each comfortable final pass of calculated result, semantic-based calculated result, the calculated result of model-naive Bayesian of clustering techniqueWeight ratio in keyword result operation generating process;
Step 8: by the keyword recognition result generation module, comparing the calculated result based on disjunctive model, be based on heightIt ties up in the calculated result, semantic-based calculated result, the calculated result of model-naive Bayesian of clustering technique respectively to keyThe hit-count of word, according to above-mentioned preconfigured weight ratio, COMPREHENSIVE CALCULATING obtains final keyword recognition result.
Wherein, which is characterized in that the Chinese key extraction module based on disjunctive model, using based on disjunctive modelChinese key extraction algorithm, the identification of keyword is extracted as a classification, to candidate keywords each in text areaDivide keyword or non-key word.
Wherein, which is characterized in that the disjunctive model is respectively established to key words and crucial word string, in keyIn the selection of word feature, each model established respectively chooses different features.
Wherein, which is characterized in that the Chinese key extraction module of the High Dimensional Clustering Analysis technology, by according to small dictionaryFast word segmentation, secondary participle, High Dimensional Clustering Analysis and keyword select the extraction that four steps realize keyword.
Wherein, which is characterized in that the semantic-based Chinese key extraction module incorporates phrase semantic featureDuring keyword extraction, constructs semantic similarity network and utilize degree Density Metric phrase semantic criticality between two parties.
Wherein, which is characterized in that the Chinese key extraction module based on model-naive Bayesian passes through firstTraining process obtains the parameters in model-naive Bayesian, then takes it as a basis, and completes keyword in test process and mentionsIt takes.
Wherein, which is characterized in that the algorithm weights than distribution module according to 2:3:4:3 ratio-dependent it is above-mentioned based on pointFrom the calculated result of model, the calculated result based on High Dimensional Clustering Analysis technology, semantic-based calculated result, model-naive BayesianEach comfortable final keyword results operation generating process of calculated result in weight ratio.
Wherein, which is characterized in that the weight ratio of the 2:3:4:3 is default configuration.
Wherein, which is characterized in that the weight ratio is voluntarily to configure according to concrete application scene.
Wherein, the format of the original text includes WORD format, PDF format.
(3) beneficial effect
Compared with prior art, the present invention uses the Chinese key extraction algorithm of disjunctive model, is based on High Dimensional Clustering AnalysisThe Chinese key extraction algorithm of technology, semantic-based Chinese text keyword extraction algorithm are based on model-naive BayesianChinese key extraction algorithm, comprehensive matching judgement, come promoted keyword extraction identification accuracy.
Each algorithm is compared to keyword hit-count, the weight ratio default of each algorithm configuration is calculated using 2:3:4:3Recognition result, weight can voluntarily be configured according to concrete application scene, be carried out according to the weight ratio of each algorithm to hit-countIt calculates, and as final result.
By this way, in keyword retrieval technical field, by promoting the accuracy of file keyword based on many algorithmsRecognition methods.
Detailed description of the invention
Fig. 1 is the schematic diagram of technical solution of the present invention.
Specific embodiment
To keep the purpose of the present invention, content and advantage clearer, with reference to the accompanying drawings and examples, to of the inventionSpecific embodiment is described in further detail.
In order to solve the above technical problems, the present invention provides a kind of knowledge for promoting the accuracy of file keyword based on many algorithmsOther method, the recognition methods are implemented based on identifying system, and the identifying system includes: that original text input module, text are pre-Processing module, the Chinese key extraction module based on disjunctive model, the Chinese key based on High Dimensional Clustering Analysis technology extract mouldBlock, semantic-based Chinese key extraction module, the Chinese key extraction module based on model-naive Bayesian, algorithm powerAgain than distribution module, keyword recognition result generation module;Specifically,
The recognition methods includes the following steps:
Step 1: the original text of pending keyword identification is inputted by the original text input module;
Step 2: text formatting being carried out to the original text that original text input module inputs by the Text Pretreatment module and is turnedPretreatment is changed, the candidate word handled for subsequent recognizer is formed;
Step 3: by the Chinese key extraction module based on disjunctive model, disjunctive model is based on, to from textThe candidate word of preprocessing module, carries out key words extraction and crucial word string is extracted, and generates the calculated result based on disjunctive model,Obtain keyword number of extracted information;
Step 4: by the Chinese key extraction module based on High Dimensional Clustering Analysis technology, it is based on High Dimensional Clustering Analysis technology, it is rightCandidate word from Text Pretreatment module, carries out key words extraction and crucial word string is extracted, and generates based on High Dimensional Clustering Analysis skillThe calculated result of art obtains keyword number of extracted information;
Step 5: by being set forth in semantic Chinese key extraction module, semantic-based Chinese text keyword extraction is calculatedMethod carries out key words extraction and crucial word string is extracted, generate semantic-based to the candidate word from Text Pretreatment moduleCalculated result obtains keyword number of extracted information;
Step 6: by the Chinese key extraction module based on model-naive Bayesian, being based on naive Bayesian mouldType carries out key words extraction and crucial word string is extracted, generate and be based on simple shellfish to the candidate word from Text Pretreatment moduleThe calculated result of this model of leaf obtains keyword number of extracted information;
Step 7: by the algorithm weights than distribution module, configuring the above-mentioned calculated result based on disjunctive model, based on heightTie up each comfortable final pass of calculated result, semantic-based calculated result, the calculated result of model-naive Bayesian of clustering techniqueWeight ratio in keyword result operation generating process;
Step 8: by the keyword recognition result generation module, comparing the calculated result based on disjunctive model, be based on heightIt ties up in the calculated result, semantic-based calculated result, the calculated result of model-naive Bayesian of clustering technique respectively to keyThe hit-count of word, according to above-mentioned preconfigured weight ratio, COMPREHENSIVE CALCULATING obtains final keyword recognition result.
Wherein, which is characterized in that the Chinese key extraction module based on disjunctive model, using based on disjunctive modelChinese key extraction algorithm, the identification of keyword is extracted as a classification, to candidate keywords each in text areaDivide keyword or non-key word.
Wherein, which is characterized in that the disjunctive model is respectively established to key words and crucial word string, in keyIn the selection of word feature, each model established respectively chooses different features.
Wherein, which is characterized in that the Chinese key extraction module of the High Dimensional Clustering Analysis technology, by according to small dictionaryFast word segmentation, secondary participle, High Dimensional Clustering Analysis and keyword select the extraction that four steps realize keyword.
Wherein, which is characterized in that the semantic-based Chinese key extraction module incorporates phrase semantic featureDuring keyword extraction, constructs semantic similarity network and utilize degree Density Metric phrase semantic criticality between two parties.
Wherein, which is characterized in that the Chinese key extraction module based on model-naive Bayesian passes through firstTraining process obtains the parameters in model-naive Bayesian, then takes it as a basis, and completes keyword in test process and mentionsIt takes.
Wherein, which is characterized in that the algorithm weights than distribution module according to 2:3:4:3 ratio-dependent it is above-mentioned based on pointFrom the calculated result of model, the calculated result based on High Dimensional Clustering Analysis technology, semantic-based calculated result, model-naive BayesianEach comfortable final keyword results operation generating process of calculated result in weight ratio.
Wherein, which is characterized in that the weight ratio of the 2:3:4:3 is default configuration.
Wherein, which is characterized in that the weight ratio is voluntarily to configure according to concrete application scene.
Wherein, the format of the original text includes WORD format, PDF format.
In addition, the present invention also provides a kind of identifying system for promoting the accuracy of file keyword based on many algorithms, such as Fig. 1Shown, the identifying system includes:
Original text input module is used to input the original text of pending keyword identification;
Text Pretreatment module is used to carry out the original text that original text input module inputs at the pre- place of text formatting conversionReason forms the candidate word handled for subsequent recognizer;
Chinese key extraction module based on disjunctive model is used for based on disjunctive model, to from Text PretreatmentThe candidate word of module, carries out key words extraction and crucial word string is extracted, and generates the calculated result based on disjunctive model, acquisition is closedKeyword number of extracted information;
Chinese key extraction module based on High Dimensional Clustering Analysis technology is used for based on High Dimensional Clustering Analysis technology, to from textThe candidate word of this preprocessing module, carries out key words extraction and crucial word string is extracted, and generates based on High Dimensional Clustering Analysis technologyIt calculates as a result, obtaining keyword number of extracted information;
Semantic-based Chinese key extraction module is used for semantic-based Chinese text keyword extraction (SKE)Algorithm carries out key words extraction and crucial word string is extracted, generate and be based on semanteme to the candidate word from Text Pretreatment moduleCalculated result, obtain keyword number of extracted information;
Chinese key extraction module based on model-naive Bayesian is used for based on model-naive Bayesian, to nextFrom the candidate word of Text Pretreatment module, carries out key words extraction and crucial word string is extracted, generate based on naive Bayesian mouldThe calculated result of type obtains keyword number of extracted information;
Algorithm weights than distribution module, be used for concrete application scene configure the above-mentioned calculated result based on disjunctive model,Calculated result, semantic-based calculated result, each leisure of the calculated result of model-naive Bayesian based on High Dimensional Clustering Analysis technologyWeight ratio in final keyword results operation generating process;
Keyword recognition result generation module is used to compare the calculated result based on disjunctive model, is based on High Dimensional Clustering AnalysisThe calculated result of technology, semantic-based calculated result, in the calculated result of model-naive Bayesian respectively to the life of keywordMiddle number, according to above-mentioned preconfigured weight ratio, COMPREHENSIVE CALCULATING obtains final keyword recognition result.
Wherein, the Chinese key extraction module based on disjunctive model, it is crucial using the Chinese based on disjunctive modelWord extraction algorithm extracts the identification of keyword as a classification, distinguishes keyword also to candidate keywords each in textIt is non-keyword;
Wherein, disjunctive model is respectively established to key words and crucial word string, in the selection of keyword feature,The each model established respectively chooses different features.
Key words are extracted and crucial word string extracts the accuracy for improving extraction according to different features.The algorithm isKeyword identifies most common algorithm, and calculated result accounts for the 2/10 of result operation specific gravity.
Wherein, the Chinese key extraction module of the High Dimensional Clustering Analysis technology, to based on statistical information keyword extraction sideThe low problem of method accuracy rate proposes the Chinese key extraction algorithm based on High Dimensional Clustering Analysis technology;By according to the fast of small dictionarySpeed participle, secondary participle, High Dimensional Clustering Analysis and keyword select the extraction that four steps realize keyword.
Theory analysis and experiment display, the Chinese key extracting method based on High Dimensional Clustering Analysis technology have better stabilizationProperty, higher efficiency and more accurate result.The algorithm speed is very fast and recognition accuracy is very high, and calculated result accounts for result operationThe 3/10 of specific gravity.
Wherein, the semantic-based Chinese key extraction module, is mentioned using semantic-based Chinese text keywordTake (SKE) algorithm;During phrase semantic feature is incorporated keyword extraction by it, constructs semantic similarity network and utilizeDensity Metric phrase semantic criticality is spent between two parties.
Compared with the keyword extraction algorithm based on statistical nature, it is more excellent that SKE algorithm extracts key word algorithm performance.The calculationThe keyword discrimination accuracy of method is high, and calculated result accounts for the 4/10 of result operation specific gravity.
Wherein, the Chinese key extraction module based on model-naive Bayesian, using based on naive Bayesian mouldThe Chinese key extraction algorithm of type;It obtains the parameters in model-naive Bayesian by training process first, thenIt takes it as a basis, completes keyword extraction in test process.Experiment shows that relative to traditional method, the algorithm can be from small ruleMore accurate keyword is extracted in the document sets of mould, and can neatly increase the characteristic item of characterization word importance, toolThere is better scalability.The keyword of the algorithm identifies that accuracy is very high in small document, and calculated result accounts for result operation ratioThe 3/10 of weight.
Wherein, the algorithm weights are more above-mentioned based on disjunctive model according to the ratio-dependent of 2:3:4:3 than distribution moduleCalculate result, the calculated result based on High Dimensional Clustering Analysis technology, semantic-based calculated result, the calculated result of model-naive BayesianWeight ratio in each comfortable final keyword results operation generating process.
Wherein, the weight ratio of the 2:3:4:3 is default configuration.
Wherein, the weight ratio is voluntarily to configure according to concrete application scene.
Wherein, the format of the original text includes WORD format, PDF format.
Embodiment 1
The present embodiment provides a kind of methods for promoting the recognition accuracy of file keyword based on many algorithms, adopt to fileWith the Chinese key extraction algorithm of use disjunctive model, the Chinese key extraction algorithm based on High Dimensional Clustering Analysis technology, it is based onSemantic Chinese text keyword extraction (SKE) algorithm, the Chinese key extraction algorithm based on model-naive Bayesian carry outKeyword processing parsing simultaneously judges to promote accuracy by weight.
Wherein, the Chinese key extraction algorithm based on disjunctive model extracts and crucial word string key wordsIt extracts, according to the Chinese key extraction algorithm based on disjunctive model, key words is extracted and crucial word string extracts the twoProblem devises different features to improve the accuracy of extraction.
Wherein, the Chinese key extraction algorithm based on High Dimensional Clustering Analysis technology, to based on statistical information keywordThe low problem of extracting method accuracy rate proposes the Chinese key extraction algorithm based on High Dimensional Clustering Analysis technology.Algorithm passes through foundationThe fast word segmentation of small dictionary, secondary participle, High Dimensional Clustering Analysis and keyword select the extraction that four steps realize keyword.Theory pointAnalysis and experiment display, the Chinese key extracting method based on High Dimensional Clustering Analysis technology have better stability, higher efficiencyAnd more accurate result.
Wherein, phrase semantic feature is incorporated and is closed by semantic-based Chinese text keyword extraction (SKE) algorithmIn keyword extraction process, constructs semantic similarity network and utilize degree Density Metric phrase semantic criticality between two parties.With baseIt is compared in the keyword extraction algorithm of statistical nature, it is more excellent that SKE algorithm extracts key word algorithm performance.
Wherein, the Chinese key extraction algorithm based on model-naive Bayesian, the algorithm pass through training firstProcess obtains the parameters in model-naive Bayesian, then takes it as a basis, and completes keyword extraction in test process.It is realIt tests and shows that, relative to traditional if*idf method, which can extract more accurate key from small-scale document setsWord, and can neatly increase the characteristic item of characterization word importance, there is better scalability.
Keyword is extracted by each algorithm, the keyword quantity to be accurately obtained in file/folder mentionsIt wins the confidence breath.Each algorithm is compared to keyword hit-count, the weight ratio default of each algorithm configuration is calculated using 2:3:4:3 to be knownNot as a result, weight can voluntarily be configured according to concrete application scene, hit-count is counted according to the weight ratio of each algorithmIt calculates, and as final result.
The above is only a preferred embodiment of the present invention, it is noted that for the ordinary skill people of the artFor member, without departing from the technical principles of the invention, several improvement and deformations can also be made, these improvement and deformationsAlso it should be regarded as protection scope of the present invention.

Claims (10)

CN201811210049.4A2018-10-172018-10-17The recognition methods of file keyword accuracy is promoted based on many algorithmsPendingCN109255014A (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN201811210049.4ACN109255014A (en)2018-10-172018-10-17The recognition methods of file keyword accuracy is promoted based on many algorithms

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN201811210049.4ACN109255014A (en)2018-10-172018-10-17The recognition methods of file keyword accuracy is promoted based on many algorithms

Publications (1)

Publication NumberPublication Date
CN109255014Atrue CN109255014A (en)2019-01-22

Family

ID=65045874

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN201811210049.4APendingCN109255014A (en)2018-10-172018-10-17The recognition methods of file keyword accuracy is promoted based on many algorithms

Country Status (1)

CountryLink
CN (1)CN109255014A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN111339770A (en)*2020-02-182020-06-26百度在线网络技术(北京)有限公司Method and apparatus for outputting information
CN111726285A (en)*2020-08-212020-09-29支付宝(杭州)信息技术有限公司Instant messaging method and device
CN112307175A (en)*2020-12-022021-02-02龙马智芯(珠海横琴)科技有限公司Text processing method, text processing device, server and computer readable storage medium
CN119740566A (en)*2024-12-242025-04-01中国工商银行股份有限公司 Text similarity determination method, device, equipment, storage medium and program product

Citations (4)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN106202042A (en)*2016-07-062016-12-07中央民族大学A kind of keyword abstraction method based on figure
US20170140464A1 (en)*2015-11-162017-05-18Uberple Co., Ltd.Method and apparatus for evaluating relevance of keyword to asset price
CN107480858A (en)*2017-07-102017-12-15武汉楚鼎信息技术有限公司A kind of Aided intelligent decision-making and method based on the analysis of stock big data
CN108595425A (en)*2018-04-202018-09-28昆明理工大学Based on theme and semantic dialogue language material keyword abstraction method

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US20170140464A1 (en)*2015-11-162017-05-18Uberple Co., Ltd.Method and apparatus for evaluating relevance of keyword to asset price
CN106202042A (en)*2016-07-062016-12-07中央民族大学A kind of keyword abstraction method based on figure
CN107480858A (en)*2017-07-102017-12-15武汉楚鼎信息技术有限公司A kind of Aided intelligent decision-making and method based on the analysis of stock big data
CN108595425A (en)*2018-04-202018-09-28昆明理工大学Based on theme and semantic dialogue language material keyword abstraction method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
王博: "基于云计算的多层次文本关键词抽取研究与应用", 《中国优秀硕士学位论文全文数据库 信息科技辑》*

Cited By (6)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN111339770A (en)*2020-02-182020-06-26百度在线网络技术(北京)有限公司Method and apparatus for outputting information
CN111339770B (en)*2020-02-182023-07-21百度在线网络技术(北京)有限公司Method and device for outputting information
CN111726285A (en)*2020-08-212020-09-29支付宝(杭州)信息技术有限公司Instant messaging method and device
CN112307175A (en)*2020-12-022021-02-02龙马智芯(珠海横琴)科技有限公司Text processing method, text processing device, server and computer readable storage medium
CN112307175B (en)*2020-12-022021-11-02龙马智芯(珠海横琴)科技有限公司 A text processing method, apparatus, server and computer-readable storage medium
CN119740566A (en)*2024-12-242025-04-01中国工商银行股份有限公司 Text similarity determination method, device, equipment, storage medium and program product

Similar Documents

PublicationPublication DateTitle
CN106649868B (en)Question and answer matching process and device
CN107515877B (en)Sensitive subject word set generation method and device
Li et al.Twiner: named entity recognition in targeted twitter stream
CN103914494B (en)Method and system for identifying identity of microblog user
CN112347223B (en)Document retrieval method, apparatus, and computer-readable storage medium
TW201737118A (en) Method and device for classifying webpage text, method and device for recognizing webpage text
CN108009135B (en)Method and device for generating document abstract
CN103838835B (en)A kind of network sensitive video detection method
CN104881458B (en)A kind of mask method and device of Web page subject
CN109255014A (en)The recognition methods of file keyword accuracy is promoted based on many algorithms
CN105824959A (en)Public opinion monitoring method and system
CN108268539A (en)Video matching system based on text analyzing
CN110781679B (en)News event keyword mining method based on associated semantic chain network
CN103473327A (en)Image retrieval method and image retrieval system
CN110134792B (en)Text recognition method and device, electronic equipment and storage medium
CN104866558B (en)A kind of social networks account mapping model training method and mapping method and system
CN116628173A (en)Intelligent customer service information generation system and method based on keyword extraction
CN104268230B (en)A kind of Chinese micro-blog viewpoint detection method based on heterogeneous figure random walk
CN107315734A (en)A kind of method and system for becoming pronouns, general term for nouns, numerals and measure words standardization based on time window and semanteme
CN108304502A (en)Quick hot spot detecting method and system based on magnanimity news data
CN104216968A (en)Rearrangement method and system based on document similarity
CN114997288B (en) A design resource association method
CN114065749B (en) A text-oriented Cantonese recognition model and system training and recognition method
CN109978020A (en)A kind of social networks account vest identity identification method based on multidimensional characteristic
CN110222250A (en)A kind of emergency event triggering word recognition method towards microblogging

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination
CB03Change of inventor or designer information
CB03Change of inventor or designer information

Inventor after:Zhang Yongjing

Inventor after:Xu Hai

Inventor after:Zuo Xiaohui

Inventor after:Wang Jun

Inventor after:Zhang Tong

Inventor after:Hao Jia

Inventor after:Gao Xiaoqiong

Inventor after:Li Shicheng

Inventor after:Zheng Chunyi

Inventor after:Li Jingtian

Inventor after:Si Jing

Inventor before:Zhang Yongjing

Inventor before:Zuo Xiaohui

Inventor before:Zhang Tong

Inventor before:Hao Jia

Inventor before:Gao Xiaoqiong

Inventor before:Li Shicheng

Inventor before:Zheng Chunyi

Inventor before:Li Jingtian

Inventor before:Si Jing

Inventor before:Xu Hai

RJ01Rejection of invention patent application after publication
RJ01Rejection of invention patent application after publication

Application publication date:20190122


[8]ページ先頭

©2009-2025 Movatter.jp