Movatterモバイル変換


[0]ホーム

URL:


CN107656919B - A kind of optimal L DA Automatic Model Selection methods based on minimum average B configuration similarity between theme - Google Patents

A kind of optimal L DA Automatic Model Selection methods based on minimum average B configuration similarity between theme
Download PDF

Info

Publication number
CN107656919B
CN107656919BCN201710815144.6ACN201710815144ACN107656919BCN 107656919 BCN107656919 BCN 107656919BCN 201710815144 ACN201710815144 ACN 201710815144ACN 107656919 BCN107656919 BCN 107656919B
Authority
CN
China
Prior art keywords
theme
temp
similarity
num
array
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710815144.6A
Other languages
Chinese (zh)
Other versions
CN107656919A (en
Inventor
汪洋
孙启超
韩宁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
CHINA SOFTWARE AND TECHNOLOGY SERVICE Co Ltd
Original Assignee
CHINA SOFTWARE AND TECHNOLOGY SERVICE Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by CHINA SOFTWARE AND TECHNOLOGY SERVICE Co LtdfiledCriticalCHINA SOFTWARE AND TECHNOLOGY SERVICE Co Ltd
Priority to CN201710815144.6ApriorityCriticalpatent/CN107656919B/en
Publication of CN107656919ApublicationCriticalpatent/CN107656919A/en
Application grantedgrantedCritical
Publication of CN107656919BpublicationCriticalpatent/CN107656919B/en
Activelegal-statusCriticalCurrent
Anticipated expirationlegal-statusCritical

Links

Classifications

Landscapes

Abstract

The invention discloses a kind of optimal L DA Automatic Model Selection methods based on minimum average B configuration similarity between theme, this method is:Change K values in initial setting section, for every K values:The theme number initial value of destination document set is set as current K values, the destination document set is trained using LDA models, obtains the ProbabilityDistribution Vector of K theme-word;Calculate the average similarity AC between vectorKAnd it is stored to global average similarity array;Average similarity minimum value is chosen from the array as interim minimum average B configuration similarity, by centered on current interim minimum average B configuration similarity position in the array, determine that the best theme number of the destination document set, corresponding LDA models are the optimal L DA models of the destination document set according to the center.Method proposed by the present invention is more intuitive, reliable in practical applications.

Description

A kind of optimal L DA Automatic Model Selections based on minimum average B configuration similarity between themeMethod
Technical field
The present invention relates to the natural language processing under computer science, machine learning field, specially a kind of LDA models are mostThe determination method of excellent theme number;Due to theme number it is optimal directly determine LDA models it is optimal, so this method also cry it is optimalA kind of determining method of LDA models.
Background technology
LDA (Latent Dirichlet Allocation) topic models (Topic Model) exist from David Blei etc.(D.M.Blei, A.Y.Ng, and M.I.Jordan.Latent Dirichlet have been referred to since proposing within 2003Allocation.Journal of Machine Learning Research, 3,993-1022,2003), text mining,The field that information retrieval, calculating advertisement, commending system, question answering system, knowledge mapping etc. are related to text semantic analysis has obtained extensivelyGeneral application.LDA models are that a kind of generative probabilistic model (refers to Zhao Xin, user's topic interest modeling is ground with excavation in social mediaStudy carefully, the outstanding doctoral thesis of Peking University, 2014), document no longer as traditional vector space model, is only regarded as dictionary by itExpression spatially, and it is the introduction of the concept in theme space, to realize the expression of text spatially in theme.By rightThe introducing of Subject Concept, the model bring two benefits:(1) low-dimensional for realizing text indicates that this is very beneficial for subsequentlyThe calculating of text classification or the like, avoiding the occurrence of " dimension disaster " problem, (text is indicated by the vector of theme spatially, vectorialThe dimension in dimension, that is, theme space, is determined by theme number;Relatively conventional text vector spatial model, this dimension are usually wantedIt is much lower.In vector space model, the dimension size in the dictionary space that text vector dimension is obtained by text collection determines, leads toOften it is much larger than theme number);(2) semantic information that text collection implies behind, i.e. theme have been excavated, has been text semantic modelingOne strong tool.
Since LDA has solid Fundamentals of Mathematics and a good autgmentability, exploration to the model itself and and otherThe researchs such as the combination of method are always one of hot research problems in fields such as natural language processing subject, machine learning.WhereinIt is exactly a specific Research Challenges about the determination method of LDA model parameter theme Optimal units.
By literature search, determines method about the optimal theme number of LDA models, be mainly the following:
(1) experience is set, and in text semantic analysis task, researcher is often through the number for repeatedly debugging themeCarry out the quality of observation experiment effect, for example the quality of the theme vocabulary of observation high probability, semantic whether consistent etc. (refers to ZhaoIt is prosperous, user's topic interest modeling and Research on Mining in social media, the outstanding doctoral thesis of Peking University, 2014).Empirical settingPeople is needed to participate in, artificial experience is judged, as a result not necessarily very accurate;Different people judgment criteria is also variant;If collection of document is huge,Including it is theme number tens, up to a hundred, micro-judgment manually can not be almost carried out one by one, while this is not a kind of determination of automationMethod.
(2) based on the determination method of Perplexity.For a collection of document, after being trained by LDA text modelings, baseIn result of calculation, Perplexity values are calculated.One lower Perplexity value corresponds to a good LDA model, but mostThe automation of low Perplexity values determines method, and there is presently no people's propositions.In practical application, typically everybody according toPerplexity values -- the change curve of theme number artificially determines minimum point, to obtain optimal theme number.
(3) the deformation extension based on nonparametric Bayes method.More representational work is HierarchicalDirichlet Processes models, it solves the problems, such as to automatically determine number of topics purpose in topic model to a certain extent,But due to model complexity, actual use gets up to run complexity higher, and cost is too big (to refer to Teh, Y.W.;Jordan,M.I.;Beal,M.J.;Blei,D.M.(2006).Hierarchical Dirichlet Processes.Journal ofthe American Statistical Association.101:pp.1566–1581)。
(4) Cao Juan etc. proposes a kind of optimal L DA model selection methods based on minimum average B configuration similarity principle between theme,Which demonstrate the conclusions of " when average similarity minimum topic model is just optimal between theme ", simultaneously, it is proposed that one kind is based on closeThe optimal theme number selection algorithm of degree, this algorithm are that analogy density clustering algorithm DBSCAN thoughts propose, be it is a kind of relativelyGood automation determines method.But since algorithm idea hypothesis, the condition of convergence, method of determination of material calculation etc. are in practical applicationIn have a deviation, result of calculation is not necessarily accurate, it is reliable (refer to Cao Juan, Zhang Yongdong, Li Jintao, Tang Sheng, it is a kind of based on densityAdaptive optimal LDA model selection methods).
Invention content
For the technical problems in the prior art, the purpose of the present invention is to provide one kind automatically determining optimal themeA counting method, compares Name-based Routing, and method proposed by the present invention is more intuitive, reliable in practical applications.
The technical scheme is that:
A kind of optimal L DA Automatic Model Selection methods based on minimum average B configuration similarity between theme, step include:
1) in initial setting Qu Jian [K0,KMAX]Interior change K values, for every K values of selection:By destination document setTheme number initial value is set as current K values, is trained to the destination document set using LDA models, obtains K theme-wordProbabilityDistribution Vector;Calculate the average similarity AC between the ProbabilityDistribution Vector of this K theme-wordKAnd it is stored toGlobal average similarity array AC_Array;Global average similarity array AC_Array is one-dimension array;
2) average similarity minimum value is chosen from overall situation average similarity array AC_Array as interim minimum flatEqual similarity;The corresponding theme number of the interim minimum average B configuration similarity is TEMP_Kbest;
3) in overall situation average similarity array AC_Array, it is with current interim minimum average B configuration similarity positionThe array element total number of the right of center is denoted as NUM_R_TEMP_K by centerbest, the array element of the left of center is totalNumber is denoted as NUM_L_TEMP_Kbest
If 4) NUM_R_TEMP_KbestMore than N × NUM_L_TEMP_Kbest, then theme number K is exportedbest=TEMP_Kbest;If NUM_L_TEMP_KbestMore than NUM_R_TEMP_Kbest, then K is enabledMAX=KMAX+ m, K0=Km, r=r0, repeat step1)~4);If NUM_L_TEMP_KbestLess than NUM_R_TEMP_KbestAnd NUM_R_TEMP_KbestLess than N × NUM_TEMP_L_Kbest, then KMAX=N × NUM_L_TEMP_Kbest, K0=Km, r=r1, repeat step 1)~4), r1More than r0
5) by theme number KbestAs the best theme number of the destination document set, theme number KbestIt is correspondingLDA models are the optimal L DA models of the destination document set.
Further, according to K=K0+ r*n changes K values, and r is the theme number incremental spacing number, and n is positive integer.
Further, average similarity AC is calculatedKMethod be:Calculate first the probability distribution of this K theme-word toThe similarity of amount between any two, then calculates being averaged for each similarity and is worth to average similarity ACK
Further, the N values are 4.
Further, r0The initial value of the number that is the theme incremental spacing;r0=3, KMAXInitial value be set as 30, r0=10.
Further, the m=30.
The present invention will automate the proposition to search problem, resolving ideas, specific algorithm according to the optimal theme number of LDA modelsIt is illustrated etc. several parts.
First part, LDA models optimize the proposition that theme number automatically determines problem.
In use, theme number needs specified in advance LDA models.Specified different theme number, LDA are trainedThe data arrived are also different.For Cao Juan etc. it has been proved that when average similarity minimum between theme, corresponding LDA models are optimal,At this moment optimal theme number is just obtained.Based on this conclusion, how algorithm for design, minimum similarity degree is between being automatically found themeA research topic of great practical value.
It is exemplified below, the common method for manually determining optimal theme number in each document.For specific documentSet, calculates and draws the tendency chart that average similarity between theme change with theme number, and artificial foundation tendency chart can be sentencedMake optimal theme number.Detailed process is as follows:First, theme number computer capacity is set, the meter of setting theme number is includedCalculate section and incremental spacing;Then, under the theme number of each setting, corresponding LDA model trainings are carried out, and calculateAverage similarity under each theme number between theme.After the completion of all calculating, so that it may average similar between theme to drawThe tendency chart that degree (ordinate) changes with theme number (abscissa), (this is in several different scales as shown in Figure 1 to Figure 3The experiment done on data set).According to trend chart, the position of global average similarity minimum value manually may determine that, withDetermine optimal theme number.
An algorithm is designed, the above process is automated, is the problem to be solved in the present invention.
Second part, the optimal L DA Automatic Model Selection algorithm resolving ideas based on minimum average B configuration similarity between theme.
Fig. 1 to Fig. 3 is certain embodiments.On different data sets, trend that average similarity changes with theme numberFigure estimation is various.In face of a specific collection of document, if expecting the corresponding theme of minimum average B configuration similarityHow number, be obtained by calculation reliable average similarity that one can refer to and become with theme number variation diagram and solve the problems, such asIt is crucial.And to solve the problems, such as this, it is crucial that determining that (theme number investigates section, starting point to abscissa range in variation diagramAnd terminal) and interval (theme number incremental spacing).In the present invention, specific design is as follows:
In the present invention, the starting point (minimum value) in theme number investigation section is set as 2.Starting point setting is excessive, it is possible toMiss minimum average B configuration similarity.
Consider the terminal of theme number computation interval.LDA models, the assignable maximum value of theme number, theoretically, noMore than the dictionary maximum value obtained from this document set, (extreme case, a word indicate a theme, and a document has moreLack different words, both how many different theme).But it is shown from the test data of current all kinds of documents, optimal theme numberFar from reaching this value (because of typically tens, a word up to a hundred, indicating a theme).On the contrary, most relative to above-mentioned theoryBig value, optimal theme number, be usually all close to starting point (minimum value) end on one side, so during usually everybody tests, themeNumber, which investigates mode, to be calculated since starting point end (minimum value), is then incremented by according to certain intervals, is calculated to some estimationMaximum value terminate, this mode, which can guarantee, finds and full out finds best theme number.Simultaneously as finding best themeDuring number, LDA models are trained according to theme number, so to consider to train cost problem.Theme number specifies numerical valueBigger, corresponding LDA model trainings calculation amount is bigger, and calculating duration may be at several times, the growth of decades of times.So comprehensiveIt closes and considers that factors above, ideal case are, size is arranged in theme number interval terminal, should ensure that section includes minimum average B configuration phaseIt is suitably more again like the corresponding point of degree, to can determine that out follow-up variation tendency, ensure the point found be global minima orPerson's approximation global minima.In the present invention, theme number computation interval terminal (maximum value), using when judging increased mode intoRow, terminal maximum value determine that centered on the theme number where minimum average B configuration similarity, maximum value numerical value is by following methodCenter " the right " theme number 5 times (the corresponding average similarity of center " the right " theme number it is more corresponding than center place mostSmall average similarity is big), it is presently believed that in this way, can largely portray average similarity between themeWith the trend that theme number changes, it can guarantee that the optimal theme number of acquisition is global optimum or approximate global optimum, it can be to closeThe cost of reason farthest meets the needs of practical business application.
The setting of theme number incremental spacing number.In order not to miss global minima because the setting of space-number is excessive and be averaged phaseLike degree, while it is further contemplated that initial incremental spacing is set as 3 by calculation amount, the present invention.
Part III, specific algorithm of the present invention illustrates, including algorithm steps, parameter etc..
The optimal theme number algorithm of LDA models that the present invention designs is as follows:
Algorithm inputs:Destination document set, theme number investigate section threshold value K0, end point values KMAX, theme number be incremented bySpace-number r.
Algorithm exports:Optimal theme number Kbest, the output data of optimal L DA models acquisition.
Step:
1, K is enabled0=2, meanwhile, theme number incremental spacing r0Initial value is set as 3, KMAXInitial value be set as 30;
2, designated key number K, K=K in the following manner0+ r*n, n=(0,1,2 ... ...), and K &#91 in section;K0,KMAX], space-number r=r0;Then, cycle carries out following calculating step (2.1-2.3), until traversing all K (notes:It is incremented byThe theme number maximum value of acquisition is denoted as Km):
2.1 are directed to destination document set, set the theme number initial value of the destination document set as K, utilize LDA modelsThe destination document set is trained, every time after the completion of training, obtains the ProbabilityDistribution Vector of K theme-word, each vectorUse ziIt indicates, and i=(1,2 ... .., K), zi={ βI, 1, βI, 2,..., βI, V,, wherein βI, 1Indicate every in vectorial collection of documentProbability value of a word at this theme i, V are that (LDA model trainings, please refer to for the size of the dictionary obtained by collection of documentHeinrich Gregor, Parameter estimation for text analysis, 2009.).
After the completion of 2.2 above-mentioned training, the average similarity between the theme obtained according to following formula, computation model, noteFor ACK, subscript K expression theme numbers are K, are appended to successively in a global average similarity array AC_Array, AC_Array arrays are one-dimension array.
First, similarity between two themes of calculating, calculation formula are as follows:
Then, the average similarity between theme is calculated, calculation formula is as follows:
3, interim minimum average B configuration similarity is calculated.To the average similarity array AC_Array&#91 in step 2.2;2,KMAX]It asksMinimum value obtains interim minimum average B configuration similarity and at this time corresponding theme number TEMP_Kbest
4, after having carried out step 3, investigation algorithm continues to run with or algorithm terminates.
Investigate interim minimum average B configuration similarity (corresponding theme number TEMP_Kbest) in average similarity array AC_Position distribution situation in Array:
In array AC_Array, using interim minimum average B configuration similarity position as "center", the number on the right of "center"Group element total number is denoted as NUM_R_TEMP_Kbest, the array element total number on the "center" left side is denoted as NUM_L_TEMP_Kbest,
(1) if NUM_L_TEMP_KbestMore than NUM_R_TEMP_Kbest, then K is enabledMAX=KMAX+ m (m=30 in the present invention),K0=Km, r=r0It is transferred to step 2;
(2) if NUM_L_TEMP_KbestLess than NUM_R_TEMP_KbestAnd NUM_R_TEMP_KbestLess than 4 × NUM_TEMP_L_Kbest, then KMAX=4 × NUM_L_TEMP_Kbest,K0=Km, r=10, is transferred to step 2.
(3) if NUM_R_TEMP_KbestMore than 4 × NUM_L_TEMP_Kbest, terminate and calculate, and enable Kbest=TEMP_Kbest
5, it is directed to destination document set, exports best theme number Kbest;Assert this theme number (Kbest) under it is correspondingLDA models are optimal L DA models;According to the calculating of above-mentioned steps 2, this LDA model trained mistake, you can export this LDA mouldThe related data that type training obtains, including the ProbabilityDistribution Vector of theme-word, document-theme ProbabilityDistribution Vector (these dataIt is the implicit semantic information that LDA models are excavated from destination document set), other calculating can be used for.
Beneficial effects of the present invention
In order to examine the actual effect of algorithm proposed by the present invention, this algorithm to be counted on 3 different data setsAccording to the experiment.Experiment shows and (refers to attached fig. 4 to fig. 6), in the case of no human interference, can accurately automatically determine out mostExcellent theme number.Meanwhile between the main body drawn average similarity with theme number variation diagram, it is shown that variation tendency showsWhat is found is strictly Optimal units.
Description of the drawings
Fig. 1 collection of document 1- average similarities are with theme number variation diagram;
Fig. 2 collection of document 2- average similarities are with theme number variation diagram;
Fig. 3 collection of document 3- average similarities are with theme number variation diagram;
Fig. 4 collection of document 1- average similarities are with theme number variation diagram 2 (optimal theme number 29);
Fig. 5 collection of document 2- average similarities are with theme number variation diagram 2 (optimal theme number 131);
Fig. 6 collection of document 3- average similarities are with theme number variation diagram 2 (optimal theme number 50);
Fig. 7 is flow chart of the method for the present invention.
Specific implementation mode
Features described above and advantage to enable the present invention are clearer and more comprehensible, special embodiment below, and institute's attached drawing is coordinated to makeDetailed description are as follows.
Present invention implementation is very simple, as long as according to algorithm steps, writes program implementation, method flow of the inventionAs shown in fig. 7, its detailed algorithm is as follows:
Algorithm inputs:Destination document set, theme number investigate section threshold value K0, end point values KMAX, theme number be incremented bySpace-number r.
Algorithm exports:Optimal theme number Kbest, the output of the LDA models obtained at this time.
Step:
1, K is enabled0=2, meanwhile, theme number incremental spacing r0Initial value is set as 3, KMAXInitial value be set as 30;
2, designated key number K, K=K in the following manner0+ r*n, n=(0,1,2 ... ...), and K &#91 in section;K0,KMAX], space-number r=r0;Then, cycle carries out following calculating step (2.1-2.3), until traversing all K (notes:It is incremented byThe theme number maximum value of acquisition is denoted as Km):
2.1 are directed to destination document set, set the theme number initial value of the destination document set as K, utilize LDA modelsThe destination document set is trained, every time after the completion of training, obtains the ProbabilityDistribution Vector of K theme-word, each vectorUse ziIt indicates, and i=(1,2 ... .., K), zi={ βI, 1, βI, 2,..., βI, V,, wherein βI, 1Indicate every in vectorial collection of documentProbability value of a word at this theme i, V are that (LDA model trainings, please refer to for the size of the dictionary obtained by collection of documentHeinrich Gregor, Parameter estimation for text analysis, 2009.).
After the completion of 2.2 above-mentioned training, the average similarity between the theme obtained according to following formula, computation model, noteFor ACK, subscript K expression theme numbers are K, are appended to successively in a global average similarity array AC_Array, AC_Array arrays are one-dimension array.
First, similarity between two themes of calculating, calculation formula are as follows:
Then, the average similarity between theme is calculated, calculation formula is as follows:
3, interim minimum average B configuration similarity is calculated.To the average similarity array AC_Array&#91 in step 2.2;2,KMAX]It asksMinimum value obtains interim minimum average B configuration similarity and at this time corresponding theme number TEMP_Kbest
4, after having carried out step 3, investigation algorithm continues to run with or algorithm terminates.
Investigate interim minimum average B configuration similarity (corresponding theme number TEMP_Kbest) in average similarity array AC_Position distribution situation in Array:
In array AC_Array, using interim minimum average B configuration similarity position as "center", the number on the right of "center"Group element total number is denoted as NUM_R_TEMP_Kbest, the array element total number on the "center" left side is denoted as NUM_L_TEMP_Kbest,
(1) if NUM_L_TEMP_KbestMore than NUM_R_TEMP_Kbest, then K is enabledMAX=KMAX+ m (m=30 in the present invention),K0=Km, r=r0It is transferred to step 2;
(2) if NUM_L_TEMP_KbestLess than NUM_R_TEMP_KbestAnd NUM_R_TEMP_KbestLess than 4 × NUM_TEMP_L_Kbest, then KMAX=4 × NUM_L_TEMP_Kbest,K0=Km, r=10, is transferred to step 2.
(3) if NUM_R_TEMP_KbestMore than 4 × NUM_L_TEMP_Kbest, terminate and calculate, and enable Kbest=TEMP_Kbest
5, it is directed to destination document set, exports best theme number Kbest;Assert this theme number (Kbest) under it is correspondingLDA models are optimal L DA models;According to the calculating of above-mentioned steps 2, this LDA model trained mistake, you can export this LDA mouldThe related data that type training obtains, including the ProbabilityDistribution Vector of theme-word, document-theme ProbabilityDistribution Vector (these dataIt is the implicit semantic information that LDA models are excavated from destination document set), other calculating can be used for.
It is above to implement to be merely illustrative of the technical solution of the present invention rather than be limited, the ordinary skill people of this fieldMember can be modified or replaced equivalently technical scheme of the present invention, without departing from the spirit and scope of the present invention, this hairBright protection domain should be subject to described in claims.

Claims (5)

CN201710815144.6A2017-09-122017-09-12A kind of optimal L DA Automatic Model Selection methods based on minimum average B configuration similarity between themeActiveCN107656919B (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN201710815144.6ACN107656919B (en)2017-09-122017-09-12A kind of optimal L DA Automatic Model Selection methods based on minimum average B configuration similarity between theme

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN201710815144.6ACN107656919B (en)2017-09-122017-09-12A kind of optimal L DA Automatic Model Selection methods based on minimum average B configuration similarity between theme

Publications (2)

Publication NumberPublication Date
CN107656919A CN107656919A (en)2018-02-02
CN107656919Btrue CN107656919B (en)2018-10-26

Family

ID=61129681

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN201710815144.6AActiveCN107656919B (en)2017-09-122017-09-12A kind of optimal L DA Automatic Model Selection methods based on minimum average B configuration similarity between theme

Country Status (1)

CountryLink
CN (1)CN107656919B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN118484665B (en)*2024-07-162024-09-27中国民用航空飞行学院 Intelligent extraction method and system of text topics based on NLP technology

Citations (2)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN103984681A (en)*2014-03-312014-08-13同济大学News event evolution analysis method based on time sequence distribution information and topic model
CN106599181A (en)*2016-12-132017-04-26浙江网新恒天软件有限公司Hot news detecting method based on topic model

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US9135242B1 (en)*2011-10-102015-09-15The University Of North Carolina At CharlotteMethods and systems for the analysis of large text corpora
US9542477B2 (en)*2013-12-022017-01-10Qbase, LLCMethod of automated discovery of topics relatedness
CN105740354B (en)*2016-01-262018-11-30中国人民解放军国防科学技术大学The method and device of adaptive potential Di Li Cray model selection

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN103984681A (en)*2014-03-312014-08-13同济大学News event evolution analysis method based on time sequence distribution information and topic model
CN106599181A (en)*2016-12-132017-04-26浙江网新恒天软件有限公司Hot news detecting method based on topic model

Also Published As

Publication numberPublication date
CN107656919A (en)2018-02-02

Similar Documents

PublicationPublication DateTitle
CN110334726A (en) A method for identifying and repairing abnormal data of electric load based on density clustering and LSTM
CN105279288B (en)A kind of online content recommendation method based on deep neural network
CN106776534B (en)Incremental learning method of word vector model
CN108108849A (en)A kind of microblog emotional Forecasting Methodology based on Weakly supervised multi-modal deep learning
CN108763213A (en)Theme feature text key word extracting method
CN110704640A (en)Representation learning method and device of knowledge graph
CN104679738B (en)Internet hot words mining method and device
CN109635105A (en)A kind of more intension recognizing methods of Chinese text and system
CN107273348B (en) A method and device for joint detection of topic and emotion in text
CN103559504A (en)Image target category identification method and device
CN107203600B (en)Evaluation method for enhancing answer quality ranking by depicting causal dependency relationship and time sequence influence mechanism
CN111143567B (en)Comment emotion analysis method based on improved neural network
CN105488033A (en)Preprocessing method and device for correlation calculation
CN105373703A (en)Self-adaptive capacity testing system based on forgetting curve
CN104834918A (en)Human behavior recognition method based on Gaussian process classifier
CN110458600A (en)Portrait model training method, device, computer equipment and storage medium
CN109145304A (en)A kind of Chinese Opinion element sentiment analysis method based on word
CN106789149A (en)Using the intrusion detection method of modified self-organizing feature neural network clustering algorithm
CN105740354A (en)Adaptive potential Dirichlet model selection method and apparatus
CN110851593A (en)Complex value word vector construction method based on position and semantics
CN110688484B (en)Microblog sensitive event speech detection method based on unbalanced Bayesian classification
CN107656919B (en)A kind of optimal L DA Automatic Model Selection methods based on minimum average B configuration similarity between theme
CN102521402A (en)Text filtering system and method
CN115292498A (en)Document classification method, system, computer equipment and storage medium
CN115358300A (en)Student cognitive recognition method, device and equipment based on voice and text classification

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination
GR01Patent grant
GR01Patent grant

[8]ページ先頭

©2009-2025 Movatter.jp