Specific implementation mode
The principle and spirit of the invention are described below with reference to several illustrative embodiments.It should be appreciated that providing thisA little embodiments are used for the purpose of making those skilled in the art can better understand that realizing the present invention in turn, and be not with anyMode limits the scope of the invention.On the contrary, these embodiments are provided so that the disclosure is more thorough and complete, and energyIt is enough that the scope of the present disclosure is completely communicated to those skilled in the art.
One skilled in the art will appreciate that embodiments of the present invention can be implemented as a kind of system, device, equipment, methodOr computer program product.Therefore, the disclosure can be with specific implementation is as follows, i.e.,:Complete hardware, complete softwareThe form that (including firmware, resident software, microcode etc.) or hardware and software combine.
Before the embodiments of the invention are explained in detail, some the relevant technologies to may relate in embodiment first belowSlightly illustrate.
CPT (according to time-based charging) advertisement
CPT is a kind of Advertising Pattern relatively early occurred in internet.As described in the background art, in CPT, advertisementThe resource that platform provides is usually expressed as the exclusive displaying power of some specific exhibition position in certain a period of time, as shown in Figure 1, showingA kind of typical CPT advertising displays schematic diagram.
CPT advertisements are usually to estimate visit capacity to fix a price, therefore within following specific a period of time according to specific exhibition positionCharging regulation is relatively simple, and advertiser can easily understand dependency rule.On the other hand, the illustrative of CPT advertisementsRelatively determine, if within the time cycle bought, advertiser see at any time launched advertisement whether byExposure, to promote trust of the advertiser to advertising platform.The Comprehensible of rule and the certainty of displaying can be furtherAdvertiser is attracted to participate in the auction process of advertisement.
The defect of CPT advertisements is that it is difficult assessment to launch effect.By taking advertisement shown in Fig. 1 as an example, although website can be withThe exposure frequency of advertisement is counted by technological means, but there is no ripe technical solution at present, accurately can trace back to how manyUser is that conversion is generated by seeing the advertisement.Just because of this, what traditional CPT advertisements were mainly directed towards is that brand is wideIt accuses and leads, for centering small advertisement master and be not suitable for.
CPC (according to charging is clicked) advertisement
As described in the background art, CPC is universal and appearance a kind of Advertising Pattern with search engine, more at presentIt general or Commodity Information Search engine service company is provided is all made of this advertisement and sell mode.In this manner, when withWhen the search key of family, advertising platform can calculate each advertisement according to the degree of correlation for promoting object and active user's requestQuality point, is the bid (bid) clicked each time in combination with different advertisers, will promote object and feeds back to use after sequenceFamily.Such as refering to what is shown in Fig. 2, a kind of typical CPC advertising displays schematic diagram is shown.
For CPT, one of main advantage of CPC is that advertisement delivery effect is easy to assess.In CPC, advertisementThe cores such as impression, hits, turn over number launch index and can be easier to obtain so that advertiser can be according to effect numberPlan is launched according to adjustment at any time.For those can be achieved with converting to be formed the industry of transaction closed loop on line, thisAdvantage seems particularly evident.For example, the CPC advertisements launched for application program download service provider, it is easy to can countWhich user is by search key, and then the download page-downloading provided from search engine obtains;In another example for mealThe CPC advertisements that drink, hotel service enterprise launch are purchased by group, are subscribed by search key, and then by what search engine providedThe user data that advertisement page places an order, is also readily available.
However, CPC modes are complicated to the sequence logic for promoting object, in the major embodiment calculating that quality is divided wherein, thisCertain understanding cost can be brought to advertiser, and then lead to the loss of part advertiser.
Keyword search advertisement
Keyword search advertisement refers to by selling keyword to advertiser, and advertising platform is receiving inquiry input by userWhen word, in the way of matched keyword feedback of advertisement content.As it was noted above, industry is generally by the way of CPC at presentCarry out selling and launching for this series advertisements.However, in some specific industries, the dispensing efficiency ratio CPC of CPT is proved through actual effectHigher.
It is difficult under attracting user to complete on line by launching Internet advertising specifically, for some specific industriesThe conversion behaviors such as single, also can not just form closed loop on line, lead to the dispensing effect for being difficult to assess advertisement.With wedding celebration class, trade company isExample, although being also capable of providing the function (such as wedding photo set meal purchases by group) of transaction closed loop on line, since this kind of trade company is pastIt is past to consume the agreed-upon price space there are bigger to shop, and the individual demand of user causes the information provided on line to be not enough to guideUser's decision, therefore conversion ratio is still very low on line, also can not just assess advertisement delivery effect.However as mentioned, it is main advantages one of of the CPC compared with CPT that advertisement delivery effect, which is easy to assessment, but some are difficult to be formed on line and is closedThe industry of ring, CPC can not play its advantage, and therefore, the feature simple for charging regulation, displaying certainty is strong, CPT advertisements existIt is more applicable under this scene.
For providing keyword search advertisement by the way of CPT, it is both referred to the mode of CPC at present, with single crucialWord is price unit, and advertising platform is sold according to keyword, exhibition position and the combination of time to advertiser.The pass of this CPT modesThe keyword search advertisements form of expression for example can be, for query word input by user in special time period, advertising platform according toThe matching keywords of advertiser's purchase, in specific position to user feedback ad content.
It can be seen that for keyword search advertisement, presently, there are two kinds of charging modes of CPC and CPT, and are all with listA keyword is that price unit is sold to advertiser.However, in such a way that single keyword is sold, need input a large amount of artificialCost carries out the processing of keyword, it is difficult to realize scale, normally only be suitable for the larger keyword of volumes of searches.Those are searchedRope amount is smaller but a fairly large number of long-tail word, is difficult to realize efficiently use in such a way that single keyword is sold, for advertisementIt is the waste on a kind of flow for platform and advertiser, accurate feedback result can be obtained by being decreased for user terminalProbability.
In order to solve the problems, such as that said program exists, embodiment of the disclosure provides a kind of inquiry polymerizeing based on keywordFeedback method, device and corresponding storage medium and electronic equipment.Below with reference to several representative embodiments of the present invention, in detailIt is thin to illustrate the principle and spirit of the invention.
Fig. 3 is the query feedback method flow diagram based on keyword polymerization according to one embodiment of the disclosure, as shown,The method of the present embodiment includes the following steps 301-303.In one embodiment, the method for the present embodiment can by provide it is general orThe server of Commodity Information Search engine executes.
In step 301, obtain based on estimating the keyword set of intention generation, and corresponding with the keyword set push awayWide object.
In one embodiment, keyword set includes multiple keywords there are similar characteristics.Here similar characteristicsIt is to be understood that multiple keywords can reflect same user intent.In other words, it is generated based on intention of estimating by acquisitionKeyword can condense together the keyword of identical intention is reflected.
By embodiment of the disclosure be applied to advertisement for, the advertisement for such as wedding celebration type, here estimate intentionWedding photo, wedding dinner, wedding ring jewellery, honeymoon trip etc. are may include but be not limited to, by polymerizeing corresponding keyword set, extensivelyIt accuses main purchase target and becomes keyword set from single keyword.
In one embodiment, it includes any content that can be shown to user to promote object.With embodiment of the disclosureFor advertisement, popularization object here includes but not limited to trade company's page, the commodity page, the download page etc..
In one embodiment, for each keyword set, a corresponding popularization object is all preset respectively.ThisSample just can be to the corresponding popularization object of user feedback when subsequently navigating to matched keyword set based on user's query word.
In step 302, according to the query word received from user, the keyword set belonging to the query word is determined.
For traditional search engine, when receiving query word from user, it is all based on directly on query word feedbackThe highest query result of matching degree, or by matching degree sequence feedback query result.
In contrast, in the embodiment of the present disclosure, in response to the operation of user input query word, not direct feedback query knotFruit (i.e. previously described popularization object), but the first determining and matched keyword set of the query word, subsequently again based on determinationKeyword set feed back corresponding popularization object.In other words, since keyword set is generated by the intention of estimating,Embodiment of the disclosure is first to predict user intent based on query word, and then corresponding popularization is fed back according to the intention of predictionObject.
In one embodiment, affiliated keyword set is determined according to query word, can be identical with query wordKeyword where keyword set, can also be the keyword set between query word where the highest keyword of similarityIt closes.
By taking embodiment of the disclosure is applied to advertisement as an example, sold to wide by the keyword set generated by step 301It accuses after leading, advertising platform is when receiving query word input by user, first determination and the matched keyword set of the query word, fromAnd it subsequently can be to the ad content of user feedback respective advertisement master.
In step 303, based on determining keyword set to the corresponding popularization object of the user feedback.
As it was noted above, for traditional search engine, it is highest to be all based on directly on query word feedback matching degreeQuery result, or by matching degree sequence feedback query result.In other words, it is to be calculated by single query word in this caseIts matching degree between query result.
In contrast, in the embodiment of the present disclosure, after the keyword set belonging to determining query word by step 302,Feedback is popularization object corresponding with keyword set.In other words, it is according to keyword to gather to consider to inquire in this caseWord and promote object between matching degree, different query words can be fed back if belonging to the same keyword set completely just asPopularization object.
In one embodiment, the keyword set that step 302 is determined may include multiple, correspondingly, feeding back herePopularization object be also likely to be present multiple, the embodiment of the present disclosure can further comprise carrying out multiple popularization objects by preset rulesThe step of sequence, details are not described herein again.
According to the query feedback method that above-described embodiment keyword polymerize, based on the keyword set being generated in advance to userThe query word of input is matched, and feeds back corresponding popularization object according to matched result, can be more compared to single keywordIt accurately realizes that object is promoted based on user intent, combined query word is also can make full use of for promoting object and is generatedVisit capacity.
As described above, compared with tradition is based on the scheme of single keyword feedback search result, embodiment of the disclosureObject is promoted to user to feed back based on the keyword set generated by intention.Generation keyword set is shown respectively in Fig. 4 and Fig. 9Example flow diagram.
In one embodiment, as shown in figure 4, the generation of keyword set may include step 401-402.
In step 401, the similarity between keyword is obtained.
The present embodiment builds keyword set based on the similarity between keyword.
May be used also before obtaining similarity in one embodiment to the reflection for estimating intention to embody keyword setInclude the steps that selected seed word, each seed words can be considered that corresponding one kind estimating intention.The quantity of seed words need not be veryIt is more, but need have stronger correlation with user intent.
By taking embodiment of the disclosure is applied to advertisement as an example, the advertisement for such as wedding celebration type can be chosen but unlimitedIn the keywords such as " photography ", " wedding dinner ", " wedding ring ", " honeymoon " as seed words, to the user intent corresponding to several major class.
In one embodiment, seed words can may be based on the word frequency or existing of keyword by being chosen manuallyHistorical data obtain automatically, there is no restriction to this for embodiment of the disclosure, herein without repeating.
Once having chosen seed words, keyword set can be built around seed words.For other other than seed wordsThe keyword set of corresponding seed words can be added in keyword (non-seed word) by subsequent step according to maximum similarity.
In one embodiment, the similarity between keyword can based on keyword and the incidence relation for carrying out source document, andIt is calculated according to the similarity come between source document, one example can be found in the description of Fig. 5.Here the source document that comes is appreciated thatFor the source of keyword.Under different application scenarios, corresponding keyword and source can be chosen according to different strategiesDocument.By taking embodiment of the disclosure is applied to advertisement as an example, the focus inquiry word and hot topic of industry where advertiser can be chosenDocument builds keywords database and source document library respectively, the calculating basis as similarity.
In one embodiment, the similarity between keyword can also be calculated based on it come the coincidence degree of source document,One example can be found in the description of Fig. 7.The embodiment is applicable to the case where being not easy to obtain keyword in advance, therefore, firstCan be first that unit builds the source to match according still further to keyword based on keyword is obtained to the natural language processing for carrying out source documentCollection of document.In turn, for any two keyword, the two can be calculated based on the coincidence degree of the two source collection of documentBetween similarity.
In step 402, there is the non-seed word of highest similarity based on each seed words and with the seed words, described in generationKeyword set.
If addressed above, once having chosen seed words, keyword set can be generated based on seed words.For non-seedWord, this step can be added into the keyword set of corresponding seed words according to maximum similarity.
In one embodiment, the similarity calculation result between any two keyword can be obtained based on step 401.After selected seed word, the similarity calculation can be based on as a result, for each non-seed word, all count itself and each seed words itBetween similarity result and be ranked up by size.In this way, initial keyword set can only include each seed words, for everyA non-seed word, then selection are added to the maximum seed words of its similarity in corresponding keyword set.
In another embodiment, in the case where having chosen seed words in advance, step 401 also can only calculate non-kindSimilarity between sub- word and each seed words, and then the above required ranking results can be obtained.
Fig. 5 shows to calculate an example flow diagram of crucial Word similarity.As shown in figure 5, in the present embodiment keyword itBetween the calculating of similarity may include step 501-503.
In step 501, for any two keyword, calculate it is associated come source document between similarity be averagedValue.
In one embodiment, before step 501 further include the incidence relation for obtaining keyword and carrying out source document.For example,The user behavior (such as clicking operation after input inquiry word) that can be accessed based on history in data is established keyword and carrys out source documentThe incidence relation of shelves.
Further, in one embodiment, keyword can be expressed by building bigraph (bipartite graph) (bipartite graphs)With carry out the incidence relation between source document.
Bigraph (bipartite graph), which refers to the node in figure, can be divided into two subsets, associated two nodes in any one side respectively fromIn the two subsets.Fig. 6 shows an example of bigraph (bipartite graph), it is illustrated that the node K1-K3 on the left side can indicate keyword, the section on the rightPoint D1-D4 can indicate the source document that comes of keyword, and the contact between node then indicates keyword and carrys out being associated between source documentRelationship.
Bigraph (bipartite graph) based on structure, the present embodiment can be calculated similar between keyword based on SimRank iterative algorithmsDegree.The basic thought of SimRank is, also should be similar with their relevant objects if two objects are similar.For example,In figure 6, if D1 is similar with D4, K1 and K3 should be also similar, because K1 is related to D1, and K3 is related to D4.Simrank algorithms are a kind of algorithms based on iteration, in each step of iteration, can the similarity relation between node be extended oneLayer, main iterative part include two steps, are respectively used to the similarity calculated between keyword and come between source documentSimilarity.
This step can calculate the similarity between any two keyword based on following formula (1):
Wherein, Sim (Ki1,Ki2) indicate any two keyword between similarity, O indicate with keyword presence is associated withThe source document sets of relationship, | O (Ki1) | and | O (Ki2) | it indicates respectively big with the related collection of document of keyword i1, i2It is small, Sim (Dj1,Dj2) indicate respectively from O (Ki1) and O (Ki2) in choose arbitrary two similarities come between source document.Formula(1) the right it is practical it is to be understood that respectively with keyword Ki1And Ki2Associated arbitrary two are carried out similarity between source documentAverage value.
In step 502, source document is come for any two, calculates being averaged for the similarity between associated keywordValue.
Based on Simrank algorithms, this step can calculate any two come similar between source document based on following formula (2)Degree:
Wherein Sim (Dj1,Dj2) indicating that any two carrys out the similarity between source document, I is indicated and is carried out source document in the presence of passThe keyword set of connection relationship, | I (Di1) | and | I (Di2) | it indicates respectively big with the related keyword set of document j1, j2It is small, Sim (Ki1,Ki2) indicate respectively from I (Di1) and I (Di2) in choose any two keyword between similarity.Formula (2)The right it is practical it is to be understood that respectively with source document Di1And Di2Similarity is flat between associated any two keywordMean value.
In step 503, the similarity between the keyword is obtained based on SimRank iterative algorithms.
As described above, step 501 is based on the similarity come between the similarity calculation keyword between source document, stepRapid 502 on the contrary, be based on the similarity calculation between keyword come the similarity between source document.It therefore, can in step 503By introducing preset primary condition and the condition of convergence, any two in keyword set is obtained based on SimRank iterative algorithmsSimilarity between keyword.
SimRank iterative algorithms itself are not the content of disclosure concern, and details are not described herein again.
Fig. 7 shows to calculate another example flow diagram of crucial Word similarity.As shown in fig. 7, keyword in the present embodimentBetween the calculating of similarity may include step 701-703.
In step 701, source document sets are pre-processed, obtain corresponding keywords database.
The present embodiment is applicable to the case where being not easy to obtain keyword in advance.
Therefore, the pretreatment to carrying out source document is primarily based in this step, to obtain keyword.In one embodiment,Pretreatment series of steps such as may include but be not limited to participle, removal stop words, the conversion of simplified and traditional body, capital and small letter conversion.It is logicalPretreatment is crossed, keywords database finally can be obtained.
In step 702, obtain respectively with two collection of document of any two Keywords matching.
In one embodiment, the keywords database formed according to step 701, can obtain each pass by establishing inverted indexKeyword and the incidence relation for accordingly carrying out source document.
In other words, step 701 is that always source document respectively obtains corresponding keyword, and inverted index can be used to based on keyWord reversely obtains correspondence to be come in source document to occur at for which.Fig. 8 shows an example of inverted index, it is illustrated that the node on the left side1-n corresponds to each keyword in keywords database, and the box on the right indicates the keyword K1-Kn stored in node and its correspondingSource document index.
So, for any two keyword in keywords database, each Self Matching can be all found by inverted indexTwo source collection of document.
In step 703, the ratio between intersection and union based on described two collection of document calculate any two keywordBetween similarity.
In one embodiment, the collection of document inquired according to step 702, this step can be calculated based on following formula (3)Similarity between any two keyword:
Wherein, Sim (K1, K2) indicates the similarity between any two keyword K1 and K2, Dk1And Dk2Respectively indicate withThe denominator of keyword K1 and the matched collection of document of K2, formula (3) the right fraction indicates set Dk1With set Dk2Union size,Molecule indicates set Dk1With set Dk2Intersection size.
In another embodiment, as shown in figure 9, the generation of keyword set may include step 901-903.
In step 901, obtain the keyword with it is described come source document incidence relation, and come source document label letterBreath.
What the present embodiment was applicable to existing label information carrys out source document.In this case in combination with label information and useFamily historical behavior, the more succinct mode of use realize the polymerization of keyword.
In one embodiment, user behavior (such as the input inquiry word in data can be accessed in step 901 based on historyClicking operation afterwards) it establishes keyword and carrys out the incidence relation of source document.
In one embodiment, label information includes any information that can be used for classifying to carrying out source document.Due toLabel information is normally based on usage experience or the browsing custom of user side to generate, therefore can directly embody to a certain extentGo out user intent.Therefore, can be directly based upon in the present embodiment label information structure keyword polymerization, no longer need to calculate keyword itBetween similarity.
In step 902, the matching degree between the keyword and the label information is obtained based on the incidence relation.
In one embodiment, keyword can be obtained, come the pass between source document and label information three based on step 902Connection relationship.Figure 10 shows an example of this incidence relation, it is illustrated that left node K indicates a keyword, intermediate nodeD1-D3 indicates that with the keyword, the node T1-T3 on the right then indicates to carry out the label of source document there are the source document that comes of incidence relationInformation.
In one embodiment, in order to which which label information keyword is finally divided under by determination, may be used also in this stepKeyword is obtained based on user behavior and come the weight of incidence relation between source document.With the point carried out after user input query wordIt hits for operation, it is assumed that user's common property after inputting keyword K gives birth to 10 clicks, wherein being directed toward 1,3 direction texts of document D for 6 timesShelves D2,1 time direction document D 3, then the weight of three incidence relations can be identified as 0.6,0.3 and 0.1.As shown in Figure 10, may be usedThe size of weight is indicated based on the line thickness between K and D1-D3.
It connects, it in one embodiment, can be based on above-mentioned weight and each come the pass between source document and label informationConnection relationship, to calculate the matching probability of keyword and label information.The example of hookup 10, keyword K and document D 1-D3 itBetween the weight of incidence relation when being respectively 0.6,0.3 and 0.1, due to label T1 be assigned to document D 1 and D3, T2 be assigned toDocument D 1 and D2, T3 are assigned to document D 3, then the incidence relation numerical value between keyword and label T1-T3 can be by adding upArrive, respectively 0.7 (0.6+0.1), 0.9 (0.6+0.3) and 0.1, obtained after normalization probability respectively may be about 0.41 (7/17),0.53 (9/17) and 0.06 (1/17).
In step 903, according to the matching degree of acquisition, generated based on the keyword corresponding with the label informationKeyword set.
The matching degree being calculated according to step 902, this step can be based on the sizes of matching degree, finally will be crucial to determineWord is divided in the corresponding keyword set of which label information.
The example of hookup 10 can then add keyword K due to 0.53 highest of matching probability of keyword K and label T2It adds in keyword set corresponding with label T2.
In one embodiment, it is also possible to consider the purity of current key word when generating keyword set for step 903, ifPurity is higher, illustrates that the intention that keyword indicates is more clear, to suitable for being divided to corresponding label, otherwise illustrate its expressionIntention it is not clear enough, abandon processing of the keyword without division so as to take.
For example, the entropy between following formula (4) calculating keyword K and multiple label informations can be based on, as the pure of keyword KDegree is considered.
Entropy (K)=- ∑ pi*log(pi)…(4)
Wherein, Entropy (K) indicates the entropy of keyword K, piRepresent the matching probability of keyword K and label i.
The matching probability of the example of hookup 10, keyword K and label T1-T3 is respectively 0.41,0.53,0.06, then baseThe entropy that keyword K can be obtained in formula (3) is Entropy (K)=- 0.41*log (0.41) -0.53*log (0.53) -0.06*log(0.06)≈0.38。
It connects by taking the entropy that formula (4) is calculated as an example, entropy is higher, indicates that the purity of keyword is higher, to embodyThe user intent gone out is more apparent, if it is greater than predetermined threshold value, can directly be divided into the highest label information of matching probability (upper exampleIn be label T2) under, that is, directly the keyword can be added in corresponding with label information keyword set.On the contrary, entropyValue is lower, indicates that the purity of keyword is lower, to which the user intent embodied is also more indefinite, therefore can be considered as the keywordInvalid word and without processing.
Embodiment of the disclosure further provides a kind of query feedback device polymerizeing based on keyword.
Figure 11 is the query feedback apparatus structure schematic diagram being polymerize based on keyword according to one embodiment of the disclosure.Such as Figure 11Shown, the query feedback device based on keyword polymerization in the present embodiment includes aggregation module 1110, determination module 1120 and anti-Present module 1130.
Aggregation module 1110 is set as obtaining based on the keyword set for estimating intention generation, and with the keyword setCorresponding popularization object.
Determination module 1120 is set as, according to the query word received from user, determining the keyword set belonging to the query wordIt closes.
Feedback module 1130 be set as based on determining keyword set to the user feedback it is corresponding it is described promote pairAs.
In one embodiment, aggregation module 1110 is set as obtaining the similarity between keyword, and is based on each seedWord and the non-seed word with the seed words with highest similarity, generate the keyword set.Here seed words can wrapInclude the keyword for best embodying user intent.Acquisition for similarity, aggregation module 1110 can be by obtaining the keywordWith the incidence relation for carrying out source document, and obtained according to the similarity come between source document similar between the keywordDegree.On the other hand, aggregation module 1110 can also be based on the keyword be obtained, to generate and the keyword come source documentThe collection of document matched, and the similarity between the keyword is obtained according to the collection of document.
In another embodiment, aggregation module 1110 may also be configured to obtain keyword and carry out the incidence relation of source documentWith it is described come source document label information, between the keyword and the label information is obtained based on the incidence relationWith degree, and according to the comparison result of the matching degree and predetermined threshold value, generated and the label information pair based on the keywordThe keyword set answered.
According to the query feedback device that above-described embodiment keyword polymerize, based on the keyword set being generated in advance to userThe query word of input is matched, and feeds back corresponding popularization object according to matched result, can be more compared to single keywordIt accurately realizes that object is promoted based on user intent, combined query word is also can make full use of for promoting object and is generatedVisit capacity.
Figure 12 is the query feedback apparatus structure schematic diagram being polymerize based on keyword according to another embodiment of the disclosure.Such as figureShown in 12, shown in Figure 11 on the basis of structure, in the query feedback device that the present embodiment is polymerize based on keyword, aggregation module1110 include association acquiring unit 1111, iterative calculation 1112 sum aggregate of unit conjunction generation unit 1113.
Association acquiring unit 1111 is set as obtaining the keyword and carrys out the incidence relation of source document.In one embodimentIn, association acquiring unit 1111 can access the user behavior (such as clicking operation after input inquiry word) in data based on historyIt establishes keyword and carrys out the incidence relation of source document.
Iterative calculation unit 1112 is set as:For any two keyword, calculate it is associated come source document betweenThe average value of similarity;Source document is come for any two, calculates the average value of the similarity between associated keyword;WithAnd the similarity between the keyword is obtained based on SimRank iterative algorithms.
Set generation unit 1113 is set as the similarity calculation according to iterative calculation unit 1112 as a result, being based on each seedWord and the non-seed word with the seed words with highest similarity, generate corresponding keyword set.
Figure 13 is the query feedback apparatus structure schematic diagram being polymerize based on keyword according to disclosure another embodiment.Such as figureShown in 13, shown in Figure 11 on the basis of structure, in the query feedback device that the present embodiment is polymerize based on keyword, aggregation module1110 include document process unit 1114,1115 sum aggregate of matching primitives unit conjunction generation unit 1113.
Document process unit 1114 is set as based on obtaining the keyword come source document, and is generated and the keywordThe collection of document matched.
Matching primitives unit 1115 is set as the handling result according to document process unit 1114, obtains respectively with arbitrary twoTwo collection of document of a Keywords matching, and the ratio between the intersection based on described two collection of document and union calculate described appointSimilarity between two keywords of meaning.
Set generation unit 1113 is set as the similarity calculation according to matching primitives unit 1115 as a result, being based on each seedWord and the non-seed word with the seed words with highest similarity, generate corresponding keyword set.
Figure 14 is the query feedback apparatus structure schematic diagram being polymerize based on keyword according to disclosure another embodiment.Such as figureShown in 14, shown in Figure 11 on the basis of structure, in the query feedback device that the present embodiment is polymerize based on keyword, aggregation module1110 include label associative cell 1116,1117 sum aggregate of matching degree unit conjunction generation unit 1113.
Label associative cell 1116 be set as obtaining the keyword and come source document incidence relation and it is described come source documentThe label information of shelves.
Matching degree unit 1117 is set as obtaining between the keyword and the label information based on the incidence relationMatching degree.In one embodiment, matching degree unit 1117 be set as based on the incidence relation determine the keyword withIt is described come source document between weight, and calculate institute according to the weight and with described come the corresponding label information of source documentState the entropy of keyword and the label information.
Set generation unit 1113 is set as the matching degree calculated according to matching degree unit 1117, and is based on the matching degreeWith the comparison result of predetermined threshold value, keyword set corresponding with the label information is generated using the keyword.
About the device in above-described embodiment, wherein modules execute the concrete mode of operation in related this methodEmbodiment in be described in detail, explanation will be not set forth in detail herein.
It should be noted that although being referred to several modules or list for acting the equipment executed in above-detailedMember, but this division is not enforceable.In fact, according to embodiment of the present disclosure, it is above-described two or moreThe feature and function of module either unit can embody in a module or unit.Conversely, an above-described mouldEither the feature and function of unit can be further divided into and embodied by multiple modules or unit block.As module or listThe component of member display may or may not be physical unit, you can be located at a place, or may be distributed overIn multiple network element.Some or all of module therein can be selected according to the actual needs to realize the open scheme of woodPurpose.Those of ordinary skill in the art are without creative efforts, you can to understand and implement.
By the description of embodiment of above, those skilled in the art is it can be readily appreciated that example described above embodiment partyFormula can also be realized by software realization in such a way that software is in conjunction with necessary hardware.
For example, in an example embodiment, a kind of computer readable storage medium is also provided, is stored thereon with calculatingThe step of machine program, which may be implemented method described in any one above-mentioned embodiment when being executed by processor.The sideThe specific steps of method can refer to the detailed description in previous embodiment, and details are not described herein again.The computer readable storage mediumCan be ROM, random access memory (RAM), CD-ROM, tape, floppy disk and optical data storage devices etc..
In another example embodiment, a kind of computing device is also provided, which can be mobile phone, tablet computer etc.Mobile terminal can also be the terminal devices such as desktop computer, server, is not restricted to this in this example embodiment.Figure15 show the schematic diagram according to a kind of computing device in disclosure example embodiment 1500.For example, equipment 1500 can be carriedFor for a mobile terminal.Referring to Fig.1 5, equipment 1500 includes processing component 1510, further comprises one or more processingDevice, and by the memory resource representated by memory 1520, it can be by the instruction of the execution of processing component 1510, example for storingSuch as application program.The application program stored in memory 1520 may include it is one or more each correspond to one groupThe module of instruction.In addition, processing component 1510 is configured as executing instruction, it is anti-to execute the above-mentioned inquiry polymerizeing based on keywordFeedback method.The step of this method, can refer to the detailed description in preceding method embodiment, and details are not described herein again.
Device 1500 can also include that a power supply module 1530 be configured as the power management of executive device 1500, oneWired or wireless network interface 1540 is configured as device 1500 being connected to network and input and output (I/O) interface1550.Device 1500 can be operated based on the operating system for being stored in memory 1520, such as Android, IOS or similar.
Those skilled in the art after considering the specification and implementing the invention disclosed here, will readily occur to its of the disclosureIts embodiment.This application is intended to cover any variations, uses, or adaptations of the disclosure, these modifications, purposes orPerson's adaptive change follows the general principles of this disclosure and includes the undocumented common knowledge in the art of the disclosureOr conventional techniques.The description and examples are only to be considered as illustrative, and the true scope and spirit of the disclosure are by appendedClaim is pointed out.
Although exemplary embodiment describing the disclosure with reference to several, it is to be understood that, term used is explanation and showsExample property, term and not restrictive.The spirit or reality that can be embodied in a variety of forms without departing from application due to the disclosureMatter, it should therefore be appreciated that above-described embodiment is not limited to any details above-mentioned, and should be spiritual defined by appended claimsAccompanying is all should be with the whole variations and remodeling widely explained, therefore fallen into claim or its equivalent scope in range to weighProfit requires to be covered.