Movatterモバイル変換


[0]ホーム

URL:


CN102103615B - Three-segment sequential collecting method and system for retrieval results - Google Patents

Three-segment sequential collecting method and system for retrieval results
Download PDF

Info

Publication number
CN102103615B
CN102103615BCN200910243807.7ACN200910243807ACN102103615BCN 102103615 BCN102103615 BCN 102103615BCN 200910243807 ACN200910243807 ACN 200910243807ACN 102103615 BCN102103615 BCN 102103615B
Authority
CN
China
Prior art keywords
result
retrieval
head section
capacity
interlude
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN200910243807.7A
Other languages
Chinese (zh)
Other versions
CN102103615A (en
Inventor
童征宇
徐剑波
赵东岩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
New Founder Holdings Development Co ltd
Peking University
Founder Apabi Technology Ltd
Original Assignee
Peking University
Peking University Founder Group Co Ltd
Beijing Founder Apabi Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Peking University, Peking University Founder Group Co Ltd, Beijing Founder Apabi Technology Co LtdfiledCriticalPeking University
Priority to CN200910243807.7ApriorityCriticalpatent/CN102103615B/en
Publication of CN102103615ApublicationCriticalpatent/CN102103615A/en
Application grantedgrantedCritical
Publication of CN102103615BpublicationCriticalpatent/CN102103615B/en
Expired - Fee Relatedlegal-statusCriticalCurrent
Anticipated expirationlegal-statusCritical

Links

Images

Landscapes

Abstract

Translated fromChinese

本发明公开了一种检索结果三段式有序收集方法及系统,属于检索技术领域。现有收集方式当检索结果较多时需要占用大量的内存空间,而且排序性能低。本发明所述方法及系统首先根据检索请求进行检索并获得检索结果;然后根据预先确定的头段、尾段和中间段的容量将检索结果中排序在最前的部分收集在头段中,排序在最后的部分收集在尾段中,排序在中间的部分收集在中间段中;最后根据检索结果返回请求从头段、尾段和/或中间段中返回检索结果。采用本发明能够有效地减少检索结果的存储空间,增强检索结果的排序性能,综合提高检索系统的整体性能。

Figure 200910243807

The invention discloses a three-stage orderly collection method and system for retrieval results, belonging to the technical field of retrieval. The existing collection method needs to occupy a large amount of memory space when there are many retrieval results, and the sorting performance is low. The method and system of the present invention first perform retrieval according to the retrieval request and obtain the retrieval results; then according to the capacity of the predetermined head section, tail section and middle section, the parts sorted at the top of the retrieval results are collected in the head section, sorted in the first section. The last part is collected in the tail section, and the middle part is collected in the middle section; finally, the retrieval result is returned from the head section, the tail section and/or the middle section according to the retrieval result return request. The invention can effectively reduce the storage space of the retrieval results, enhance the sorting performance of the retrieval results, and comprehensively improve the overall performance of the retrieval system.

Figure 200910243807

Description

A kind of result for retrieval syllogic ordered collection method and system
Technical field
The invention belongs to retrieval technique field, be specifically related to a kind of result for retrieval syllogic ordered collection method and system, be particularly suitable for the ordered collection to full-text search result.
Background technology
Text retrieval system, after retrieving according to retrieval request, also needs result for retrieval to sort, and then returns to the Ordinal in user's specified scope.Sort by is the degree of correlation of result for retrieval and retrieval request normally, can be also other foundations, such as the significance level of result for retrieval, the order of certain field value of result for retrieval etc.The object that result for retrieval is sorted, the result that makes to meet user's needs appears at the foremost of result sequence.
Text retrieval system foreground is generally to result for retrieval paging displaying, and once an exposition result, constantly shows subsequent result by page turning.When page turning, or by retrieval again or obtain the content of this displaying by reading the buffer memory generating while retrieving last time.Therefore the result for retrieval that text retrieval system returns is generally the data volume of one page or multipage, and along with the carrying out of page turning, the result for retrieval scope of obtaining constantly moves.
The subsequent treatment band of the result for retrieval of text retrieval system performance issue: the increase of the data volume of processing along with text retrieval system, in the situation that hitting large result, the storage of result for retrieval and sequence can take a large amount of storage spaces and processing time, cause retrieval performance not good, height is processed result for retrieval concomitantly fast.
At present, the solution that the search engine on network adopts is: only support to return front some maximally related results, surpass and limit not returning of quantity, thereby avoided this problem.As Baidu, Google all only provide less than the browsing of 800 result for retrieval, page turning only provides prompting after surpassing this quantity, no longer continues to show later retrieval result.But in the application of enterprise-level text retrieval system, requirement can be returned to whole results that retrieval is hit, and when the continuous page turning of user, can browse to whole result for retrieval.
User, when browsing result for retrieval, generally only browses former page, but also constantly page turning sometimes, or checks the last few pages.Therefore need to be according to user's demand, returning part result for retrieval.These result for retrieval may sort the most forward, may be also after leaning on most, or the result in the middle of being positioned at.
In prior art, in order to reduce the pressure of result for retrieval storage and sequence, for user, only browse the situation of former pages, use Top-K method.The best result for retrieval of K item before described Top-K method refers to and only collects, abandons all the other results.The quantity K that the default needs of text retrieval system are collected returns to the part that user requires from this in K item result.If Ahmed Metwally etc. is described in " EfficientComputation of Frequent and Top-k Elements in Data Streams " (In Proceedingsof the 10th ICDT International Conference on Database Theory.398-412.): some results sorting before collecting, when new result is added in set, replace minimum item, and finally from set, return to Top-K item value.Needed memory headroom while having saved in this way processing large result, thus raise the efficiency.But the countermeasure when result for retrieval that does not propose user's request exceeds front K item.Therefore, the method is only applicable to only need the situation of the best result for retrieval of returning part.
When the result for retrieval of user's request exceeds front K item, in prior art, generally there are following two kinds of collection modes:
1. collect whole results and carry out partial orderedly, return to the desired part of user.
The result that unordered collection is all hit, the range of results of returning according to user's requirement is carried out partial ordered, obtains between fruiting area, and the result in this interval is carried out to complete sequence, thereby obtain required result and return to user.The defect that this mode exists is: need a large amount of storage spaces, and the sequence performance of large result is lower.
2. dynamic expansion K value, to the K ' that comprises user and require the result return, is collected the best result for retrieval of K ' item.
In the famous text retrieval system Lucene that increases income (http://lucene.apache.org/), used this mode.As Top-K method, the best result for retrieval of K ' item before only collecting, abandons all the other results, finally obtains required result and returns to user.This mode has good performance when K ' approaches K, but along with the expansion of K ' value, its performance also constantly declines.The result for retrieval scope requiring user, when whole result for retrieval last, just become and collects whole result for retrieval, and storage space is now the same with the 1st kind of mode, and owing to all sorting and causing performance lower.
Above two kinds of existing modes all need to preserve a large amount of result for retrieval, thereby have taken a large amount of memory headrooms, and the performance that sorts when result for retrieval is more is low, make the hydraulic performance decline of text retrieval system.
Summary of the invention
For the defect existing in prior art, fundamental purpose of the present invention is to provide that a kind of can to return to sequence according to user's needs the most forward, or after leaning on most, or the result for retrieval syllogic ordered collection method and system of some middle result for retrieval.On this basis, the present invention can also improve the sequence performance of result for retrieval, reduces the storage space of result for retrieval, reduces taking of memory headroom, improves on the whole the performance of searching system.
To achieve these goals, the technical solution used in the present invention is as follows:
A result for retrieval syllogic ordered collection method, first the method retrieves and obtains result for retrieval according to retrieval request; Then according to the capacity of predetermined head section, rear and interlude, sequence in result for retrieval is collected in head section at head, sequence is collected in rear in last part, and sequence is collected in interlude in middle part; Finally according to result for retrieval, return to request and from the beginning in section, rear and/or interlude, return to result for retrieval.
Result for retrieval syllogic ordered collection method as above, wherein, a head section capacity comprises the first capacity K and the second capacity K ', the second capacity K ' is between 1.5 times to 10 times of the first capacity K; Rear capacity M is between 0.5 times to 1 times of the first capacity K; The result for retrieval scope that interlude capacity N returns according to request and head section capacity are determined.Head section the first capacity K is generally the result for retrieval quantity that 5 to 10 result for retrieval displayed pages comprise, and is preferably the result for retrieval quantity that 8 result for retrieval displayed pages comprise; The second capacity K ' is preferably the result for retrieval quantity that 50 result for retrieval displayed pages comprise, and rear capacity M is preferably the result for retrieval quantity that 4 result for retrieval displayed pages comprise.
Result for retrieval syllogic ordered collection method as above, wherein, the result for retrieval descending sort in head section, the result for retrieval ascending order in rear is arranged, and the result for retrieval in interlude does not sort.
Result for retrieval syllogic ordered collection method as above, wherein, sequence in result for retrieval is collected in head section at head, and sequence is collected in rear in last part, and the process that sequence is collected in interlude in middle part comprises the following steps:
(1) determine head section the first capacity K and the second capacity K ', rear capacity M;
(2) according to result for retrieval, return to the result for retrieval scope that in request, requirement is returned and determine collection mode;
Described collection mode comprises only to be collected head section result for retrieval and collects head section, rear and interlude result for retrieval simultaneously; If collect head section, rear and interlude result for retrieval simultaneously, determine the distance D between interlude capacity N, interlude and head section;
(3) according to collection mode definite in step (2), result for retrieval is collected to section to the end, or head section, rear and interlude.
Result for retrieval syllogic ordered collection method as above, the result for retrieval scope of returning as requested described in step (2) determines that the process of collection mode is as follows:
The result for retrieval scope that request is returned is by reference position SI and return to amount R C and represent;
If (SI+RC) ∈ [0, K], head section capacity adopts the first capacity K, only collects head section result for retrieval;
If (SI+RC) ∈ (K, K '], a head section capacity adopts the second capacity K ', only collects a head section result for retrieval;
If (SI+RC) > K ' and SI < K ', head section capacity adopts the second capacity K ', N=SI+RC-K ', and D=0, collects head section, rear and interlude result for retrieval;
If (SI+RC) > K ' and SI >=K ', head section capacity adopts the second capacity K ', N=RC, and D=SI-K ', collects head section, rear and interlude result for retrieval.
Result for retrieval syllogic ordered collection method as above, described in step (3), result for retrieval is collected to section to the end, or the process of head section, rear and interlude comprises the following steps:
1. from result for retrieval set, take out a result for retrieval X;
2. whether judgement head section is full; As less than, result for retrieval X is added in head section, and in correct section, result for retrieval sorts, and goes to step 7.;
If 3. head section is full, result for retrieval X is compared with the minterm in head section; If be less than minterm, using result for retrieval X as the result excluding; Otherwise result for retrieval X is added in head section and sequence, using minterm as the result excluding;
4. judge that whether rear is full; As less than, the result excluding is added in rear, and result for retrieval in rear is sorted, go to step 7.;
If 5. rear is full, the maximal term in the result excluding and rear is compared; If be less than maximal term, the result excluding added in rear, and the result for retrieval in rear is sorted, using maximal term as the result excluding; Otherwise the result excluding is constant;
6. judge the quantity of result for retrieval in the value of D and interlude; If D > 0, successively decreases one by D, abandon the result excluding; If D=0, and in interlude, the quantity of result for retrieval is less than N, the result excluding is added in interlude, according to the order adding, preserves result for retrieval; If the quantity of result for retrieval equals N in interlude, abandon the result excluding;
7. judge whether result for retrieval set is empty; In this way, finish; As no, go to step 1..
Result for retrieval syllogic ordered collection method as above, wherein, according to result for retrieval, return and ask the process of returning to result for retrieval to comprise the following steps:
(a) the relatively quantity of result for retrieval and a capacity for head section in head section;
(b) if the quantity of result for retrieval is less than its capacity in head section, the RC bar result for retrieval a Duan Zhongcong SI being started is as returning results; If RC has surpassed a Duan Zhongcong SI, start to the result for retrieval quantity of ending, a Duan Zhongcong SI is started to the result for retrieval ending up as returning results; If SI is greater than the quantity of result for retrieval in head section, return results as sky;
(c) if the quantity of result for retrieval equals its capacity in head section, from SI, start [MAX ((K '-SI) minimum head section, 0)] in the whole result for retrieval in bar result for retrieval, interlude and rear, maximum [the whole result for retrieval quantity in RC-MAX ((K '-SI), 0)-interlude] bar record conduct returns results.
A kind of result for retrieval syllogic ordered collection system, comprise for retrieve and obtain the indexing unit of result for retrieval according to retrieval request, for result for retrieval sequence being collected in to the head section collection device in head section at head according to predetermined head section capacity; For result for retrieval sequence being collected in to the rear collection device in rear in last part according to predetermined rear capacity; For result for retrieval sequence being collected in to the interlude collection device in interlude in middle part according to predetermined interlude capacity; And the result for retrieval return mechanism that returns to result for retrieval for return to the result for retrieval scope of asking requirement to be returned according to result for retrieval.
The method of the invention and system, by the mode that adopts three sections to collect result for retrieval, not only realized that according to user's needs, to return to sequence the most forward, or after leaning on most, or some middle result for retrieval, and the effectively less storage space of result for retrieval, strengthened the sequence efficiency of result for retrieval, comprehensively improved the overall performance of searching system.
Accompanying drawing explanation
Fig. 1 is the graph of a relation between three sections in embodiment;
Fig. 2 is the structured flowchart of syllogic ordered collection system in embodiment;
Fig. 3 adopts system shown in Figure 2 to collect the process flow diagram of result for retrieval in embodiment;
Fig. 4 adds result for retrieval respectively the process flow diagram of section, rear and interlude to the end according to the order of sequence in embodiment.
Embodiment
Core concept of the present invention is: first according to retrieval request, retrieve and obtain result for retrieval; Then according to the capacity of predetermined head section (Top-K), rear (Bottom-M) and interlude (Middle-N), sequence in result for retrieval is collected in head section at head, sequence is collected in rear in last part, and sequence is collected in interlude in middle part; Finally according to result for retrieval, return to request and from the beginning in section, rear and/or interlude, return to result for retrieval.Wherein, the result for retrieval descending sort in head section, the result for retrieval ascending order in rear is arranged, and the result for retrieval in interlude does not sort.Relation between head section, interlude and rear as shown in Figure 1.In the time of in the front or the most last ranking results that the range of results that user's request is returned is being collected, return to ranking results accurately; When range of results that request is returned is not exclusively in the orderly range of results in front and back, return to front and rear part Ordinal and middle not ranking results; When range of results that request is returned is completely not in the orderly range of results in front and back, the not ranking results in the middle of returning.Below in conjunction with embodiment and accompanying drawing, describe the present invention.
Fig. 2 has shown the structure of the preferred implementation of result for retrieval syllogic ordered collection system of the present invention.This system comprisesindexing unit 11, headsection collection device 12,rear collection device 13,interlude collection device 14 and result forretrieval return mechanism 15.
Indexing unit 11 is for retrieving and obtain result for retrieval according to retrieval request.Headsection collection device 12 for according to predetermined head section capacity by result for retrieval sequence in some the most front collection in head section, the result for retrieval descending sort in head section.Rear collection device 13 for according to predetermined rear capacity by result for retrieval sequence in some last collection at rear, the result for retrieval ascending order in rear is arranged.Interlude collection device 14 for according to predetermined interlude capacity by result for retrieval sequence in some collection of centre in interlude, the result for retrieval of interlude does not sort.Result forretrieval return mechanism 15 requires the result for retrieval scope of returning to return to result for retrieval for return to request according to result for retrieval.
Fig. 3 adopts system shown in Figure 1 to collect the flow process of result for retrieval, comprises the following steps:
(1) determine head section the first capacity K and the second capacity K ', rear capacity M, and result for retrieval returns to the result for retrieval scope that in request, requirement is returned.Wherein, the value of K, K ' and M is according to user's service condition analysis is determined afterwards, K ' be generally K 1.5 times to 10 between, M is generally between 0.5 to 1 times of K.K is generally the result for retrieval quantity that 5-10 result for retrieval displayed page comprises, and the best is the result for retrieval quantity that 8 result for retrieval displayed pages comprise.K ' adjusts according to the value of K, and optimum value is the result for retrieval quantity that 50 result for retrieval displayed pages comprise.M generally remains on the result for retrieval quantity that 4 left and right result for retrieval displayed pages comprise.
(2) the result for retrieval scope of returning is as requested determined collection mode, and described collection mode comprises only to be collected head section result for retrieval and collect head section, rear and interlude result for retrieval simultaneously.If need to collect head section, rear and interlude result for retrieval simultaneously, determine the distance D (with reference to Fig. 1) of interlude capacity N (being the fruiting quantities that interlude need to be collected), interlude and head section.
The result for retrieval scope that requirement is returned is by reference position StartIndex (being abbreviated as SI) and return to amount R eturnCount (being abbreviated as RC) expression.When result for retrieval is multipage, along with user's page turning, SI constantly changes; RC value is exactly the result for retrieval quantity that one page is shown, is generally 10.
When result for retrieval scope that needs return is in [0, K] interval, during (SI+RC) ∈ [0, K], a head section capacity adopts K.When result for retrieval scope that needs return (K, K '] in interval, (SI+RC) ∈ (K, K '] time, a head section capacity adopts K '.Under both of these case, only need to collect head section result for retrieval, without collecting rear and interlude.
When result for retrieval scope that needs return not all in a segment limit (now a head section capacity is K '), i.e. (SI+RC) > K ' time, if existing, both scopes partly overlap, i.e. SI < K ', N=SI+RC-K ', D=0.If it is overlapping that both do not exist, i.e. SI >=K ', N=RC, D=SI-K '.Under both of these case, need to collect head section, rear and interlude simultaneously.
(3) according to collection mode definite in step (2), result for retrieval is collected to section to the end, or head section, rear and interlude, detailed process as shown in Figure 4, comprises the following steps:
1. from result for retrieval set, take out a result for retrieval X;
2. whether judgement head section is full; As less than, result for retrieval X is added in head section, and in correct section, result for retrieval carries out descending sort, goes to step 7.;
If 3. head section is full, result for retrieval X is compared with the minterm in head section; If be less than minterm, using result for retrieval X as the result excluding; Otherwise result for retrieval X is added in head section and descending sort, using minterm as the result excluding; Described minterm refers to the item that comes minimum position according to ordering rule;
4. judge that whether rear is full; As less than, the result excluding is added in rear, and result for retrieval in rear is carried out to ascending order arrangement, go to step 7.;
If 5. rear is full, the maximal term in the result excluding and rear is compared; If be less than maximal term, the result excluding added in rear, and the result for retrieval in rear is carried out to ascending order arrangement, using maximal term as the result excluding; Otherwise the result excluding is constant; Described maximal term refers to the item that comes maximum position according to ordering rule;
6. judge the quantity of result for retrieval in the value of D and interlude; If D > 0, successively decreases one by D, abandon the result excluding; If D=0, and in interlude, the quantity of result for retrieval is less than N, the result excluding is added in interlude, according to the order adding, preserves result for retrieval; If the quantity of result for retrieval equals N in interlude, abandon the result excluding;
7. judge whether result for retrieval set is empty; In this way, finish; As no, go to step 1..
(4) according to result for retrieval, return to request and require the result for retrieval scope of returning to return to result for retrieval, detailed process comprises the following steps:
(a) the relatively quantity of result for retrieval and a capacity for head section in head section.
(b) if the quantity of result for retrieval is less than its capacity in head section, illustrate that now interlude, rear are all empty, the RC bar result for retrieval that a Duan Zhongcong SI is started is as returning results.The result for retrieval quantity that Duan Zhongcong SI starts may not enough RC bar, also may ask the result for retrieval scope of returning to exceed the result for retrieval scope of collecting in head section.If the result for retrieval amount R C that request is returned has surpassed a Duan Zhongcong SI, start to the result for retrieval quantity of ending, a Duan Zhongcong SI is started to the result for retrieval ending up as returning results; If SI is greater than the quantity of result for retrieval in head section, return results as sky.
(c) if the quantity of result for retrieval equals its capacity in head section, from SI, by MAX ((K '-SI) minimum in head section, 0) in the whole result for retrieval in bar result for retrieval, interlude and rear, maximum [the whole result for retrieval quantity in RC-MAX ((K '-SI), 0)-interlude] bar record conduct returns results.If interlude is non-conterminous with head section, the result of returning in head section is for empty; If interlude and rear are non-conterminous, rear returns results as sky; If when request only returns results therein in a section, other sections returns results as sky.
Embodiment
The result that the collection of take is below retrieved press index storehouse is example, and said method is illustrated.In the retrieve application in press index storehouse, a result for retrieval displayed page is shown 10 records, 8 pages of results of primary retrieval buffer memory just can be satisfied the demand substantially, therefore K gets the fruiting quantities 80 that 8 result for retrieval displayed pages comprise, K ' gets 500, it is 40 that M gets 0.5 of K, and these parameters are that user is according to the feature configured in advance of searching system.
Inquiry " content: China " in the press index storehouse of approximately 1,000 ten thousand data volumes, and according to the order sequence from back to front of publications time, hit altogether 2,201,853 results.
When getting follow-up 10 results since the 10th, i.e. SI=10, RC=10.SI+RC < K, requires the result for retrieval returning in head section.Now, directly front K item is sorted to maximum collection in head section, all the other results are abandoned.Finally from the beginning in section, obtain needed result.
When getting follow-up 10 results since the 495th, i.e. SI=495, RC=10.SI+RC > K ' and SI < K ', require the result for retrieval returning in head section and interlude, N=SI+RC-K '=5, D=0.
For every result for retrieval, according to method as shown in Figure 4, first add to the end in section, if there is result to be excluded, join in rear again, if there is result to be excluded, according to the situation of interlude, add interlude again, all the other results are abandoned.
Finally from the beginning in section, read from the minimum result of [MAX ((K '-SI), 0)=5] bar, and from interlude, obtain all results (N=5 bar), as returning results, the result that rear returns is for empty.
When getting follow-up 10 results since the 510th, i.e. SI=510, RC=10.SI+RC > K ', requires the result for retrieval returning in interlude, N=RC=10, D=SI-K '=10.
For every result for retrieval, according to method as shown in Figure 4, first add to the end in section, if there is result to be excluded, join in rear again, if there is result to be excluded, according to the situation of interlude, add interlude again, all the other results are abandoned.
Finally from interlude, obtain all results (N=10 bar), as returning results, the returning results as sky of other sections.
When from the 2nd, 201,810 while starting to get follow-up 10 results, i.e. SI=2,201,810, RC=10.SI+RC > K ', requires the result for retrieval returning in interlude and rear, N=RC=10, D=SI-K '=2,201,310.
For every result for retrieval, according to method as shown in Figure 4, first add to the end in section, if there is result to be excluded, join in rear again, if there is result to be excluded, according to the situation of interlude, add interlude again, all the other results are abandoned.
Finally from interlude, obtain all results (less than, only have 3), bar (totally 7) the maximum result that sorts is as returning results from rear, to obtain [whole fruiting quantities in RC-MAX ((K '-SI), 0)-interlude], and head section returns results as sky.
When from the 2nd, 201,820 while starting to get follow-up 10 results, i.e. SI=2,201,820, RC=10.SI+RC > K ', requires the result for retrieval returning in rear, N=RC=10, D=SI-K '=2,201,320.
For every result for retrieval, according to method as shown in Figure 4, first add to the end in section, if there is result to be excluded, join in rear again, if there is result to be excluded, according to the situation of interlude, add interlude again, all the other results are abandoned.
Finally from rear, obtain and return results, other sections return results as sky.
Obviously, those skilled in the art can carry out various changes and modification and not depart from the spirit and scope of the present invention the present invention.Like this, if within of the present invention these are revised and modification belongs to the scope of the claims in the present invention and equivalent technology thereof, the present invention is also intended to comprise these changes and modification interior.

Claims (8)

1. a result for retrieval syllogic ordered collection method, is characterized in that: first the method retrieves and obtain result for retrieval according to retrieval request; Then according to the capacity of predetermined head section, rear and interlude, sequence in result for retrieval is collected in head section at head, sequence is collected in rear in last part, and sequence is collected in interlude in middle part; Finally according to result for retrieval, return to request and from the beginning in section, rear and/or interlude, return to result for retrieval;
A head section capacity comprises the first capacity K and the second capacity K ', and the second capacity K ' is between 1.5 times to 10 times of the first capacity K; Rear capacity M is between 0.5 times to 1 times of head section the first capacity K; The result for retrieval scope that interlude capacity N returns according to request and head section capacity are determined;
Sequence in result for retrieval is collected in head section at head, and sequence is collected in rear in last part, and the process that sequence is collected in interlude in middle part comprises the following steps:
(1) determine head section the first capacity K and the second capacity K ', rear capacity M;
(2) according to result for retrieval, return to the result for retrieval scope that in request, requirement is returned and determine collection mode;
Described collection mode comprises only to be collected head section result for retrieval and collects head section, rear and interlude result for retrieval simultaneously; If collect head section, rear and interlude result for retrieval simultaneously, determine the distance D between interlude capacity N, interlude and head section;
(3) according to collection mode definite in step (2), result for retrieval is collected to section to the end, or head section, rear and interlude.
2. result for retrieval syllogic ordered collection method as claimed in claim 1, is characterized in that: described head section the first capacity K is 5 to 10 result for retrieval quantity that result for retrieval displayed page comprises.
3. result for retrieval syllogic ordered collection method as claimed in claim 2, it is characterized in that: described head section the first capacity K is the result for retrieval quantity that 8 result for retrieval displayed pages comprise, the second capacity K ' is the result for retrieval quantity that 50 result for retrieval displayed pages comprise, and rear capacity M is the result for retrieval quantity that 4 result for retrieval displayed pages comprise.
4. the result for retrieval syllogic ordered collection method as described in one of claims 1 to 3, is characterized in that: the result for retrieval descending sort in described head section, and the result for retrieval ascending order in rear is arranged, and the result for retrieval in interlude does not sort.
5. result for retrieval syllogic ordered collection method as claimed in claim 1, is characterized in that, the result for retrieval scope of returning as requested described in step (2) determines that the process of collection mode is as follows:
The result for retrieval scope that request is returned is by reference position SI and return to amount R C and represent;
If (SI+RC) ∈ [0, K], head section capacity adopts the first capacity K, only collects head section result for retrieval;
If (SI+RC) ∈ (K, K '], a head section capacity adopts the second capacity K ', only collects a head section result for retrieval;
If (SI+RC) > K ' and SI < K ', head section capacity adopts the second capacity K ', N=SI+RC-K ', and D=0, collects head section, rear and interlude result for retrieval;
If (SI+RC) > K ' and SI >=K ', head section capacity adopts the second capacity K ', N=RC, and D=SI-K ', collects head section, rear and interlude result for retrieval.
6. result for retrieval syllogic ordered collection method as claimed in claim 5, is characterized in that, described in step (3), result for retrieval is collected to section to the end, or the process of head section, rear and interlude comprises the following steps:
1. from result for retrieval set, take out a result for retrieval X;
2. whether judgement head section is full; As less than, result for retrieval X is added in head section, and in correct section, result for retrieval sorts, and goes to step 7.;
If 3. head section is full, result for retrieval X is compared with the minterm in head section; If be less than minterm, using result for retrieval X as the result excluding; Otherwise result for retrieval X is added in head section and sequence, using minterm as the result excluding;
4. judge that whether rear is full; As less than, the result excluding is added in rear, and result for retrieval in rear is sorted, go to step 7.;
If 5. rear is full, the maximal term in the result excluding and rear is compared; If be less than maximal term, the result excluding added in rear, and the result for retrieval in rear is sorted, using maximal term as the result excluding; Otherwise the result excluding is constant;
6. judge the quantity of result for retrieval in the value of D and interlude; If D>0, successively decreases one by D, abandon the result excluding; If D=0, and in interlude, the quantity of result for retrieval is less than N, the result excluding is added in interlude, according to the order adding, preserves result for retrieval; If the quantity of result for retrieval equals N in interlude, abandon the result excluding;
7. judge whether result for retrieval set is empty; In this way, finish; As no, go to step 1..
7. result for retrieval syllogic ordered collection method as claimed in claim 6, is characterized in that, according to result for retrieval, returns and asks the process of returning to result for retrieval to comprise the following steps:
(a) the relatively quantity of result for retrieval and a capacity for head section in head section;
(b) if the quantity of result for retrieval is less than its capacity in head section, the RC bar result for retrieval a Duan Zhongcong SI being started is as returning results; If RC has surpassed a Duan Zhongcong SI, start to the result for retrieval quantity of ending, a Duan Zhongcong SI is started to the result for retrieval ending up as returning results; If SI is greater than the quantity of result for retrieval in head section, return results as sky;
(c) if the quantity of result for retrieval equals its capacity in head section, from SI, start [MAX ((K '-SI) minimum head section, 0)] maximum in the whole result for retrieval in bar result for retrieval, interlude and rear [RC-MAX ((K '-SI), 0) whole result for retrieval quantity] in-interlude bar record is as returning results.
8. a result for retrieval syllogic ordered collection system, comprise for retrieve and obtain the indexing unit (11) of result for retrieval according to retrieval request, it is characterized in that: described system also comprises for result for retrieval sequence being collected in to the head section collection device (12) in head section at head according to predetermined head section capacity; For result for retrieval sequence being collected in to the rear collection device (13) in rear in last part according to predetermined rear capacity; For result for retrieval sequence being collected in to the interlude collection device (14) in interlude in middle part according to predetermined interlude capacity; And the result for retrieval return mechanism (15) that returns to result for retrieval for return to the result for retrieval scope of asking requirement to be returned according to result for retrieval;
A described head section capacity comprises the first capacity K and the second capacity K ', and the second capacity K ' is between 1.5 times to 10 times of the first capacity K; Described rear capacity M is between 0.5 times to 1 times of head section the first capacity K; The result for retrieval scope that described interlude capacity N returns according to request and head section capacity are determined;
Head section collection device (12) by sequence in result for retrieval at head, be collected in head section, rear collection device (13) by sequence in last part is collected in rear and interlude collection device (14) concrete mode that sequence is collected in to interlude in middle part be:
(1) determine head section the first capacity K and the second capacity K ', rear capacity M;
(2) according to result for retrieval, return to the result for retrieval scope that in request, requirement is returned and determine collection mode;
Described collection mode comprises only to be collected head section result for retrieval and collects head section, rear and interlude result for retrieval simultaneously; If collect head section, rear and interlude result for retrieval simultaneously, determine the distance D between interlude capacity N, interlude and head section;
(3) according to collection mode definite in step (2), result for retrieval is collected to section to the end, or head section, rear and interlude.
CN200910243807.7A2009-12-212009-12-21Three-segment sequential collecting method and system for retrieval resultsExpired - Fee RelatedCN102103615B (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN200910243807.7ACN102103615B (en)2009-12-212009-12-21Three-segment sequential collecting method and system for retrieval results

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN200910243807.7ACN102103615B (en)2009-12-212009-12-21Three-segment sequential collecting method and system for retrieval results

Publications (2)

Publication NumberPublication Date
CN102103615A CN102103615A (en)2011-06-22
CN102103615Btrue CN102103615B (en)2014-03-26

Family

ID=44156392

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN200910243807.7AExpired - Fee RelatedCN102103615B (en)2009-12-212009-12-21Three-segment sequential collecting method and system for retrieval results

Country Status (1)

CountryLink
CN (1)CN102103615B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN101140588A (en)*2007-10-102008-03-12华为技术有限公司 A sorting method and device for relational search results
CN101158971A (en)*2007-11-152008-04-09深圳市迅雷网络技术有限公司 Method and device for sorting search results based on search engine
CN101425068A (en)*2007-10-302009-05-06国际商业机器公司Method for ordering search result and ordering device
CN101556603A (en)*2009-05-062009-10-14北京航空航天大学Coordinate search method used for reordering search results

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN101196916A (en)*2007-12-272008-06-11腾讯科技(深圳)有限公司Method and device for fragment storage file

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN101140588A (en)*2007-10-102008-03-12华为技术有限公司 A sorting method and device for relational search results
CN101425068A (en)*2007-10-302009-05-06国际商业机器公司Method for ordering search result and ordering device
CN101158971A (en)*2007-11-152008-04-09深圳市迅雷网络技术有限公司 Method and device for sorting search results based on search engine
CN101556603A (en)*2009-05-062009-10-14北京航空航天大学Coordinate search method used for reordering search results

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
JP特开2009-140161A 2009.06.25
吴广君等.网络数据流分段存储模型的研究与实现.《通信学报》.2007,第28卷(第12期),第66-71页.*
王志勇.网络数据流存储算法分析与实现.《计算机应用与软件》.2009,第26卷(第11期),第186-188,220页.*

Also Published As

Publication numberPublication date
CN102103615A (en)2011-06-22

Similar Documents

PublicationPublication DateTitle
US8965904B2 (en)Apparatus and method for information access, search, rank and retrieval
CN101246499B (en)Network information search method and system
CN102725759B (en)Semantic directory for search results
US9652558B2 (en)Lexicon based systems and methods for intelligent media search
JP5550669B2 (en) SEARCH DEVICE, SEARCH METHOD, AND PROGRAM
US9235643B2 (en)Method and system for generating search results from a user-selected area
CN1979469A (en)Index and its extending and searching method
CN102902688A (en)Method and device for displaying results of keyword search
CN105183897A (en)Method and system for ranking video retrieval
CN102542052A (en)Priority hash index
JP2011523731A5 (en)
CN102004782A (en)Search result sequencing method and search result sequencer
US20120130994A1 (en)Matching funnel for large document index
CN103678491A (en)Method based on Hadoop small file optimization and reverse index establishment
US20120130996A1 (en)Tiering of posting lists in search engine index
JP2012533819A (en) Method and system for document indexing and data querying
CN105589929A (en)Image retrieval method and device
US20080059432A1 (en)System and method for database indexing, searching and data retrieval
CN102831224B (en)Generation method and device are suggested in a kind of method for building up in data directory library, search
US20160041975A1 (en)Document tagging and retrieval using per-subject dictionaries including subject-determining-power scores for entries
CN106874402A (en)Searching method and device
CN104361109A (en)Method and device for determining picture screening result
CN103942232B (en)For excavating the method and apparatus being intended to
CN102103615B (en)Three-segment sequential collecting method and system for retrieval results
US20140280050A1 (en)Term searching based on context

Legal Events

DateCodeTitleDescription
C06Publication
PB01Publication
C10Entry into substantive examination
SE01Entry into force of request for substantive examination
GR01Patent grant
GR01Patent grant
TR01Transfer of patent right
TR01Transfer of patent right

Effective date of registration:20220620

Address after:3007, Hengqin international financial center building, No. 58, Huajin street, Hengqin new area, Zhuhai, Guangdong 519031

Patentee after:New founder holdings development Co.,Ltd.

Patentee after:FOUNDER APABI TECHNOLOGY Ltd.

Patentee after:Peking University

Address before:100871, fangzheng building, 298 Fu Cheng Road, Beijing, Haidian District

Patentee before:PEKING UNIVERSITY FOUNDER GROUP Co.,Ltd.

Patentee before:FOUNDER APABI TECHNOLOGY Ltd.

Patentee before:Peking University

CF01Termination of patent right due to non-payment of annual fee
CF01Termination of patent right due to non-payment of annual fee

Granted publication date:20140326


[8]ページ先頭

©2009-2025 Movatter.jp