Embodiment
In order to make purpose of the present invention, technical scheme and advantage clearer, below in conjunction with drawings and Examples, the present invention is further elaborated.Should be appreciated that specific embodiment described herein only in order to explain the present invention, is not intended to limit the present invention.
In the embodiment of the invention, generate corresponding index data according to Web Community's data of obtaining, when receiving the key word of user's input, search for corresponding text data according to the index data that generates, and demonstration and user-dependent text data.
It is a kind of that the embodiment of the invention provides: the fast searching method of Web Community's data and system.
Described method comprises: obtain Web Community's data, and generate corresponding index data, described index data comprises text index data, authority index data and user index data;
Concern the chain index data according to text index data and the generation of user index data, the described chain index data that concern are the index data of text and each user's pass tethers;
Receive the key word of user's input, and at the described text data of searching in the chain index data with described keyword match that concerns;
Filter the text data that the match is successful according to the authority index data, and show the text data after filtering.
Described system comprises: the index data generation unit, and be used for obtaining Web Community's data, and generate corresponding index data, described index data comprises text index data, authority index data and user index data;
Customer relationship chain index data generating unit is used for concerning the chain index data according to text index data and the generation of user index data that the described chain index data that concern are the index data of text and each user's pass tethers;
Text data is searched the unit, is used for receiving the key word of user's input, and at the described text data of searching in the chain index data with described keyword match that concerns;
The text data display unit is used for filtering the text data that the match is successful according to the authority index data, and shows the text data after filtering.
In the embodiment of the invention, generate corresponding text index data, authority index data and user index data according to Web Community's data of obtaining, and generate the index data of text and each user's pass tethers according to the text index data that generate and user index data.Behind the key word that receives user's input, arrange the text data that finds in the index data of the pass tethers that is presented at text and each user.Because in Web Community, the user retrieves the text data of delivering with the certain user who concerns of its existence usually, therefore arrange and show that text data makes the user go out the required text data of searching of user by quick-searching, thereby saved searching the time of user, improved user's good experience.
For technical solutions according to the invention are described, describe below by specific embodiment.
Embodiment one:
Fig. 1 shows the process flow diagram of the fast searching method of Web Community's data that first embodiment of the invention provides, in the present embodiment, generate corresponding index data according to Web Community's data of obtaining, when receiving the key word of user's input, search for corresponding text data according to the index data that generates, and arrange demonstration and user-dependent text data, details are as follows:
In step S11, obtain Web Community's data, and generate corresponding index data, this index data comprises text index data, authority index data and user index data.
In the present embodiment, Web Community's data of obtaining comprise text data, permissions data and user data.Wherein, permissions data is used for describing the authority that each user has the text data that obtains, such as the user who describes in the community text data 1 that obtains is had the authority of checking, the user outside the community does not then have the information of the authority of checking etc. to text data 1.Wherein, user data is used for describing the user owner of text data, and the user owner of text data and the relation between other users in the community, such as, supposing that user A has delivered text data 4, user B has delivered text data 5, and user A and user B are good friend's relation in the X of Web Community, then text data is user A for the user owner who describes text data 4, and the user owner of text data 5 is user B, and user B and user A are good friend's relation in the X of Web Community.
In order to be illustrated more clearly in the relation between text data, permissions data and the user data, the below uses form to describe, and is specifically as shown in table 1:
Table 1:
Content as shown in table 1, that user 1 and user 2 can check text 1, and user 3 cannot check the content of text 1; In like manner, user 1, user 2, user 3 have the authority of checking text 2; Text 3 only has user 3 to have the authority of checking, user 1 and user 2 do not have the authority of checking text 3.
Wherein, obtain Web Community's data, and the step that generates corresponding index data is specially:
A, obtain Web Community's data.In the present embodiment, obtain all Web Community's data from the network data provider of community, comprise and obtain text data, permissions data and user data.
B, resolve this Web Community's data, and generate the text index data according to text data, generate the authority index data according to permissions data, generate the user index data according to user data.In the present embodiment, resolve text data, permissions data and user data, generate the text index data, authority index data and the user index data that meet indexed format, wherein, the data that meet indexed format are scale-of-two input data.
Wherein, the step according to text data generation text index data is specially:
1, the text data that obtains being carried out re-scheduling processes.In the present embodiment, owing to having a lot of text datas to repeat to reprint in community's network data, may there be a plurality of identical text datas in the text data that therefore obtains.For conserve storage, need to carry out to the text data that obtains re-scheduling and process: if there are a plurality of identical text datas, then only keep one of them text data.Wherein, the step of the text data that generates being carried out the re-scheduling processing is specially: adopt fingerprint algorithm to calculate the md5 value of each text data, and the text data that will have identical md5 value is judged to be identical text data, and only keeps any text data corresponding to different md5 values.In the present embodiment, the md5 value of text index data is equivalent to the fingerprint of text data, is used for distinguishing different text datas.
2, generate the text index data according to the text data after the re-scheduling processing.In the present embodiment, the text data that obtains is carried out after re-scheduling processes, each text data that remains is different, thereby according to the different different text index data of text data generation, has saved storage space.
Wherein, the step according to permissions data generation authority index data is specially:
Permissions data is pushed to the search backstage, so that specific data structure and infrastructure component storage permissions data are used in the search backstage, generates the authority index data, and the authority index data that generates of real-time update, the real-time query service provided to the user.Wherein, the specific data structure of storage permissions data comprises Hash table (hash table) or bitmap table, dynamic memory management device etc.
Wherein, the step according to user data generation user index data is specially: the user data that obtains is generated the binary data that meets indexed format.
In step S12, concern the chain index data according to text index data and the generation of user index data, this concerns that the chain index data are the index data of text and each user's pass tethers.
Wherein, generate according to text index data and user index data and to concern the chain index data, this concerns that the chain index data are that the step of index data of text and each user's pass tethers is specially:
Fingerprint according to text data carries out cluster to the text index data;
Carry out cluster according to the text index data of user index data after to cluster again, generate and concern the chain index data, this concerns that the chain index data are the index data of text and each user's pass tethers.
In the present embodiment, text index data is the text index data after re-scheduling is processed.These user index data comprise user id, carry out cluster according to the text index data of user id after to cluster.
Further, concerning the chain index data generating according to text index data and user index data, this concerns that the chain index data are after text and each user's the step of index data of pass tethers, further comprise the steps:
The index data of text index data, authority index data, user index data and text and each user's that storage generates pass tethers.In the present embodiment, the index data of each text index data, each authority index data, user data and text and each user's of storage generation pass tethers.If the text index data that generate have been carried out the re-scheduling processing, then only preserve the index data of the pass tethers of text index data, each authority index data, each user index data and each text and each user after re-scheduling is processed.This is because after the reprinting of same text data process, usually corresponding to different user owners and different authority users, therefore even the text index data have been carried out the re-scheduling processing, also still need to store the index data of all authority index datas and all text and each user's pass tethers.
Further, after the step of the index data of text index data, authority index data, user index data and text and each user's that storage generates pass tethers, comprise the steps:
If Web Community's data change, the index data of edit and storage then.In the present embodiment, if follow-uply increased new text data, perhaps delete the text data that has obtained, then upgraded the text index data of having stored and the text index data behind the storage update; If the follow-up permissions data of having obtained that changed, the then authority index data stored of real-time update, and the authority index data behind the storage update.In like manner, if variation has occured in text index data or user index data, the index data of the text behind the storage update and each user's pass tethers then.
In step S13, receive the key word of user's input, and in concerning the chain index data, search the text data with this keyword match.
In the present embodiment, if the user needs to search certain text data in Web Community, then input the key word relevant with text data.Computing machine is after receiving this key word, this key word is mated with concerning the chain index data, if the match is successful, then concern text data corresponding to this key word of chain index data search according to the match is successful, if mate unsuccessful, then output content reminding users such as " not finding relative recording ".
In step S14, filter the text data that the match is successful according to the authority index data, and show the text data after filtering.
In the present embodiment, if receive the key word of user's input, then arrive corresponding text data according to this keyword search, from the text data that finds, filter out again the text data that the user has the authority of checking, and the text data after will filtering is arranged demonstration in a certain order.Wherein, the user owner of the text data of searching with search the user when having particular kind of relationship, this text data of searching is presented at the front of Search Results, the text data of delivering with the user owner who searches the user and do not exist particular kind of relationship is presented at the back of Search Results.This particular kind of relationship refers to the user owner of text data and searches between the user except having the relation that is in consolidated network community also have at least a other relations, such as also have good friend's relation, classmate's relation, with relations such as school relations.
Wherein, filter the text data that the match is successful according to the authority index data, and show that the step of the text data after filtering is specially:
Judge the relation between the user owner of the text data search the user and to find according to user data; According to the relation between the user owner who searches user and text data and authority index data, judge whether search the user has the authority of checking text data; Searching the user when having the authority of checking the text data that finds, preferential show and search the text data that the user exists the user owner of relation to deliver, searching the user when not having the authority of checking the described text data that finds, do not show and search the text data that the user does not have the authority of checking.Because whether the user according to search has the result that the authority of checking text data shows search; therefore not only being very easy to the user searches the text data that relative user delivers in the same Web Community, and can effectively protect each user's privacy.
Such as, after supposing user's the third input key word, find a plurality of text datas relevant with this key word, and in a plurality of text datas that find, some text datas that exist good friend user's fourth of user third to deliver, this user's fourth is provided with the authority of " only the good friend checks " to its text data of delivering, then when showing the text data of user's the third search, the preferential text data that shows that user's fourth is delivered, and will be presented at the back of Search Results with the text data that user third other users that it doesn't matter deliver, thereby be convenient to user third can fast finding relative user delivers in the same Web Community text data.Certainly, because user's fourth is provided with the authority of " only the good friend checks " to its text data of delivering, then when the good friend of other non-user's fourths inputs the text data of identical keyword search Web Community, do not show the text data that this user's fourth is delivered.In like manner; if user's fourth is provided with " secret " authority to its some text data of delivering in Web Community; then text data only have user's fourth oneself to view, and all the other users have no right to check text data, thereby have effectively protected the privacy of this user's fourth.
In first embodiment of the invention, generate corresponding text index data, authority index data and user index data according to Web Community's data of obtaining, and concern the chain index data according to text index data and the generation of user index data.After receiving the key word of user input, preferentially be presented at and concern the text data that finds in the chain index data.Because in Web Community; the user retrieves the text data of delivering with the certain user who concerns of its existence usually; therefore; the preferential text data that exists the user of certain relation to deliver with the user that shows; make the user go out the required text data of searching of user by quick-searching; thereby saved searching the time of user; improved user's good experience; and; owing to only showing that to search subscriber this search subscriber has the text data of the authority of checking; do not show that this search subscriber does not have the text data of the authority of checking, therefore can effectively protect the privacy of text data.
Embodiment two:
Fig. 2 shows the structural representation of the quick look-up system of Web Community's data that second embodiment of the invention provides, and for convenience of explanation, only shows the part relevant with the embodiment of the invention.
The quick look-up system of these community network data can be used for the various information processing terminals by wired or wireless network connection server, mobile phone for example, pocket computing machine (Pocket Personal Computer, PPC), palm PC, computing machine, notebook computer, personal digital assistant (Personal Digital Assistant, PDA) etc., it can be the software unit that runs in these terminals, the unit that hardware cell or software and hardware combine, also can be used as independently, suspension member is integrated in these terminals or runs in the application system of these terminals, wherein:
Indexdata generation unit 21 is used for obtaining Web Community's data, and generates corresponding index data, and described index data comprises text index data, authority index data and user index data.
Wherein, this indexdata generation unit 21 comprises: Web Community'sdata acquisition module 211 and a plurality of indexdata generation module 212.
This Web Community'sdata acquisition module 211 is used for obtaining Web Community's data, and described Web Community data comprise text data, permissions data and user data.
These a plurality of indexdata generation modules 212 are used for resolving described Web Community data, and generate the text index data according to text data, generate the authority index data according to permissions data, generate the user index data according to user data.
In the present embodiment, after Web Community'sdata acquisition module 211 had obtained the user data of each user in permissions data corresponding to text data, the text data of Web Community and the Web Community, Web Community's data that a plurality of indexdata generation modules 212 obtain according to Web Community'sdata acquisition module 211 generated corresponding text index data, authority index data and user index data.Wherein, the step that (1) generates the text index data is specially: the text data that obtains is carried out re-scheduling process, in order to save storage space; Generate the text index data according to the text data after the re-scheduling processing.(2) step of generation authority index data is specially: permissions data is pushed to the search backstage, so that specific data structure storage permissions data is used on the search backstage, generate the authority index data, provide the real-time query service to the user, the specific data structure of this storage permissions data comprises Hash table (hash table) or bitmap table etc.(3) step of generation user index data is specially: the user data that obtains is generated the binary data that meets indexed format.Text index data, authority index data and the user index data that generate are scale-of-two input data.
Customer relationship chain indexdata generating unit 22 is used for concerning the chain index data according to text index data and the generation of user index data that this concerns that the chain index data are the index data of text and each user's pass tethers.
Wherein, customer relationship chain indexdata generating unit 22 comprises:text cluster module 221 anduser clustering module 222.
Text cluster module 221 is used for according to the fingerprint of text data the text index data being carried out cluster.
Thisuser clustering module 222 is used for carrying out cluster according to the text index data of user index data after to cluster again, generates to concern the chain index data, and this concerns that the chain index data are the index data of text and each user's pass tethers.
Further, the quick look-up system of these community network data also comprises:storage unit 25 and editcell 26.
Thisstorage unit 25 is for the index data of text index data, authority index data, user index data and text and each user's of storage generation pass tethers.
Thisedit cell 26 is used for when Web Community's data change the index data of edit and storage.
Text data is searchedunit 23, be used for to receive the key word of user's input, and searches the text data with described keyword match in concerning the chain index data.
Text data displayunit 24 is used for filtering the text data that the match is successful according to the authority index data, and shows the text data after filtering.
Further, textdata display unit 24 comprises: each customerrelationship judge module 241, text dataauthority judge module 242 and text data are selecteddisplay module 243.
This each customerrelationship judge module 241 is for the relation between the user owner who judges the text data of searching the user and finding according to user data.
Text datapermission judge module 242 is used for judging and searching the authority whether user has the text data of checking that this finds according to relation and authority index data between the user owner who searches user and text data.
Text dataselection display module 243 is used for searching the user when having the authority of checking the text data that finds, and preferentially shows and searches the text data that the user exists the user owner of relation to deliver.
In second embodiment of the invention, after the index data of the pass tethers that the text index data that customer relationship chain index data generating unit 22 generates according to index data generation unit 21 and user index data generate text and each user, text data is searched unit 23 and is searched user's text data relevant with the key word of user's input, and shows in 24 arrangements of text data display unit.Because in Web Community; the user retrieves the text data of delivering with the certain user who concerns of its existence usually; therefore the preferential text data that exists the user of certain relation to deliver with the user that shows makes the user go out the required text data of searching of user by quick-searching; thereby saved searching the time of user; improved user's good experience; and; owing to only showing that to search subscriber this search subscriber has the text data of the authority of checking; do not show that this search subscriber does not have the text data of the authority of checking, therefore can effectively protect the privacy of text data.The present invention generates corresponding text index data, authority index data and user index data according to Web Community's data of obtaining, and generates the index data of text and each user's pass tethers according to text index data and user index data.After receiving the key word of user input, preferentially be presented at the text data that finds in text and each user's the index data of pass tethers.Because in Web Community; the user retrieves the text data of delivering with the certain user who concerns of its existence usually; therefore the preferential text data that exists the user of certain relation to deliver with the user that shows makes the user go out the required text data of searching of user by quick-searching; thereby saved searching the time of user; improved user's good experience; and; owing to only showing that to search subscriber this search subscriber has the text data of the authority of checking; do not show that this search subscriber does not have the text data of the authority of checking, therefore can effectively protect the privacy of text data.
The above only is preferred embodiment of the present invention, not in order to limiting the present invention, all any modifications of doing within the spirit and principles in the present invention, is equal to and replaces and improvement etc., all should be included within protection scope of the present invention.