Summary of the invention
The objective of the invention is to overcome the shortcoming and defect of above-mentioned conventional art, and the method that provides a kind of computer auxiliary report and knowledge base to produce, more particularly, be a kind of based on unstructured information source and structured data sources, comprise the whole network search (webpage and structured web site) of internet, and the comprehensive fusion of the third party database under the line (as: patent database etc.) search, to paying close attention to the real-time listening incessantly of point of interest, and the theme of setting round enterprise and assisted user produces the method for keynote speech.
The present invention solves the technical scheme that its technical matters adopts.This computer auxiliary report and knowledge base production method specifically comprise the steps:
1.1), server-side system receives the request of user search information, all structurings and the non-structured web page and the website of search and user search demand coupling in internet and third party database, and the result that will search for classification, go heavily, return to the user client digital termination system after the arrangement, corresponding information in the databases such as coupling third party database such as Deng Bai Shi enterprise database, patent database, public security case library, sector database automatically simultaneously;
1.2), the more new situation of the search behavior of the automatic recording user of server-side system, active monitoring information source and catch in real time, sort out updated information, when the user logins, remind the more new situation in user profile source, simultaneously subscriber mailbox is sent the prompting mail;
1.3), the user selects the information acquisition that needs in system in the result that search obtains, and utilize the knowledge excavation method, generate report and convince outfile by patient analysis;
1.4), server-side system carries out machine learning to user's behavior, and initiatively excavates user's search need, the information point that automatically prompting user need be searched for.
The user can carry out deep search, immediate to the information that search obtains, with the mode of USB KEY the user be managed and provides the interface of third party database among the present invention.
The effect that the present invention is useful is:
1) improves the efficient of user search, collection, classified information, greatly, and provide number of ways, business decision instrument more easily for the decision-making section of enterprise and supvr;
2), by man-machine interaction, unstructured information is converted into structured message, and the semantic relevance between the energy reservation information, as the relevance between " team " and " project ", other structured message that comes with search is fused to the material that report needs, and the formation of having enriched information makes information more complete, more accurate, improve the efficient that enterprise obtained, utilized information significantly, saved the correlative charges that information is collected, stored, excavates;
3), depth of round search is provided, by disperse, the continuous repetition of this process of polymerization, the user can be in the information of the next level of the continuous deep search of the point of interest of different level of information;
4), follow the tracks of user's point of interest, these points of interest are carried out uninterrupted real-time listening, newly dynamically being caught, being gathered these points of interest, during for user's login next time these information or information are in time utilized, guarantee the freshness of information, made can cope with competition the soon variation in adversary and market of enterprise;
5), knowledge excavation function can be according to the given subject analysis logic of user, in artificial assistance with under revising data are carried out the scale arrangement and the information and intelligence data are carried out process analysis, draw the naive inference conclusion and relative detailed report that do not possess in internet and the third party database, carry out higher level reasoning for the user, as: the suggestion of the potential product of the future development trend of a product, a kind of innovation.And the data adding report that can allow the user that search is come out, the report export function is provided, make things convenient for user's browsed off-line;
6), can get in touch the other side by instant messaging to company and the personage that search is come out, make things convenient for the authenticity of understanding information and carrying out of cooperation.
Embodiment
The invention will be described further below in conjunction with drawings and Examples:
Concerning this description, accompanying drawing any or a plurality of in quote under the situation of step with same numeral or feature, these steps or feature have substantially the same function or operation.
Shown in Fig. 1 is the system chart that computer auxiliary report and knowledge base produce system in the exemplary embodiment.This system comprises client I 100, and client II 110, calling terminal I 120, calling terminal II 130,digital network 140, third party'sdata source 150, server-side system 160, application program 170, data-base recording 180 and database server side system 190.Below in conjunction with Fig. 1 various piece is described in detail.
Client I 100 and client II 110 are two kinds of multi-form client, and client and server-side system can think it all is a kind of client machine system on function is formed.
Client machine system: client machine system of the present invention can be realized by digital termination system, is used to carry out the application program of processing procedure of the present invention, but is not limited in this.Client machine system can be digital terminal or the terminal that is connected to digital terminal.Usually, in order to realize system of the present invention, the digital terminal of indication needs to comprise display device, audio frequency input and output device, user input unit, storer and CPU at least in the present invention, and be considered to carry out the application program that can realize system of the present invention and system, as network browser program Internet Explorer.
Be appreciated that ground, this client machine system is not limited in digital termination system, also can be other equipment such as mobile phone, and the person skilled in art should be able to understand this point at an easy rate.
Client I 100: what client I represented is that a kind of accessnumber word network 140 communicates movable client composition mode with server-side system 160.The purpose that it communicates is to server-side system 160 requests and receives search information.Client I has comprisedcohort 1 and thecohort 2 that connects bylocal network 103, andcohort 1 is two different client machine systems withcohort 2 equally.Cohort 1 is with incohort 2 can be distributed in same or different local networks.It is client I thatcohort 101,cohort 102 connect bylocal network 103.
Cohort: can be the set of uniting, also can be represented as an industry, as financial circles, manufacturing industry by individual, department, subsidiary company, affiliate or other modes.
Local network 103: comprise the LAN (Local Area Network) LAN that is limited in limited geographic area, and the wide area network WAN and the Metropolitan Area Network (MAN) MAN that are not subject to limited geographic area.
Client II 110: different with client I is that what client II represented is another kind of as a client form that can communicate activity bydigital network 140 and server-side system 160.What client II represented is an independentclient machine system 110.
Be appreciated that ground, in another embodiment, may comprise wherein a kind of or whole client forms of client I and client II, but the array configuration of client do not influence the realization of system of the present invention.
Calling terminal I 120: represent the calling terminal form of landline telephone, as wireless phone, wire telephony, visual telephone etc.
Calling terminal II 130: represent the calling terminal form of mobile phone, as mobile phone, Personal Handyphone System etc.
Should be noted that utilize advanced technology, calling terminal and client can realize the other side's function mutually, such as tool devices and handset having Internet and client that the network phone system platform is housed.
Digital network 140: the transmission network of wired or wireless digital network information or signal is used for the information of transmission of digital network.Can be understood as but be not limited only to LAN (Local Area Network) LAN, wide area network WAN, Metropolitan Area Network (MAN) MAN, virtual private network and the Internet.Client I, client II, calling terminal I, calling terminal II and other network terminal entities can be connected to server-side system 160 by the network of any form, but they not necessarily are connected on the server-side system 160 by same network.
Third party's data source 150: can adopt the one or more servers that are similar to server-side system 160 to realize, its effect is the available third party's information source outside the queryingserver end system 160, as patent database, and utilize related information content that these information sources provide by application program 170 visit and carry out and generate related information and return to client I 100, client II 110, calling terminal I 120 and calling terminal II 130.
Server-side system 160: server-side system realizes by one or more servers, can be wherein one or more server associatings ofdatabase server 161, thewebserver 162,apps server 163, also can be to have comprised the wherein function of one or more servers in the server.
Server: be used to respond the computer program operation that is stored on the server.
Database server 161: all electronic information of storeddata base record 180 and execution are to the visit of data-base recording 180.
Database server side system 190: form by reptile end system I 191, reptile end system II 192, reptile end system III193, reptile end system I 191, reptile end system II 192, reptile end system III193 are three different client machine systems, can be the combinations of the client machine system of Any Digit.In merging search module 171, system utilize web crawlers (Crawler) be also referred to as " Web Spider " (Spider) or the computer program of network robot fromdigital network 140 and third party'sdata source 150, grasp the information of user search, but because the workload of web crawlers program is very big, so also need distributed reptile to support the complete sum high efficiency of whole data, the distribution extracting and the distribution that promptly need the reptile end system of some to finish data are uploaded.
Reptile end system: be a kind of client machine system of simulating browser work, be used to carry out data and climb the task of getting, and will climb the data that obtain and information uploading central database to database server.
Data-base recording 180: all users that storage is relevant with server-side system 160 or the various information contents and the data of client machine system, as index 181,Search Results 182, user learning model 183.These information contents and data comprise the field that data-base recording comprised of Fig. 3-1, Fig. 3-2 and the shown exemplary embodiment of Fig. 3-3.
Fig. 3-1 has illustrated an example of index 181 structures, and it has comprised a plurality of fields.The data that index 181 comprises have:keyword 311, article numbering 312,occurrence number 313, the frequency ofoccurrences 314 of keyword occur.Different with general index is that the present invention has set up reverse indexing for the data in all data sources, makes things convenient for Query Database and improvessearch efficiency.Keyword 311 is meant the word that can represent certain piece of article or certain bar information core implication, article one, information may comprise a plurality of keywords, keyword is that the user is used for the foundation of search information, also is the foundation of server-side system 160 fromdigital network 140 and 150 kinds of search information of third party's data source.The article numbering 312 that keyword occurs is meant the numbering of some article that certain particular keywords occurs, as the page number, chapters and sections or the like, expression be the position that keyword occurs, server-side system 160 is searched the information that arrives with stored search by this correspondingrelation.Occurrence number 313 expression be server-side system 160 during fromdigital network 140 and 150 kinds of search information of third party's data source, the number of times sum total that particular keywords occurs in search procedure.When the frequency ofoccurrences 314 is meant server-side system 160 fromdigital network 140 and 150 kinds of search information of third party's data source, the number of times that particular keywords occurs in the regular hour section.System sorts to the information that search obtains according to number of times and the frequency that keyword occurs, and the Search Results that occurrence number is many more, frequency is high is more arranged forward more.
Fig. 3-2 has illustrated an example ofSearch Results 182 structures, and it has comprised a plurality of fields.The Search Results of indication of the present invention is meant that the user collects the object information that the user in the system platform needs from the information that system returns.Search Results 182 is made ofsearch result number 321, search result content 322.Search result number 321 is these Search Results unique number, is used for carrying out related with the user learning model.Search result content 322 changes according to information category is different, as comprising the information such as technology of name, age, date of birth, contact method, responsible project, grasp in other Search Results of figure kind, comprise information such as inventor, technical background, affiliated company/mechanism, technical advance in the Search Results of technology category.
Fig. 3-3 has illustrated an example ofuser learning model 183 structures, and it has comprised a plurality of fields.The data thatuser learning model 183 comprises have: CustomsAssigned Number 331,user name 332, other log-onmessages 333,historical search information 334,historical search result 335 andhistorical information source 336 of gathering.CustomsAssigned Number 331 representative be that this user profile is stored in the unique number in the database, be convenient to the renewal of 160 pairs of user learning models of server-side system and call.User name 332 is authentication data of this system of User login.Other log-onmessages 333 are user's other information exceptuser name 332 when registering or being set to the system registry user, as land password, affiliated industry, Business Name etc.Historical search information 334 is the crucial words and phrases searched for after registering of user and the set of Search Results, and server-side system 160 utilizeshistorical search information 334 to carry out user behavior study.Historical search result 335 is that the user selects and collect the information in the system platform in the information that systematic search obtains.Historical information source 336 of gathering is that the user is registered as after the system user, the information source of She Zhiing voluntarily, and as certain third party database or website, the collection point that these users were provided with is noted down by system, and the foundation of learning as user behavior.
Return Fig. 1 below.
The webserver 162: communicate with the client as client I 100, client II 110, calling terminal I 120 and calling terminal II 130, as to client I 100, client II 110, calling terminal I 120 and calling terminal II 130 transmission information, reception information, and carry out being associated of task.
Apps server 163: according to exemplary embodiment, the computer program of apps server storage, execution such as application program 170.
Application program 170: in this explanation, one or more computer programs that can realize system of the present invention are referred to as application program, certainly, some processing in the application program can realize by client I 100, client II110, calling terminal I 120 and calling terminal II 130.Application program 170 has comprised following main modular: merge search module 171, userbehavior study module 172, form derivation module 173, call center'smodule 174 and BackAdministration Module 175.
Merge search module 171: merge search module and be meant such application program, after the user imports crucial words and phrases, system can carry out Chinese word segmentation (in native system to the crucial words and phrases of user's input, adopt the pattern of mixing participle to carry out Chinese word segmentation, utilize terminological dictionary, common participle, stop speech and unite participle, thereby finish correct participle), the crucial words and phrases of user's input are cut into different vocabulary by various segmenting methods, start the web crawlers program then, from each non-structured web page ofdigital network 140 and third party'sdata source 150 and structured web site, climb and get the relevant data of crucial words and phrases, and data are carried out intellectual analysis, after removing repeated content and unwanted information, index file 181 is writedatabase server 161, returning to the user after information is reset browses, the user can choose any speech on the page, put into search mission, system just can deepen search once more to the crucial words and phrases that the user selects, the user also can be with the information acquisition of needs in system platform, the information of gathering deposits in theSearch Results 182, upgradesuser learning model 183 simultaneously.
User behavior study module 172: the data foundation of user behavior study module is auser learning model 183, the user is in the process of search information, the crucial words and phrases that the automatic recording user of system was searched for, information that these crucial words and phrases are relevant and the Search Results of gathering, track record is also carried out in the setting of the information source that the user is gathered simultaneously, only cross search custom and the topics of interest of learning the user after machine learning is analyzed, thereby make that in the process of search system can initiatively may the information of interest content be pushed to the user in face of the user according to the user learning model.
Form is derived module 173: form is derived module and is meant that system can provide report capability for the keyword that the user once searched for, and can utilize the knowledge excavation function that Search Results is analyzed, draw preliminary report, the user can utilize this module that the information and the report of needs are exported as html or word form, makes things convenient for user's browsed off-line in the future.
Call center's module 174: inSearch Results 182, a lot of information have all related to the contact method that can carry out communication, as fixed telephone number and phone number, the user is by client I 100, client II 110, the user input apparatus of calling terminal I 120 and calling terminal II 130, as mouse, keyboard, certain focal pointe in system request and Search Results carries out communication, system start-up call center module, the fixed telephone terminal or the network telephone terminal of this user and this mechanism are connected, the user utilizes the audio frequency input-output unit, just can immediately get in touch as earphone and microphone, and called associate also can utilize fixed telephone to interested focal pointe, mobile phone or earphone and microphone are answered.Like this, the user need not utilize communication apparatus call peers such as landline telephone when having a question, but directly finishes consulting on the net.VOIP signaling aspect adopts the Session Initiation Protocol signaling, and voice flow adopts rtp streaming, and multiple phonetic matrix is selected.Signaling and Streaming Media all adopt effective encryption.Call record is preserved automatically by system.
Back Administration Module 175: the user utilizes Back Administration Module to manage the user, as revising user profile; The information source in the search module 171 be can maintenance and management merge, search interface address, search number of pages and search engine title comprised; Also can put Management Calculation information to data in order; For the search interface and the service of providing convenience for the user, the a plurality of search engines of energy simple configuration in Back Administration Module, the multiple search engine service is provided, and can dispose a plurality of themes flexibly, each theme has different crucial dictionaries, in the time of user's indexed file, the user can make things convenient for the data that the user can find more accurately to be needed from the index data and the index degree of some particular keywords of extraction in the crucial dictionary.
Should be appreciated that Fig. 1 just illustrates wherein a kind of demonstration system in order to be illustrated more clearly in the present invention, but do not represent the present invention just to be confined to this scope.
Fig. 2 below.Fig. 2 illustrates the processing procedure of exemplary embodiment.Wherein the dotted portion among the figure is step or the sightless step of user that carry out on the system backstage.
At first user's execution instep 210 to system platform, because this platform can only land by registration, is defaulted as the registered user so log on the user of system platform by network login without exception.
Then, in thestep 211, the user imports keyword or the statement that needs search in the search statement input frame, and click search button, search for the information that needs by execution instep 212, system automatically performsstep 213, judge whether this user searched for relevant information, if this logic determines result is a "Yes", promptly this user once searched for relevant information, and then system's execution instep 214, call the Search Results of index file and index correspondence, and execution instep 215, upgrade the user learning model and the Search Results of index correspondence is returned to the user, be i.e. step 220; If the user does not have by the relevant excessively information of this systematic search, then system's execution instep 216, set up the index of user learning model for this user, and execution instep 217, by combined method for searching, with the relevant information in the whole network and the third party database, by going heavily, integrating, return to the user, and while execution instep 218, index file upgraded.
Then, the Search Results that user's browing system returns, and can choose keyword in the webpage, and carry out the track-while-scan of keyword, bestep 230, thereupon, system's execution instep 217 and subsequent step and step 215, the information that the user is needed returns to the user by Syndicating search, and upgrades the user learning model in view of the above.
Can be alternatively, the user can choose the communication information in the Search Results, and as phone number, by execution instep 240, immediate the other side obtaining the communication of the very first time, and inspiresstep 215, upgrades user's learning model.
Can be alternatively, the user can be with the part that needs in the Search Results, by the form that pulls, add in the explorer of report, i.e.step 250, the user can independently selectstep 251, report is exported as doc form or step 252, report is exported as the html form, meanwhile, system's execution instep 215, renewal user's learning model.
Can be alternatively, the user can choose the information that needs, and this information is added in the information source of monitoring, it isstep 260, system automatically performsstep 261, judge whether renewal has taken place in this information source, if in the information source of being monitored renewal is arranged, then system's execution instep 262 and step 263, promptly send to remind mail to remind the user to browse updated information to the user and when the logging in system by user, after the user browsed the information browsed of prompting, system's execution instep 215 was upgraded user's learning model.
Certainly, can find out at an easy rate that this flow process is not necessarily to carry out in proper order as described above, but a process that constantly circulates repeatedly, the difference of sequence of steps does not influence the system that realizes that the present invention tells a story, so the present invention is not subject to the process flow diagram that this exemplary embodiment is drawn yet.
Come 4-1 with the aid of pictures, Fig. 4-2, Fig. 4-3, Fig. 4-4 below, what these four figure showed respectively is the page sectional drawing of four key steps among the embodiment: derive the user interface sectional drawing with householder interface sectional drawing, information acquisition user interface sectional drawing, degree of depth circulation vertical search and call center user interface sectional drawing and report.
Fig. 4-1 illustrates the sectional drawing with the householder interface that exemplary embodiment is created and generated.Mainly comprise following content with householder interface 410: explorer 411, search words and phrases input frame 412, search button 413, Search Results 414 and information audit window 415.Wherein explorer 411 is similar to the explorer under the windows operating system, constitutes and the corresponding content of each theme in order to the data of listing the Search Results that the user gathered, and has comprised functional modules such as back-stage management, report derivation.In the search words and phrases input frame 412, the user needs the crucial words and phrases of inquiry by input, and click search button 413, server-side system 160 is searched for relevant information from digital network 140 and third party's data source 150, and starting user behavior study module 172, the Search Results that will meet customer requirements returns to the user and browses.Search Results 414 is the information that system returns to the user, comprising the Search Results title, and information such as general introduction and chained address, the user can click expansion to each bar Search Results, thereby obtains detailed Search Results in full.Information audit window 415 can monitor the part that renewal takes place and change in the information of user's collection, and these information are crawled out be pushed to the user, make things convenient for the user to browse the very first time, grasp information is dynamic, the information audit window simultaneously also can be according to the study conclusion of user learning model 183, the user may be needed, information of interest initiatively grasps and be pushed to the user browses.
Fig. 4-2 illustrates the user interface sectional drawing of the information acquisition of exemplary embodiment establishment and generation.Informationacquisition user interface 420 comprises following content:explorer 421, informationfull text 423, interpolation monitoring information point 422.Wherein,explorer 421 is identical withexplorer 411 functions in the sectional drawing of Fig. 4-1 usefulness householder interface, be similar to the explorer under the windows operating system, constitute and the corresponding content of each theme in order to the data of listing the Search Results that the user gathered, and comprised functional modules such as back-stage management, report derivation.Information 423 is that the user clicks the complete information data of launching after the Search Results title in full, the user can by information in full 423 browse the understanding details.Addmonitoring information point 422 and be meant that the user can select information point, and this information point is taken in the information point tabulation of automatic monitoring of system in the interpolation listening point function of right-click menu, so that in information updating or when changing, in time catch dynamically, and remind the variation of user profile point in the very first time.
Fig. 4-3 illustrates degree of depth circulation vertical search and the call center user interface sectional drawing that exemplary embodiment is created and generated.Degree of depth circulation vertical search and call center user interface 430 mainly comprise three contents: trace button 431, degree of depth circulation vertical search 433 and call center's window 432.If wherein trace button 431 is meant that the user is interested in and want to understand more detailed content certain information point in the Search Results, can in Search Results, choose the information point literal, and click by mouse right button, choose the tracking function option, can be to choosing to such an extent that information point is carried out degree of depth circulation vertical search, need not in search words and phrases input frame, to import keyword, vocabulary in the automatic tabulate statistics Search Results of system, the statistics word frequency, and to choosing to such an extent that the information point literal carries out participle, the highest vocabulary of word frequency with choose vocabulary and form several vectors, representing several directions of search next time, select for the user, and can mate with the related data in third party's data source 150, but be the intention of pressing close to him most when the user finds certain direction, the user just clicks " following the trail of according to this " button, allow system the information vector in the knowledge base under the whole network and the line be mated, browse thereby information is more accurately returned to the user according to the vector of this direction representative.And in a lot of Search Results and information, all comprised contact method as phone number and fixed telephone number, system monitoring is to these contact methods, and be converted into the form that to carry out calling network automatically, the user clicks contact method, system ejects call center's window 432 automatically, and user's click-to-dial button just can be connected with the other side's communicating terminal, and the lang sound of going forward side by side is linked up.
Fig. 4-4 illustrates the report derivation user interface sectional drawing that exemplary embodiment is created and generated.Report is deriveduser interface 440 and comprised following content:fundamental analysis report 443, doc form are derivedbutton 441, the html form is derived button 442.Wherein,fundamental analysis report 443 is that system is reported according to the basis that the knowledge excavation functional analysis forms according to the Search Results that the user gathers, as technology trends or the like.The user clicks the doc form and derives thefundamental analysis report 443 thatbutton 441 can draw system and export as the word document.And the user clicks the html form and derives the file thatfundamental analysis report 443 thatbutton 442 can draw system exports as form web page, makes things convenient for user's browsed off-line.
More than by to reference to the accompanying drawings detailed description; the person skilled in art can understand the realization principle and the mechanism of system of the present invention at an easy rate; drafting with reference to the accompanying drawings is just in order to illustrate method of the present invention, system and computer program better; rather than the scope of regulation protection, protection scope of the present invention is defined by appended claims.In addition to the implementation, the present invention can also have other embodiments.All employings are equal to the technical scheme of replacement or equivalent transformation formation, all drop on the protection domain of requirement of the present invention.