Movatterモバイル変換


[0]ホーム

URL:


CN107291916A - Internet Information Integration engine - Google Patents

Internet Information Integration engine
Download PDF

Info

Publication number
CN107291916A
CN107291916ACN201710506555.7ACN201710506555ACN107291916ACN 107291916 ACN107291916 ACN 107291916ACN 201710506555 ACN201710506555 ACN 201710506555ACN 107291916 ACN107291916 ACN 107291916A
Authority
CN
China
Prior art keywords
integration engine
internet information
information integration
web page
network address
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201710506555.7A
Other languages
Chinese (zh)
Inventor
赵勇
邢志金
李宇华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Shang Shang Robot Technology Co Ltd
Original Assignee
Shanghai Shang Shang Robot Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Shang Shang Robot Technology Co LtdfiledCriticalShanghai Shang Shang Robot Technology Co Ltd
Priority to CN201710506555.7ApriorityCriticalpatent/CN107291916A/en
Publication of CN107291916ApublicationCriticalpatent/CN107291916A/en
Pendinglegal-statusCriticalCurrent

Links

Classifications

Landscapes

Abstract

The invention discloses a kind of Internet Information Integration engine, its technical scheme is:Network address duplicate removal, identical network address is directly filtered out;Web page structural information correlation duplicate removal, similarity is fallen higher than a certain proportion of information filtering;Web page structural information interest degree is assessed, and leaves behind the content matched with key word cloud;The Internet Information Integration engine that the present invention is provided, the significant increase operating efficiency of news editor personnel reduces the error rate of editing system.

Description

Internet Information Integration engine
Technical field
The present invention relates to portal website of robot, network media field, more particularly to a kind of Internet Information Integration engine.
Background technology
The technical scheme that Application No. 201610557263.1 is provided, discloses a kind of vertical search assisted based on userMethod, including:System carries out general polling to trade information, and shows general polling result;User input query keyword, beSystem carries out data processing, and query context is reduced on the basis of general polling result;System generates expansible Keyword List, usesFamily selects expansible keyword from the expansible Keyword List, further reduces query context.Accordingly, one is also disclosedKind of the vertical search system assisted based on user, including subfoundation, user data process layer and user mutual and system selfImprove layer.The technical scheme, which is applied to a certain field or industry, can greatly improve the operating efficiency of professional person, with knowingKnow a characteristics of continuous self-propagation, data message are new, positioning is fast, scalability is good.
The technical scheme that Application No. 201610856709.0 is provided, discloses a kind of system combination method and device, its sideMethod includes:The integration to access system is received to instruct;The integration to access system according to receiving is instructed, based on script injectionWith the system combination strategy of web retrieval, corresponding Item increasing is carried out to the access system, wherein, web retrieval is at least wrappedInclude reptile engine.The invention can achieve a butt joint into the integration of system without carrying out transformation to access system, and will can log in wholeClose, function is integrated, list is integrated, inquiry is integrated, message is integrated and combined, and improves integration depth and scope,Make whole system compacter.
In prior art, including the above-mentioned technical scheme enumerated, it is manually to be examined one by one, wastes time and energy,Inefficiency, and largely content repeatedly can not be remembered.For defect above-mentioned in the prior art, the invention provides a kind of netNetwork information integration engine.
The content of the invention
The present invention is to provide a kind of Internet Information Integration engine to solve the technical scheme that above-mentioned technical problem is used, itsIn, concrete technical scheme is:
Network address duplicate removal, identical network address is directly filtered out;
Web page structural information correlation duplicate removal, similarity is fallen higher than a certain proportion of information filtering;
Web page structural information interest degree is assessed, and leaves behind the content matched with key word cloud;
The present invention has the advantages that relative to prior art:
The significant increase operating efficiency of news editor personnel, reduces the error rate of editing system.
Brief description of the drawings
Fig. 1-4 is the frame diagram of Internet Information Integration engine.
Embodiment
A kind of Internet Information Integration engine that the present invention is provided, concrete technical scheme includes network address duplicate removal, identical network addressDirectly filter out;Web page structural information correlation duplicate removal, similarity is fallen higher than a certain proportion of information filtering;Web page structuralInformation interest degree is assessed, and leaves behind the content matched with key word cloud.
Although the present invention is disclosed as above with preferred embodiment, so it is not limited to the present invention, any this area skillArt personnel, without departing from the spirit and scope of the present invention, when a little modification can be made and perfect, therefore the protection model of the present inventionEnclose when by being defined that claims are defined.

Claims (3)

CN201710506555.7A2017-06-282017-06-28Internet Information Integration enginePendingCN107291916A (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN201710506555.7ACN107291916A (en)2017-06-282017-06-28Internet Information Integration engine

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN201710506555.7ACN107291916A (en)2017-06-282017-06-28Internet Information Integration engine

Publications (1)

Publication NumberPublication Date
CN107291916Atrue CN107291916A (en)2017-10-24

Family

ID=60098619

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN201710506555.7APendingCN107291916A (en)2017-06-282017-06-28Internet Information Integration engine

Country Status (1)

CountryLink
CN (1)CN107291916A (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN101520798A (en)*2009-03-062009-09-02苏州锐创通信有限责任公司Webpage classification technology based on vertical search and focused crawler
CN101694658A (en)*2009-10-202010-04-14浙江大学Method for constructing webpage crawler based on repeated removal of news
CN102156709A (en)*2011-02-282011-08-17奇智软件(北京)有限公司Browser engine mode switching method
CN102567473A (en)*2011-12-142012-07-11鸿富锦精密工业(深圳)有限公司Network information retrieval system and retrieval method
CN105335509A (en)*2015-10-292016-02-17广州神马移动信息科技有限公司Method and device for recommending activity information and server

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN101520798A (en)*2009-03-062009-09-02苏州锐创通信有限责任公司Webpage classification technology based on vertical search and focused crawler
CN101694658A (en)*2009-10-202010-04-14浙江大学Method for constructing webpage crawler based on repeated removal of news
CN102156709A (en)*2011-02-282011-08-17奇智软件(北京)有限公司Browser engine mode switching method
CN102567473A (en)*2011-12-142012-07-11鸿富锦精密工业(深圳)有限公司Network information retrieval system and retrieval method
CN105335509A (en)*2015-10-292016-02-17广州神马移动信息科技有限公司Method and device for recommending activity information and server

Similar Documents

PublicationPublication DateTitle
CN102710795B (en) Hot spot polymerization method and device
CN102098229B (en)Method and device for optimizing and auditing uniform resource locator (URL) as well as network device
US20150074289A1 (en)Detecting error pages by analyzing server redirects
CN107885777A (en) A control method and system for crawling web page data based on collaborative crawler
CN106844640A (en)A kind of web data analysis and processing method
US20180025012A1 (en)Web page classification based on noise removal
EP2657854A1 (en)Method and system for incremental collection of forum replies
WO2002042863A3 (en)A system and process for network site fragmented search
CN102411617B (en)Method for storing and inquiring a large quantity of URLs
CN104391978A (en)Method and device for storing and processing web pages of browsers
WO2011142979A2 (en)Decreasing duplicates and loops in an activity record
CN103412901A (en)Method and device for clearing historical records
CN103177022A (en)Method and device of malicious file search
CN108228656A (en)URL classification method and device based on CART decision trees
CN106161406A (en)The method and apparatus obtaining user account
CN106504020A (en)A kind of intelligent network marketing system based on SEO
CN107018354A (en)A kind of individual soldier's equipment of support case label, method and system
CN105677921A (en)Method and system for acquiring Internet public opinion data
CN101944093A (en)Method and system for searching network information
CN103942226A (en)Method and device for obtaining hot content
CN107291916A (en)Internet Information Integration engine
CN106257451A (en)The method and device of website visiting
KR100989320B1 (en) Non-tree index fast search method and non-tree-based indexing log processor for large web log mining and attack detection
CN108984519B (en) Method, device and storage medium for automatic construction of event corpus based on dual mode
CN106339392A (en)Method and device for obtaining public sentiment information

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination
RJ01Rejection of invention patent application after publication

Application publication date:20171024

RJ01Rejection of invention patent application after publication

[8]ページ先頭

©2009-2025 Movatter.jp