Internet Information Integration engineTechnical field
The present invention relates to portal website of robot, network media field, more particularly to a kind of Internet Information Integration engine.
Background technology
The technical scheme that Application No. 201610557263.1 is provided, discloses a kind of vertical search assisted based on userMethod, including:System carries out general polling to trade information, and shows general polling result;User input query keyword, beSystem carries out data processing, and query context is reduced on the basis of general polling result;System generates expansible Keyword List, usesFamily selects expansible keyword from the expansible Keyword List, further reduces query context.Accordingly, one is also disclosedKind of the vertical search system assisted based on user, including subfoundation, user data process layer and user mutual and system selfImprove layer.The technical scheme, which is applied to a certain field or industry, can greatly improve the operating efficiency of professional person, with knowingKnow a characteristics of continuous self-propagation, data message are new, positioning is fast, scalability is good.
The technical scheme that Application No. 201610856709.0 is provided, discloses a kind of system combination method and device, its sideMethod includes:The integration to access system is received to instruct;The integration to access system according to receiving is instructed, based on script injectionWith the system combination strategy of web retrieval, corresponding Item increasing is carried out to the access system, wherein, web retrieval is at least wrappedInclude reptile engine.The invention can achieve a butt joint into the integration of system without carrying out transformation to access system, and will can log in wholeClose, function is integrated, list is integrated, inquiry is integrated, message is integrated and combined, and improves integration depth and scope,Make whole system compacter.
In prior art, including the above-mentioned technical scheme enumerated, it is manually to be examined one by one, wastes time and energy,Inefficiency, and largely content repeatedly can not be remembered.For defect above-mentioned in the prior art, the invention provides a kind of netNetwork information integration engine.
The content of the invention
The present invention is to provide a kind of Internet Information Integration engine to solve the technical scheme that above-mentioned technical problem is used, itsIn, concrete technical scheme is:
Network address duplicate removal, identical network address is directly filtered out;
Web page structural information correlation duplicate removal, similarity is fallen higher than a certain proportion of information filtering;
Web page structural information interest degree is assessed, and leaves behind the content matched with key word cloud;
The present invention has the advantages that relative to prior art:
The significant increase operating efficiency of news editor personnel, reduces the error rate of editing system.
Brief description of the drawings
Fig. 1-4 is the frame diagram of Internet Information Integration engine.
Embodiment
A kind of Internet Information Integration engine that the present invention is provided, concrete technical scheme includes network address duplicate removal, identical network addressDirectly filter out;Web page structural information correlation duplicate removal, similarity is fallen higher than a certain proportion of information filtering;Web page structuralInformation interest degree is assessed, and leaves behind the content matched with key word cloud.
Although the present invention is disclosed as above with preferred embodiment, so it is not limited to the present invention, any this area skillArt personnel, without departing from the spirit and scope of the present invention, when a little modification can be made and perfect, therefore the protection model of the present inventionEnclose when by being defined that claims are defined.