Movatterモバイル変換


[0]ホーム

URL:


CN106294433A - Facility information processing method and processing device - Google Patents

Facility information processing method and processing device
Download PDF

Info

Publication number
CN106294433A
CN106294433ACN201510276430.0ACN201510276430ACN106294433ACN 106294433 ACN106294433 ACN 106294433ACN 201510276430 ACN201510276430 ACN 201510276430ACN 106294433 ACN106294433 ACN 106294433A
Authority
CN
China
Prior art keywords
information
input text
facility information
participle
bank
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201510276430.0A
Other languages
Chinese (zh)
Other versions
CN106294433B (en
Inventor
涂建超
程搏
蔡林霖
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Tencent Computer Systems Co Ltd
Original Assignee
Shenzhen Tencent Computer Systems Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Tencent Computer Systems Co LtdfiledCriticalShenzhen Tencent Computer Systems Co Ltd
Priority to CN201510276430.0ApriorityCriticalpatent/CN106294433B/en
Publication of CN106294433ApublicationCriticalpatent/CN106294433A/en
Application grantedgrantedCritical
Publication of CN106294433BpublicationCriticalpatent/CN106294433B/en
Activelegal-statusCriticalCurrent
Anticipated expirationlegal-statusCritical

Links

Classifications

Landscapes

Abstract

The invention discloses a kind of facility information processing method, including: after setting up first information storehouse and the second information bank, read facility information pending in first information storehouse;Described pending facility information is spliced in the URL address of search engine;Facility information pending described in described URL address acquisition is accessed as input text, and by the input text comparison of described input text Yu described second information bank by search engine;When described input text is with the input text matches of described second information bank, associate the input text of described input text and described second information bank.Invention additionally discloses a kind of facility information processing means.The invention enables facility information to read convenient, the facility information accuracy of acquisition is high and improves the intelligence degree of equipment information collection and maintenance.

Description

Facility information processing method and processing device
Technical field
The present invention relates to technical field of data processing, be related specifically to facility information processing method and processing device.
Background technology
Along with the development of terminal technology, increasing terminal enters daily life and workIn, and increasing along with terminal, the brand of terminal, type and system are also increasing.With Android systemAs a example by system, the opening of Android platform so that Android platform is artificial via brush machine, ROOT etc.After operation, the hardware parameter of terminal becomes to obtain, or get is the information of artificial amendment,Multifarious, there is no standard.
At present acquisition terminal hardware information is typically by smart mobile phone api interface, sample collection hardware information,In addition to sample size deficiency, also can be because of the actual scene used departing from user, it is impossible to cover various realityDuring complicated real hardware environment (such as brush machine, root etc.), the hardware data causing collecting accurateProperty and coverage are the highest;Type is obtained mode and is mainly artificially collected by employing and safeguard model information,The model information collected cannot directly mate use with the information of terminal collection, also causes availabilityExtremely low.
To sum up, under existing mode obtain facility information (hardware information, model information etc.) poor accuracy,Readability is poor and needs manually to gather and safeguard that intelligence degree is low.
Summary of the invention
The embodiment of the present invention provides a kind of facility information processing method and processing device, it is intended to the existing mode of solutionFacility information (hardware information, the model information etc.) poor accuracy of lower acquisition, readability are poor and need peopleWork collection and safeguard the low problem of intelligence degree.
For achieving the above object, the embodiment of the present invention proposes a kind of facility information processing method, including:
After setting up first information storehouse and the second information bank, read equipment letter pending in first information storehouseBreath;
Described pending facility information is spliced in the URL address of search engine;
Facility information pending described in described URL address acquisition is accessed as input literary composition by search engineThis, and by the input text comparison of described input text Yu described second information bank;
When described input text is with the input text matches of described second information bank, associate described input literary compositionThis input text with described second information bank.
To achieve these goals, the embodiment of the present invention it is further proposed that a kind of facility information processing means,Including:
Read module, for after setting up first information storehouse and the second information bank, reads in first information storehousePending facility information;
Concatenation module, for being spliced to described pending facility information in the URL address of search engine;
Acquisition module, for accessing equipment pending described in described URL address acquisition by search engineInformation is as input text;
Comparing module, for by the input text comparison of described input text Yu described second information bank;
Relating module, is used for when described input text is with the input text matches of described second information bank,Associate the input text of described input text and described second information bank.
The present invention is by being spliced to pending information in the URL address of search engine, by input textComparison, associate device information, i.e. set up facility information word transformational relation.It is prevented effectively from existing equipmentExist under information processing manner acquisition facility information (hardware information, model information etc.) poor accuracy,Readability is poor and needs manually to gather and safeguard the problem that intelligence degree is low so that facility information reading sideJust, the facility information accuracy of acquisition is high and improves the intelligence degree of equipment information collection and maintenance.
Accompanying drawing explanation
Fig. 1 is the hardware structure schematic diagram involved by embodiment of the present invention apparatus information acquiring device;
Fig. 2 is the schematic flow sheet of the first embodiment of present device information getting method;
Fig. 3 is present invention input text comparison one enforcement by described input text Yu described second information bankThe schematic flow sheet of example;
Fig. 4 is the schematic flow sheet of the second embodiment of present device information getting method;
Fig. 5 is the integrated stand composition of beacon function storehouse one of the present invention embodiment;
Fig. 6 is the whole design and framework figure of data processing section one embodiment of the present invention;
Fig. 7 is the schematic flow sheet of data processing section one embodiment of the present invention;
Fig. 8 is that word segmentation result of the present invention processes and the schematic diagram of optimal inspection one embodiment;
Fig. 9 is the schematic diagram after the present invention sorts out by brand categorization results one embodiment taken statistics;
Figure 10 is the schematic diagram after the present invention sorts out by type categorization results one embodiment taken statistics;
Figure 11 is the high-level schematic functional block diagram of the first embodiment of present device information acquisition device;
Figure 12 is the refinement high-level schematic functional block diagram of comparing module first embodiment in Figure 11;
Figure 13 is the high-level schematic functional block diagram of the first embodiment of present device information acquisition device.
The realization of the object of the invention, functional characteristics and advantage will in conjunction with the embodiments, do referring to the drawings furtherExplanation.
Detailed description of the invention
Should be appreciated that specific embodiment described herein, only in order to explain the present invention, is not used to limitDetermine the present invention.
The primary solutions of the embodiment of the present invention is: after setting up first information storehouse and the second information bank,Read facility information pending in first information storehouse;Described pending facility information is spliced to searchIn the URL address of engine;Pending setting described in described URL address acquisition is accessed by search engineFor information as input text, and by the input text comparison of described input text Yu described second information bank;When the input text matches of described input text and described second information bank, associate described input text withThe input text of described second information bank.By the facility information automatization that search engine is obtained, wordDescription is set up and is changed corresponding relation, it is achieved brand, type field semanteme regular so that equipmentIt is convenient that information reads, and the facility information accuracy of acquisition is high and improves the intelligence of equipment information collection and maintenanceDegree can be changed.
Owing to there is the facility information of acquisition under existing equipment information processing manner, (hardware information, type are believedBreath etc.) poor accuracy, readability difference and need manually to gather and safeguard the problem that intelligence degree is low.
Embodiment of the present invention framework one apparatus information acquiring device, this apparatus information acquiring device will be by treatingProcess information is spliced in the URL address of search engine, and by inputting the comparison of text, associate device is believedBreath, i.e. sets up facility information word transformational relation.It is prevented effectively under existing equipment information processing manner existenceFacility information (hardware information, the model information etc.) poor accuracy, readability difference and the needs that obtain are artificialGather and safeguard the problem that intelligence degree is low so that facility information reads convenient, the facility information of acquisitionAccuracy is high and improves the intelligence degree of equipment information collection and maintenance.
Wherein, the present embodiment apparatus information acquiring device can be carried on PC end, it is also possible to be carried on mobile phone,Panel computers etc. can obtain the electric terminal with inquiry apparatus information.Involved by this apparatus information acquiring deviceAnd hardware structure can be as shown in Figure 1.
Fig. 1 shows the hardware structure involved by embodiment of the present invention apparatus information acquiring device.Such as Fig. 1 instituteShowing, the hardware involved by described apparatus information acquiring device includes: processor 301, such as CPU, networkInterface 304, user interface 303, memorizer 305, communication bus 302.Wherein, communication bus 302 is used forRealize the connection communication between each building block in this information push platform.User interface 303 can include showingThe assemblies such as display screen (Display), keyboard (Keyboard), mouse, for receiving the information of user's input,And the information transmission of reception is processed to processor 305.Display screen can be LCD display, LEDDisplay screen, it is also possible to for touch screen, need the data of display, example for display device information acquisition deviceSuch as operation interfaces such as display device information inquiry, apparatus information acquiring.Optional user interface 303 can also wrapInclude the wireline interface of standard, wave point.Network interface 304 optionally can include standard wireline interface,Wave point (such as WI-FI interface).Memorizer 305 can be high-speed RAM memorizer, it is also possible to be stableMemorizer (non-volatile memory), such as disk memory.Memorizer 305 is the most all rightIt is independently of the storage device of aforementioned processor 301.As it is shown in figure 1, as a kind of computer-readable storage mediumMemorizer 305 in can include operating system, network communication module, Subscriber Interface Module SIM and equipment letterBreath acquisition program.
In the hardware involved by the apparatus information acquiring device shown in Fig. 1, network interface 304 is mainly usedIn connecting application platform, carry out data communication with application platform;User interface 303 is mainly used in connecting visitorFamily end, carries out data communication with client, receives information and the instruction of client input;And processor 301May be used for calling the apparatus information acquiring program of storage in memorizer 305, and perform following operation:
After setting up first information storehouse and the second information bank, read equipment letter pending in first information storehouseBreath;
Described pending facility information is spliced in the URL address of search engine;
Facility information pending described in described URL address acquisition is accessed as input literary composition by search engineThis, and by the input text comparison of described input text Yu described second information bank;
When described input text is with the input text matches of described second information bank, associate described input literary compositionThis input text with described second information bank.
Further, in one embodiment, the equipment of storage during processor 301 calls memorizer 305Information acquiring program can perform following operation:
Described input text participle is obtained the input text after participle;
Participle input text is obtained, by the input text after described participle and institute from described second information bankState participle input text comparison.
Further, in one embodiment, the equipment of storage during processor 301 calls memorizer 305Information acquiring program can perform following operation:
When the input text of described input text with described second information bank does not mates, according to predetermined mannerThe input text of predetermined number is chosen also from the input text of described input text and described second information bankPreserve;
Receive associated instructions based on the described input text chosen, by input corresponding for described associated instructionsTextual association.
Further, in one embodiment, the equipment of storage during processor 301 calls memorizer 305Information acquiring program can perform following operation:
Receive the facility information reported by SDK, the described facility information reported is preserved as the first letterBreath storehouse.
Further, in one embodiment, the equipment of storage during processor 301 calls memorizer 305Information acquiring program can perform following operation:
Obtain facility information by third party website, and the facility information of described acquisition is carried out participle conductParticiple input text saves as the second information bank.
The present embodiment is according to such scheme, by pending information is spliced to the URL address of search engineIn, by inputting the comparison of text, associate device information, i.e. set up facility information word transformational relation.It is prevented effectively from facility information (hardware information, the type letter that there is acquisition under existing equipment information processing mannerBreath etc.) poor accuracy, readability difference and need manually to gather and safeguard the problem that intelligence degree is low, makeObtaining facility information and read conveniently, the facility information accuracy of acquisition is high and improves equipment information collection and dimensionThe intelligence degree protected.
Based on above-mentioned hardware structure, present device information getting method embodiment is proposed.
As in figure 2 it is shown, propose a kind of equipment information acquiring method of the present invention first embodiment, described in setStandby information getting method includes:
Step S10, after setting up first information storehouse and the second information bank, reads in first information storehouse pendingFacility information;
In the present embodiment, setting up first information storehouse and the second information bank in advance, described first information storehouse is wrappedIncluding facility information, described facility information includes but not limited to that equipment brand, type, RAM, ROM etc. setStandby hardware information, described second information bank is facility information, includes but not limited to brand, type, whetherThe device hardware information such as major key.The process in described structure first information storehouse includes: receives and is reported by SDKFacility information, using the described facility information reported preserve as first information storehouse.Concrete, by adjustingWith the api interface of smart machine, reported by fixing event rqd_model include brand, type, RAM,The device hardware information such as ROM, find according to daily experience and actual data analysis, in general, and oneThe brand of smart machine+type+ROM+ network formats just can uniquely confirm a type, other parameterWhen these four parameters are consistent, other parameter informations the most identical (except the special circumstances such as mountain vallage, brush machine,This information can be used as one of factor of judgment of brush machine).And network formats information, field used in everyday with peopleScape is correlated with, and through analyzing, can temporarily be not used as the crucial KEY value of the judgement of unique type;ROM parameterFor numeric type, standardization arrangement rule is relatively easy, therefore, in the present embodiment, be mainly used in brand,Automatization's specification of type field.First information storehouse represents that example is as shown in table 1:
Table 1
The process creating the second information bank may include that by third party website acquisition facility information and rightThe facility information of described acquisition carries out participle and saves as the second information bank as participle input text.DescribedTripartite website includes main flow mobile phone official website, Ministry of Industry and Information website, third party's cellphone information website etc., by fromAbove-mentioned website obtains facility information, and network consisting type database data, by participle instrument by described acquisitionFacility information carries out molecule, inputs text as participle.Second information bank represents that example is as shown in table 2:
Field nameField meaningsWhether major keyField value is illustrated
BrandBrandYSamsung
ModelTypeYGT-I9100
……
Table 2
After setting up first information storehouse and the second information bank, read equipment letter pending in first information storehouseBreath data, i.e. read the data needing " standardization " to process from described first information storehouse.Preferably,Described pending facility information is the key word information of the facility information preserved in described first information storehouse,It is stored in text, as input source by row.
Step S20, is spliced to described pending facility information in the URL address of search engine;
In the present embodiment, Python provides ready-made http protocol method, will input key word information withParametric form is spliced in the URL address of search engine (such as: input " MI2 mobile phone ", then be spliced intoURL address is: http:m.baidu.com/s?Word=MI+2+%E6%89%B%E6%9C%BA).
Step S30, accesses facility information pending described in described URL address acquisition by search engineAs input text, and by the input text comparison of described input text Yu described second information bank;
After by facility information splicing in URL address, access this URL, capture the packet returned and makeThe input text resolved for participle.Text is carried out word segmentation processing, by described input text and the second informationInput text comparison in storehouse, to determine described input text and the input text in described second information bankWhether mate
Concrete, with reference to Fig. 3, the described input text ratio by described input text with described second information bankTo process include:
Step S31, obtains the input text after participle by described input text participle;
Step S32, obtains participle input text, by the input after described participle from described second information bankText inputs text comparison with described participle.
Described input text is carried out participle by participle instrument, obtains the input text after participle, describedThe ready-made instrument that participle operation utilizes open source projects jieba participle to provide carries out keyword extraction, such as:Jieba.analyse.extract_tags (sentence.copk), illustrates: wherein sentence is input to be extractedText, is the text returned by the keyword search of first information storehouse in this project;Topk is returning rightThe key word that weight is maximum, is preferably in this project and returns the key word that 5 weights are maximum;Described topk isThe key word information used required for manually sorting out.
Step S40, when described input text is with the input text matches of described second information bank, associates instituteState the input text of input text and described second information bank.
When the input text matches of described input text and described second information bank, i.e. defeated after participleWhen entering the participle input text matches in text and the second information bank, automatic write-back storehouse, association couplingInput text merge write-back data base.
The present embodiment is by being spliced to pending information in the URL address of search engine, by input literary compositionThis comparison, associate device information, i.e. set up facility information word transformational relation.It is prevented effectively from existing settingExist under standby information processing manner acquisition facility information (hardware information, model information etc.) poor accuracy,Readability is poor and needs manually to gather and safeguard the problem that intelligence degree is low so that facility information reading sideJust, the facility information accuracy of acquisition is high and improves the intelligence degree of equipment information collection and maintenance.
Further, first embodiment based on the said equipment information getting method, propose the of the present inventionTwo embodiments.As shown in Figure 4, after described step S30, it is also possible to including:
Step S50, when the input text of described input text with described second information bank does not mates, according toPredetermined manner chooses the defeated of predetermined number from the input text of described input text and described second information bankEnter text and preserve;
Step S60, receives associated instructions based on the described input text chosen, by described associated instructions pairThe input textual association answered.
In the present embodiment, described predetermined manner is TF-IDF (term frequency inverse documentFrequency, term frequency-inverse document frequency index), TF-IDF is a kind of statistical method, in order to assess a wordWord is for the significance level of a copy of it file in a file set or a corpus.Words importantProperty is directly proportional increase along with the number of times that it occurs hereof, but can occur in corpus along with it simultaneouslyFrequency be inversely proportional to decline.The various forms of TF-IDF weighting is often applied by engine, as file and userThe tolerance of degree of correlation or grading between inquiry.Described predetermined number is preferably 5, i.e. chooses 5 powerThe key word information that weight values is maximum, is written in artificial taxis system, and for artificial handsome choosing, manual foundation " is closedSystem ", i.e. receive associated instructions based on the described input text chosen, by corresponding for described associated instructionsInput textual association.Above procedure, realizes automatization by Python script, in data factoryConfiguration routine dispatching task, daily performs.
The present embodiment passes through when input file does not mates with the input text in described second information bank,Export part input file according to predetermined manner, for manually setting up incidence relation, be further ensured that equipment is believedThe accuracy of breath.
In order to preferably describe present device information process, as a example by beacon, with reference to Fig. 5, for lampThe integrated stand composition in tower crane energy storehouse, some explanations of nouns in beacon function, qimei: solve in beacon projectCertainly identify mobile terminal uniquely identified identity ID, this ID based on the various intrinsic ID of mobile terminal (asThe ID such as IMEI, MAC, IMSI, all cannot effectively identify unique a end in actual complex sceneEnd equipment) calculate through mathematical method, it is finally reached the purpose confirming unique end equipment;
Beacon: operation solution based on terminal, function includes customer analysis, terminal analysis, networkAnalysis, APP quality optimization etc., provide the platform product of comprehensive operating service for mobile APP;
Beacon SDK: in beacon solution, is used for being embedded in intelligent terminal APP, awards userIn the range of power, for gathering the SDK external member of intelligent terminal's relevant information and APP relevant information;
Dictionary: during participle, for specific area language material improve participle success rate, it is provided that this fieldThe dictionary of language material, herein, is referred to the information obtained by web crawlers, arranges and filter out intelligent terminal's productBoard, the lexical set of model information, be organized into " mobile phone brand dictionary " and " mobile phone type dictionary ".
Beacon reports: real mass users terminal hardware information, forms beacon function storehouse;
Ministry of Industry and Information, evaluation and test door, manufacturer official website, the type storehouse of business self maintained;Multiple data sources,Almost cover whole brand, model information, form reptile network type storehouse;
Data processing section whole design and framework, is divided into four parts by information automation planning process, ginsengExamine Fig. 6:
1, beacon function storehouse and reptile network type storehouse are built;
2, " the semantic association relation " of key word between beacon function storehouse and reptile function storehouse is built;
3, manual intervention, leakage detection is filled a vacancy;
4, double storehouses information merges.
Detailed step flow process is as shown in Figure 7:
Word segmentation result processes with optimal inspection as shown in Figure 8: with following data instance: XIAOMIMI3@Semen setariae, MI3, XIAOMI, secret, XIAOMIMI3WCDMA.The searching keyword of input isXIAOMIMI3;The word segmentation result returned is: by the top5 word of word frequency+inverse literary composition frequency sequence, by top5Return word " dictionary " to mate one by one, match, then relation successfully constructs;Unmatch, enter peopleWork coupling link.Such as Fig. 9 is shown, for brand categorization results by the result of calculation taken statistics after classification;Such as figureShown in 10, for type categorization results.Crawlers in the present embodiment can realize with different language (asPerl, ruby etc.);It is simultaneous for different scenes and purposes, " dictionary " of personalization can be set up voluntarily" corpus ", is used for adjusting accuracy and the TF-IDF index of participle;Different searching can also be passed throughIndex is held up and is substituted, and also can set up search engine voluntarily;Participle instrument can use other similar instruments orPerson is from authoring tool;Manual association's rule, also has certain artificial trace, can be according to concrete applied fieldScape difference formulates the correlation rule being more suitable for.The using value of the present invention is, it is possible to utilize disclosedTechnology and instrument, in the case of limited human input, build a set of automatic " semantic conversion relation "System, while accuracy and the readability of terminal function storehouse information is substantially improved, reduce manual maintenanceCost.
Accordingly, the preferred embodiment of present device information acquisition device is proposed.With reference to Fig. 8, described in setStandby information acquisition device includes acquisition module 10, concatenation module 20 and pushing module 30.
Described read module 10, for after setting up first information storehouse and the second information bank, reads the first letterFacility information pending in breath storehouse;
In the present embodiment, setting up first information storehouse and the second information bank in advance, described first information storehouse is wrappedIncluding facility information, described facility information includes but not limited to that equipment brand, type, RAM, ROM etc. setStandby hardware information, described second information bank is facility information, includes but not limited to brand, type, whetherThe device hardware information such as major key.The process in described structure first information storehouse includes: receives and is reported by SDKFacility information, using the described facility information reported preserve as first information storehouse.Concrete, by adjustingWith the api interface of smart machine, reported by fixing event rqd_model include brand, type, RAM,The device hardware information such as ROM, find according to daily experience and actual data analysis, in general, and oneThe brand of smart machine+type+ROM+ network formats just can uniquely confirm a type, other parameterWhen these four parameters are consistent, other parameter informations the most identical (except the special circumstances such as mountain vallage, brush machine,This information can be used as one of factor of judgment of brush machine).And network formats information, field used in everyday with peopleScape is correlated with, and through analyzing, can temporarily be not used as the crucial KEY value of the judgement of unique type;ROM parameterFor numeric type, standardization arrangement rule is relatively easy, therefore, in the present embodiment, be mainly used in brand,Automatization's specification of type field.First information storehouse represents that example is as shown in table 1:
The process creating the second information bank may include that by third party website acquisition facility information and rightThe facility information of described acquisition carries out participle and saves as the second information bank as participle input text.DescribedTripartite website includes main flow mobile phone official website, Ministry of Industry and Information website, third party's cellphone information website etc., by fromAbove-mentioned website obtains facility information, and network consisting type database data, by participle instrument by described acquisitionFacility information carries out molecule, inputs text as participle.Second information bank represents that example is as shown in table 2:
After setting up first information storehouse and the second information bank, read equipment letter pending in first information storehouseBreath data, i.e. read the data needing " standardization " to process from described first information storehouse.Preferably,Described pending facility information is the key word information of the facility information preserved in described first information storehouse,It is stored in text, as input source by row.
Described concatenation module 20, for being spliced to the URL of search engine by described pending facility informationIn address;
In the present embodiment, Python provides ready-made http protocol method, will input key word information withParametric form is spliced in the URL address of search engine (such as: input " MI2 mobile phone ", then be spliced intoURL address is: http:m.baidu.com/s?Word=MI+2+%E6%89%B%E6%9C%BA).
Described acquisition module 30, for accessing described in described URL address acquisition pending by search engineFacility information as input text;
Described comparing module 40, for by the input text ratio of described input text with described second information bankRight;
After by facility information splicing in URL address, access this URL, capture the packet returned and makeThe input text resolved for participle.Text is carried out word segmentation processing, by described input text and the second informationInput text comparison in storehouse, to determine described input text and the input text in described second information bankWhether mate
Concrete, with reference to Figure 12, described comparing module 40 includes participle unit 41 and comparing unit 42,
Described participle unit 41, the input text after described input text participle is obtained participle;
Described comparing unit 42, for obtaining participle input text, by described from described second information bankInput text after participle inputs text comparison with described participle.
Described input text is carried out participle by participle instrument, obtains the input text after participle, describedThe ready-made instrument that participle operation utilizes open source projects jieba participle to provide carries out keyword extraction, such as:Jieba.analyse.extract_tags (sentence.copk), illustrates: wherein sentence is input to be extractedText, is the text returned by the keyword search of first information storehouse in this project;Topk is returning rightThe key word that weight is maximum, is preferably in this project and returns the key word that 5 weights are maximum;Described topk isThe key word information used required for manually sorting out.
Described relating module 50, for the input text at described input text Yu described second information bankTiming, associates the input text of described input text and described second information bank.
When the input text matches of described input text and described second information bank, i.e. defeated after participleWhen entering the participle input text matches in text and the second information bank, automatic write-back storehouse, association couplingInput text merge write-back data base.
The present embodiment is by being spliced to pending information in the URL address of search engine, by input literary compositionThis comparison, associate device information, i.e. set up facility information word transformational relation.It is prevented effectively from existing settingExist under standby information processing manner acquisition facility information (hardware information, model information etc.) poor accuracy,Readability is poor and needs manually to gather and safeguard the problem that intelligence degree is low so that facility information reading sideJust, the facility information accuracy of acquisition is high and improves the intelligence degree of equipment information collection and maintenance.
Further, first embodiment based on the said equipment information acquisition device, present device is proposedSecond embodiment of information acquisition device.As shown in figure 13, described apparatus information acquiring device can also wrapInclude: choose module 60, preserve module 70 and receiver module 80,
Described choose module 60, for the input text at described input text and described second information bank notDuring coupling, choose from the input text of described input text and described second information bank according to predetermined mannerThe input text of predetermined number;
Described preservation module 70, for preserving selected input text;
Described receiver module 80, for receiving associated instructions based on the described input text chosen;
Described relating module 50, is additionally operable to input textual association corresponding for described associated instructions.
In the present embodiment, described predetermined manner is TF-IDF (term frequency inverse documentFrequency, term frequency-inverse document frequency index), TF-IDF is a kind of statistical method, in order to assess a wordWord is for the significance level of a copy of it file in a file set or a corpus.Words importantProperty is directly proportional increase along with the number of times that it occurs hereof, but can occur in corpus along with it simultaneouslyFrequency be inversely proportional to decline.The various forms of TF-IDF weighting is often applied by engine, as file and userThe tolerance of degree of correlation or grading between inquiry.Described predetermined number is preferably 5, i.e. chooses 5 powerThe key word information that weight values is maximum, is written in artificial taxis system, and for artificial handsome choosing, manual foundation " is closedSystem ", i.e. receive associated instructions based on the described input text chosen, by corresponding for described associated instructionsInput textual association.Above procedure, realizes automatization by Python script, in data factoryConfiguration routine dispatching task, daily performs.
The present embodiment passes through when input file does not mates with the input text in described second information bank,Export part input file according to predetermined manner, for manually setting up incidence relation, be further ensured that equipment is believedThe accuracy of breath.
In order to preferably describe present device information process, as a example by beacon, with reference to Fig. 5, for lampThe integrated stand composition in tower crane energy storehouse, some explanations of nouns in beacon function, qimei: solve in beacon projectCertainly identify mobile terminal uniquely identified identity ID, this ID based on the various intrinsic ID of mobile terminal (asThe ID such as IMEI, MAC, IMSI, all cannot effectively identify unique a end in actual complex sceneEnd equipment) calculate through mathematical method, it is finally reached the purpose confirming unique end equipment;
Beacon: operation solution based on terminal, function includes customer analysis, terminal analysis, networkAnalysis, APP quality optimization etc., provide the platform product of comprehensive operating service for mobile APP;
Beacon SDK: in beacon solution, is used for being embedded in intelligent terminal APP, awards userIn the range of power, for gathering the SDK external member of intelligent terminal's relevant information and APP relevant information;
Dictionary: during participle, for specific area language material improve participle success rate, it is provided that this fieldThe dictionary of language material, herein, is referred to the information obtained by web crawlers, arranges and filter out intelligent terminal's productBoard, the lexical set of model information, be organized into " mobile phone brand dictionary " and " mobile phone type dictionary ".
Beacon reports: real mass users terminal hardware information, forms beacon function storehouse;
Ministry of Industry and Information, evaluation and test door, manufacturer official website, the type storehouse of business self maintained;Multiple data sources,Almost cover whole brand, model information, form reptile network type storehouse;
Data processing section whole design and framework, is divided into four parts by information automation planning process, ginsengExamine Fig. 6:
1, beacon function storehouse and reptile network type storehouse are built;
2, " the semantic association relation " of key word between beacon function storehouse and reptile function storehouse is built;
3, manual intervention, leakage detection is filled a vacancy;
4, double storehouses information merges.
Detailed step flow process is as shown in Figure 7:
Word segmentation result processes with optimal inspection as shown in Figure 8: with following data instance: XIAOMIMI3@Semen setariae, MI3, XIAOMI, secret, XIAOMIMI3WCDMA.The searching keyword of input isXIAOMIMI3;The word segmentation result returned is: by the top5 word of word frequency+inverse literary composition frequency sequence, by top5Return word " dictionary " to mate one by one, match, then relation successfully constructs;Unmatch, enter peopleWork coupling link.Such as Fig. 9 is shown, for brand categorization results by the result of calculation taken statistics after classification;Such as figureShown in 10, for type categorization results.Crawlers in the present embodiment can realize with different language (asPerl, ruby etc.);It is simultaneous for different scenes and purposes, " dictionary " of personalization can be set up voluntarily" corpus ", is used for adjusting accuracy and the TF-IDF index of participle;Different searching can also be passed throughIndex is held up and is substituted, and also can set up search engine voluntarily;Participle instrument can use other similar instruments orPerson is from authoring tool;Manual association's rule, also has certain artificial trace, can be according to concrete applied fieldScape difference formulates the correlation rule being more suitable for.The using value of the present invention is, it is possible to utilize disclosedTechnology and instrument, in the case of limited human input, build a set of automatic " semantic conversion relation "System, while accuracy and the readability of terminal function storehouse information is substantially improved, reduce manual maintenanceCost.
It should be noted that in this article, term " include ", " comprising " or its any other variantBe intended to comprising of nonexcludability so that include the process of a series of key element, method, article orPerson's device not only includes those key elements, but also includes other key elements being not expressly set out, or alsoIncluding the key element intrinsic for this process, method, article or device.In the feelings not having more restrictionUnder condition, statement " including ... " key element limited, it is not excluded that include this key element process,Method, article or device there is also other identical element.
The invention described above embodiment sequence number, just to describing, does not represent the quality of embodiment.
Through the above description of the embodiments, those skilled in the art is it can be understood that arrive above-mentionedEmbodiment method can add the mode of required general hardware platform by software and realize, naturally it is also possible to logicalCross hardware, but a lot of in the case of the former is more preferably embodiment.Based on such understanding, the present invention'sThe part that prior art is contributed by technical scheme the most in other words can be with the form body of software productRevealing to come, this computer software product is stored in a storage medium (such as ROM/RAM, magnetic disc, lightDish) in, including some instructions with so that a station terminal equipment (can be mobile phone, computer, serviceDevice, or the network equipment etc.) perform the method described in each embodiment of the present invention.

Claims (10)

CN201510276430.0A2015-05-262015-05-26Equipment information processing method and deviceActiveCN106294433B (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN201510276430.0ACN106294433B (en)2015-05-262015-05-26Equipment information processing method and device

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN201510276430.0ACN106294433B (en)2015-05-262015-05-26Equipment information processing method and device

Publications (2)

Publication NumberPublication Date
CN106294433Atrue CN106294433A (en)2017-01-04
CN106294433B CN106294433B (en)2020-03-03

Family

ID=57634887

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN201510276430.0AActiveCN106294433B (en)2015-05-262015-05-26Equipment information processing method and device

Country Status (1)

CountryLink
CN (1)CN106294433B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN109284384A (en)*2018-10-102019-01-29拉扎斯网络科技(上海)有限公司Text analysis method and device, electronic equipment and readable storage medium
CN112256862A (en)*2020-09-082021-01-22山东黄金矿业(莱州)有限公司三山岛金矿Data mapping relation establishing method

Citations (4)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN102591972A (en)*2011-12-312012-07-18北京百度网讯科技有限公司Method and device for providing goods search results
US8676778B2 (en)*1995-12-142014-03-18Graphon CorporationMethod and apparatus for electronically publishing information on a computer network
CN103678443A (en)*2012-09-192014-03-26弗里塞恩公司Method and system for providing content provider-specified URL keyword navigation
US9122730B2 (en)*2012-05-302015-09-01International Business Machines CorporationFree-text search for integrating management of applications

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US8676778B2 (en)*1995-12-142014-03-18Graphon CorporationMethod and apparatus for electronically publishing information on a computer network
CN102591972A (en)*2011-12-312012-07-18北京百度网讯科技有限公司Method and device for providing goods search results
US9122730B2 (en)*2012-05-302015-09-01International Business Machines CorporationFree-text search for integrating management of applications
CN103678443A (en)*2012-09-192014-03-26弗里塞恩公司Method and system for providing content provider-specified URL keyword navigation

Cited By (3)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN109284384A (en)*2018-10-102019-01-29拉扎斯网络科技(上海)有限公司Text analysis method and device, electronic equipment and readable storage medium
CN109284384B (en)*2018-10-102021-01-01拉扎斯网络科技(上海)有限公司Text analysis method and device, electronic equipment and readable storage medium
CN112256862A (en)*2020-09-082021-01-22山东黄金矿业(莱州)有限公司三山岛金矿Data mapping relation establishing method

Also Published As

Publication numberPublication date
CN106294433B (en)2020-03-03

Similar Documents

PublicationPublication DateTitle
CN104615760B (en)Fishing website recognition methods and system
CN112104642B (en)Abnormal account number determination method and related device
CN107832468A (en)Demand recognition methods and device
CN106713579B (en)Telephone number identification method and device
CN109063000A (en)Question sentence recommended method, customer service system and computer readable storage medium
CN106446113A (en)Mobile big data analysis method and device
CN107590255A (en)Information-pushing method and device
CN105718533A (en)Information pushing method and device
CN117540803A (en)Decision engine configuration method and device based on large model, electronic equipment and medium
CN110020161B (en)Data processing method, log processing method and terminal
US20240220772A1 (en)Method of evaluating data, training method, electronic device, and storage medium
CN110389941A (en)Database method of calibration, device, equipment and storage medium
CN107679141A (en)Data storage method, device, equipment and computer-readable recording medium
CN107844595A (en)A kind of job hunting website position intelligent recommendation method
CN113568888A (en)Index recommendation method and device
CN103064967B (en)A kind of method and apparatus for establishing user's binary crelation library
CN115952792A (en)Text auditing method and device, electronic equipment, storage medium and product
CN105786858A (en)Information search system and method
CN106528566A (en)Log file output method, server and client
CN106294433A (en)Facility information processing method and processing device
CN108959289B (en)Website category acquisition method and device
US20240311433A1 (en)Dynamic Website Characterization For Search Optimization
CN118194328A (en)Data processing method, data processing device, electronic equipment and storage medium
CN104765747A (en)Webpage processing method and device
CN117201146A (en)Malicious website identification method, system, electronic equipment and storage medium

Legal Events

DateCodeTitleDescription
C06Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination
GR01Patent grant
GR01Patent grant

[8]ページ先頭

©2009-2025 Movatter.jp