Summary of the invention
The purpose of the application is to propose the method and apparatus of a kind of automatic collection and analysis interchanger log, come solve withThe technical issues of upper background technology part is mentioned.
In a first aspect, this application provides a kind of methods of automatic collection and analysis interchanger log, which comprisesAcquire the log that all interchangers generate in Internet data center;By the log according to scheduled rule-based filtering, apertureClose the log of scheduled rule;It carries out structuring to the log of the reservation to handle to form structural data, wherein the knotStructure data include: interchanger ID, timestamp, type of message and detail message;The structured data transfer is set to storageIn standby;Analyze the structural data in the storage equipment.
In some embodiments, the log to the reservation carries out structuring processing to form structural data, wrapsIt includes: parsing interchanger ID and timestamp from the log of the reservation;Remove interchanger ID in the log of the reservation andTimestamp;Participle duplicate removal is carried out to the log using Lucene;Using clustering algorithm by the day with identical structure or meaningWill is divided into one kind to extract type of message and detail message.
In some embodiments, the log that all interchangers generate in the acquisition Internet data center, comprising: pass throughTwo Core servers acquire the log that all interchangers generate in the Internet data center, wherein two coresServer is mutually redundant and supports breakpoint transmission.
In some embodiments, it is described by the structured data transfer into storage equipment, comprising: using there are twoThe flume framework of transmission node, described two transmission nodes share a virtual IP address.
In some embodiments, the storage equipment includes MySQL database and Hadoop distributed file system.
In some embodiments, the structural data in the analysis storage equipment, comprising: by describedMySQL database inquires the real-time architecture data of the interchanger in the predetermined time;Pass through the distributed text of the HadoopPart system analyzes the log size of the interchanger, alerts friendship if the log size of the interchanger is greater than threshold valueIt changes planes exception.
Second aspect, this application provides the devices of a kind of automatic collection and analysis interchanger log, which is characterized in that instituteStating device includes: acquisition unit, is configured to the log that all interchangers generate in acquisition Internet data center;Filtering is singleMember is configured to the log retaining the log for meeting scheduled rule according to scheduled rule-based filtering;Structuring unit,It is configured to the log to the reservation and carries out structuring processing to form structural data, wherein the structural data packetIt includes: interchanger ID, timestamp, type of message and detail message;Transmission unit is configured to arrive the structured data transferIt stores in equipment;Analytical unit is configured to analyze the structural data in the storage equipment.
In some embodiments, the structuring unit is configured to: parsing interchanger from the log of the reservationID and timestamp;Remove interchanger ID and the timestamp in the log of the reservation;The log is divided using LuceneWord duplicate removal;Clustering algorithm is used to be divided into the log with identical structure or meaning a kind of to extract type of message and disappear in detailBreath.
In some embodiments, the acquisition unit is further configured to: by described in two Core server acquisitionsThe log that all interchangers generate in Internet data center, wherein two Core servers are mutually redundant and support to breakPoint resumes.
In some embodiments, the transmission unit is configured to: it uses there are the flume framework of two transmission nodes,Described two transmission nodes share a virtual IP address.
In some embodiments, the storage equipment includes MySQL database and Hadoop distributed file system.
In some embodiments, the analytical unit is further configured to: being inquired by the MySQL database predeterminedThe real-time architecture data of the interchanger in time;By the Hadoop distributed file system to the interchangerLog size is analyzed, and interchanger exception is alerted if the log size of the interchanger is greater than threshold value.
The method and apparatus of automatic collection provided by the present application and analysis interchanger log, by acquisition internet dataThe log that intracardiac all interchangers generate, formation structural data is transferred to storage again after the log is carried out cleaning pretreatmentIn equipment, and the structural data in the storage equipment is analyzed, can be good at handling extensive polytypic complexity and setThe acquisition of interchanger log statistic, structuring and concentration under standby environment check, analyze the O&M for being able to ascend O&M engineerEfficiency.
Specific embodiment
The application is described in further detail with reference to the accompanying drawings and examples.It is understood that this place is retouchedThe specific embodiment stated is used only for explaining related invention, rather than the restriction to the invention.It also should be noted that in order toConvenient for description, part relevant to related invention is illustrated only in attached drawing.
It should be noted that in the absence of conflict, the features in the embodiments and the embodiments of the present application can phaseMutually combination.The application is described in detail below with reference to the accompanying drawings and in conjunction with the embodiments.
Fig. 1 shows the automatic collection that can apply the application and the method for analyzing interchanger log or automatic collection and dividesAnalyse the exemplary system architecture 100 of the embodiment of the device of interchanger log.
As shown in Figure 1, system architecture 100 may include interchanger 101,102,103, network 104 and server 105.NetNetwork 104 between interchanger 101,102,103 and server 105 to provide the medium of communication link.Network 104 may includeVarious connection types, such as wired, wireless communication link or fiber optic cables etc..
The log of interchanger 101,102,103 is transferred to server 105 by network 104.On interchanger 101,102,103The client of various log collection tools can be installed, such as be equipped with and can be transferred to its internal log information remotelyLog server rsyslog etc..
Interchanger 101,102,103, which can be, provides the electric signal exclusively enjoyed for any two network node of access switchThe network equipment of access, including but not limited to Ethernet switch, fast ethernet switch, gigabit ethernet switch,FDDI interchanger, ATM switch and token ring switch etc..
Server 105 can be to provide the server of various services, such as the log of acquisition interchanger 101,102,103,Structuring, storage, analysis are carried out to collected log.
It should be noted that automatic collection provided by the embodiment of the present application and analysis interchanger log method generally byServer 105 executes, and correspondingly, the device of automatic collection and analysis interchanger log is generally positioned in server 105.
It should be understood that the number of interchanger, network and server in Fig. 1 is only schematical.According to realize needs,It can have any number of interchanger, network and server.
With continued reference to Fig. 2, a reality of the method for the automatic collection and analysis interchanger log according to the application is shownApply the process 200 of example.The method of the automatic collection and analysis interchanger log, comprising the following steps:
Step 201, the log that all interchangers generate in Internet data center is acquired.
In the present embodiment, automatic collection and analysis interchanger log method operation thereon electronic equipment (such asServer shown in FIG. 1) log collection tool can be passed through from internet by wired connection mode or radio connectionAll interchangers acquire interchanger log in data center.
In some optional implementations of the present embodiment, the internet data is acquired by two Core serversThe log that all interchangers generate in center, wherein two Core servers are mutually redundant and support breakpoint transmission.ExampleIt such as, can be by collected interchanger log buffer in local, after network recovery again in the case where there is network failureBreakpoint transmission is carried out, maximum disaster tolerance can be 200,000 logs.
Step 202, log is retained into the log for meeting scheduled rule according to scheduled rule-based filtering.
In the present embodiment, after log being filtered out to useless log, then carry out log transmission, for example, filter out byA large amount of useless logs caused by BUG.
Step 203, structuring is carried out to the log of reservation to handle to form structural data.
In the present embodiment, the log retained after filtering can be carried out to structuring to handle to form structural data,In, the structural data includes: interchanger ID, timestamp, type of message and detail message.
Step 204, by structured data transfer into storage equipment.
In the present embodiment, by structured data transfer into storage equipment.Wherein the storage equipment can be Fig. 1 instituteThe server shown is also possible to other remote servers
It is described using there are the flume framework of two transmission nodes in some optional implementations of the present embodimentTwo transmission nodes share a virtual IP address, realize load balancing and disaster tolerance.
In some optional implementations of the present embodiment, the storage equipment includes MySQL database and HadoopDistributed file system.For example, MySQL database is used to save one week time, rule of thumb, line is also substantially checked within one weekThe maximum time window of upper problem
Step 205, the structural data in analysis storage equipment.
In the present embodiment, the structural data in analysis storage equipment.In order to promote the troubleshooting efficiency of O&M engineer,The log of all structurings can be provided and uniformly check entrance.After the log of all devices is focused on storage equipment, takeLog unified query platform is built, O&M engineer can check any one computer room, a kind of model, an equipment any time periodDetail message.For example, counted to the scale of Web logs of each switch device, by scale anomaly there may beThe switch device of problem.If the scale of Web logs of an equipment is abnormal, it is more likely that be to open DEBUG mode or interchangerThere are BUG.Therefore, it is necessary to handle these equipment in time, interference of the rubbish log to engineer's troubleshooting is reduced.
In some optional implementations of the present embodiment, inquired in the predetermined time by the MySQL databaseThe real-time architecture data of the interchanger;By the Hadoop distributed file system to the log size of the interchangerIt is analyzed, interchanger exception is alerted if the log size of the interchanger is greater than threshold value.
With continued reference to the applied field that Fig. 3, Fig. 3 are according to the method for the automatic collection and analysis interchanger log of the present embodimentOne schematic diagram of scape.In the application scenarios of Fig. 3, log collection unit 302 acquires the log of multiple switch 301, will adoptStorage equipment MySQL304 and HDFS (Hadoop distribution are transferred to by log transmission unit 303 after the log filtering collectedFile system) in 305.Structuring log in MySQL304 is inquired for engineer, and the log in HDFS can be used for carrying out logIt analyzes and provides original off-line data for offline excavation, log compression algorithm.
The method provided by the above embodiment of the application passes through to the automation collection of interchanger log, transmission, storage, knotStructure, inquiry and analysis, can be good at handling under extensive polytypic complex device environment log statistic acquisition, structureThe O&M efficiency changed and statistics is concentrated to check, be able to ascend O&M engineer.
With further reference to Fig. 4, it illustrates another embodiments of automatic collection and the method for analysis interchanger logProcess 400.The process 400 of the automatic collection and the method for analysis interchanger log, comprising the following steps:
Step 401, the log that all interchangers generate in Internet data center is acquired.
Step 402, log is retained into the log for meeting scheduled rule according to scheduled rule-based filtering.
Step 401-402 is identical with step 201-202, therefore repeats no more.
Step 403, interchanger ID and timestamp are parsed from the log of reservation.
In the present embodiment, interchanger ID and timestamp are parsed from the log of reservation, as shown in table 1.
Original interchanger log is unstructured data, can not directly carry out the statistic of classification of log, and structuringDifficult point is the diversification of the format under complicated interchanger model.It is structured that treated that structural data is as shown in table 1:
Table 1
Wherein, interchanger ID is the unique identification of switch device, and generalling use management IP and title indicates, as well asIts attaching relation, as shown in table 2:
| IDC | Manage IP | Title | Area type |
| xxx | 192.168.x.x | xx-xx-xx-xx.Int | INT_SWITCH |
| xxx | 192.168.x.x | xx-xx-xx-xx.Ext | INT_SWITCH |
| xxx | 192.168.x.x | xx-xx-xx-xx.Admin | INT_SWITCH |
Table 2
Wherein, managing IP and title can be extracted using general regular expression, and IDC and area type needs are passingLabel (using the journal formatting function of Rsyslog) is added to log when defeated.
Step 404, interchanger ID and the timestamp in the log of reservation are removed.
In the present embodiment, interchanger ID and the timestamp in the log of reservation are removed.
The difficult point of formatting first is that offseting the extraction of breath type, the message type format of each model interchanger is not unitedOne in addition the interchanger journal formats of same model different editions can also have any different.In order to solve this problem, we are right firstThe interchanger of same model is pre-processed, removal variable (number, management IP, title, timestamp etc.).
Step 405, participle duplicate removal is carried out to log using Lucene.
In the present embodiment, participle duplicate removal is carried out to log using Lucene.Lucene is a famous Open-Source Tools,It can use the tool to be segmented.The log for removing variable is inputted into the tool, then can export word segmentation result.
Step 406, use clustering algorithm that the log with identical structure or meaning is divided into one kind to extract type of messageStructural data is formed with detail message.
In the present embodiment, Term Frequency-Inverse Document is extracted to pretreated dataFrequency (TF-IDF) simultaneously converts numerical value for log text, using K-means algorithm cluster, will have identical structure orThe log of meaning is divided into one kind, then extracts the regular expression of type of message, as shown in table 3:
Table 3
Then, the type of message in log is extracted using the regular expression of type of message.
When extracting detail message, need to reject all structured parts from original log, while needing to locateSome spcial characters, such as * .% etc. at log beginning are managed, detail message is as shown in table 4:
| Detail message |
| Interface ethernet 1/2/2,state up |
| VLAN 4094Port 1/2/2State->BLOCKING(PortDown) |
| 2/3optic rx power low alarm |
| Optic is not Foundry qualified(port 7) |
Table 4
Step 407, by structured data transfer into storage equipment.
Step 408, the structural data in analysis storage equipment.
Step 407-408 is identical as step 204-205, therefore repeats no more.
Figure 4, it is seen that automatic collection and analysis exchange compared with the corresponding embodiment of Fig. 2, in the present embodimentThe process 400 of the method for machine log, which is highlighted, carries out the step of structuring processing is to form structural data to log.As a result, originallyThe scheme of embodiment description can carry out log processing for the different switch devices of multiple manufacturers, carry out unified structureChange convenient for log is inquired and analyzed.
With further reference to Fig. 5, as the realization to method shown in above-mentioned each figure, this application provides a kind of automatic collection andOne embodiment of the device of interchanger log is analyzed, the Installation practice is corresponding with embodiment of the method shown in Fig. 2, the dressSetting specifically can be applied in various electronic equipments.
As shown in figure 5, automatic collection described in the present embodiment and the device 500 of analysis interchanger log include: that acquisition is singleMember 501, filter element 502, structuring unit 503, transmission unit 504 and analytical unit 505.Wherein, acquisition unit 501 configuresFor acquiring the log that all interchangers generate in Internet data center;Filter element 502 be configured to by the log byAccording to scheduled rule-based filtering, retain the log for meeting scheduled rule;Structuring unit 503 is configured to the day to the reservationWill carries out structuring processing to form structural data, wherein the structural data includes: interchanger ID, timestamp, messageType and detail message;Transmission unit 504 is configured to the structured data transfer into storage equipment;Analytical unit505 are configured to analyze the structural data in the storage equipment.
In the present embodiment, collected log is sent to filter element 502 and filtered by acquisition unit 501.Structuring listThe filtered log of filter element 502 is carried out structuring processing and is transferred to analytical unit 505 by transmission unit 504 again by member 503.
In some optional implementations of the present embodiment, the structuring unit 503 is configured to: from the reservationLog in parse interchanger ID and timestamp;Remove interchanger ID and the timestamp in the log of the reservation;UsingLucene carries out participle duplicate removal to the log;Use clustering algorithm by the log with identical structure or meaning be divided into it is a kind of withExtract type of message and detail message.
In some optional implementations of the present embodiment, the acquisition unit 501 is further configured to: by twoPlatform Core server acquires the log that all interchangers generate in the Internet data center, wherein two cores clothesBusiness device is mutually redundant and supports breakpoint transmission.
In some optional implementations of the present embodiment, the transmission unit 504 is configured to: using there are twoThe flume framework of transmission node, described two transmission nodes share a virtual IP address.
In some optional implementations of the present embodiment, the storage equipment includes MySQL database and HadoopDistributed file system.
In some optional implementations of the present embodiment, the analytical unit 505 is further configured to: passing through instituteState the real-time architecture data of the interchanger in the MySQL database inquiry predetermined time;It is distributed by the HadoopFile system analyzes the log size of the interchanger, alerts if the log size of the interchanger is greater than threshold valueInterchanger is abnormal.
Below with reference to Fig. 6, it illustrates the computer systems 600 for the server for being suitable for being used to realize the embodiment of the present applicationStructural schematic diagram.
As shown in fig. 6, computer system 600 includes central processing unit (CPU) 601, it can be read-only according to being stored inProgram in memory (ROM) 602 or be loaded into the program in random access storage device (RAM) 603 from storage section 608 andExecute various movements appropriate and processing.In RAM 603, also it is stored with system 600 and operates required various programs and data.CPU 601, ROM 602 and RAM 603 are connected with each other by bus 604.Input/output (I/O) interface 605 is also connected to alwaysLine 604.
I/O interface 605 is connected to lower component: the importation 606 including keyboard, mouse etc.;It is penetrated including such as cathodeThe output par, c 607 of spool (CRT), liquid crystal display (LCD) etc. and loudspeaker etc.;Storage section 608 including hard disk etc.;And the communications portion 609 of the network interface card including LAN card, modem etc..Communications portion 609 via such as becauseThe network of spy's net executes communication process.Driver 610 is also connected to I/O interface 605 as needed.Detachable media 611, such asDisk, CD, magneto-optic disk, semiconductor memory etc. are mounted on as needed on driver 610, in order to read from thereonComputer program be mounted into storage section 608 as needed.
Particularly, in accordance with an embodiment of the present disclosure, it may be implemented as computer above with reference to the process of flow chart descriptionSoftware program.For example, embodiment of the disclosure includes a kind of computer program product comprising be tangibly embodied in machine readableComputer program on medium, the computer program include the program code for method shown in execution flow chart.At thisIn the embodiment of sample, which can be downloaded and installed from network by communications portion 609, and/or from removableMedium 611 is unloaded to be mounted.When the computer program is executed by central processing unit (CPU) 601, execute in the present processesThe above-mentioned function of limiting.
Flow chart and block diagram in attached drawing are illustrated according to the system of the various embodiments of the application, method and computer journeyThe architecture, function and operation in the cards of sequence product.In this regard, each box in flowchart or block diagram can generationA part of one module, program segment or code of table, a part of the module, program segment or code include one or moreExecutable instruction for implementing the specified logical function.It should also be noted that in some implementations as replacements, institute in boxThe function of mark can also occur in a different order than that indicated in the drawings.For example, two boxes succeedingly indicated are practicalOn can be basically executed in parallel, they can also be executed in the opposite order sometimes, and this depends on the function involved.Also it wantsIt is noted that the combination of each box in block diagram and or flow chart and the box in block diagram and or flow chart, Ke YiyongThe dedicated hardware based system of defined functions or operations is executed to realize, or can be referred to specialized hardware and computerThe combination of order is realized.
Being described in unit involved in the embodiment of the present application can be realized by way of software, can also be by hardThe mode of part is realized.Described unit also can be set in the processor, for example, can be described as: a kind of processor packetInclude acquisition unit, filter element, structuring unit, transmission unit and analytical unit.Wherein, the title of these units is in certain feelingsThe restriction to the unit itself is not constituted under condition, for example, acquisition unit is also described as " acquisition Internet data centerThe unit for the log that interior all interchangers generate ".
As on the other hand, present invention also provides a kind of nonvolatile computer storage media, the non-volatile calculatingMachine storage medium can be nonvolatile computer storage media included in device described in above-described embodiment;It is also possible toIndividualism, without the nonvolatile computer storage media in supplying terminal.Above-mentioned nonvolatile computer storage media is depositedOne or more program is contained, when one or more of programs are executed by an equipment, so that the equipment: acquisitionThe log that all interchangers generate in Internet data center;By the log according to scheduled rule-based filtering, reservation meets pre-The log of fixed rule;It carries out structuring to the log of the reservation to handle to form structural data, wherein the structuringData include: interchanger ID, timestamp, type of message and detail message;By the structured data transfer into storage equipment;Analyze the structural data in the storage equipment.
Above description is only the preferred embodiment of the application and the explanation to institute's application technology principle.Those skilled in the artMember is it should be appreciated that invention scope involved in the application, however it is not limited to technology made of the specific combination of above-mentioned technical characteristicScheme, while should also cover in the case where not departing from the inventive concept, it is carried out by above-mentioned technical characteristic or its equivalent featureAny combination and the other technical solutions formed.Such as features described above has similar function with (but being not limited to) disclosed hereinCan technical characteristic replaced mutually and the technical solution that is formed.