Summary of the invention
The object of the invention is as solving the problem and providing one can carry out multimedium high-speed parallel forensics analysis, evidence obtaining efficiency is high, can reduce or avoid because of offered load, computing node CPU, internal memory equal distribution is uneven and the distributed evidence obtaining dynamic load leveling dispatching method of overall performance decline problem that causes and device.
For this reason, the invention discloses a kind of distributed evidence obtaining dynamic load leveling dispatching method, comprise the steps:
A1, if computing node is a computing node set P, and to each computing node Picarry out initialization;
A2, according to computing node Piresource situation determination task window threshold values Tm;
A3, calculates current operation node Piload Wi, and to its sequence;
A4, calculates current operation node Piidleness Ridle, and it is sorted;
A5, is set to a set of tasks T, and sorts to set of tasks T according to task priority by task;
A6, takes out least-loaded and the highest computing node P of idleness from computing node set Piif the result obtained is sky, repeat steps A 3 to A6, until obtain;
A7, obtains the highest task T of priority from set of tasks Tiif the result obtained is sky, repeat steps A 4 to A7, until obtain;
A8, obtaining in steps A 7 of task Tidistribute to the computing node P obtained in steps A 6i, upgrade set of tasks T and computing node P simultaneouslyicurrent number of tasks;
A9, repeated execution of steps A2 to A8, until all tasks are all assigned.
Further, in described steps A 3, at least one situation according to the CPU of computing node, internal memory and network condition calculates load Wi.
Further, in described steps A 3, summation is weighted to the CPU of computing node, internal memory and offered load and calculates load Wi.
Further, in described steps A 3, adopt by stages dynamic state feedback mechanism to the load W calculatedifeed back, and to feeding back the load W obtainedisort.
Further, in described steps A 4, described idleness Ridlecomputing method be: establish current operation node Pinumber of tasks be Tn, then the idleness R of each computing nodeidleaccording to formulaCalculate.
Further, in described steps A 5, FCFS policing algorithm is adopted to sort when the priority of task is identical.
Further, also comprise task division step, be specially: task is split by granularity.
Further, also comprise abnormality processing step, be specially:
When computing node breaks down, being distributed on this node of task is reclaimed in time and be assigned on other normal nodes and continue to perform;
The threshold values of task process time-out is set, when a task time-out does not also complete, is re-assigned to other node and goes to perform or wait for complete at this node;
When task occurs mistake when performing, the information of record and feedback error in time.
The invention also discloses a kind of distributed evidence obtaining dynamic load leveling dispatching device, comprising:
Initialization module, for setting computing node as a computing node set P, and to each computing node Picarry out initialization;
Task threshold computing module, for according to computing node Piresource situation determination task window threshold values Tm;
Load calculates order module, for calculating current operation node Piload Wi, and to its sequence;
Idleness calculates order module, for calculating current operation node Piidleness Ridle, and it is sorted;
Task priority order module, for all tasks are set to a set of tasks T, and sorts to set of tasks T according to task priority;
Optimum computing node acquisition module, for taking out least-loaded and the highest computing node P of idleness from computing node set Piif the result obtained is sky, repeats load and calculate order module, idleness calculating order module, task priority order module and optimum computing node acquisition module, until obtain;
Limit priority task acquisition module, for obtaining the highest task T of priority from set of tasks Tiif the result obtained is sky, repeats idleness and calculate order module, task priority order module, optimum computing node acquisition module and limit priority task acquisition module, until obtain;
Task allocating module, for the task T that limit priority task acquisition module is obtainedidistribute to the computing node P that optimum computing node acquisition module obtainsi, upgrade set of tasks T and computing node P simultaneouslyicurrent number of tasks;
Loop module, for repeating task threshold computing module, load calculates order module, idleness calculates order module, task priority order module, optimum computing node acquisition module, limit priority task acquisition module and task allocating module, until all tasks are all assigned.
Further, described load calculates order module, and at least one situation for the CPU according to computing node, internal memory and network condition calculates load Wi.
Further, also comprise task division module, for task being split by granularity.
Further, also comprise abnormality processing module, specifically comprise:
Fault processing module, continues to perform for being distributed on this node of task being reclaimed in time when computing node breaks down and being assigned on other normal nodes;
Timeout treatment module, for arranging the threshold values of task process time-out, when a task time-out also not completing, being re-assigned to other node and going to perform or wait for complete at this node;
Error logging and feedback module, for there is mistake when task when performing, the information of record and feedback error in time.
Advantageous Effects of the present invention:
The present invention can support the operative scenario of multimedium high-speed parallel forensics analysis, it passes through Task-decomposing to the enterprising row relax of multiple computing nodes, not only save a large amount of time, also the efficiency of evidence obtaining is substantially increased, simultaneously, can reduce or avoid because of offered load, computing node CPU, internal memory equal distribution is uneven and the overall performance decline problem that causes, takes full advantage of the arithmetic capability of multiple stage machine.The present invention can be applied in evidence obtaining industry, the operations such as forensics analysis, search, index can not only be carried out fast and effectively to mass data, and also very strong in flexible expansion, the hot-swappable computing node when not halt system, the expansion that the system that realizes is seamless.
Embodiment
Now the present invention is further described with embodiment by reference to the accompanying drawings.
As shown in Figure 1, a kind of distributed evidence obtaining dynamic load leveling dispatching method, comprises the steps:
A1, if computing node is a computing node set P, and to each computing node Picarry out initialization.
In the present embodiment, computing node is computing machine, for performing evidence obtaining task, all computing nodes participating in evidence obtaining is set to a computing node set P{P1, P2, P3pn, and to each computing node Picarry out initialization, just can receive an assignment after initialization success and perform.
A2, according to computing node Piresource situation determination task window threshold values Tm.
Concrete, at computing node Piduring initialization, according to the resource situation determination task window threshold values T such as CPU check figure and internal memory of computing nodem, as Tm=24, then represent that computing node synchronization can only receive at most 24 tasks.
A3, calculates current operation node Piload Wi, and it is sorted from low to high, take out the computing node that load is minimum.
Concrete, at least one situation according to the CPU of computing node, internal memory and network condition calculates load Wi.In the present embodiment, according to computing node Pithe resource situation such as CPU, internal memory, network calculate current load Wi, Wiinterval is [0,100], and concrete grammar is: be located at t computing node Picpu load be Cpu (Pi), offered load Net (Pi), internal memory load is Mem (Pi), the computing formula of load value: Wi=R1* Cpu (Pi)+R2* Net (Pi)+R3* Mem (Pi), wherein R1, R2, R3value set according to the kind of task, as: be set as 0.4,0.5,0.1.
Further, traditional load-balancing mechanism uses the dispatching method of static polling schemas or dynamic realtime load feedback mostly, consider the problem such as network, system resource overhead that the inaccuracy of static scheduling and Real-time Feedback bring, the present invention adopts by stages dynamic state feedback mechanism with head it off, is specially: whole non-load balanced case is divided into different intervals as (0 according to configuration file, 33], (33,66], (66,100] three intervals, work as Wibe worth from the first interval (0,33] change to second (33,66] interval value, the W of computing nodeivalue feeds back, to sort.
In addition, a load threshold V is set in systems in which, works as Wivalue enters more than the task of being assigned to computing node during V value the state of waiting in line, and avoids the extreme case occurring that load is overweight, ensures the stability of system.
A4, calculates current operation node Piidleness Ridle, and it is sorted.
Be specially: establish current operation node Pinumber of tasks be Tn, then the idleness R of each computing nodeidleaccording to formula:calculate, the R that all compute node drawidlerendezvous value sorts from high to low, takes out the computing node P that idleness is the highesti.
A5, is set to a set of tasks T{T by all tasks1, T2, T3tn, and according to task priority, set of tasks T is sorted.
Concrete, first various types of task is divided by granularity, during division, should avoid occurring large task too consuming time, can not occur splitting too thin a large amount of little task.All tasks after dividing are set to a set of tasks T{T1, T2, T3tn, and corresponding priority is configured to it, then sort to set of tasks T according to task priority, the task that priority is identical sorts according to FCFS policing algorithm.FCFS policing algorithm is prior art, and this no longer describes in detail.
A6, takes out least-loaded and the highest computing node P of idleness from computing node set Piif the result obtained is sky, repeat steps A 3 to A6, until obtain.
A7, obtains the highest task T of priority from set of tasks Tiif the result obtained is sky, repeat steps A 4 to A7, until obtain.
A8, obtaining in steps A 7 of task Tidistribute to the computing node P obtained in steps A 6i, upgrade set of tasks T and computing node P simultaneouslyicurrent number of tasks.
A9, repeated execution of steps A2 to A8, until all tasks are all assigned.
In addition, in the present embodiment, also comprise abnormality processing step, be specially:
When computing node breaks down, being distributed on this node of task is reclaimed in time and be assigned on other normal nodes and continue to perform.
The threshold values of task process time-out is set, when a task time-out does not also complete, is re-assigned to other node and goes to perform or wait for complete at this node.
When task occurs mistake when performing, the information of record and feedback error in time.
Step number in the present embodiment is for convenience of description, and the sequencing that not exclusively ride instead of walk is rapid.
Present invention also offers a kind of distributed evidence obtaining dynamic load leveling dispatching device, comprising:
Initialization module, for setting computing node as a computing node set P, and to each computing node Picarry out initialization.
Concrete, initialization module, for being set to a computing node set P{P by all computing nodes participating in evidence obtaining1, P2, P3pn, and to each computing node Picarry out initialization, just can receive an assignment after initialization success and perform.
Task threshold computing module, for according to computing node Piresource situation determination task window threshold values Tm.
Concrete, at computing node Piduring initialization, according to the resource situation determination task window threshold values T such as CPU check figure and internal memory of computing nodem.
Load calculates order module, for calculating current operation node Piload Wi, and to its sequence.
Concrete, load calculates order module according to computing node Pithe resource situation such as CPU, internal memory, network calculate current load Wi, Wiinterval is [0,100], and concrete grammar is: be located at t computing node Picpu load be Cpu (Pi), offered load Net (Pi), internal memory load is Mem (Pi), the computing formula of load value: Wi=R1* Cpu (Pi)+R2* Net (Pi)+R3* Mem (Pi), wherein R1, R2, R3value set according to the kind of task, as: be set as 0.4,0.5,0.1.
Further, load calculates order module and comprises by stages dynamic feedback module, for whole non-load balanced case is divided into different intervals according to configuration file, as (0,33], (33,66], (66,100] three intervals, work as Wibe worth from the first interval (0,33] change to second (33,66] interval value, the W of computing nodeivalue feeds back and calculates order module, to sort to load.
Idleness calculates order module, for calculating current operation node Piidleness Ridle, and it is sorted.
Be specially: establish current operation node Pinumber of tasks be Tn, then the idleness R of each computing nodeidleaccording to formula:calculate, by the R that all compute node drawidlerendezvous value sorts from high to low, takes out the computing node P that idleness is the highesti.
Task priority order module, for all tasks are set to a set of tasks T, and sorts to set of tasks T according to task priority.Concrete, task priority order module also comprises:
Task division module, for dividing by granularity various types of task, should avoid occurring large task too consuming time, can not occur splitting too thin a large amount of little task during division.
Priority configuration module, for configuring corresponding priority to all the letting alone after division.
Optimum computing node acquisition module, for taking out least-loaded and the highest computing node P of idleness from computing node set Piif the result obtained is sky, repeats load and calculate order module, idleness calculating order module, task priority order module and optimum computing node acquisition module, until obtain.
Limit priority task acquisition module, for obtaining the highest task T of priority from set of tasks Tiif the result obtained is sky, repeats idleness and calculate order module, task priority order module, optimum computing node acquisition module and limit priority task acquisition module, until obtain.
Task allocating module, for the task T that limit priority task acquisition module is obtainedidistribute to the computing node P that optimum computing node acquisition module obtainsi, upgrade set of tasks T and computing node P simultaneouslyicurrent number of tasks.
Loop module, for repeating task threshold computing module, load calculates order module, idleness calculates order module, task priority order module, optimum computing node acquisition module, limit priority task acquisition module and task allocating module, until all tasks are all assigned.
In addition, in the present embodiment, device also comprises abnormality processing module, comprising:
Fault processing module, continues to perform for being distributed on this node of task being reclaimed in time when computing node breaks down and being assigned on other normal nodes.
Timeout treatment module, for arranging the threshold values of task process time-out, when a task time-out also not completing, being re-assigned to other node and going to perform or wait for complete at this node.
Error logging and feedback module, for there is mistake when task when performing, the information of record and feedback error in time.
The distributed evidence obtaining dynamic load leveling dispatching device of the embodiment of the present invention can be contained in server, this server is communicated to connect by network and computing node, in the present embodiment, network is provided by the cross-platform TCP communication module of the high-performance realized based on BoostAsio, the IOCP network model used under its windows platform is a kind of technology being applicable to high capacity server, not only scalability is good, and execution efficiency is very high, it is the basis that whole dispatch service carries out task scheduling and exchanges data.
To Figure 2 shows that under the test data mirror image of formed objects different number computing node and conventional individual pattern are collected evidence the performance comparison figure equipped, as can be seen from the figure, the performance of conventional individual pattern evidence obtaining equipment is the poorest, and the performance of the evidence obtaining equipment that computing node number is more is better.
Each module that embodiment disclosed herein describes and algorithm steps, can realize with electronic hardware, computer software or the combination of the two.These functions perform with hardware or software mode actually, depend on application-specific and the design constraint of technical scheme.Professional and technical personnel can use distinct methods to realize described function to each specifically should being used for, but this realization should not thought and exceeds scope of the present invention.
Although specifically show in conjunction with preferred embodiment and describe the present invention; but those skilled in the art should be understood that; not departing from the spirit and scope of the present invention that appended claims limits; can make a variety of changes the present invention in the form and details, be protection scope of the present invention.