CN105335236A

Movatterモバイル変換

Info

Publication number: CN105335236A
Application number: CN201510915168.XA
Authority: CN
Inventors: 陈俊珊; 苏再添; 吴少华
Original assignee: Xiamen Meiya Pico Information Co Ltd
Current assignee: Xiamen Meiya Pico Information Co Ltd
Priority date: 2015-12-10
Filing date: 2015-12-10
Publication date: 2016-02-17
Anticipated expiration: 2035-12-10
Also published as: CN105335236B

Abstract

The invention relates to the field of electronic medium evidence obtaining, in particular to a distributed evidence obtaining dynamic load balanced scheduling method and device. The distributed evidence obtaining dynamic load balanced scheduling method and device can support the multi-medium high-speed parallel evidence obtaining analysis work scene. Due to the fact that a task is decomposed to a plurality of computing nodes to be processed, a large amount of time is saved, and evidence obtaining efficiency is also greatly improved. The distributed evidence obtaining dynamic load balanced scheduling method and device can reduce or avoid the problem that overall performance reduction is caused by uneven distribution of network loads, a computing node CPU, an internal storage and the like.

Description

A kind of distributed evidence obtaining dynamic load leveling dispatching method and device

Technical field

The invention belongs to electronic media evidence obtaining field, relate to a kind of distributed evidence obtaining dynamic load leveling dispatching method particularly.

Background technology

Along with the development of society, increasing data are Electronically preserved, the data be stored in computing machine and other equipment progressively become important evidence in computer network case and Evidence in Litigation, and the constantly bringing forth new ideas of infotech and memory technology, make the memory capacity of equipment also increasing, the thing followed is that the case-involving storage medium of evidence obtaining process is many, capacity large, forensics analysis task is heavy, the inferior problem of inefficiency, and the forensics analysis how realized rapidly and efficiently also just becomes the emphasis of evidence obtaining product.

Traditional forensics analysis software and hardware equipment great majority are Evidence models of unit, as publication: CN203659010U, its synchronization can only support that sole user processes a case, need very powerful computing power just can complete when information memory capacity is very large, and the time expended is also very long, along with the development of network and memory technology, traditional forensics analysis equipment can not meet the requirement such as high-speed data processing and mass data correlation analysis, not only complicated operation but also inefficiency.For solving Problems existing in unit evidence obtaining, indivedual evidence-obtaining system starts to introduce distributed network forensics pattern, task matching is coupled together by network to multiple stage machine (compute node) by distributed evidence-obtaining system exactly, the arithmetic capability of each computing node is utilized to improve overall performance, but the effect that the problem that the performance difference existed due to each computing node and dispatching algorithm exist load imbalance makes the overall performance of system cannot reach desirable, therefore, reasonably to address this problem and just must carry out load balance scheduling.

Summary of the invention

The object of the invention is as solving the problem and providing one can carry out multimedium high-speed parallel forensics analysis, evidence obtaining efficiency is high, can reduce or avoid because of offered load, computing node CPU, internal memory equal distribution is uneven and the distributed evidence obtaining dynamic load leveling dispatching method of overall performance decline problem that causes and device.

For this reason, the invention discloses a kind of distributed evidence obtaining dynamic load leveling dispatching method, comprise the steps:

A1, if computing node is a computing node set P, and to each computing node P_icarry out initialization;

A2, according to computing node P_iresource situation determination task window threshold values T_m;

A3, calculates current operation node P_iload W_i, and to its sequence;

A4, calculates current operation node P_iidleness R_idle, and it is sorted;

A5, is set to a set of tasks T, and sorts to set of tasks T according to task priority by task;

A6, takes out least-loaded and the highest computing node P of idleness from computing node set P_iif the result obtained is sky, repeat steps A 3 to A6, until obtain;

A7, obtains the highest task T of priority from set of tasks T_iif the result obtained is sky, repeat steps A 4 to A7, until obtain;

A8, obtaining in steps A 7 of task T_idistribute to the computing node P obtained in steps A 6_i, upgrade set of tasks T and computing node P simultaneously_icurrent number of tasks;

A9, repeated execution of steps A2 to A8, until all tasks are all assigned.

Further, in described steps A 3, at least one situation according to the CPU of computing node, internal memory and network condition calculates load W_i.

Further, in described steps A 3, summation is weighted to the CPU of computing node, internal memory and offered load and calculates load W_i.

Further, in described steps A 3, adopt by stages dynamic state feedback mechanism to the load W calculated_ifeed back, and to feeding back the load W obtained_isort.

Further, in described steps A 4, described idleness R_idlecomputing method be: establish current operation node P_inumber of tasks be T_n, then the idleness R of each computing node_idleaccording to formula

R_{i d l e} = \frac{T_{m} - T_{n}}{T_{m}} \times 100 %

Calculate.

Further, in described steps A 5, FCFS policing algorithm is adopted to sort when the priority of task is identical.

Further, also comprise task division step, be specially: task is split by granularity.

Further, also comprise abnormality processing step, be specially:

When computing node breaks down, being distributed on this node of task is reclaimed in time and be assigned on other normal nodes and continue to perform;

The threshold values of task process time-out is set, when a task time-out does not also complete, is re-assigned to other node and goes to perform or wait for complete at this node;

When task occurs mistake when performing, the information of record and feedback error in time.

The invention also discloses a kind of distributed evidence obtaining dynamic load leveling dispatching device, comprising:

Initialization module, for setting computing node as a computing node set P, and to each computing node P_icarry out initialization;

Task threshold computing module, for according to computing node P_iresource situation determination task window threshold values T_m;

Load calculates order module, for calculating current operation node P_iload W_i, and to its sequence;

Idleness calculates order module, for calculating current operation node P_iidleness R_idle, and it is sorted;

Task priority order module, for all tasks are set to a set of tasks T, and sorts to set of tasks T according to task priority;

Optimum computing node acquisition module, for taking out least-loaded and the highest computing node P of idleness from computing node set P_iif the result obtained is sky, repeats load and calculate order module, idleness calculating order module, task priority order module and optimum computing node acquisition module, until obtain;

Limit priority task acquisition module, for obtaining the highest task T of priority from set of tasks T_iif the result obtained is sky, repeats idleness and calculate order module, task priority order module, optimum computing node acquisition module and limit priority task acquisition module, until obtain;

Task allocating module, for the task T that limit priority task acquisition module is obtained_idistribute to the computing node P that optimum computing node acquisition module obtains_i, upgrade set of tasks T and computing node P simultaneously_icurrent number of tasks;

Loop module, for repeating task threshold computing module, load calculates order module, idleness calculates order module, task priority order module, optimum computing node acquisition module, limit priority task acquisition module and task allocating module, until all tasks are all assigned.

Further, described load calculates order module, and at least one situation for the CPU according to computing node, internal memory and network condition calculates load W_i.

Further, also comprise task division module, for task being split by granularity.

Further, also comprise abnormality processing module, specifically comprise:

Fault processing module, continues to perform for being distributed on this node of task being reclaimed in time when computing node breaks down and being assigned on other normal nodes;

Timeout treatment module, for arranging the threshold values of task process time-out, when a task time-out also not completing, being re-assigned to other node and going to perform or wait for complete at this node;

Error logging and feedback module, for there is mistake when task when performing, the information of record and feedback error in time.

Advantageous Effects of the present invention:

The present invention can support the operative scenario of multimedium high-speed parallel forensics analysis, it passes through Task-decomposing to the enterprising row relax of multiple computing nodes, not only save a large amount of time, also the efficiency of evidence obtaining is substantially increased, simultaneously, can reduce or avoid because of offered load, computing node CPU, internal memory equal distribution is uneven and the overall performance decline problem that causes, takes full advantage of the arithmetic capability of multiple stage machine.The present invention can be applied in evidence obtaining industry, the operations such as forensics analysis, search, index can not only be carried out fast and effectively to mass data, and also very strong in flexible expansion, the hot-swappable computing node when not halt system, the expansion that the system that realizes is seamless.

Accompanying drawing explanation

Fig. 1 is the process flow diagram of the embodiment of the present invention;

Fig. 2 is the evidence obtaining performance comparison figure of the embodiment of the present invention.

Embodiment

Now the present invention is further described with embodiment by reference to the accompanying drawings.

As shown in Figure 1, a kind of distributed evidence obtaining dynamic load leveling dispatching method, comprises the steps:

A1, if computing node is a computing node set P, and to each computing node P_icarry out initialization.

In the present embodiment, computing node is computing machine, for performing evidence obtaining task, all computing nodes participating in evidence obtaining is set to a computing node set P{P₁, P₂, P₃p_n, and to each computing node P_icarry out initialization, just can receive an assignment after initialization success and perform.

A2, according to computing node P_iresource situation determination task window threshold values T_m.

Concrete, at computing node P_iduring initialization, according to the resource situation determination task window threshold values T such as CPU check figure and internal memory of computing node_m, as T_m=24, then represent that computing node synchronization can only receive at most 24 tasks.

A3, calculates current operation node P_iload W_i, and it is sorted from low to high, take out the computing node that load is minimum.

Concrete, at least one situation according to the CPU of computing node, internal memory and network condition calculates load W_i.In the present embodiment, according to computing node P_ithe resource situation such as CPU, internal memory, network calculate current load W_i, W_iinterval is [0,100], and concrete grammar is: be located at t computing node P_icpu load be Cpu (P_i), offered load Net (P_i), internal memory load is Mem (P_i), the computing formula of load value: W_i=R₁* Cpu (P_i)+R₂* Net (P_i)+R₃* Mem (P_i), wherein R₁, R₂, R₃value set according to the kind of task, as: be set as 0.4,0.5,0.1.

Further, traditional load-balancing mechanism uses the dispatching method of static polling schemas or dynamic realtime load feedback mostly, consider the problem such as network, system resource overhead that the inaccuracy of static scheduling and Real-time Feedback bring, the present invention adopts by stages dynamic state feedback mechanism with head it off, is specially: whole non-load balanced case is divided into different intervals as (0 according to configuration file, 33], (33,66], (66,100] three intervals, work as W_ibe worth from the first interval (0,33] change to second (33,66] interval value, the W of computing node_ivalue feeds back, to sort.

In addition, a load threshold V is set in systems in which, works as W_ivalue enters more than the task of being assigned to computing node during V value the state of waiting in line, and avoids the extreme case occurring that load is overweight, ensures the stability of system.

A4, calculates current operation node P_iidleness R_idle, and it is sorted.

Be specially: establish current operation node P_inumber of tasks be T_n, then the idleness R of each computing node_idleaccording to formula:calculate, the R that all compute node draw_idlerendezvous value sorts from high to low, takes out the computing node P that idleness is the highest_i.

A5, is set to a set of tasks T{T by all tasks₁, T₂, T₃t_n, and according to task priority, set of tasks T is sorted.

Concrete, first various types of task is divided by granularity, during division, should avoid occurring large task too consuming time, can not occur splitting too thin a large amount of little task.All tasks after dividing are set to a set of tasks T{T₁, T₂, T₃t_n, and corresponding priority is configured to it, then sort to set of tasks T according to task priority, the task that priority is identical sorts according to FCFS policing algorithm.FCFS policing algorithm is prior art, and this no longer describes in detail.

A6, takes out least-loaded and the highest computing node P of idleness from computing node set P_iif the result obtained is sky, repeat steps A 3 to A6, until obtain.

A7, obtains the highest task T of priority from set of tasks T_iif the result obtained is sky, repeat steps A 4 to A7, until obtain.

A8, obtaining in steps A 7 of task T_idistribute to the computing node P obtained in steps A 6_i, upgrade set of tasks T and computing node P simultaneously_icurrent number of tasks.

A9, repeated execution of steps A2 to A8, until all tasks are all assigned.

In addition, in the present embodiment, also comprise abnormality processing step, be specially:

When computing node breaks down, being distributed on this node of task is reclaimed in time and be assigned on other normal nodes and continue to perform.

The threshold values of task process time-out is set, when a task time-out does not also complete, is re-assigned to other node and goes to perform or wait for complete at this node.

Step number in the present embodiment is for convenience of description, and the sequencing that not exclusively ride instead of walk is rapid.

Present invention also offers a kind of distributed evidence obtaining dynamic load leveling dispatching device, comprising:

Initialization module, for setting computing node as a computing node set P, and to each computing node P_icarry out initialization.

Concrete, initialization module, for being set to a computing node set P{P by all computing nodes participating in evidence obtaining₁, P₂, P₃p_n, and to each computing node P_icarry out initialization, just can receive an assignment after initialization success and perform.

Task threshold computing module, for according to computing node P_iresource situation determination task window threshold values T_m.

Concrete, at computing node P_iduring initialization, according to the resource situation determination task window threshold values T such as CPU check figure and internal memory of computing node_m.

Load calculates order module, for calculating current operation node P_iload W_i, and to its sequence.

Concrete, load calculates order module according to computing node P_ithe resource situation such as CPU, internal memory, network calculate current load W_i, W_iinterval is [0,100], and concrete grammar is: be located at t computing node P_icpu load be Cpu (P_i), offered load Net (P_i), internal memory load is Mem (P_i), the computing formula of load value: W_i=R₁* Cpu (P_i)+R₂* Net (P_i)+R₃* Mem (P_i), wherein R₁, R₂, R₃value set according to the kind of task, as: be set as 0.4,0.5,0.1.

Further, load calculates order module and comprises by stages dynamic feedback module, for whole non-load balanced case is divided into different intervals according to configuration file, as (0,33], (33,66], (66,100] three intervals, work as W_ibe worth from the first interval (0,33] change to second (33,66] interval value, the W of computing node_ivalue feeds back and calculates order module, to sort to load.

Idleness calculates order module, for calculating current operation node P_iidleness R_idle, and it is sorted.

Be specially: establish current operation node P_inumber of tasks be T_n, then the idleness R of each computing node_idleaccording to formula:calculate, by the R that all compute node draw_idlerendezvous value sorts from high to low, takes out the computing node P that idleness is the highest_i.

Task priority order module, for all tasks are set to a set of tasks T, and sorts to set of tasks T according to task priority.Concrete, task priority order module also comprises:

Task division module, for dividing by granularity various types of task, should avoid occurring large task too consuming time, can not occur splitting too thin a large amount of little task during division.

Priority configuration module, for configuring corresponding priority to all the letting alone after division.

Optimum computing node acquisition module, for taking out least-loaded and the highest computing node P of idleness from computing node set P_iif the result obtained is sky, repeats load and calculate order module, idleness calculating order module, task priority order module and optimum computing node acquisition module, until obtain.

Limit priority task acquisition module, for obtaining the highest task T of priority from set of tasks T_iif the result obtained is sky, repeats idleness and calculate order module, task priority order module, optimum computing node acquisition module and limit priority task acquisition module, until obtain.

Task allocating module, for the task T that limit priority task acquisition module is obtained_idistribute to the computing node P that optimum computing node acquisition module obtains_i, upgrade set of tasks T and computing node P simultaneously_icurrent number of tasks.

In addition, in the present embodiment, device also comprises abnormality processing module, comprising:

Fault processing module, continues to perform for being distributed on this node of task being reclaimed in time when computing node breaks down and being assigned on other normal nodes.

Timeout treatment module, for arranging the threshold values of task process time-out, when a task time-out also not completing, being re-assigned to other node and going to perform or wait for complete at this node.

The distributed evidence obtaining dynamic load leveling dispatching device of the embodiment of the present invention can be contained in server, this server is communicated to connect by network and computing node, in the present embodiment, network is provided by the cross-platform TCP communication module of the high-performance realized based on BoostAsio, the IOCP network model used under its windows platform is a kind of technology being applicable to high capacity server, not only scalability is good, and execution efficiency is very high, it is the basis that whole dispatch service carries out task scheduling and exchanges data.

To Figure 2 shows that under the test data mirror image of formed objects different number computing node and conventional individual pattern are collected evidence the performance comparison figure equipped, as can be seen from the figure, the performance of conventional individual pattern evidence obtaining equipment is the poorest, and the performance of the evidence obtaining equipment that computing node number is more is better.

Each module that embodiment disclosed herein describes and algorithm steps, can realize with electronic hardware, computer software or the combination of the two.These functions perform with hardware or software mode actually, depend on application-specific and the design constraint of technical scheme.Professional and technical personnel can use distinct methods to realize described function to each specifically should being used for, but this realization should not thought and exceeds scope of the present invention.

Although specifically show in conjunction with preferred embodiment and describe the present invention; but those skilled in the art should be understood that; not departing from the spirit and scope of the present invention that appended claims limits; can make a variety of changes the present invention in the form and details, be protection scope of the present invention.

Claims

1. a distributed evidence obtaining dynamic load leveling dispatching method, is characterized in that, comprise the steps:

A3, calculates current operation node P_iload W_i, and to its sequence;

A4, calculates current operation node P_iidleness R_idle, and it is sorted;

A9, repeated execution of steps A2 to A8, until all tasks are all assigned.

2. distributed evidence obtaining dynamic load leveling dispatching method according to claim 1, is characterized in that: in described steps A 3, and at least one situation according to the CPU of computing node, internal memory and network condition calculates load W_i.

3. distributed evidence obtaining dynamic load leveling dispatching method according to claim 2, is characterized in that: in described steps A 3, is weighted summation calculates load W to the CPU of computing node, internal memory and offered load_i.

4. the distributed evidence obtaining dynamic load leveling dispatching method according to claim 1 or 2 or 3, is characterized in that: in described steps A 3, adopts by stages dynamic state feedback mechanism to the load W calculated_ifeed back, and to feeding back the load W obtained_isort.

5. distributed evidence obtaining dynamic load leveling dispatching method according to claim 1, is characterized in that: in described steps A 4, described idleness R_idlecomputing method be: establish current operation node P_inumber of tasks be T_n, then the idleness R of each computing node_idleaccording to formulacalculate.

6. distributed evidence obtaining dynamic load leveling dispatching method according to claim 1, is characterized in that: in described steps A 5, adopts FCFS policing algorithm to sort when the priority of task is identical.

7. distributed evidence obtaining dynamic load leveling dispatching method according to claim 1, is characterized in that: also comprise task division step, be specially: task split by granularity.

8. distributed evidence obtaining dynamic load leveling dispatching method according to claim 1, is characterized in that: also comprise abnormality processing step, is specially:

9. a distributed evidence obtaining dynamic load leveling dispatching device, is characterized in that, comprising:

10. distributed evidence obtaining dynamic load leveling dispatching device according to claim 9, is characterized in that: described load calculates order module, and at least one situation for the CPU according to computing node, internal memory and network condition calculates load W_i.

11. distributed evidence obtaining dynamic load leveling dispatching devices according to claim 9, is characterized in that: also comprise task division module, for task being split by granularity.

12. distributed evidence obtaining dynamic load leveling dispatching devices according to claim 9, is characterized in that: also comprise abnormality processing module, specifically comprise: