Movatterモバイル変換


[0]ホーム

URL:


CN108228393A - A kind of implementation method of expansible big data High Availabitity - Google Patents

A kind of implementation method of expansible big data High Availabitity
Download PDF

Info

Publication number
CN108228393A
CN108228393ACN201711339686.7ACN201711339686ACN108228393ACN 108228393 ACN108228393 ACN 108228393ACN 201711339686 ACN201711339686 ACN 201711339686ACN 108228393 ACN108228393 ACN 108228393A
Authority
CN
China
Prior art keywords
node
controller
service
service processes
information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201711339686.7A
Other languages
Chinese (zh)
Inventor
万宝月
杨进展
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Aerospace Heng Jia Data Technology Co Ltd
Original Assignee
Zhejiang Aerospace Heng Jia Data Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang Aerospace Heng Jia Data Technology Co LtdfiledCriticalZhejiang Aerospace Heng Jia Data Technology Co Ltd
Priority to CN201711339686.7ApriorityCriticalpatent/CN108228393A/en
Publication of CN108228393ApublicationCriticalpatent/CN108228393A/en
Pendinglegal-statusCriticalCurrent

Links

Classifications

Landscapes

Abstract

The present invention relates to a kind of implementation methods of expansible big data High Availabitity, include the following steps, the first data directory structure are built in Zookeeper, and two controllers, respectively master controller and preparation controller are configured for each node;When initial, setting master controller is activation process;When master controller delays machine, then the transient node that master controller creates in the first data directory structure is deleted by time-out;After preparation controller receives the notice that transient node is deleted by time-out, transient node is re-created in the first data directory structure, and take over as activation process;While preparation controller takes over as activation process, the process of pull-up master controller again.The present invention utilizes the characteristic of Zookeeper, and node has carried out distributed treatment, and the information of each node is shared and monitored by Zookeeper, accomplishes each process service High Availabitity.

Description

A kind of implementation method of expansible big data High Availabitity
Technical field
The present invention relates to big data fields, and in particular to a kind of implementation method of expansible big data High Availabitity.
Background technology
A kind of implementation method of Nginx server load balancings involved in the prior art, by Nginx servers using tactileHairdo mechanism dynamically updates the resource information of back-end server cluster, then according to the resource information of back-end server cluster andDifferent types of service, load value of the computational back-end server under various businesses type select the industry belonging to Http requestsThe back-end server of load value minimum handles Http requests in service type.Meanwhile Nginx server load balancing systems canReasonably user's request is allocated, makes state of the load in relative equilibrium of back-end server cluster, is closed so as to playReason utilizes the purpose of server resource.Fig. 1 is the network topological diagram of load-balancing method in the prior art, and user is connect by terminalEnter Internet and access required service, load balancing relies on Nginx servers and shunts service of the access request to server clusterDevice.Fig. 2 is the flow chart of load-balancing method in the prior art, and idiographic flow is:S1, Nginx server pass through trigger-typeThe resource information of new mechanism back-end server cluster, resource information include cpu busy percentage, memory usage, the magnetic of serverDisk I/O utilization rates and network bandwidth utilization factor;It is equal using load after S2, Nginx server receive the Http requests of clientThe algorithms selection back-end server processing Http that weighs is asked, and server process result is returned to client;S3, load balancing are calculatedMethod is according to different types of service and the resource information of back-end server cluster, and computational back-end server is under various businesses typeLoad value, select the Http request belonging to type of service in load value minimum back-end server handle the Http ask.
The method and system of above-mentioned existing Nginx server load balancings has the drawback that:
1. the load balancing of Http requests can only be solved, and do not have scalability:Above-mentioned load-balancing method is to utilizeAfter Nginx servers receive Http request progress URL parsings, load distribution is carried out according to dispatching algorithm, load balancing is solved and asksTopic.Do not refer to how the problem of load balancing in the case where non-Http is asked solves, such as RMI, JMS, RPC in method.SideBe only the load balance process flow for indicating Http requests in method, for non-Http ask no expansible interface orService, therefore the program does not also have autgmentability.
2. without High Availabitity function and elastic telescopic mechanism, and it is not distributed type assemblies:Above-mentioned load-balancing method solvesWhen a large number of users is accessed by Http requests in load balancing to back-end services cluster group, this is built upon NginxUnder the premise of server works normally.If Nginx servers are delayed because of a variety of causes after machine, Http requests cannot be on time normalBe sent to back-end server cluster, lead to user's access exception.
Due to relying on the load balancing of Nginx servers, need to determine back-end server number of clusters in advance, be takenDevice resource information of being engaged in is collected, and load distribution is carried out according to algorithm.Therefore back-end server cluster does not have dynamic expansion function, ifA back-end server is further added by, modification Nginx servers is needed to be configured and restart, cause user that cannot access institute for a period of timeIt needs to service.In the period of user's visit capacity is less, back-end server cluster must be kept all in operating status, waste of resource, noIt can accomplish that dynamic reduces the equally loaded again of back-end server quantity.
For rear end cluster and Nginx servers not using distributed thought, lead to the load balancing of rear end cluster notIt can freely stretch, Nginx servers do not have High Availabitity function.
Invention content
The technical problems to be solved by the invention are to provide a kind of implementation method of expansible big data High Availabitity, solveUser develops the problem of node High Availabitity under different business scene and various communications protocols.
The technical solution that the present invention solves above-mentioned technical problem is as follows:A kind of realization side of expansible big data High AvailabitityMethod includes the following steps,
Build the first data directory structure in Zookeeper, and two controllers be configured for each node, respectively based onController and preparation controller;
When initial, setting master controller is activation process, and the master that master controller stores in the first data directory structureUnder the control for controlling information, and according to the service of node serve information administration corresponding node stored in the first data directory structureProcess;
When master controller delays machine, then the transient node that master controller creates in the first data directory structure is deleted by time-outIt removes;
After preparation controller receives the notice that transient node is deleted by time-out, re-create and face in the first data directory structureShi Jiedian, and activation process is taken over as, and the control of standby control information that preparation controller stores in the first data directory structureUnder system, and according to the service processes of node serve information take over corresponding node stored in the first data directory structure;
While preparation controller takes over as activation process, the process of pull-up master controller again.
The beneficial effects of the invention are as follows:A kind of implementation method of expansible big data High Availabitity of the present invention utilizesThe characteristic of Zookeeper, node have carried out distributed treatment, and the information of each node is shared and monitored by Zookeeper;Due to saving the details of each node in Zookeeper, so master controller and preparation controller at runtime can be mutualMutually monitor the situation of other side, the master controller for being in state of activation is delayed suddenly after machine, and spare preparation controller can be taken in timeTask simultaneously attempts pull-up and has delayed the master controller of machine, and ensure that process services normal operation, accomplishes that each process service height canWith.
Based on the above technical solution, the present invention can also be improved as follows.
Further, the first data directory structure includes root, second-level directory and three-level catalogue, wherein, rootProject name is named as, second-level directory is the host name of each node, and being created in three-level catalogue has node control message file and sectionPoint service information file, node control message file include active information file, main control message file and standby control information textPart;
The node serve information is stored in node serve message file, and master controller or preparation controller create interimNode is stored in active information file, and the main control information is stored in main control message file, the standby control informationIt is stored in standby control message file.
Further, master controller and preparation controller check in real time according to the transient node recorded in the active information fileThe state of activation of itself;
When master controller or preparation controller, which check, lays oneself open to state of activation, then master controller or preparation controller continueIt checks the state for oneself possessing service processes, and the service processes stopped being handled according to preset algorithm.
Further, before master controller delays machine, the information on services for the service processes being currently running is stored in by master controllerOn the node serve message file;
After master controller delays machine, when preparation controller takes over as activation process, preparation controller is according to the node serveThe information on services recorded on message file whether there is to detect the service processes that master controller is delayed before machine.
Further, when preparation controller detects that the service processes that master controller is delayed before machine exist and in health statusWhen, then the service processes that preparation controller delays to master controller before machine take over;
Before preparation controller delays machine, the information on services for the service processes being currently running is stored in the node by preparation controllerIn service information file.
Further, the service of controller administration corresponding node being active, the load balancing including control node,The step of load balancing of controller control node being active, includes,
The second data directory structure is built in Zookeeper, and node is started to the configuration information storage of service processesIn the second data directory structure;
The controller being active according to the multiple mutually independent service types of the node serve information monitoring, andEach service processes are individually dispatched respectively according to the number of nodes under each service type and configuration information.
Further, the second data directory structure includes root, second-level directory, three-level catalogue and level Four catalogue,In, root is named as project name, and second-level directory is named as load balancing master catalogue, and three-level catalogue is the service under each nodeClassification, creating in level Four catalogue has configuration information file and nodal information file, and nodal information file designation is the master of each nodeMachine title;The configuration information is stored in the configuration information file.
Further, the controller being active is distinguished according to the number of nodes under each service type and configuration informationThe process individually dispatched includes automatic load balancing, and the detailed process of automatic load balancing is,
When the controller being active starts, calculating needs service processes quantity to be started on transient node,And the service processes quantity oneself administered is adjusted according to the variation of service processes sum, while the service managed in needsTransient node is created in nodal information file under category list;
When the service processes failure of the controller on a node, then under the corresponding service type catalogue for needing to manageNodal information file in transient node be automatically left out, the controller being active on other nodes is according to monitoringSituation is to needing service processes quantity to be started to recalculate, and be adjusted correspondingly.
Further, the calculating process of service processes to be started is needed on node specifically, according to all number of nodes andThe need service processes quantity to be started being written in configuration information file, calculating needs service processes to be started on present nodeQuantity.
Further, the controller being active is distinguished according to the number of nodes under each service type and configuration informationThe process individually dispatched further includes sluggard's load balancing, and the principle of sluggard's load balancing is, when new node adds in cluster,Original service processes on present node will not actively be stopped;When configuration information file changes or has a service processes appearanceDuring mistake, load and start new service processes less than the controller corresponding to the node of preset configuration quantity.
Advantageous effect using above-mentioned further scheme is:Since the information of each node carries out in ZookeeperStorage, so controller can dynamically be increased and decreased according to load-balancing algorithm according to the quantity of service processes, when increasing newly orWhen losing node, carry out load balance can also be recalculated.The present invention carries out load balancing on the basis of the service of bottom process,Since certain service access request quantity can be configured in each independent process service, when access request quantity increasesWhen, it is only necessary to calculating needs process quantity of service to be started, shields access request and is initiated with which kind of communication protocol, it is only necessary to be pressedIt is encapsulated in process service according to the communication protocol of demand, load-balancing algorithm carries out load balancing according to service processes quantity.InstituteThe problem of load balancing of various communications protocols is adapted to the present invention, user can carry out load balancing with ready-made process service,It can also independently be extended under the load balancing frame of the present invention according to special scenes demand.And the present invention is according to processService time to live is divided into two kinds of mechanism and carries out load balancing, and sluggard load balancing side is used for the process service of longtime runningFormula need to only go wrong in process and carry out recalculating distribution, and automatic load balancing is used for the process service of short-term operationMode can carry out load balancing distribution in real time.
Description of the drawings
Fig. 1 is the network topological diagram of Nginx server load balancings method in the prior art;
Fig. 2 is the flow chart of Nginx server load balancings method in the prior art;
Fig. 3 is with the reality of the first data knot catalogue structure in a kind of implementation method of expansible big data High Availabitity of the present inventionThe structure chart of existing High Availabitity;
Fig. 4 is realizes with the second data directory structure in a kind of implementation method of expansible big data High Availabitity of the present inventionThe structure chart of load balancing;
Fig. 5 is loaded for Flume in a kind of specific embodiment of the implementation method of expansible big data High Availabitity of the present inventionThe structure chart of equilibrium application.
Specific embodiment
The principle and features of the present invention will be described below with reference to the accompanying drawings, and the given examples are served only to explain the present invention, andIt is non-to be used to limit the scope of the present invention.
A kind of implementation method combination Zookeeper distributed type assemblies management of expansible big data High Availabitity of the present invention is specialProperty, design it is a kind of there is distributed, expansible node High Availabitity and load balancing scheme, solve user not of the same trade or businessThe problem of exploitation node High Availabitity and load balancing are needed using multiple technologies stack under business scene and various communications protocols is born simultaneouslyIt carries equalization scheme and supports that automatic and sluggard's two ways, user can more preferably be selected according to different business scene, it can alsoUsing load balancing interface exploitation is extended according to special scenes demand.
Zookeeper is distributed, open source code a distributed application program coordination service, is GoogleMono- realization increased income of Chubby is the significant components of Hadoop and Hbase;It is one and provides consistency for Distributed ApplicationThe software of service, the function of providing include:Configuring maintenance, domain name service, distributed synchronization, group service etc..
Two kinds of technologies of dependence Zookeeper are needed in the present invention:Transient node and monitor.
There are two types of nodes in Zookeeper, respectively transient node and persistent node.Ephemeral Node areThe transient node of Zookeeper, the life cycle of such node, which depends on, creates their session, once conversation end, temporarilyNode will be automatically left out;Persistent node, English name Persistent Node refer to after node creates, just deposit alwaysThis node is actively being removed until there is delete operation, will not disappeared because of the client session failure for creating the node.
Zookeeper clients can set monitor (Watch) on node, when node state changes, thanSuch as the increase, deletion, change of data, it will the operation corresponding to triggering monitor.When monitor is triggered, ZookeeperIt will be sent to client and only send a notice, because monitor can only be triggered once.
The High Availabitity (HA) of node, i.e.,:High Available, high availability cluster are to ensure having for business continuanceImitate solution, it is general there are two or more than two nodes, and be divided into active node (Active) and standby node(Standby);Usually the active node that is known as the business that is carrying out, and a backup as active node is then referred to as standbyUse node;When active node goes wrong, when causing the business being currently running (task) to be not normally functioning, standby node is at this timeIt will detect, and the active node that immediately continues performs business;So as to fulfill business do not interrupt or short interruption.
As shown in figure 3, in order to realize the High Availabitity of node serve, need to establish specific catalogue knot in ZookeeperStructure (i.e. the first data directory structure), that is, the NameSpace of Zookeeper, root can be named as project name(ZKNameSpace), second-level directory is the host name (Nost of each node:Including Nost_A, Nost_B, Nost_C etc., HostFor the host of erection schedule service, also referred to as node), three-level catalogue is divided into two classes:One kind is node control information, including activationInformation (Active) and active and standby control information (Controller_A, Controller_B), another kind of is node serve information(Agents), classification storage can be carried out according to service type below node serve information.
As shown in figure 3, under each node (Host, below by taking Nost_A as an example), all there are two controller (Controller_A, Controller_B, Controller_A is state of activation herein, and Controller_B is spare), each controllerThere is the function of monitoring another controller, and delay in another controller that (machine of delaying, referring to operating system can not be from a serious system for machineRecover or system hardware level goes wrong in system mistake, so that system is for a long time without response, and restarting meter of having toThe phenomenon that calculation machine.Can also refer to some process occur serious error needs restart) when take over its subordinate node serve process (toolBody, master controller is under the control of main control information according to the service of node serve information realization corresponding node, preparation controllerAccording to the service of node serve information realization corresponding node under the control of standby control information), and restart the control for machine of delayingDevice.
The present invention monitors the state of a process of two controllers, and a controller wherein by ZookeeperProcess when the error occurs, realize automatically switch.When initial, the master controller Controller_A of each node is (hereinafter referred to asCA it is) activation (Active) process, when CA delays machine, (transient node stores the transient node created in ZookeeperCatalogue is /ZKNamespace/Host_A/Active) it will be deleted by time-out, while preparation controller Controller_B is (followingAbbreviation CB) notice that transient node will be received is deleted;Then, CB will re-create transient node in Zookeeper, and connectPipe becomes the function of activation process.It, will pull-up CA processes again while CB takes over as activation process.
In order to normally take over existing node serve process, it is initially in the master controller of state of activationBy the information on services corresponding to the service processes being currently running, (information on services is specifically as follows Controller_A for meeting before the machine of delayingPort and PID) it is stored on Zookeeper (catalogue of storage is /ZKNamespace/Host/Agents/Topic/ID).OftenWhen secondary preparation controller Controller_B is switched to state of activation, it can attempt using the process service letter recorded on ZookeeperCease whether the service processes delayed before machine to detect master controller Controller_A also exist.If master controllerThe service processes that Controller_A delays before machine exist and in health status, then Controller_B pairs of preparation controllerIt takes over, and the function being managed to existing service process is realized with this.And the controller being active can be examinedLook into the state for oneself possessing service processes, and according to preset algorithm, the service processes stopped being restarted orOther processing operations.
Each controller can check the current state of oneself in real time, if laying oneself open to state of activation, then will continue toThe operating status for oneself possessing service processes is checked, if the service processes oneself possessed mistake occur or exception is moved backGo out, then the predetermined operation according to preset algorithm is to there is mistake or the service processes that exit extremely are handled or againIt opens.
The service processes of controller administration corresponding node being active include the load balancing of control node.
As shown in figure 4, in order to realize node load balancing function, need to establish specific catalogue knot in ZookeeperStructure (i.e. the second data directory structure), that is, the NameSpace of Zookeeper, root can be named as project name(ZKNameSpace), second-level directory is load balancing master catalogue (Balancer), and three-level catalogue is the service class under each nodeNot (Topic:Including Topic_A, Topic_B, Topic_C etc., Topic is process service type, according to different methods of service orClassifying content, same service type can have multiple processes under its command and provide service), level Four catalogue is configuration information(Configuration, the configuration information of recording controller) and nodal information (Nost), wherein nodal information are each nodeHostname (including Nost_A, Nost_B, Nost_C etc.).
The controller Controller being active on each node can using Zookeeper come carry out node itBetween service processes distribution, it is therefore desirable to the configuration information storage of service processes will be started on Zookeeper.For eachThe controller Controller being active can monitor multiple service types, and each service type is mutual indepedent,The controller Controller being active can be right respectively according to the number of nodes under each service type and configuration informationEach service processes are individually dispatched.Each corresponding configuration information of service type deposit in Zookeeper on (catalogue is/ZKNamespace/Balancer/Topic/Configuration)。
The service type managed can be needed when the controller Controller being active starts in ZookeeperCorresponding transient node is created under catalogue, and (transient node creaties directory as/ZKNamespace/Balancer/Topic/Hosts/Host_A), need to start a certain number of service processes on each node, the controller being activeController can be calculated when starting needs service processes quantity to be started on transient node.All controls being activeDevice can monitor the quantity of node, and calculate the service processes that initiate by its own is needed in corresponding node, also can according to service intoThe variation of journey sum is adjusted the service processes quantity oneself administered.Likewise, when the clothes of the controller on a nodeWhen business process fails, (specific catalogue is /ZKNamespace/Balancer/Topic/Hosts/ under corresponding Hosts cataloguesHost_A transient node) can disappear automatically, other nodes are according to monitoring situation to service processes quantity to be started is needed to carry outIt recalculates, is timely adjusted correspondingly.Here the method for adjustment service processes quantity follows the principle that last in, first out, i.e.,Newest service processes are preferentially stopped.
Each node needs service processes algorithm to be started as follows:According to the quantity of all transient nodes withThe need service processes quantity to be started being written in Configuration catalogues, can calculate needs to start on present nodeNumber of processes.Current service processes Distribution Algorithm process includes, and lists host name all under Hosts catalogues, and makeIt is ranked up with lexcographical order, using serial number of the host name of present node in entire sequence as initial Index, then usedFollowing code carries out the calculating of number of processes.
Int number=0;(it is integer variable to define service processes quantity)
For (int i=index;i<total;I+=hostNum)
number++;(using serial number of the host name of present node in entire sequence as initial Index, when currentWhen serial number of the host name of node in entire sequence is less than total sequence quantity, according to total node number amount, present node serial numberAnd process sum to be started is needed, which to carry out present node, needs service processes quantity to be started to calculate)
}
return number;(return node needs service processes quantity to be started)
In order to adapt to different business scenarios, it is proposed that two class load-balancing methods, first, above-mentioned automatic load balancing:Load balancing can be carried out automatically at once when newly-increased or while losing node, compare the service processes for adapting to short-term operation, becauseFor for the process of longtime running, according to above-mentioned automatic equalization mode, whenever a newly-increased node or one is lostDuring node, just have process and be moved to end, and restart on new node, and the cost for starting stopping Long Running Tasks is veryHigh, the process newly started is with original state of a process it is difficult to ensure that consistent.Therefore, occur number of nodes variation when at once intoThe balanced strategy of row, the task for longtime running is not a kind of suitable mode;In order to avoid disadvantages mentioned above, the present invention also carriesSluggard's load balancing has been supplied, sluggard's load balancing can't actively delete original service processes when new node adds in cluster,And when configuration information file changes or has service processes when the error occurs, it just has and loads relatively low controller to startNew service processes, in this way can be to avoid unknown problem caused by the original service processes possibility of stopping.
The computational methods that number of nodes to be started is needed to be described with automatic load balancing on each node of sluggard's load balancingIt is identical.Difference is:
1) the service processes quantity being currently running can be recorded on Zookeeper that (physical record is in node by each nodeIn message file), other nodes can be understood and there remains how many a service processes at present and can start;
2) each node will not actively stop service processes, but can update when occurring service processes failure on nodeData in Zookeeper are determined to start the node of service processes by preset algorithm;
3) each node can (Leader be elected, and is generally opened in cluster by the Leader Selection on ZookeeperDynamic or Leader delays and performs after machine) mode, try to become Leader (collection group decision and leader node, a Zookeeper clusterOnly there are one leader, similar Master/Slave patterns after client submits request, are first sent to Leader, LeaderAs recipient, it is broadcast to each Server), and only current Leader can start service processes;
4) each node load can add in election queue less than the quantity of configuration, and when reaching destination service number of processesElection can be exited.
A kind of key point of the implementation method of expansible big data High Availabitity is invented to be:(1) distributed type assemblies design,When part of nodes delays machine, ensure that service is normal and provide;(2) load-balanced server High Availabitity is carried out using Zookeeper to setMeter ensures the load-balancing mechanism normal operation of cluster;(3) using automatic load balancing and sluggard's load balancing both of which weighing apparatus sideMethod is suitble to different application scenarios;(4) Controller of load balancing can independently be expanded according to special scenes demandExhibition;(5) load balancing accomplishes that freely (elastic telescopic refers to according to business demand and strategy, its elasticity of adjust automatically elastic telescopicThe management service of computing resource reaches the service ability of optimization combination of resources, increases computing capability when portfolio rises, work as industryBusiness amount decline when reduce computing capability, the stability and high availability of operation system are ensured with this, at the same save computing resource intoThis), it dynamically calculates and carries out load distribution.
It illustrates how to realize by the present invention for solving Flume load balancing below.
Flume is the High Availabitity that Cloudera is provided, highly reliable, distributed massive logs acquisition, polymerizationWith the system of transmission, Flume supports to customize Various types of data sender in log system, for collecting data;Meanwhile FlumeIt provides and simple process is carried out to data, and write the ability of various data receivings (customizable).
Flume is usually used in acquiring daily record data, and the Agent of oneself is placed in the server for needing gathered data, ifThe daily record data of acquisition is more dispersed, type is more, it is necessary to which a reliable load-balancing mechanism solves load assignment problem.The structure of Flume load balancing application is as shown in figure 5, there are three node, storage Flume for deployment in ZookeeperThe nodal information and configuration information of Controller, process service (in particular to Flume Agent) difference portion of gathered dataThree Flume Agent have given tacit consent to during beginning on three hosts on each host in administration.
Firstly the need of the Essential Environment for erecting operation, including JDK, Zookeeper, Flume and will need externally to carryFlume Agent configurations for service finish.
Then Flume Controller required informations are configured, using the monitoring method realized or pass through monitoringInterface realizes the monitoring method needed for oneself, and monitoring content includes Flume Controller processes and delays machine, Flume AgentHealth status.
Finally start all process services, itself can be registered in Zookeeper by Flume Controller, and establishment is facedWhen nodal directory preserve information.
High Availabitity is realized:
When the Active processes of Flume Controller break down, the corresponding transient node in ZookeeperIt can remove automatically, Flume Controller standby nodes can monitor Active process transient nodes and be eliminated, and automatically switchFor Active states, transient node, and the new Flume Controller processes of pull-up one again are registered in ZookeeperAs backup.
Flume Controller can be on the service processes synchronizing information to Zookeeper possessed, as spare FlumeWhen Controller starts, service processes information can be obtained, and attempts to take over.If process service is not present, FlumeController will start new process service, when starting new process service every time, by the information of process service (hereNo. PID of fingering journey and port numbers) it is stored on Zookeeper so that subsequent processes service recovery uses.
Load balancing is realized:
Flume Controller can periodically call load-balancing method, attempt to recalculate number of processes.ForCan obtain needs the node for sharing process, can list the file under all Hosts here, and to the title of these nodesIt is ranked up, the location of acquisition node itself, it is to be started according to total node number amount, present node serial number and need laterService processes sum, which carries out present node, needs number of processes to be started to calculate.
When newly increasing a node every time, catalogue can be established on Zookeeper and preserves information, represent that there are one newNode shares process service, and load-balancing method can stop process service, and equilibrium assignment again.When losing a node every timeWhen, the transient node on Zookeeper can be removed, and at this moment the process service lost can be recalculated distribution by load-balancing methodTo on remaining node.
Sluggard's load-balancing method can also be used, unlike before, sluggard's load-balancing method can't be attemptedService processes are stopped, but are just retried when the error occurs checking service processes.But the if number of processAmount can enter the election queue of launching process less than desired startup number, Balancer, attempt to start new process.
The present invention utilizes the characteristic of Zookeeper, and node has carried out distributed treatment, and the information of each node passes throughZookeeper is shared and is monitored.Due to saving the details of each node in Zookeeper, so active and standbyController can monitor mutually the situation of other side at runtime, be in after the machine of delaying suddenly of Active states, spareIt timely taking over tasks and pull-up can be attempted has delayed the Controller of machine, and ensure that process services normal operation, accomplish eachProcess services High Availabitity.Since the information of each node is stored in Zookeeper, so Controller can be byAccording to load-balancing algorithm, dynamically increased and decreased according to the quantity of service processes, it, can also be again when increasing newly or losing nodeIt calculates and carries out load balance.
The present invention carries out load balancing on the basis of the service of bottom process, since each independent process service can be configuredCertain service access request quantity, therefore when access request quantity increase, it is only necessary to calculating needs process service to be startedQuantity is shielded access request and is initiated with which kind of communication protocol, it is only necessary to which communication protocol as desired is encapsulated in process serviceIn, load-balancing algorithm carries out load balancing according to service processes quantity.So the present invention adapts to the load of various communications protocolsEqualization problem, user can carry out load balancing with ready-made process service, can also be according to special scenes demand in the present inventionLoad balancing frame under independently extended.And the present invention is divided into two kinds of mechanism according to process service time to live and is bornEquilibrium is carried, sluggard's load balancing mode is used for the process service of longtime running, only need to go wrong progress again in processDistribution is calculated, automatic load balancing mode is used for the process service of short-term operation, load balancing point can be carried out in real timeMatch.
The foregoing is merely presently preferred embodiments of the present invention, is not intended to limit the invention, it is all the present invention spirit andWithin principle, any modification, equivalent replacement, improvement and so on should all be included in the protection scope of the present invention.

Claims (10)

CN201711339686.7A2017-12-142017-12-14A kind of implementation method of expansible big data High AvailabitityPendingCN108228393A (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN201711339686.7ACN108228393A (en)2017-12-142017-12-14A kind of implementation method of expansible big data High Availabitity

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN201711339686.7ACN108228393A (en)2017-12-142017-12-14A kind of implementation method of expansible big data High Availabitity

Publications (1)

Publication NumberPublication Date
CN108228393Atrue CN108228393A (en)2018-06-29

Family

ID=62649567

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN201711339686.7APendingCN108228393A (en)2017-12-142017-12-14A kind of implementation method of expansible big data High Availabitity

Country Status (1)

CountryLink
CN (1)CN108228393A (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN109104321A (en)*2018-09-032018-12-28福建星瑞格软件有限公司The method for promoting Web_HDFS availability under Hadoop dual computer group
CN109189854A (en)*2018-08-142019-01-11新华三技术有限公司成都分公司The method and node device of sustained traffic are provided
CN110532060A (en)*2019-08-102019-12-03佳都新太科技股份有限公司A kind of hybrid network environmental data collecting method and system
WO2020147330A1 (en)*2019-01-182020-07-23苏宁云计算有限公司Data stream processing method and system
CN111552672A (en)*2020-02-192020-08-18中国船舶工业系统工程研究院ZooKeeper-based distributed service state consistency maintenance method and device
CN112948128A (en)*2021-03-302021-06-11华云数据控股集团有限公司Target terminal selection method, system and computer readable medium
CN114520809A (en)*2022-03-042022-05-20浪潮云信息技术股份公司Method and device for realizing load balancing of back-end request

Citations (4)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN1719831A (en)*2005-07-152006-01-11清华大学 A Highly Available Distributed Border Gateway Protocol System Based on Cluster Router Structure
US20130058354A1 (en)*2010-07-062013-03-07Martin CasadoManaged switching elements used as extenders
CN105553732A (en)*2015-12-232016-05-04中国科学院信息工程研究所Distributed network simulation method and system
CN105786666A (en)*2016-02-052016-07-20浪潮(北京)电子信息产业有限公司Failure processing method and system for multi-controller storage system

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN1719831A (en)*2005-07-152006-01-11清华大学 A Highly Available Distributed Border Gateway Protocol System Based on Cluster Router Structure
US20130058354A1 (en)*2010-07-062013-03-07Martin CasadoManaged switching elements used as extenders
CN105553732A (en)*2015-12-232016-05-04中国科学院信息工程研究所Distributed network simulation method and system
CN105786666A (en)*2016-02-052016-07-20浪潮(北京)电子信息产业有限公司Failure processing method and system for multi-controller storage system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
田心宁: "基于Zookeeper的SDN多控制器架构的研究与实现", 《中国优秀硕士学位论文全文数据库信息科技辑》*

Cited By (10)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN109189854A (en)*2018-08-142019-01-11新华三技术有限公司成都分公司The method and node device of sustained traffic are provided
CN109189854B (en)*2018-08-142021-06-08新华三技术有限公司成都分公司Method and node equipment for providing continuous service
CN109104321A (en)*2018-09-032018-12-28福建星瑞格软件有限公司The method for promoting Web_HDFS availability under Hadoop dual computer group
CN109104321B (en)*2018-09-032021-05-18福建星瑞格软件有限公司Method for improving Web _ HDFS availability under Hadoop dual-computer cluster
WO2020147330A1 (en)*2019-01-182020-07-23苏宁云计算有限公司Data stream processing method and system
CN110532060A (en)*2019-08-102019-12-03佳都新太科技股份有限公司A kind of hybrid network environmental data collecting method and system
CN111552672A (en)*2020-02-192020-08-18中国船舶工业系统工程研究院ZooKeeper-based distributed service state consistency maintenance method and device
CN111552672B (en)*2020-02-192023-09-15中国船舶工业系统工程研究院Distributed service state consistency maintenance method and device based on ZooKeeper
CN112948128A (en)*2021-03-302021-06-11华云数据控股集团有限公司Target terminal selection method, system and computer readable medium
CN114520809A (en)*2022-03-042022-05-20浪潮云信息技术股份公司Method and device for realizing load balancing of back-end request

Similar Documents

PublicationPublication DateTitle
CN108228393A (en)A kind of implementation method of expansible big data High Availabitity
JP6600373B2 (en) System and method for active-passive routing and control of traffic in a traffic director environment
CN108712464A (en)A kind of implementation method towards cluster micro services High Availabitity
US9262229B2 (en)System and method for supporting service level quorum in a data grid cluster
CN102739775B (en)The monitoring of internet of things data acquisition server cluster and management method
CN104935672B (en)Load balancing service high availability implementation method and equipment
CN112463366B (en) Cloud-native-oriented microservice automatic expansion and contraction and automatic circuit breaker method and system
CN102571772B (en)Hot spot balancing method for metadata server
EP3039844B1 (en)System and method for supporting partition level journaling for synchronizing data in a distributed data grid
CN103067293A (en)Method and system for multiplex and connection management of a load balancer
CN105630589A (en)Distributed process scheduling system and process scheduling and execution method
WO2021120633A1 (en)Load balancing method and related device
JP6615761B2 (en) System and method for supporting asynchronous calls in a distributed data grid
US10652100B2 (en)Computer system and method for dynamically adapting a software-defined network
CN114900449B (en)Resource information management method, system and device
WO2024021469A1 (en)System operation and maintenance management method and apparatus, and electronic device
CN110113406B (en)Distributed computing service cluster system
Chang et al.Write-aware replica placement for cloud computing
JP7678892B2 (en) Geographically Distributed Hybrid Cloud Clusters
CN119473720A (en) Data backup method, device, electronic device and storage medium
CN119149165A (en)High availability computing clusters
CN104468674B (en)Data migration method and device
CN112100283B (en)Linux platform based time-sharing multiplexing method for android virtual machine
CN119155209A (en)Scheduling system, method and electronic equipment of Redis instance crossing K8s cluster
CN118642844A (en) Dynamic and smooth expansion of POD resources based on K8S and rescheduling method under heavy load

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination
RJ01Rejection of invention patent application after publication
RJ01Rejection of invention patent application after publication

Application publication date:20180629


[8]ページ先頭

©2009-2025 Movatter.jp