Disclosure of Invention
In view of this, an object of the embodiments of the present invention is to provide a method, a system, a computer device, and a computer-readable storage medium for process collaboration in a cluster, in which the whole operating system cluster is regarded as a whole, and no host needs to be elected or assigned, so as to reduce the risk of "brain split"; by setting the threshold value, the process with the highest resource consumption in the server reaching the threshold value is transferred to other hosts for operation, and when the process cannot be operated by any other host, the resource access amount of the process is limited, the problem of shutdown or even avalanche of the server is avoided, and the availability of the server is improved; interaction is carried out in a broadcasting mode, and interaction frequency is reduced.
Based on the above object, an aspect of the embodiments of the present invention provides a method for process cooperation in a cluster, including the following steps: all the hosts are configured in the same network segment, the communication between each host and other hosts is established, and the resource configuration condition of each host is monitored; in response to the existence of a process with the resource configuration of the first host exceeding a threshold, calculating the resource consumption of each process in the first host, and determining the process with the maximum resource consumption; initiating a migration broadcast to a cluster, and judging whether a second host exists in the cluster to respond to the broadcast; and responding to the broadcast of the second host existing in the cluster, and sending the process with the maximum resource consumption to the second host.
In some embodiments, the method further comprises: in response to no second host in the cluster responding to the broadcast, limiting resource usage by a process in the first host having a greatest resource consumption.
In some embodiments, the sending the process with the largest resource consumption amount to the second host includes: and responding to the broadcast responded by a plurality of second hosts, acquiring the current resource utilization rate of each second host, and sending the process with the maximum resource consumption to the second host with the lowest current resource utilization rate.
In some embodiments, the sending the process with the largest resource consumption amount to the second host includes: and suspending the process with the maximum resource consumption, compressing the memory state of the process, and sending the compressed process memory mirror image and the process executable file to the second host.
In some embodiments, the monitoring the resource configuration of each host includes: acquiring the CPU utilization rate, the memory utilization rate and the file system utilization rate of the host, carrying out weighted calculation according to the preset CPU utilization rate weight, the preset memory utilization rate weight and the preset file system utilization rate weight to obtain the resource configuration of the host, and comparing the resource configuration with the threshold.
In some embodiments, the method further comprises: and setting a standby host in the cluster, configuring the host which is currently used into a host list, and setting the standby host into a dynamic increasing mode.
In some embodiments, the method further comprises: and responding to the broadcast that no second host exists in the cluster, and sending the process with the maximum resource consumption to the standby host for execution.
In another aspect of the embodiments of the present invention, a system for process cooperation in a cluster is provided, including: the configuration module is configured to configure all the hosts to the same network segment, establish communication between each host and other hosts, and monitor the resource configuration condition of each host; the computing module is configured to respond to the condition that the resource configuration of a first host exceeds a threshold value, compute the resource consumption of each process in the first host and determine the process with the maximum resource consumption; the broadcasting module is configured to initiate a migration broadcast to a cluster and judge whether a second host exists in the cluster to respond to the broadcast; and a sending module configured to send, in response to a second host existing in the cluster responding to the broadcast, a process with the largest resource consumption to the second host.
In another aspect of the embodiments of the present invention, there is also provided a computer device, including: at least one processor; and a memory storing computer instructions executable on the processor, the instructions when executed by the processor implementing the steps of the method as above.
In a further aspect of the embodiments of the present invention, a computer-readable storage medium is also provided, in which a computer program for implementing the above method steps is stored when the computer program is executed by a processor.
The invention has the following beneficial technical effects: the whole operating system cluster is regarded as a whole, and a host does not need to be elected or assigned, so that the risk of brain split is reduced; by setting the resource configuration threshold, the process with the highest resource consumption in the server reaching the threshold is transferred to other hosts for operation, and when the process cannot be operated by any other host, the resource access amount of the process is limited, the problem of shutdown or even avalanche of the server is avoided, and the availability of the server is improved; interaction is carried out in a broadcasting mode, and interaction frequency is reduced.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the following embodiments of the present invention are described in further detail with reference to the accompanying drawings.
It should be noted that all expressions using "first" and "second" in the embodiments of the present invention are used for distinguishing two entities with the same name but different names or different parameters, and it should be noted that "first" and "second" are merely for convenience of description and should not be construed as limitations of the embodiments of the present invention, and they are not described in any more detail in the following embodiments.
In a first aspect of the embodiments of the present invention, an embodiment of a method for process cooperation in a cluster is provided. Fig. 1 is a schematic diagram illustrating an embodiment of a method for process cooperation in a cluster according to the present invention. As shown in fig. 1, the embodiment of the present invention includes the following steps:
s1, configuring all hosts to the same network segment, establishing communication between each host and other hosts, and monitoring resource configuration condition of each host;
s2, responding to the fact that the resource configuration of the first host exceeds a threshold value, calculating the resource consumption of each process in the first host, and determining the process with the maximum resource consumption;
s3, initiating a migration broadcast to the cluster, and judging whether a second host computer exists in the cluster to respond to the broadcast;
s4, responding to the broadcast of the second host existing in the cluster, and sending the process with the maximum resource consumption to the second host; and
and S5, configuring the cluster working mode into three working modes of dynamic detection, a fixed host list and a main/standby host list.
All the hosts are configured in the same network segment, the communication between each host and other hosts is established, and the resource configuration condition of each host is monitored.
All the hosts are configured into a cluster by building a local area network. The network segments of the hosts can be configured to be the same local area network, for example, after the cluster is configured to 192.168.11.0/24 network segments, the hosts configured to the network segments are all cluster hosts. The hosts can ping each other. Configure by modifying the configuration/etc/rpcha. The configuration contents include: cluster network segment address, cluster working mode and cluster resource allocation threshold. The cluster network segment address configuration mode is network segment address + subnet mask; the cluster working mode can be configured into three working modes of dynamic detection/fixed host list/main/standby host list; the resource configuration threshold of the cluster is the lowest resource configuration value of the operating system for process migration.
The servers are configured in the same network segment, and the clusters are defined in a network segment mode, so that the hosts in the network segment can communicate with each other. And simultaneously, main and auxiliary machines are not designated, the status of each machine is equal, and all hosts in the network segment are the machines of the cluster.
In the context of the present application, the network IP is externally consistent: after configuration is completed, an out-of-cluster IP may be configured. The IP address points to the cluster. The host of a particular cluster response is determined internally by the cluster. For example, an IP points to an a host in cluster ABC, which IP may point to other hosts immediately after host a goes down. Therefore, the appearance is consistent, and all the IP accesses are normal. The user can not sense that the switching of the host computer is generated inside in the using process, and the user satisfaction degree is improved. Further, the IP may be organized in the form of a floating IP.
There are three cluster operating modes, and each operating mode determines the interaction mode of the servers in the cluster. The three working modes are respectively as follows:
dynamic monitoring: dynamic detection is the dynamic discovery of hosts within a clustered network. Because only the hosts configured in the network segment are regarded as nodes in the cluster, the main implementation scheme of dynamic detection is that each time a new host joins the cluster, the new host initiates broadcasting into the local area network. The broadcast content is the address of the new host, and the name of the new host. And after receiving the broadcast message sent by the new host in other existing clusters, automatically recording the basic information of the new host node. For example, there are 10 hosts in the current cluster, and when a host node needs to be added to the cluster, only the server needs to be configured into the corresponding network, and after configuration is completed, the server is added to the cluster through an initialization command (rpcha-init) of the cluster. The command initiates a local area network to broadcast and inform other clusters, each host records ip and name information of a newly added node, and the clusters become 11 hosts.
List of fixed hosts: fixed host lists are the reverse of dynamic detection. The fixed host list cannot dynamically configure the host nodes, and only some nodes can be fixedly configured as host list nodes. If the host node needs to be changed, such as online or offline, the configuration file needs to be modified uniformly. The configuration mode is relatively fixed in operation, cannot dynamically increase or decrease the host, but has stronger control granularity. The method is suitable for a relatively clear and stable host cluster scene.
List of master and slave hosts: the above two configurations are combined. The working mode of the main and standby host computer lists is a mechanism combining main and standby operation. The host computer adopts a basic configuration mode, the standby computer adopts a dynamic increasing mode, for example, a certain cluster adopts 5 servers as the host computer to work, and the general condition can be met. The standby machines are in the same network segment but are not configured in the host list, so the standby machines adopt a dynamic increasing mode. When a new host needs to be added to the server, the server is firstly configured into the network segment and then initialized by using an initialization command in accordance with the dynamic detection scheme.
In some embodiments, the monitoring the resource configuration of each host includes: acquiring the CPU utilization rate, the memory utilization rate and the file system utilization rate of the host, carrying out weighted calculation according to the preset CPU utilization rate weight, the preset memory utilization rate weight and the preset file system utilization rate weight to obtain the resource configuration of the host, and comparing the resource configuration with the threshold.
When the resource allocation of a host reaches a threshold, new processes cannot be executed continuously. The threshold is configured as a percentage, e.g., 80% configured as 80. The threshold value of the resource allocation is calculated by adopting a weighted calculation weight value mode. It is possible, for example: the CPU utilization weight is 5, the memory utilization weight is 3, the file system utilization weight is 2, the CPU utilization is 100%, the memory utilization is 60%, and the file system utilization is 90%, the weight (5 × 100+3 × 60+2 × 90)/10 =86 is calculated. At this time, process migration is required. The most computationally expensive process may continue to be computed using the algorithm. The weighted consumption of each process is calculated. And selecting the highest process for process migration.
In response to the existence of a process having a resource configuration of the first host that exceeds a threshold, calculating a resource consumption amount of each process in the first host, and determining a process having a maximum resource consumption amount.
And initiating a migration broadcast to the cluster, and judging whether a second host exists in the cluster or not to respond to the broadcast. And after processes needing migration are elected, broadcasting is initiated into the cluster, and if the server receiving the broadcasting can process new processes, the broadcasting is responded.
And responding to the broadcast of the second host in the cluster, and sending the process with the maximum resource consumption to the second host.
In some embodiments, the sending the process with the largest resource consumption amount to the second host includes: and responding to the broadcast responded by a plurality of second hosts, acquiring the current resource utilization rate of each second host, and sending the process with the maximum resource consumption to the second host with the lowest current resource utilization rate.
In some embodiments, the sending the process with the largest resource consumption amount to the second host includes: and suspending the process with the maximum resource consumption, compressing the memory state of the process, and sending the compressed process memory mirror image and the process executable file to the second host. After receiving the response, the host initiating the broadcast suspends the process, compresses the memory state, sends the compressed process memory mirror image and the process executable file to the new host in a mode of copying the cluster local area network file, and the new host continues to run the process.
In some embodiments, the method further comprises: in response to no second host in the cluster responding to the broadcast, limiting resource usage by a process in the first host having a greatest resource consumption. If no host is capable of running the process. Then the resource limit strategy (cgroup) is used to limit the resource usage of the current process. The CPU utilization and the memory file system utilization are limited, and a warning is given to a user. And warning that the cluster resources are currently in a low-speed running state, and waiting for human intervention to modify the cluster resources or uniformly configure the processes.
In some embodiments, the method further comprises: and setting a standby host in the cluster, configuring the host which is currently used into a host list, and setting the standby host into a dynamic increasing mode.
In some embodiments, the method further comprises: and responding to the broadcast that no second host exists in the cluster, and sending the process with the maximum resource consumption to the standby host for execution.
The invention considers the whole operating system cluster as a whole and does not need to elect a host or distribute tasks by the host. Each operating system, by default, runs on its own physical device while performing tasks. Only when a certain condition is reached, the server can select the process which cannot be continuously executed by the local machine to pause, stop the current state of the process and give the current state to other servers with resources to execute the task; in order to ensure that the process can be continuously executed, if all other hosts can not run the process, a sandbox mechanism is started, and the local machine continuously runs the process, but the resource access amount of the process is limited through a resource limitation scheme (cgroup) of an operating system, and an early warning is sent to a server manager to warn that the server is in a limited state and the host needs to be added for running, so that the problem that the server cannot generate avalanche is solved, and the server can be ensured to run at least without being crashed by using a resource access limiting mode; the invention adopts the broadcasting mode for interaction, does not depend on the heartbeat mechanism, and the reason for using the broadcasting mode for interaction is that the heartbeat mechanism needs to establish permanent connection and needs to interact more frequently, while the broadcasting mode is infrequent, communication is not needed under normal conditions, when interaction is needed, the operating systems respond to each other, the interaction frequency is reduced, and interaction is carried out only when necessary.
It should be particularly noted that, the steps in the embodiments of the method for process cooperation in a cluster described above may be mutually intersected, replaced, added, or deleted, and therefore, these methods for process cooperation in a cluster transformed by reasonable permutation and combination shall also belong to the scope of the present invention, and shall not limit the scope of the present invention to the embodiments.
Based on the above object, a second aspect of the embodiments of the present invention provides a system for process cooperation in a cluster. As shown in fig. 2, thesystem 200 includes the following modules: the configuration module is configured to configure all the hosts to the same network segment, establish communication between each host and other hosts, and monitor the resource configuration condition of each host; the computing module is configured to respond to the condition that the resource configuration of a first host exceeds a threshold value, compute the resource consumption of each process in the first host and determine the process with the maximum resource consumption; the broadcasting module is configured to initiate a migration broadcast to a cluster and judge whether a second host exists in the cluster to respond to the broadcast; a sending module configured to send, in response to a second host existing in the cluster responding to the broadcast, a process with the largest resource consumption to the second host; and the cluster working mode configuration module is used for configuring the cluster working mode into three working modes, namely a dynamic detection working mode, a fixed host list working mode and a main host and standby host list working mode.
In some embodiments, the system further comprises a restriction module configured to: in response to no second host in the cluster responding to the broadcast, limiting resource usage by a process in the first host having a greatest resource consumption.
In some embodiments, the sending module is configured to: and responding to the broadcast responded by a plurality of second hosts, acquiring the current resource utilization rate of each second host, and sending the process with the maximum resource consumption to the second host with the lowest current resource utilization rate.
In some embodiments, the sending module is configured to: and suspending the process with the maximum resource consumption, compressing the memory state of the process, and sending the compressed process memory mirror image and the process executable file to the second host.
In some embodiments, the configuration module is configured to: acquiring the CPU utilization rate, the memory utilization rate and the file system utilization rate of the host, carrying out weighted calculation according to the preset CPU utilization rate weight, the preset memory utilization rate weight and the preset file system utilization rate weight to obtain the resource configuration of the host, and comparing the resource configuration with the threshold.
In some embodiments, the system further comprises a backup module configured to: and setting a standby host in the cluster, configuring the host which is currently used into a host list, and setting the standby host into a dynamic increasing mode.
In some embodiments, the system further comprises a second sending module configured to: and responding to the broadcast that no second host exists in the cluster, and sending the process with the maximum resource consumption to the standby host for execution.
The embodiment of the invention can be implemented by an operating system command tool, a cluster synchronization device, a mirror image compression and backup device and a configuration device. The operating system command tool is responsible for interacting with a user, and the user is responsible for calling the operating system command tool to configure the cluster; the cluster synchronization device is responsible for communication among clusters, and is mainly responsible for sending and receiving broadcast, mutual transmission of messages in the clusters and the like; the mirror image compression and backup device is responsible for compressing the process to be migrated, and after the synchronization is completed through the cluster synchronization device, backing up the mirror image and starting a new process in other servers in the cluster; the configuration device is responsible for reading configuration and analyzing, storing the analyzed result, interacting through the synchronization device, and checking whether the configuration of other servers in the cluster is consistent.
A virtual subnet may be configured to configure servers within a cluster in the same network segment. After configuration is completed, a server in the cluster is found at any time, and initialization is performed using the operating system command tool rpc _ tools — init. The command tool first reads the configuration via the configuration device. And after the reading is finished, broadcasting is initiated to the cluster through the cluster synchronization device. The machines receiving the broadcast respond to the initialization operation and all initialize local configuration through the local configuration device. And after the initialization of each server is completed, the broadcast is initiated through the server synchronization device. And finally, completing the complete configuration.
After configuration is complete, the process may be initiated at the server cluster. And selecting a server to log in. The process is started using a command line tool. For example to start the demo process. And the process is started in the local computer by default, if the server reaches the configured threshold performance, the process is selected, a process with the largest consumption performance is selected, the broadcast is initiated, and the host computer capable of being hosted is found. After receiving the broadcast, the other servers reply whether the servers can host or not through the synchronization device. A hosting host is selected by the server that originated the hosting broadcast. After the selection is finished, the current process and the server state are compressed through the mirror image compression and backup device and transmitted to the new host, and the process of the local computer is killed. And if the host cannot be found, limiting the resource use of the process in a resource limiting mode and sending out alarm information.
In view of the above object, a third aspect of the embodiments of the present invention provides a computer device, including: at least one processor; and a memory storing computer instructions executable on the processor, the instructions being executable by the processor to perform the steps of: s1, configuring all hosts to the same network segment, establishing communication between each host and other hosts, and monitoring resource configuration condition of each host; s2, responding to the fact that the resource configuration of the first host exceeds a threshold value, calculating the resource consumption of each process in the first host, and determining the process with the maximum resource consumption; s3, initiating a migration broadcast to the cluster, and judging whether a second host computer exists in the cluster to respond to the broadcast; and S4, responding to the second host computer existing in the cluster responding to the broadcast, and sending the process with the maximum resource consumption to the second host computer.
In some embodiments, the steps further comprise: in response to no second host in the cluster responding to the broadcast, limiting resource usage by a process in the first host having a greatest resource consumption.
In some embodiments, the sending the process with the largest resource consumption amount to the second host includes: and responding to the broadcast responded by a plurality of second hosts, acquiring the current resource utilization rate of each second host, and sending the process with the maximum resource consumption to the second host with the lowest current resource utilization rate.
In some embodiments, the sending the process with the largest resource consumption amount to the second host includes: and suspending the process with the maximum resource consumption, compressing the memory state of the process, and sending the compressed process memory mirror image and the process executable file to the second host.
In some embodiments, the monitoring the resource configuration of each host includes: acquiring the CPU utilization rate, the memory utilization rate and the file system utilization rate of the host, carrying out weighted calculation according to the preset CPU utilization rate weight, the preset memory utilization rate weight and the preset file system utilization rate weight to obtain the resource configuration of the host, and comparing the resource configuration with the threshold.
In some embodiments, the steps further comprise: and setting a standby host in the cluster, configuring the host which is currently used into a host list, and setting the standby host into a dynamic increasing mode.
In some embodiments, the steps further comprise: and responding to the broadcast that no second host exists in the cluster, and sending the process with the maximum resource consumption to the standby host for execution.
Fig. 3 is a schematic hardware structural diagram of an embodiment of a computer device for process cooperation in a cluster according to the present invention.
Taking the device shown in fig. 3 as an example, the device includes aprocessor 301 and amemory 302.
Theprocessor 301 and thememory 302 may be connected by a bus or other means, such as the bus connection in fig. 3.
Thememory 302 is used as a non-volatile computer readable storage medium for storing non-volatile software programs, non-volatile computer executable programs, and modules, such as program instructions/modules corresponding to the method for process cooperation in a cluster in the embodiment of the present application. Theprocessor 301 executes various functional applications of the server and data processing by running nonvolatile software programs, instructions, and modules stored in thememory 302, that is, implements the method of process cooperation in a cluster of the above-described method embodiments.
Thememory 302 may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created according to the use of the method of process cooperation in the cluster, and the like. Further, thememory 302 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid state storage device. In some embodiments,memory 302 optionally includes memory located remotely fromprocessor 301, which may be connected to a local module via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
Computer instructions 303 corresponding to a method for process cooperation in one or more clusters are stored in thememory 302 and when executed by theprocessor 301, perform the method for process cooperation in a cluster in any of the above-described method embodiments.
Any embodiment of the computer device executing the method for process cooperation in a cluster can achieve the same or similar effects as any corresponding embodiment of the method.
The invention also provides a computer readable storage medium storing a computer program which, when executed by a processor, performs the method as above.
FIG. 4 is a schematic diagram of an embodiment of a computer storage medium for cooperating processes in a cluster according to the present invention. Taking the computer storage medium as shown in fig. 4 as an example, the computerreadable storage medium 401 stores acomputer program 402 which, when executed by a processor, performs the method as described above.
Finally, it should be noted that, as one of ordinary skill in the art can appreciate that all or part of the processes in the methods of the above embodiments can be implemented by instructing relevant hardware through a computer program, and the program of the method for process cooperation in a cluster can be stored in a computer-readable storage medium, and when executed, the program can include the processes of the embodiments of the methods described above. The storage medium of the program may be a magnetic disk, an optical disk, a Read Only Memory (ROM), a Random Access Memory (RAM), or the like. The embodiments of the computer program may achieve the same or similar effects as any of the above-described method embodiments.
The foregoing is an exemplary embodiment of the present disclosure, but it should be noted that various changes and modifications could be made herein without departing from the scope of the present disclosure as defined by the appended claims. The functions, steps and/or actions of the method claims in accordance with the disclosed embodiments described herein need not be performed in any particular order. Furthermore, although elements of the disclosed embodiments of the invention may be described or claimed in the singular, the plural is contemplated unless limitation to the singular is explicitly stated.
It should be understood that, as used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, unless the context clearly supports the exception. It should also be understood that "and/or" as used herein is meant to include any and all possible combinations of one or more of the associated listed items.
The numbers of the embodiments disclosed in the embodiments of the present invention are merely for description, and do not represent the merits of the embodiments.
It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program instructing relevant hardware, and the program may be stored in a computer-readable storage medium, and the above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.
Those of ordinary skill in the art will understand that: the discussion of any embodiment above is meant to be exemplary only, and is not intended to intimate that the scope of the disclosure, including the claims, of embodiments of the invention is limited to these examples; within the idea of an embodiment of the invention, also technical features in the above embodiment or in different embodiments may be combined and there are many other variations of the different aspects of the embodiments of the invention as described above, which are not provided in detail for the sake of brevity. Therefore, any omissions, modifications, substitutions, improvements, and the like that may be made without departing from the spirit and principles of the embodiments of the present invention are intended to be included within the scope of the embodiments of the present invention.