Movatterモバイル変換


[0]ホーム

URL:


CN106301895A - A kind of disaster recovery method obtaining cluster monitoring data and device - Google Patents

A kind of disaster recovery method obtaining cluster monitoring data and device
Download PDF

Info

Publication number
CN106301895A
CN106301895ACN201610627210.2ACN201610627210ACN106301895ACN 106301895 ACN106301895 ACN 106301895ACN 201610627210 ACN201610627210 ACN 201610627210ACN 106301895 ACN106301895 ACN 106301895A
Authority
CN
China
Prior art keywords
cluster
monitoring data
monitoring
node
disaster recovery
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610627210.2A
Other languages
Chinese (zh)
Inventor
周龙飞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Inspur Beijing Electronic Information Industry Co Ltd
Original Assignee
Inspur Beijing Electronic Information Industry Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Inspur Beijing Electronic Information Industry Co LtdfiledCriticalInspur Beijing Electronic Information Industry Co Ltd
Priority to CN201610627210.2ApriorityCriticalpatent/CN106301895A/en
Publication of CN106301895ApublicationCriticalpatent/CN106301895A/en
Pendinglegal-statusCriticalCurrent

Links

Classifications

Landscapes

Abstract

Translated fromChinese

本发明公开了一种获取集群监控数据的容灾方法及装置,每个集群节点收集本机监控信息,并获取集群内其他所有节点的监控信息;通过设置的集群主节点获取集群的监控数据;当检测到集群主节点发生宕机时,重新选择其他节点作为当前集群主节点,通过当前集群主节点获取监控数据。本发明打破了原有低效、缺乏容灾方案或等待容灾时间过长的管理软件获取监控信息的方式,考虑了集群环境的特殊性,将监控信息的收集、容灾方案变为1+N的模式,将管理软件获取数据的容灾等待时间省略。可以在保证监控信息安全的同时,节省了监控信息收集时间以及带宽资源的消耗,并达到无缝切换数据获取对象的目的。

The invention discloses a disaster recovery method and device for obtaining cluster monitoring data. Each cluster node collects local monitoring information and obtains monitoring information of all other nodes in the cluster; the cluster monitoring data is obtained through the set cluster master node; When it is detected that the cluster master node is down, another node is re-selected as the current cluster master node, and the monitoring data is obtained through the current cluster master node. The present invention breaks the original way of obtaining monitoring information by management software that lacks disaster recovery solutions or waits too long for disaster recovery, takes into account the particularity of the cluster environment, and changes the collection of monitoring information and disaster recovery solutions to 1+ In the N mode, the disaster recovery waiting time for the management software to obtain data is omitted. While ensuring the security of monitoring information, it saves the time for collecting monitoring information and the consumption of bandwidth resources, and achieves the purpose of seamlessly switching data acquisition objects.

Description

Translated fromChinese
一种获取集群监控数据的容灾方法及装置Disaster recovery method and device for acquiring cluster monitoring data

技术领域technical field

本发明涉及数据容灾技术领域,特别是涉及一种获取集群监控数据的容灾方法及装置。The invention relates to the technical field of data disaster recovery, in particular to a disaster recovery method and device for acquiring cluster monitoring data.

背景技术Background technique

一般集群监控信息的收集,都采用单独的数据收集服务器,对各个集群节点的监控信息进行收集,然后额外增加备份设备。或者采用广播方式在每个集群节点上收集所有节点监控信息,再由单独的数据收集服务器对单一节点进行监控信息读取,最后由管理软件到数据收集服务器进行数据获取。Generally, the collection of cluster monitoring information uses a separate data collection server to collect the monitoring information of each cluster node, and then additional backup devices are added. Or use broadcasting to collect all node monitoring information on each cluster node, and then use a separate data collection server to read the monitoring information of a single node, and finally use the management software to the data collection server for data acquisition.

让每个节点都收集所有节点监控信息的方式,在时间消耗上投入较大。而将监控信息放入数据库,对数据库进行容灾保护的方式,则对容灾投入消耗较大。而一对一的管理软件获取监控数据的方式也增加了容灾时用户等待数据切换的成本。The method of allowing each node to collect monitoring information of all nodes requires a large investment in time consumption. However, the method of putting monitoring information into the database and performing disaster recovery protection on the database consumes a lot of investment in disaster recovery. The way of one-to-one management software to obtain monitoring data also increases the cost of users waiting for data switching during disaster recovery.

发明内容Contents of the invention

本发明的目的是提供一种获取集群监控数据的容灾方法及装置,目的在于在保证监控信息容灾可靠性的前提下,减少系统带宽消耗,减少数据容灾成本。The purpose of the present invention is to provide a disaster recovery method and device for acquiring cluster monitoring data, with the purpose of reducing system bandwidth consumption and data disaster recovery cost under the premise of ensuring the reliability of monitoring information disaster recovery.

为解决上述技术问题,本发明提供一种获取集群监控数据的容灾方法,包括:In order to solve the above technical problems, the present invention provides a disaster recovery method for obtaining cluster monitoring data, including:

每个集群节点收集本机监控信息,并获取集群内其他所有节点的监控信息;Each cluster node collects local monitoring information and obtains monitoring information of all other nodes in the cluster;

通过设置的集群主节点获取集群的监控数据;Obtain the monitoring data of the cluster through the set cluster master node;

当检测到所述集群主节点发生宕机时,重新选择其他节点作为当前集群主节点,通过所述当前集群主节点获取监控数据。When it is detected that the cluster master node is down, another node is re-selected as the current cluster master node, and the monitoring data is acquired through the current cluster master node.

可选地,所述通过设置的集群主节点获取集群的监控数据包括:Optionally, the obtaining the monitoring data of the cluster through the set cluster master node includes:

通过telnet根据设置的集群主节点IP进行监控数据获取。Obtain monitoring data through telnet according to the set cluster master node IP.

可选地,所述当检测到所述集群主节点发生宕机时,重新选择其他节点作为当前集群主节点包括:Optionally, when it is detected that the cluster master node is down, reselecting other nodes as the current cluster master node includes:

当检测到所述集群主节点发生宕机时,通过telnet根据指定的当前集群主节点的IP进行监控数据获取。When it is detected that the cluster master node is down, the monitoring data is acquired through telnet according to the specified IP of the current cluster master node.

可选地,所述获取集群内其他所有节点的监控信息包括:Optionally, the obtaining monitoring information of all other nodes in the cluster includes:

通过预先配置的所有集群节点广播路径,获取集群内其他所有节点的监控信息。Obtain the monitoring information of all other nodes in the cluster through the pre-configured broadcast path of all cluster nodes.

可选地,在所述获取集群内其他所有节点的监控信息之后还包括:Optionally, after obtaining the monitoring information of all other nodes in the cluster, the method further includes:

将获取到的所有监控信息保存至本节点数据库中。Save all the obtained monitoring information to the database of this node.

本发明还提供了一种获取集群监控数据的容灾装置,包括:The present invention also provides a disaster recovery device for acquiring cluster monitoring data, including:

收集模块,用于每个集群节点收集本机监控信息,并获取集群内其他所有节点的监控信息;The collection module is used for each cluster node to collect local monitoring information and obtain the monitoring information of all other nodes in the cluster;

获取模块,用于通过设置的集群主节点获取集群的监控数据;The obtaining module is used to obtain the monitoring data of the cluster through the set cluster master node;

容灾模块,用于当检测到所述集群主节点发生宕机时,重新选择其他节点作为当前集群主节点,通过所述当前集群主节点获取监控数据。The disaster recovery module is configured to reselect another node as the current cluster master node when it is detected that the cluster master node is down, and obtain monitoring data through the current cluster master node.

可选地,所述获取模块具体用于:Optionally, the acquisition module is specifically used for:

通过telnet根据设置的集群主节点IP进行监控数据获取。Obtain monitoring data through telnet according to the set cluster master node IP.

可选地,所述容灾模块具体用于:Optionally, the disaster recovery module is specifically used for:

当检测到所述集群主节点发生宕机时,通过telnet根据指定的当前集群主节点的IP进行监控数据获取。When it is detected that the cluster master node is down, the monitoring data is acquired through telnet according to the specified IP of the current cluster master node.

可选地,所述收集模块具体用于Optionally, the collection module is specifically used for

通过预先配置的所有集群节点广播路径,获取集群内其他所有节点的监控信息。Obtain the monitoring information of all other nodes in the cluster through the pre-configured broadcast path of all cluster nodes.

可选地,还包括:Optionally, also include:

存储模块,用于在获取集群内其他所有节点的监控信息之后,将获取到的所有监控信息保存至本节点数据库中。The storage module is configured to save all the obtained monitoring information to the database of the node after obtaining the monitoring information of all other nodes in the cluster.

本发明所提供的获取集群监控数据的容灾方法及装置,每个集群节点收集本机监控信息,并获取集群内其他所有节点的监控信息;通过设置的集群主节点获取集群的监控数据;当检测到集群主节点发生宕机时,重新选择其他节点作为当前集群主节点,通过当前集群主节点获取监控数据。本发明打破了原有低效、缺乏容灾方案或等待容灾时间过长的管理软件获取监控信息的方式,考虑了集群环境的特殊性,将监控信息的收集、容灾方案变为1+N的模式,将管理软件获取数据的容灾等待时间省略。可以在保证监控信息安全的同时,节省了监控信息收集时间以及带宽资源的消耗,并达到无缝切换数据获取对象的目的。In the disaster recovery method and device for obtaining cluster monitoring data provided by the present invention, each cluster node collects local monitoring information, and obtains monitoring information of all other nodes in the cluster; obtains cluster monitoring data through the set cluster master node; when When it is detected that the cluster master node is down, another node is re-selected as the current cluster master node, and the monitoring data is obtained through the current cluster master node. The present invention breaks the original way of obtaining monitoring information by management software that lacks disaster recovery solutions or waits too long for disaster recovery, considers the particularity of the cluster environment, and changes the collection of monitoring information and disaster recovery solutions to 1+ In the N mode, the disaster recovery waiting time for the management software to obtain data is omitted. While ensuring the security of monitoring information, it saves the time for collecting monitoring information and the consumption of bandwidth resources, and achieves the purpose of seamlessly switching data acquisition objects.

附图说明Description of drawings

为了更清楚的说明本发明实施例或现有技术的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单的介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the embodiments of the present invention or the technical solutions of the prior art, the following will briefly introduce the accompanying drawings that need to be used in the description of the embodiments or the prior art. Obviously, the accompanying drawings in the following description are only For some embodiments of the present invention, those skilled in the art can also obtain other drawings based on these drawings without creative work.

图1为本发明所提供的获取集群监控数据的容灾方法的一种具体实施方式的流程图;Fig. 1 is a flow chart of a specific embodiment of the disaster recovery method for obtaining cluster monitoring data provided by the present invention;

图2为本发明所提供的获取集群监控数据容灾方法的另一种实施方式中集群初始状态的示意图;FIG. 2 is a schematic diagram of the initial state of the cluster in another embodiment of the disaster recovery method for acquiring cluster monitoring data provided by the present invention;

图3为本发明所提供的获取集群监控数据容灾方法的另一种实施方式中节点监控信息广播发送、监控数据保存的过程示意图:3 is a schematic diagram of the process of node monitoring information broadcasting and monitoring data storage in another embodiment of the method for obtaining cluster monitoring data disaster recovery provided by the present invention:

图4为本发明所提供的获取集群监控数据容灾方法的另一种实施方式中管理软件获取集群监控数据的过程示意图;Fig. 4 is a schematic diagram of the process of acquiring cluster monitoring data by management software in another embodiment of the disaster recovery method for acquiring cluster monitoring data provided by the present invention;

图5为本发明所提供的获取集群监控数据容灾方法的另一种实施方式中灾备过程示意图;5 is a schematic diagram of the disaster recovery process in another embodiment of the disaster recovery method for acquiring cluster monitoring data provided by the present invention;

图6为本发明实施例提供的获取集群监控数据的容灾装置的结构框图。FIG. 6 is a structural block diagram of a disaster recovery device for acquiring cluster monitoring data provided by an embodiment of the present invention.

具体实施方式detailed description

为了使本技术领域的人员更好地理解本发明方案,下面结合附图和具体实施方式对本发明作进一步的详细说明。显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。In order to enable those skilled in the art to better understand the solution of the present invention, the present invention will be further described in detail below in conjunction with the accompanying drawings and specific embodiments. Apparently, the described embodiments are only some of the embodiments of the present invention, but not all of them. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the protection scope of the present invention.

本发明所提供的获取集群监控数据的容灾方法的一种具体实施方式的流程图如图1所示,该方法包括:A flow chart of a specific implementation of the disaster recovery method for obtaining cluster monitoring data provided by the present invention is shown in Figure 1. The method includes:

步骤S101:每个集群节点收集本机监控信息,并获取集群内其他所有节点的监控信息;Step S101: Each cluster node collects local monitoring information, and obtains monitoring information of all other nodes in the cluster;

步骤S102:通过设置的集群主节点获取集群的监控数据;Step S102: Obtain the monitoring data of the cluster through the configured master node of the cluster;

步骤S103:当检测到所述集群主节点发生宕机时,重新选择其他节点作为当前集群主节点,通过所述当前集群主节点获取监控数据。Step S103: When it is detected that the cluster master node is down, reselect another node as the current cluster master node, and acquire monitoring data through the current cluster master node.

本发明所提供的获取集群监控数据的容灾方法,每个集群节点收集本机监控信息,并获取集群内其他所有节点的监控信息;通过设置的集群主节点获取集群的监控数据;当检测到集群主节点发生宕机时,重新选择其他节点作为当前集群主节点,通过当前集群主节点获取监控数据。本发明打破了原有低效、缺乏容灾方案或等待容灾时间过长的管理软件获取监控信息的方式,考虑了集群环境的特殊性,将监控信息的收集、容灾方案变为1+N的模式,将管理软件获取数据的容灾等待时间省略。可以在保证监控信息安全的同时,节省了监控信息收集时间以及带宽资源的消耗,并达到无缝切换数据获取对象的目的。In the disaster recovery method for obtaining cluster monitoring data provided by the present invention, each cluster node collects local monitoring information, and obtains monitoring information of all other nodes in the cluster; obtains cluster monitoring data through the set cluster master node; when detected When the cluster master node goes down, another node is re-selected as the current cluster master node, and the monitoring data is obtained through the current cluster master node. The present invention breaks the original way of obtaining monitoring information by management software that lacks disaster recovery solutions or waits too long for disaster recovery, considers the particularity of the cluster environment, and changes the collection of monitoring information and disaster recovery solutions to 1+ In the N mode, the disaster recovery waiting time for the management software to obtain data is omitted. While ensuring the security of monitoring information, it saves the time for collecting monitoring information and the consumption of bandwidth resources, and achieves the purpose of seamlessly switching data acquisition objects.

在上述实施例的基础上,本发明所提供的获取集群监控数据的容灾方法中,通过设置的集群主节点获取集群的监控数据的过程可以具体包括:On the basis of the above embodiments, in the disaster recovery method for obtaining cluster monitoring data provided by the present invention, the process of obtaining cluster monitoring data through the set cluster master node may specifically include:

通过telnet根据设置的集群主节点IP进行监控数据获取。Obtain monitoring data through telnet according to the set cluster master node IP.

进一步地,当检测到所述集群主节点发生宕机时,重新选择其他节点作为当前集群主节点的过程可以具体为:Further, when it is detected that the cluster master node is down, the process of re-selecting other nodes as the current cluster master node may specifically be as follows:

当检测到所述集群主节点发生宕机时,通过telnet根据指定的当前集群主节点的IP进行监控数据获取。When it is detected that the cluster master node is down, the monitoring data is acquired through telnet according to the specified IP of the current cluster master node.

在上述任一实施例的基础上,本申请中各节点获取集群内其他所有节点的监控信息的过程为:On the basis of any of the above-mentioned embodiments, the process for each node in this application to obtain the monitoring information of all other nodes in the cluster is as follows:

通过预先配置的所有集群节点广播路径,获取集群内其他所有节点的监控信息。Obtain the monitoring information of all other nodes in the cluster through the pre-configured broadcast path of all cluster nodes.

作为一种具体实施方式,在上述获取集群内其他所有节点的监控信息之后还可以进一步包括:As a specific implementation, after obtaining the monitoring information of all other nodes in the cluster, it may further include:

将获取到的所有监控信息保存至本节点数据库中。Save all the obtained monitoring information to the database of this node.

具体地,本发明所提供的获取集群监控数据的容灾方法可通过集群健康监测、IP配置模块、监控广播配置模块以及监控数据代理模块实现。Specifically, the disaster recovery method for acquiring cluster monitoring data provided by the present invention can be realized through cluster health monitoring, IP configuration module, monitoring broadcast configuration module and monitoring data agent module.

本申请硬件环境是处在集群环境中,因此,在集群环境搭建完成后,由监控广播配置模块配置所有集群节点广播路径。在集群所有节点上初始监控信息数据库到同一状态。监控广播配置模块配置其他节点广播路径。每个集群节点将只单独收集本机监控信息,然后通过广播方式将监控信息发送到其他所有节点上,然后由每个子节点的监控数据代理模块将所有监控信息保存到本节点的数据库中。The hardware environment of this application is in a cluster environment. Therefore, after the cluster environment is set up, the broadcast path of all cluster nodes is configured by the monitoring broadcast configuration module. Initially monitor the information database to the same state on all nodes of the cluster. The monitoring broadcast configuration module configures other node broadcast paths. Each cluster node will only collect local monitoring information separately, and then broadcast the monitoring information to all other nodes, and then the monitoring data agent module of each child node will save all monitoring information to the database of the node.

由集群健康监测和IP配置模块指定某一集群节点为集群主节点,并设置管理软件获取监控数据的主集群节点IP,管理软件将通过telnet等方式获取主集群节点上数据库中的监控数据。当主节点发生宕机时,由集群健康监测和IP配置模块重新选择其他节点提升为主集群节点,并设置管理软件获取监控数据的节点IP,转换管理软件获取监控数据的节点对象。The cluster health monitoring and IP configuration module designates a cluster node as the cluster master node, and sets the master cluster node IP for the management software to obtain monitoring data. The management software will obtain the monitoring data in the database on the master cluster node through telnet and other methods. When the master node goes down, the cluster health monitoring and IP configuration module re-selects other nodes to be promoted as the master cluster node, and sets the node IP for the management software to obtain the monitoring data, and converts the node object for the management software to obtain the monitoring data.

下面结合附图对本发明所提供的获取集群监控数据容灾方法的另一种实施方式的实施过程进行进一步详细描述。请参照图2至图5,图2为集群初始状态的示意图,图3为节点监控信息广播发送、监控数据保存的过程示意图,图4为管理软件获取集群监控数据过程示意图;图5为灾备过程示意图。The implementation process of another embodiment of the disaster recovery method for acquiring cluster monitoring data provided by the present invention will be further described in detail below with reference to the accompanying drawings. Please refer to Figure 2 to Figure 5, Figure 2 is a schematic diagram of the initial state of the cluster, Figure 3 is a schematic diagram of the process of node monitoring information broadcasting and storage of monitoring data, Figure 4 is a schematic diagram of the process of management software obtaining cluster monitoring data; Figure 5 is a disaster recovery Schematic diagram of the process.

如图2所示,集群节点1为主节点,各个节点收集本机监控信息。As shown in Figure 2, cluster node 1 is the master node, and each node collects local machine monitoring information.

如图3所示,所有集群节点广播发送本机监控信息到其他所有节点,监控数据代理模块将整个集群所有监控信息保存到本节点数据库。As shown in Figure 3, all cluster nodes broadcast and send local monitoring information to all other nodes, and the monitoring data proxy module saves all monitoring information of the entire cluster to the local node database.

如图4所示,管理软件通过telnet等方法,根据集群健康监测和IP配置模块指定的主节点IP进行监控数据获取。As shown in Figure 4, the management software acquires monitoring data according to the master node IP specified by the cluster health monitoring and IP configuration module through methods such as telnet.

如图5所示,当主节点宕机后,集群健康监测和IP配置模块从集群其他节点选择一个提升为主节点。管理软件通过telnet根据集群健康监测和IP配置模块指定的新的主节点IP进行监控数据获取,达到无缝切换。As shown in Figure 5, when the master node goes down, the cluster health monitoring and IP configuration module selects one from other nodes in the cluster to be promoted as the master node. The management software acquires monitoring data through telnet according to the cluster health monitoring and the new master node IP specified by the IP configuration module to achieve seamless switching.

下面对本发明实施例提供的获取集群监控数据的容灾装置进行介绍,下文描述的获取集群监控数据的容灾装置与上文描述的获取集群监控数据的容灾方法可相互对应参照。The following is an introduction to the disaster recovery device for acquiring cluster monitoring data provided by the embodiments of the present invention. The disaster recovery device for acquiring cluster monitoring data described below and the disaster recovery method for acquiring cluster monitoring data described above can be referred to in correspondence.

图6为本发明实施例提供的获取集群监控数据的容灾装置的结构框图,参照图6获取集群监控数据的容灾装置可以包括:FIG. 6 is a structural block diagram of a disaster recovery device for obtaining cluster monitoring data provided by an embodiment of the present invention. Referring to FIG. 6, the disaster recovery device for obtaining cluster monitoring data may include:

收集模块100,用于每个集群节点收集本机监控信息,并获取集群内其他所有节点的监控信息;The collection module 100 is used for each cluster node to collect local monitoring information, and obtain the monitoring information of all other nodes in the cluster;

获取模块200,用于通过设置的集群主节点获取集群的监控数据;The obtaining module 200 is used to obtain the monitoring data of the cluster through the set cluster master node;

容灾模块300,用于当检测到所述集群主节点发生宕机时,重新选择其他节点作为当前集群主节点,通过所述当前集群主节点获取监控数据。The disaster recovery module 300 is configured to reselect another node as the current cluster master node when it is detected that the cluster master node is down, and obtain monitoring data through the current cluster master node.

在上述实施例的基础上,本发明所提供的获取集群监控数据的容灾装置中,上述获取模块200具体用于:On the basis of the above-mentioned embodiments, in the disaster recovery device for obtaining cluster monitoring data provided by the present invention, the above-mentioned obtaining module 200 is specifically used for:

通过telnet根据设置的集群主节点IP进行监控数据获取。Obtain monitoring data through telnet according to the set cluster master node IP.

进一步地,上述容灾模块300可以具体用于:Further, the above disaster recovery module 300 can be specifically used for:

当检测到所述集群主节点发生宕机时,通过telnet根据指定的当前集群主节点的IP进行监控数据获取。When it is detected that the cluster master node is down, the monitoring data is acquired through telnet according to the specified IP of the current cluster master node.

在上述任一实施例的基础上,本发明所提供的获取集群监控数据的容灾装置中,收集模块100可以具体用于:On the basis of any of the above embodiments, in the disaster recovery device for acquiring cluster monitoring data provided by the present invention, the collection module 100 can be specifically used for:

通过预先配置的所有集群节点广播路径,获取集群内其他所有节点的监控信息。Obtain the monitoring information of all other nodes in the cluster through the pre-configured broadcast path of all cluster nodes.

此外,本申请所提供的获取集群监控数据的容灾装置还可以进一步包括:In addition, the disaster recovery device for obtaining cluster monitoring data provided by this application may further include:

存储模块,用于在获取集群内其他所有节点的监控信息之后,将获取到的所有监控信息保存至本节点数据库中。The storage module is configured to save all the obtained monitoring information to the database of the node after obtaining the monitoring information of all other nodes in the cluster.

本发明所提供的获取集群监控数据的容灾装置,每个集群节点收集本机监控信息,并获取集群内其他所有节点的监控信息;通过设置的集群主节点获取集群的监控数据;当检测到集群主节点发生宕机时,重新选择其他节点作为当前集群主节点,通过当前集群主节点获取监控数据。本发明打破了原有低效、缺乏容灾方案或等待容灾时间过长的管理软件获取监控信息的方式,考虑了集群环境的特殊性,将监控信息的收集、容灾方案变为1+N的模式,将管理软件获取数据的容灾等待时间省略。可以在保证监控信息安全的同时,节省了监控信息收集时间以及带宽资源的消耗,并达到无缝切换数据获取对象的目的。In the disaster recovery device for obtaining cluster monitoring data provided by the present invention, each cluster node collects local monitoring information, and obtains monitoring information of all other nodes in the cluster; obtains cluster monitoring data through the set cluster master node; when detected When the cluster master node goes down, another node is re-selected as the current cluster master node, and the monitoring data is obtained through the current cluster master node. The present invention breaks the original way of obtaining monitoring information by management software that lacks disaster recovery solutions or waits too long for disaster recovery, considers the particularity of the cluster environment, and changes the collection of monitoring information and disaster recovery solutions to 1+ In the N mode, the disaster recovery waiting time for the management software to obtain data is omitted. While ensuring the security of monitoring information, it saves the time for collecting monitoring information and the consumption of bandwidth resources, and achieves the purpose of seamlessly switching data acquisition objects.

本说明书中各个实施例采用递进的方式描述,每个实施例重点说明的都是与其它实施例的不同之处,各个实施例之间相同或相似部分互相参见即可。对于实施例公开的装置而言,由于其与实施例公开的方法相对应,所以描述的比较简单,相关之处参见方法部分说明即可。Each embodiment in this specification is described in a progressive manner, each embodiment focuses on the difference from other embodiments, and the same or similar parts of each embodiment can be referred to each other. As for the device disclosed in the embodiment, since it corresponds to the method disclosed in the embodiment, the description is relatively simple, and for the related information, please refer to the description of the method part.

专业人员还可以进一步意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、计算机软件或者二者的结合来实现,为了清楚地说明硬件和软件的可互换性,在上述说明中已经按照功能一般性地描述了各示例的组成及步骤。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本发明的范围。Professionals can further realize that the units and algorithm steps of the examples described in conjunction with the embodiments disclosed herein can be implemented by electronic hardware, computer software or a combination of the two. In order to clearly illustrate the possible Interchangeability, in the above description, the components and steps of each example have been generally described according to their functions. Whether these functions are executed by hardware or software depends on the specific application and design constraints of the technical solution. Skilled artisans may use different methods to implement the described functions for each specific application, but such implementation should not be regarded as exceeding the scope of the present invention.

结合本文中所公开的实施例描述的方法或算法的步骤可以直接用硬件、处理器执行的软件模块,或者二者的结合来实施。软件模块可以置于随机存储器(RAM)、内存、只读存储器(ROM)、电可编程ROM、电可擦除可编程ROM、寄存器、硬盘、可移动磁盘、CD-ROM、或技术领域内所公知的任意其它形式的存储介质中。The steps of the methods or algorithms described in connection with the embodiments disclosed herein may be directly implemented by hardware, software modules executed by a processor, or a combination of both. Software modules can be placed in random access memory (RAM), internal memory, read-only memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, removable disk, CD-ROM, or any other Any other known storage medium.

以上对本发明所提供的获取集群监控数据的容灾方法以及装置进行了详细介绍。本文中应用了具体个例对本发明的原理及实施方式进行了阐述,以上实施例的说明只是用于帮助理解本发明的方法及其核心思想。应当指出,对于本技术领域的普通技术人员来说,在不脱离本发明原理的前提下,还可以对本发明进行若干改进和修饰,这些改进和修饰也落入本发明权利要求的保护范围内。The disaster recovery method and device for acquiring cluster monitoring data provided by the present invention have been introduced in detail above. In this paper, specific examples are used to illustrate the principle and implementation of the present invention, and the descriptions of the above embodiments are only used to help understand the method and core idea of the present invention. It should be pointed out that for those skilled in the art, without departing from the principle of the present invention, some improvements and modifications can be made to the present invention, and these improvements and modifications also fall within the protection scope of the claims of the present invention.

Claims (10)

CN201610627210.2A2016-08-032016-08-03A kind of disaster recovery method obtaining cluster monitoring data and devicePendingCN106301895A (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN201610627210.2ACN106301895A (en)2016-08-032016-08-03A kind of disaster recovery method obtaining cluster monitoring data and device

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN201610627210.2ACN106301895A (en)2016-08-032016-08-03A kind of disaster recovery method obtaining cluster monitoring data and device

Publications (1)

Publication NumberPublication Date
CN106301895Atrue CN106301895A (en)2017-01-04

Family

ID=57664960

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN201610627210.2APendingCN106301895A (en)2016-08-032016-08-03A kind of disaster recovery method obtaining cluster monitoring data and device

Country Status (1)

CountryLink
CN (1)CN106301895A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN111221700A (en)*2019-10-312020-06-02北京浪潮数据技术有限公司Cluster node state monitoring method, device, equipment and readable storage medium

Citations (7)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN101309167A (en)*2008-06-272008-11-19华中科技大学 Disaster recovery system and method based on cluster backup
CN101667034A (en)*2009-09-212010-03-10北京航空航天大学Scalable monitoring system supporting hybrid clusters
CN103024060A (en)*2012-12-202013-04-03中国科学院深圳先进技术研究院Open type cloud computing monitoring system for large scale cluster and method thereof
CN104539689A (en)*2014-12-232015-04-22西安电子科技大学Resource monitoring method under cloud platform
CN105007193A (en)*2015-08-192015-10-28浪潮(北京)电子信息产业有限公司Multi-layer information processing method, system thereof and cluster management node
US20150347523A1 (en)*2012-05-152015-12-03Splunk Inc.Managing data searches using generation identifiers
WO2016063114A1 (en)*2014-10-232016-04-28Telefonaktiebolaget L M Ericsson (Publ)System and method for disaster recovery of cloud applications

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN101309167A (en)*2008-06-272008-11-19华中科技大学 Disaster recovery system and method based on cluster backup
CN101667034A (en)*2009-09-212010-03-10北京航空航天大学Scalable monitoring system supporting hybrid clusters
US20150347523A1 (en)*2012-05-152015-12-03Splunk Inc.Managing data searches using generation identifiers
CN103024060A (en)*2012-12-202013-04-03中国科学院深圳先进技术研究院Open type cloud computing monitoring system for large scale cluster and method thereof
WO2016063114A1 (en)*2014-10-232016-04-28Telefonaktiebolaget L M Ericsson (Publ)System and method for disaster recovery of cloud applications
CN104539689A (en)*2014-12-232015-04-22西安电子科技大学Resource monitoring method under cloud platform
CN105007193A (en)*2015-08-192015-10-28浪潮(北京)电子信息产业有限公司Multi-layer information processing method, system thereof and cluster management node

Cited By (1)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN111221700A (en)*2019-10-312020-06-02北京浪潮数据技术有限公司Cluster node state monitoring method, device, equipment and readable storage medium

Similar Documents

PublicationPublication DateTitle
US9148381B2 (en)Cloud computing enhanced gateway for communication networks
CN106412142B (en)Resource equipment address obtaining method and device
CN112100545A (en) Visualization method, apparatus, device and readable storage medium of network assets
KR101903533B1 (en)Service quality index calculation method and calculation apparatus, and communications system
WO2016127884A1 (en)Message pushing method and device
US20180165431A1 (en)Method, apparatus and system for device replacement detection and device recommendation
CN102056212B (en)Method for detecting internet speed and network side equipment
CN105187548A (en)Cluster monitoring information collection method and system
CN108023763A (en)The creation method and device of a kind of network topological diagram
CN107078925A (en)The method to set up and terminal of a kind of heart beat cycle
CN111901174B (en)Service state notification method, related device and storage medium
CN101841541B (en) A method and system for monitoring a cluster based on a multicast network
CN105721235A (en)Method and apparatus for detecting connectivity
CN103686259A (en) Method and system for providing environmental information through smart TV
CN111010362B (en)Monitoring method and device for abnormal host
CN105657001B (en)A kind of method and device of analysis communication big data
CN106301895A (en)A kind of disaster recovery method obtaining cluster monitoring data and device
CN110932878A (en) Distributed network management method, device and system
CN105100241A (en)Method of identifying service types and apparatus thereof
CN104579842A (en)Processing method for acquiring cluster monitoring computing node state based on socket communication
US9900234B2 (en)Direct link quality monitoring method, communications device, and system
CN115633093A (en)Resource acquisition method and device, computer equipment and computer readable storage medium
CN109361781B (en) Message forwarding method, device, server, system and storage medium
CN105656877A (en)Hotlinking detection method and device
CN114915434A (en)Network agent detection method, device, storage medium and computer equipment

Legal Events

DateCodeTitleDescription
C06Publication
PB01Publication
C10Entry into substantive examination
SE01Entry into force of request for substantive examination
RJ01Rejection of invention patent application after publication
RJ01Rejection of invention patent application after publication

Application publication date:20170104


[8]ページ先頭

©2009-2025 Movatter.jp