Movatterモバイル変換


[0]ホーム

URL:


CN114218020A - Disaster-tolerant switching method and device - Google Patents

Disaster-tolerant switching method and device
Download PDF

Info

Publication number
CN114218020A
CN114218020ACN202111582686.6ACN202111582686ACN114218020ACN 114218020 ACN114218020 ACN 114218020ACN 202111582686 ACN202111582686 ACN 202111582686ACN 114218020 ACN114218020 ACN 114218020A
Authority
CN
China
Prior art keywords
data
preset
fault
recorded data
main
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111582686.6A
Other languages
Chinese (zh)
Inventor
张岩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Advanced New Technologies Co Ltd
Original Assignee
Advanced New Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Advanced New Technologies Co LtdfiledCriticalAdvanced New Technologies Co Ltd
Priority to CN202111582686.6ApriorityCriticalpatent/CN114218020A/en
Publication of CN114218020ApublicationCriticalpatent/CN114218020A/en
Pendinglegal-statusCriticalCurrent

Links

Images

Classifications

Landscapes

Abstract

Translated fromChinese

公开了一种容灾切换方法及装置。一种容灾切换方法,该方法包括:该方法包括:在监听到请求端与主响应端之间的链路中当前通道出现故障之后,确定主用链路是否存在可用通道;若存在,将所述当前通道切换至所述可用通道,以使请求端可以调用主响应端;若不存在,将请求端与主响应端之间的链路切换至备用链路,以使请求端可以调用备响应端。

Figure 202111582686

Disclosed are a method and device for disaster tolerance switching. A disaster-tolerant switching method, the method comprising: the method includes: after monitoring the failure of a current channel in a link between a requesting end and a primary responding end, determining whether there is an available channel on the primary link; The current channel is switched to the available channel, so that the requester can call the primary responder; if it does not exist, the link between the requester and the primary responder is switched to the backup link, so that the requester can call the backup. responder.

Figure 202111582686

Description

Disaster recovery switching method and device
Technical Field
The embodiment of the specification relates to the technical field of service disaster tolerance, in particular to a disaster tolerance switching method and device.
Background
The micro-service architecture system has the function of decomposing the functions into discrete services, thereby reducing the coupling of the system and providing more flexible service support. In the micro-service architecture, various services are often connected together in a calling mode to provide complex service support. Inevitably, service-to-service calls are often not completely reliable, e.g., a service calls B service, but for some reason the a-B link fails, so that a service cannot call B service.
Currently, in order to deal with the above problems, backup services B1 and B2 … … having the same functions as those of the B service are additionally established, and when an a-B link fails and the a service cannot call the B service, the a service directly calls backup services B1 and B2 … … having the same functions as those of the B service, so as to implement service disaster recovery and ensure the availability of the whole service system.
For convenience of understanding, a party issuing the service invocation request (a service) may be referred to as a requester, and a party responding to the service invocation request may be referred to as a responder, where the responder may be divided into a main responder (B service) and a standby responder (standby services B1 and B2 … …).
In the existing disaster recovery switching mode, when a link between a request end and a main response end fails, the link is directly switched to a standby link, and the request end directly calls the standby response end. However, the link between the requesting end and the main responding end fails, which may be caused by network fluctuation or network flash, or by instability of the whole service system, if the link is directly switched to the standby link, the switching cost is high, that is, a certain time is consumed in the switching process, and the service provided by the whole service system is unavailable within a certain time.
Disclosure of Invention
In view of the above technical problems, an embodiment of the present specification provides a disaster recovery switching method and apparatus, and a technical scheme is as follows:
a disaster recovery switching method is applied to a request end which is respectively connected with a main response end and a standby response end, wherein a link between the request end and the main response end comprises a plurality of channels, and the method comprises the following steps:
after monitoring that a current channel in a link between a request end and a main response end fails, determining whether an available channel exists in the main link;
if yes, switching the current channel to the available channel so that the request end can call the main response end;
if the backup link does not exist, the link between the request end and the main response end is switched to the backup link, so that the request end can call the backup response end.
A disaster recovery method is applied to a request end which is respectively connected with a main response end and a standby response end, wherein a link between the request end and the main response end comprises a plurality of channels, and the method comprises the following steps:
after monitoring that the fault of the fault channel in the main link between the request end and the main response end is recovered, recovering the current channel in the main link between the request end and the main response end to the fault channel.
According to the technical scheme provided by the embodiment of the specification, after a current channel in a main link between a request end and a main response end fails, whether the main link has an available channel or not is determined, if so, the current channel is switched to the available channel so that the request end can continue to call the main response end, otherwise, a link between the request end and the main response end is switched to a standby link so that the request end can call the standby response end. The method eliminates the link between the request end and the main response end from faults caused by network fluctuation or network flash or instability of the whole service system and other factors, reduces the switching times between the main end and the standby end, and reduces the switching cost to a certain extent.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of embodiments of the invention.
In addition, any one of the embodiments in the present specification is not required to achieve all of the effects described above.
Drawings
In order to more clearly illustrate the embodiments of the present specification or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments described in the embodiments of the present specification, and other drawings can be obtained by those skilled in the art according to the drawings.
FIG. 1 is a schematic diagram of a connection between a requester and a master responder according to an embodiment of the present disclosure;
fig. 2 is a schematic flow chart of a disaster recovery switching method according to an embodiment of the present disclosure;
fig. 3 is a schematic flow chart of a disaster recovery method according to an embodiment of the present disclosure;
fig. 4 is a schematic structural diagram of a disaster recovery switching device according to an embodiment of the present disclosure;
fig. 5 is a schematic structural diagram of a disaster recovery device according to an embodiment of the present disclosure;
fig. 6 is a schematic structural diagram of an apparatus for configuring a device according to an embodiment of the present disclosure.
Detailed Description
Microservice architecture, intended to enable decoupling of solutions by breaking down functionality into individual discrete services. Its main role is to break down the function into discrete services, thereby reducing the system coupling and providing more flexible service support. In the micro-service architecture, various services are often connected together in a calling mode to provide complex service support. Inevitably, under traditional micro-service architecture, services and calls between services tend to be unreliable, as described in the background.
In view of the problems in the prior art in the background art, an embodiment of the present specification provides a technical solution, where after a current channel in a link between a request end and a primary response end fails, a state migration layer pushes a preset state machine to perform corresponding state migration according to a preset migration policy, a failover process determines whether an available channel exists according to a state of the state machine after the migration is monitored, if so, the failover process switches the current channel to the available channel, so that the request end can continue to call the primary response end, otherwise, the failover process switches a link between the request end and the primary response end to a standby link, so that the request end can call a standby response end. The method eliminates the link between the request end and the main response end from faults caused by network fluctuation or network flash or instability of the whole service system and other factors, reduces the switching times between the main end and the standby end, and reduces the switching cost to a certain extent.
The specific technical scheme provided by the embodiment of the specification is as follows:
configuring a state migration process and a fault switching process; after monitoring that a current channel in a link between a request end and a main response end fails, a state migration process pushes a preset state machine to perform corresponding state migration according to a preset migration strategy; after monitoring that the preset state machine carries out corresponding state migration, the fault switching process determines whether an available channel exists according to the state of the state machine after migration; if the current channel exists, the fault switching process switches the current channel to the available channel so that the request end can call the main response end; if the backup link does not exist, the fault switching process switches the link between the request end and the main response end to the backup link, so that the request end can call the backup response end.
A corresponding embodiment of the present specification further provides a disaster recovery method, and a specific technical solution provided by the embodiment of the present specification is as follows:
after monitoring that the fault channel in the link between the request end and the main response end recovers, the state migration process pushes a preset state machine to perform corresponding state migration according to a preset migration strategy; and after monitoring that the preset state machine carries out corresponding state transition, the fault switching process restores the current channel in the link between the request end and the main response end to the fault channel.
In the technical solution provided in the embodiment of the present specification, the a-B Link is a Link (Link) as mentioned in the background art, but in the embodiment of the present specification, the Link may be reached through a different Channel (Channel), as shown in fig. 1, for example, the a-B Link may be reached using RabbitMQ or rockmq, the RabbitMQ may be regarded as Channel 1, and the rockmq may be regarded as Channel 2, where RabbitMQ and rockmq are a message queue.
In order to make those skilled in the art better understand the technical solutions in the embodiments of the present specification, the technical solutions in the embodiments of the present specification will be described in detail below with reference to the drawings in the embodiments of the present specification, and it is obvious that the described embodiments are only a part of the embodiments of the present specification, and not all the embodiments. All other embodiments that can be derived by one of ordinary skill in the art from the embodiments given herein are intended to be within the scope of protection.
As shown in fig. 2, an implementation flowchart of a disaster recovery switching method provided in the embodiment of the present disclosure is specifically provided, where the method includes the following steps:
s201, after monitoring that a current channel in a link between a request end and a main response end has a fault, a state migration process pushes a preset state machine to perform corresponding state migration according to a preset migration strategy;
after monitoring that a current channel in a link between a request end and a main response end fails, a state migration process determines a preset migration strategy according to a current state of a preset state machine, and the state migration process pushes the preset state machine to perform corresponding state migration according to the determined migration strategy, which means that the migration strategy is related to the current state of the state machine, for example, the states of the state machine are the same as the previous ones: the migration policy may be core-backup-panel if the current state machine is core state, or may be mix-core-panel or mix-backup-panel if the current state machine is mix state. Therefore, the corresponding migration policy can be determined by the current state of the state machine, and the current state of the state machine is the state before the migration.
After the corresponding migration strategy is determined, the state migration process pushes a preset state machine to perform corresponding state migration according to the determined migration strategy. State migration refers to migrating a current state of a state machine to a next state when a failure occurs. As described above, the current state is a mix state, the next state is a backup state, specifically, any one of the backup1 and the backup2 … … may be used, and when the current state is shifted to the backup state, there are many strategies for selecting the backup1 and the backup2 … …, for example, a random strategy, that is, one of the backup1 and the backup2 … … is randomly selected and shifted to the current state, for example, a sort strategy, which is not limited in this embodiment of the present specification.
Specifically, for example, the current state is a backup state, and when the state transition is performed, the state transition may be performed internally, and specifically, for example, the backup1, the backup2 selected by the sort policy, and the internal transition from backup1 to backup2 may be performed.
S202, after monitoring that the preset state machine carries out corresponding state migration, the fault switching process determines whether an available channel exists according to the state of the state machine after migration;
the fault switching process can monitor whether the state of a preset state machine is subjected to state transition in real time, and after the preset state machine is monitored to perform corresponding state transition, whether an available channel exists is determined according to the state of the state machine after transition, and the specific steps are as follows:
after monitoring that the preset state machine performs corresponding state transition, the failover process determines a next channel according to the state of the state machine after transition, which can be understood that the state of the state machine corresponds to one channel, where the next channel is any channel except the current channel, for example, if the state of the state machine after transition is backup, specifically, backup1, the determined next channel is a standby channel 1.
And the fault switching process sends the data transmitted by the current channel or the preset virtual data to the next channel to test the next channel. The test here is to see whether the data transmitted by the next channel can be successfully processed, the data processing succeeds, that is, the test is passed, otherwise, the test is not passed, the data here may be a call request, and may also be contents in other forms, which is not limited in this specification.
If the next channel passes the test, the failover process determines that an available channel exists, and the available channel is the next channel.
If the next channel fails to pass the test, the fault switching process sends the data transmitted by the current channel or the preset virtual data to the remaining channels in sequence, and the remaining channels are tested in sequence. And if one channel passes the test, stopping the test, otherwise, continuing the test until determining that an available channel exists or determining that the rest channels are unavailable. And if all the remaining channels fail to pass the test, determining that no available channel exists in the fault switching process, otherwise determining that the available channel is a channel which passes the test in the remaining channels, and simultaneously, migrating the preset state machine to a corresponding state.
S203, if the current channel exists, the current channel is switched to the available channel by the fault switching process, so that the request end can call the main response end;
for the result determined in S202, if there is an available channel, the failover process switches the current channel to the available channel, so that the request end can invoke the primary response end. For example, the current channel in the link between the request end and the main response end is the main channel, the available channel is the standby channel 1, and the failure switching process switches the main channel to the standby channel 1.
In addition, before switching the current channel to the available channel, the fault switching process judges whether the available channel achieves the current limiting, if not, the fault switching process switches the current channel to the available channel, otherwise, the fault switching process determines the available channel again. If a plurality of available channels exist, the most idle channel is selected, and if other available channels do not exist, the current limiting strategy is executed on the originally determined available channels.
And S204, if the failure does not exist, the failure switching process switches the link between the request end and the main response end to a standby link so that the request end can call the standby response end.
For the result determined in S203, if there is no available channel, it means that all channels in the link between the request end and the main response end are unavailable, and the failover process switches the link between the request end and the main response end to the standby link, so that the request end can invoke the standby response end.
On the basis of the above scheme, before disaster recovery switching, fault detection may be further included: configuring a fault perception process;
in a preset time period, a fault sensing process records data transmitted through a current channel in a link between a request end and a main response end. The preset time period may be several consecutive periods, for example, three consecutive 1000ms, and within each period (1000ms), the fault-aware process records data transmitted through the current channel in the link between the requesting end and the master responding end, and records three consecutive periods.
The failure-aware process statistically processes the failed data among the recorded data. For example, as described above, the failure sensing process counts the data that failed the processing among the recorded data in each cycle, counting three cycles consecutively.
The failure-aware process calculates the fraction of the counted data that failed the processing among the recorded data. For example, as described above, the failure-aware process calculates the counted percentage of data with failed processing in the recorded data, for example, 30%, for each cycle, and calculates three cycles in succession.
And the fault sensing process judges whether the counted occupation ratio of the data failed in processing in the recorded data exceeds a preset threshold value or not. For example, as described above, the failure sensing process determines whether the counted percentage of the data with failed processing in the recorded data exceeds a preset threshold value, for example, the preset threshold value is 30%, in each cycle. Before the fault sensing process judges whether the counted occupation ratio of the data failed to be processed in the recorded data exceeds a preset threshold value, the fault sensing process judges whether the recorded data meets a preset requirement, namely the fault sensing process judges whether the recorded data reaches a certain number, and if so, the fault sensing process judges whether the counted occupation ratio of the data failed to be processed in the recorded data exceeds the preset threshold value.
If so, the fault sensing process sends a notification for pushing a preset state machine to perform corresponding state migration to the state migration process. For example, as described above, if the counted percentage of the data with failed processing in each period in the recorded data exceeds the preset threshold, it is sensed that the current channel fails, the failure sensing process sends a notification for pushing the preset state machine to perform corresponding state migration to the state migration process, and after monitoring that the current channel in the link between the request end and the primary response end fails, the state migration process pushes the preset state machine to perform corresponding state migration according to the preset migration policy.
Further, the fault detection step may be subdivided into the following steps:
in a preset time period, the fault sensing process records data transmitted through a current channel in a link between the request end and the main response end, and similar to the above description is omitted.
And the fault perception process classifies the data which fails to be processed in the recorded data and counts the data which fails to be processed in each class. For example, the fault sensing process classifies the data which fails to be processed in the recorded data in each period, counts the data which fails to be processed in each class, and continuously counts three periods, wherein the data which fails to be processed can be classified into 4 classes, which are respectively a system fault (an opposite end system exception, a network flash, a link timeout, a link failure, and the like), a service fault (an opposite end or local parameter check failure, for example, a certain parameter does not exist), a timeout fault (an opposite end system response timeout), and a custom fault (defined by a user), and the data which fails to be processed in each class of faults can be counted in each period.
The fault-aware process calculates the proportion of processing-failed data contained in one or more of the classified categories in the recorded data. For example, the failure-aware process calculates the ratio of the processing-failed data included in one or more of the classified categories in the recorded data in each cycle, and calculates three cycles in succession.
And the fault perception process judges whether the calculated occupation ratio of the data which is contained in one or more classes and fails in processing in the recorded data exceeds a preset threshold corresponding to the classified class or not. For example, the failure sensing process determines, in each cycle, whether the calculated ratio of the processing-failed data included in one or more of the categories in the recorded data exceeds a preset threshold corresponding to the category to be classified, and continuously determines three cycles. In each period, the fault sensing process may determine whether the ratio of the processing-failed data included in one of the classes (system faults) in the recorded data exceeds a preset threshold corresponding to the class, or may determine whether the ratio of the processing-failed data included in the classes in the recorded data exceeds a preset threshold corresponding to each of the classes. Specifically, the preset threshold corresponding to each type may be the same or different, that is, the preset threshold corresponding to the system fault may be 30%, the preset threshold corresponding to the service fault may be 30%, or other values.
If so, the fault sensing process sends a notification for pushing a preset state machine to perform corresponding state migration to the state migration process, and after the state migration process monitors that a current channel in a link between the request end and the main response end has a fault, the state migration process pushes the preset state machine to perform corresponding state migration according to a preset migration strategy, which is similar to the above and is not described in detail here again.
Through the above description of the technical solution provided in this specification, after a current channel in a link between a request end and a primary response end fails, a state migration layer pushes a preset state machine to perform corresponding state migration according to a preset migration policy, after monitoring that the state of the preset state machine migrates, a failover process determines whether an available channel exists according to the state of the migrated state machine, if so, the failover process switches the current channel to the available channel, so that the request end can continue to call the primary response end, otherwise, the failover process switches the link between the request end and the primary response end to a standby link, so that the request end can call a standby response end. The method eliminates the link between the request end and the main response end from faults caused by network fluctuation or network flash or instability of the whole service system and other factors, reduces the switching times between the main end and the standby end, and reduces the switching cost to a certain extent.
Corresponding to the above embodiment of the disaster recovery switching method, an embodiment of the present specification further provides a disaster recovery method, as shown in fig. 3, which may include the following steps:
s301, after monitoring fault recovery of a fault channel in a link between a request end and a main response end, a state migration process pushes a preset state machine to perform corresponding state migration according to a preset migration strategy;
after monitoring that the fault channel in the link between the request end and the main response end recovers, the state migration process pushes the preset state machine to perform corresponding state migration according to a preset migration strategy, wherein the preset migration strategy can be determined according to the current state of the preset state machine, and the state migration process pushes the preset state machine to perform corresponding state migration according to the determined migration strategy, which means that the migration strategy is related to the current state of the state machine. The state of the state machine is similar to that in S201 described above, and details are not repeated here, in S201, the core state of the state machine is migrated to the backup state, and in this step, the backup state is migrated back to the core state.
S302, after monitoring that the preset state machine carries out corresponding state transition, the fault switching process restores the current channel in the link between the request end and the main response end to the fault channel.
And for the result in the S301, after monitoring that the preset state machine performs corresponding state transition, the failover process performs disaster recovery, and recovers the current channel in the link between the request end and the main response end to the failed channel, where the failed channel is recovered to a normal state.
On the basis of the disaster recovery method, the method may further include a fault detection step, and specifically may include the following steps: configuring a fault perception process;
the fault perception process sends data transmitted by a current channel or preset virtual data to the fault channel;
in a preset time period, a fault sensing process records data transmitted through the fault channel;
the fault perception process statistically processes failed data in the recorded data;
the fault perception process calculates the counted proportion of the data failed in processing in the recorded data;
the fault sensing process judges whether the counted proportion of the data failed to be processed in the recorded data exceeds a preset threshold value or not;
if not, the fault perception process sends a notification for pushing a preset state machine to perform corresponding state transition to the state transition process; if the counted data with failed processing does not exceed the preset threshold (10%) in the recorded data, it means that the failed channel has recovered to the normal state, otherwise it means that the failed channel is still in the failed state.
After monitoring a state transition notification sent by the fault perception process, the state transition process pushes a preset state machine to perform corresponding state transition according to a preset transition strategy.
Further, the fault detection step may be subdivided into the following steps:
the fault perception process sends data transmitted by a current channel or preset virtual data to the fault channel;
in a preset time period, a fault sensing process records data transmitted through the fault channel;
the fault perception process classifies the data which fails in processing in the recorded data, and counts the data which fails in processing in each class;
the fault perception process calculates the proportion of the data which is contained in one or more of the classified categories and fails in processing in the recorded data;
the fault perception process judges whether the calculated ratio of the data which is contained in one or more classes and fails in processing in the recorded data exceeds a preset threshold value corresponding to the classified class or not;
if not, the fault perception process sends a notification for pushing a preset state machine to perform corresponding state transition to the state transition process;
after monitoring a state transition notification sent by the fault perception process, the state transition process pushes a preset state machine to perform corresponding state transition according to a preset transition strategy.
In the disaster recovery method, the failure detection step is similar to the failure detection step in the disaster recovery switching method, and details are not described in detail here. Particularly, in the disaster recovery method, when performing fault detection, it may be selected that it is not necessary to determine whether the recorded data meets a preset requirement.
Through the description of the disaster recovery method, whether the fault channel is recovered to be normal or not can be automatically sensed, and after the fault channel is recovered to be normal, the disaster recovery can be automatically carried out.
With respect to the foregoing method embodiment, an embodiment of this specification further provides a disaster recovery switching device, as shown in fig. 4, which may include: the device comprises aconfiguration module 410, astate transition module 420, adetermination module 430, afirst switching module 440, and asecond switching module 450.
The configuration module 410: a state migration process and a fault switching process;
thestate migration module 420 is configured to, after monitoring that a current channel in a link between the request end and the main response end fails, in the state migration process, push a preset state machine to perform corresponding state migration according to a preset migration policy;
a determiningmodule 430, configured to determine whether an available channel exists according to a state of the state machine after the failover process monitors that the preset state machine performs corresponding state migration;
afirst switching module 440, configured to switch, if the current channel exists, the current channel to the available channel through a failover process, so that the request end may invoke the primary response end;
thesecond switching module 450 is configured to, if the failure occurs, switch the link between the request end and the main response end to the standby link through the failover process, so that the request end can invoke the standby response end.
According to a specific embodiment provided in the present specification, theconfiguration module 410 is further configured to configure a fault-aware process;
the device further comprises: a fault detection module 460;
the fault detection module includes:
a data recording unit 461, configured to record, in a preset time period, data transmitted through a current channel in a link between a request end and a main response end by a fault-aware process;
a data statistics unit 462, configured to statistically process failed data in the recorded data by the fault-aware process;
a duty ratio calculation unit 463 configured to calculate a duty ratio of the counted data with processing failure in the recorded data by the failure sensing process;
a judging unit 464, configured to judge, by the failure sensing process, whether the counted percentage of the data with the processing failure in the recorded data exceeds a preset threshold;
a notification sending unit 465, configured to send, if yes, a notification that pushes a preset state machine to perform corresponding state migration to the state migration process by the fault sensing process;
thestate transition module 420 is specifically configured to:
and after monitoring the state migration notification sent by the fault perception process, the state migration process pushes a preset state machine to perform corresponding state migration according to a preset migration strategy.
According to an embodiment provided in the present specification, the determining unit 464 is specifically configured to:
the fault sensing process judges whether the recorded data meet preset requirements or not;
if so, the fault sensing process judges whether the counted occupation ratio of the data failed in processing in the recorded data exceeds a preset threshold value.
According to an embodiment provided in the present specification, the data statistics unit 462 is specifically configured to:
the fault perception process classifies the data which fails in processing in the recorded data, and counts the data which fails in processing in each class;
the proportion calculation unit 463 is specifically configured to:
the fault perception process calculates the proportion of the data which is contained in one or more of the classified categories and fails in processing in the recorded data;
the determining unit 464 is specifically configured to:
and the fault perception process judges whether the calculated occupation ratio of the data which is contained in one or more classes and fails in processing in the recorded data exceeds a preset threshold corresponding to the classified class or not.
According to a specific implementation manner provided in this specification, thestate transition module 420 is specifically configured to:
after monitoring that a current channel in a link between a request end and a main response end fails, a state migration process determines a preset migration strategy according to the current state of a preset state machine;
and the state migration process pushes a preset state machine to perform corresponding state migration according to the determined migration strategy.
Through the above description of the technical solution provided in this specification, after a current channel in a link between a request end and a primary response end fails, a state migration layer pushes a preset state machine to perform corresponding state migration according to a preset migration policy, after monitoring that the state of the preset state machine migrates, a failover process determines whether an available channel exists according to the state of the migrated state machine, if so, the failover process switches the current channel to the available channel, so that the request end can continue to call the primary response end, otherwise, the failover process switches the link between the request end and the primary response end to a standby link, so that the request end can call a standby response end. The method eliminates the link between the request end and the main response end from faults caused by network fluctuation or network flash or instability of the whole service system and other factors, reduces the switching times between the main end and the standby end, and reduces the switching cost to a certain extent.
With respect to the above disaster recovery switching device, an embodiment of the present specification further provides a disaster recovery switching recovery device, as shown in fig. 5, which may include: aconfiguration module 510, astate transition module 520, and achannel switching module 530.
Aconfiguration module 510, configured to perform a state transition process and a failover process;
astate migration module 520, configured to, after monitoring that a failure of a failed channel in a link between a request end and a main response end recovers, push a preset state machine to perform corresponding state migration according to a preset migration policy in a state migration process;
thechannel switching module 530 is configured to, after the failover process monitors that the preset state machine performs corresponding state transition, restore a current channel in a link between the request end and the main response end to the failed channel.
According to a specific embodiment provided in the present specification, theconfiguration module 510 is further configured to configure a failure-aware process;
the device further comprises: a fault detection module 540;
the fault detection module includes:
a data sending unit 541, configured to send, by a fault sensing process, data transmitted by a current channel or preset virtual data to a fault channel;
a data recording unit 542, configured to record, in a preset time period, data transmitted through the fault channel by a fault-aware process;
the data counting unit 543 is used for counting the failed data in the recorded data by the fault sensing process;
a proportion calculation unit 544, configured to calculate a proportion of the counted data with failed processing in the recorded data by the failure sensing process;
the judging unit 545 is configured to judge, by the fault sensing process, whether the counted percentage of the data with the processing failure in the recorded data exceeds a preset threshold;
a notification sending unit 546, configured to send, if the state transition process is not successful, a notification for pushing a preset state machine to perform corresponding state transition to the state transition process;
thestate transition module 520 is specifically configured to:
after monitoring a state transition notification sent by the fault perception process, the state transition process pushes a preset state machine to perform corresponding state transition according to a preset transition strategy.
According to an embodiment provided in the present specification, the data statistics unit 543 is specifically configured to:
the fault perception process classifies the data which fails in processing in the recorded data, and counts the data which fails in processing in each class;
the proportion calculation unit 544 is specifically configured to:
the fault perception process calculates the proportion of the data which is contained in one or more of the classified categories and fails in processing in the recorded data;
the determining unit 545 is specifically configured to:
and the fault perception process judges whether the calculated occupation ratio of the data which is contained in one or more classes and fails in processing in the recorded data exceeds a preset threshold corresponding to the classified class or not.
Through the description of the disaster recovery method, whether the fault channel is recovered to be normal or not can be automatically sensed, and after the fault channel is recovered to be normal, the disaster recovery can be automatically carried out.
The implementation process of the functions and actions of each module in the above device is specifically described in the implementation process of the corresponding step in the above method, and is not described herein again.
Embodiments of the present specification further provide a computer device, as shown in fig. 6, the computer device may include: aprocessor 610, amemory 620, an input/output interface 630, acommunication interface 640, and abus 650. Wherein theprocessor 610,memory 620, input/output interface 630, andcommunication interface 640 are communicatively coupled to each other within the device via abus 650.
Theprocessor 610 may be implemented by a general-purpose CPU (Central Processing Unit), a microprocessor, an Application Specific Integrated Circuit (ASIC), or one or more Integrated circuits, and is configured to execute related programs to implement the technical solutions provided in the embodiments of the present specification.
TheMemory 620 may be implemented in the form of a ROM (Read Only Memory), a RAM (Random Access Memory), a static storage device, a dynamic storage device, or the like. Thememory 620 may store an operating system and other application programs, and when the technical solution provided by the embodiments of the present specification is implemented by software or firmware, the relevant program codes are stored in thememory 620 and called by theprocessor 610 to be executed.
The input/output interface 630 is used for connecting an input/output module to realize information input and output. The i/o module may be configured as a component in a device (not shown) or may be external to the device to provide a corresponding function. The input devices may include a keyboard, a mouse, a touch screen, a microphone, various sensors, etc., and the output devices may include a display, a speaker, a vibrator, an indicator light, etc.
Thecommunication interface 640 is used for connecting a communication module (not shown in the figure) to realize communication interaction between the device and other devices. The communication module can realize communication in a wired mode (such as USB, network cable and the like) and also can realize communication in a wireless mode (such as mobile network, WIFI, Bluetooth and the like).
Bus 650 includes a pathway to transfer information between various components of the device, such asprocessor 610,memory 620, input/output interface 630, andcommunication interface 640.
It should be noted that although the above-mentioned devices only show theprocessor 610, thememory 620, the input/output interface 630, thecommunication interface 640 and thebus 650, in a specific implementation, the devices may also include other components necessary for normal operation. In addition, those skilled in the art will appreciate that the above-described apparatus may also include only those components necessary to implement the embodiments of the present description, and not necessarily all of the components shown in the figures.
Embodiments of the present specification further provide a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the disaster recovery switching method described above. The method at least comprises the following steps:
a disaster recovery switching method is applied to a request end which is respectively connected with a main response end and a standby response end, wherein a link between the request end and the main response end comprises a plurality of channels, and the method comprises the following steps: configuring a state migration process and a fault switching process;
after monitoring that a current channel in a link between a request end and a main response end fails, a state migration process pushes a preset state machine to perform corresponding state migration according to a preset migration strategy;
after monitoring that the preset state machine carries out corresponding state migration, the fault switching process determines whether an available channel exists according to the state of the state machine after migration;
if the current channel exists, the fault switching process switches the current channel to the available channel so that the request end can call the main response end;
if the backup link does not exist, the fault switching process switches the link between the request end and the main response end to the backup link, so that the request end can call the backup response end.
Embodiments of the present specification further provide a computer-readable storage medium, on which a computer program is stored, and when the computer program is executed by a processor, the computer program implements the disaster recovery method described above. The method at least comprises the following steps:
a disaster recovery method is applied to a request end which is respectively connected with a main response end and a standby response end, wherein a link between the request end and the main response end comprises a plurality of channels, and the method comprises the following steps: configuring a state migration process and a fault switching process;
after monitoring that the fault channel in the link between the request end and the main response end recovers, the state migration process pushes a preset state machine to perform corresponding state migration according to a preset migration strategy;
and after monitoring that the preset state machine carries out corresponding state transition, the fault switching process restores the current channel in the link between the request end and the main response end to the fault channel.
Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, a computer readable medium does not include a transitory computer readable medium such as a modulated data signal and a carrier wave.
From the above description of the embodiments, it is clear to those skilled in the art that the embodiments of the present disclosure can be implemented by software plus necessary general hardware platform. Based on such understanding, the technical solutions of the embodiments of the present specification may be essentially or partially implemented in the form of a software product, which may be stored in a storage medium, such as a ROM/RAM, a magnetic disk, an optical disk, etc., and includes several instructions for enabling a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments of the present specification.
The systems, devices, modules or units illustrated in the above embodiments may be implemented by a computer chip or an entity, or by a product with certain functions. A typical implementation device is a computer, which may take the form of a personal computer, laptop computer, cellular telephone, camera phone, smart phone, personal digital assistant, media player, navigation device, email messaging device, game console, tablet computer, wearable device, or a combination of any of these devices.
The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the apparatus embodiment, since it is substantially similar to the method embodiment, it is relatively simple to describe, and reference may be made to some descriptions of the method embodiment for relevant points. The above-described apparatus embodiments are merely illustrative, and the modules described as separate components may or may not be physically separate, and the functions of the modules may be implemented in one or more software and/or hardware when implementing the embodiments of the present disclosure. And part or all of the modules can be selected according to actual needs to achieve the purpose of the scheme of the embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
The foregoing is only a specific embodiment of the embodiments of the present disclosure, and it should be noted that, for those skilled in the art, a plurality of modifications and decorations can be made without departing from the principle of the embodiments of the present disclosure, and these modifications and decorations should also be regarded as the protection scope of the embodiments of the present disclosure.

Claims (20)

Translated fromChinese
1.一种容灾切换方法,应用于请求端,所述请求端通过主用链路与主响应端连接,通过备用链路与备响应端连接,其中所述请求端与所述主响应端之间的主用链路包括多个通道;该方法包括:1. A disaster tolerance switching method, applied to a requesting end, the requesting end is connected with a main responding end through a main link, and is connected with a standby responding end through a backup link, wherein the requesting end and the main responding end are connected The active link between includes a plurality of channels; the method includes:在监听到请求端与主响应端之间的主用链路中当前通道出现故障之后,确定主用链路是否存在可用通道;After monitoring the failure of the current channel in the main link between the requester and the main responder, determine whether there is an available channel in the main link;若存在,将所述当前通道切换至所述可用通道,以使请求端可以调用主响应端;If there is, switch the current channel to the available channel, so that the requester can call the main responder;若不存在,将请求端与主响应端之间的主用链路切换至备用链路,以使请求端可以调用备响应端。If it does not exist, switch the active link between the requester and the primary responder to the backup link, so that the requester can call the backup responder.2.如权利要求1所述的方法,所述在监听到请求端与主响应端之间的主用链路中当前通道出现故障之后,确定主用链路是否存在可用通道,包括:2. The method according to claim 1, wherein after monitoring the failure of the current channel in the main link between the requester and the main responder, determining whether there is an available channel in the main link, comprising:在监听到请求端与主响应端之间的主用链路中当前通道出现故障之后,按照预设的迁移策略推动预设的状态机进行相应的状态迁移;After monitoring the failure of the current channel in the main link between the requester and the main responder, push the preset state machine to perform corresponding state migration according to the preset migration strategy;在监听到所述预设的状态机进行相应的状态迁移之后,根据迁移之后的状态机的状态确定主用链路是否存在可用通道。After monitoring that the preset state machine performs corresponding state transition, it is determined whether there is an available channel on the active link according to the state of the state machine after the transition.3.如权利要求2所述的方法,预先配置状态迁移进程、故障切换进程;3. The method of claim 2, preconfiguring a state migration process and a failover process;所述状态迁移进程用于在监听到请求端与主响应端之间的主用链路中当前通道出现故障之后,按照预设的迁移策略推动预设的状态机进行相应的状态迁移;The state transition process is used to push the preset state machine to perform corresponding state transition according to the preset migration strategy after monitoring the failure of the current channel in the main link between the requester and the main responder;所述故障切换进程用于在监听到所述预设的状态机进行相应的状态迁移之后,根据迁移之后的状态机的状态确定主用链路是否存在可用通道;The failover process is configured to determine whether there is an available channel on the active link according to the state of the state machine after the transition after monitoring the preset state machine to perform a corresponding state transition;若存在,所述故障切换进程将所述当前通道切换至所述可用通道,以使请求端可以调用主响应端;If there is, the failover process switches the current channel to the available channel, so that the requester can call the primary responder;若不存在,所述故障切换进程将请求端与主响应端之间的主用链路切换至备用链路,以使请求端可以调用备响应端。If not, the failover process switches the active link between the requester and the primary responder to the backup link, so that the requester can call the backup responder.4.根据权利要求2所述的方法,所述方法还包括:4. The method of claim 2, further comprising:在预设的时间段内,记录经过请求端与主响应端之间的主用链路中当前通道所传递的数据;In a preset time period, record the data transmitted through the current channel in the main link between the requester and the main responder;在所记录的数据中统计处理失败的数据;Statistically processing failed data in the recorded data;计算所统计的处理失败的数据在所记录的数据中的占比;Calculate the proportion of the counted data that failed to be processed in the recorded data;判断所统计的处理失败的数据在所记录的数据中的占比是否超过预设的阈值;Determine whether the proportion of the counted data that fails to be processed in the recorded data exceeds a preset threshold;若是,按照预设的迁移策略推动预设的状态机进行相应的状态迁移。If so, push the preset state machine to perform corresponding state transition according to the preset migration strategy.5.根据权利要求4所述的方法,所述判断所统计的处理失败的数据在所记录的数据中的占比是否超过预设的阈值,包括:5. The method according to claim 4 , whether the proportion of the data that the statistics have failed to process in the recorded data exceeds a preset threshold, comprising:判断所记录的数据是否满足预设的要求;Determine whether the recorded data meets the preset requirements;若是,判断所统计的处理失败的数据在所记录的数据中的占比是否超过预设的阈值。If so, it is determined whether the proportion of the counted data that fails to be processed in the recorded data exceeds a preset threshold.6.根据权利要求4所述的方法,所述在所记录的数据中统计处理失败的数据,包括:6. The method according to claim 4, wherein in the recorded data, statistical processing of failed data comprises:在所记录的数据中对处理失败的数据进行分类,统计每类中包含的处理失败的数据;Classify the data that fails to be processed in the recorded data, and count the data that fails to be processed in each category;所述计算所统计的处理失败的数据在所记录的数据中的占比,包括:The proportion of the data that fails to be processed in the recorded data calculated by the calculation, including:计算所分的类别中其中一类或几类包含的处理失败的数据在所记录的数据中的占比;Calculate the proportion of data that fails to be processed in one or more of the classified categories in the recorded data;所述判断所统计的处理失败的数据在所记录的数据中的占比是否超过预设的阈值,包括:Described judging whether the proportion of the counted data that fails to be processed in the recorded data exceeds a preset threshold, including:判断所计算的其中一类或几类包含的处理失败的数据在所记录的数据中的占比,是否超过与所分的类别对应的预设阈值。It is judged whether the proportion of the data that has failed to be processed contained in one of the calculated categories or categories in the recorded data exceeds a preset threshold corresponding to the divided category.7.根据权利要求2所述的方法,所述在监听到请求端与主响应端之间的主用链路中当前通道出现故障之后,按照预设的迁移策略推动预设的状态机进行相应的状态迁移,包括:7. The method according to claim 2, wherein after monitoring the failure of the current channel in the main link between the requester and the main responder, push the preset state machine to respond according to the preset migration strategy state transitions, including:在监听到请求端与主响应端之间的主用链路中当前通道出现故障之后,根据所述预设的状态机当前的状态确定预设的迁移策略;After monitoring the failure of the current channel in the main link between the requester and the main responder, determine a preset migration strategy according to the current state of the preset state machine;按照所确定的迁移策略推动预设的状态机进行相应的状态迁移。According to the determined migration strategy, the preset state machine is pushed to perform corresponding state transition.8.如权利要求3所述的方法,预先配置故障感知进程,8. The method of claim 3, wherein a fault-aware process is preconfigured,所述故障感知进程在预设的时间段内,记录经过请求端与主响应端之间的主用链路中当前通道所传递的数据;The fault sensing process records, within a preset time period, the data transmitted through the current channel in the main link between the requester and the main responder;所述故障感知进程在所记录的数据中统计处理失败的数据;The fault-aware process counts the failed data in the recorded data;所述故障感知进程计算所统计的处理失败的数据在所记录的数据中的占比;The fault-aware process calculates the proportion of the counted processing failure data in the recorded data;所述故障感知进程判断所统计的处理失败的数据在所记录的数据中的占比是否超过预设的阈值;The fault perception process judges whether the proportion of the counted processing failure data in the recorded data exceeds a preset threshold;若是,所述故障感知进程向所述状态迁移进程发送推动预设的状态机进行相应的状态迁移的通知;If so, the fault-aware process sends a notification to the state transition process to push the preset state machine to perform corresponding state transition;所述状态迁移进程在监听到请求端与主响应端之间的主用链路中当前通道出现故障之后,按照预设的迁移策略推动预设的状态机进行相应的状态迁移,包括:After monitoring the failure of the current channel in the main link between the requester and the main responder, the state migration process pushes the preset state machine to perform corresponding state migration according to the preset migration strategy, including:所述状态迁移进程在监听到故障感知进程发送的状态迁移通知之后,按照预设的迁移策略推动预设的状态机进行相应的状态迁移。After monitoring the state transition notification sent by the fault sensing process, the state transition process pushes the preset state machine to perform corresponding state transition according to the preset transition strategy.9.如权利要求8所述的方法,所述故障感知进程判断所统计的处理失败的数据在所记录的数据中的占比是否超过预设的阈值,包括:9. The method according to claim 8, wherein the fault perception process judges whether the proportion of the counted processing failure data in the recorded data exceeds a preset threshold, comprising:所述故障感知进程判断所记录的数据是否满足预设的要求;The fault perception process judges whether the recorded data meets the preset requirements;若是,所述故障感知进程判断所统计的处理失败的数据在所记录的数据中的占比是否超过预设的阈值。If so, the fault sensing process judges whether the proportion of the counted processing failure data in the recorded data exceeds a preset threshold.10.如权利要求8所述的方法,所述故障感知进程在所记录的数据中统计处理失败的数据,包括:10. The method according to claim 8, wherein the failure-aware process statistically processes the failed data in the recorded data, comprising:故障感知进程在所记录的数据中对处理失败的数据进行分类,统计每类中包含的处理失败的数据;The fault awareness process classifies the data that fails to be processed in the recorded data, and counts the data that fails to be processed in each category;所述故障感知进程计算所统计的处理失败的数据在所记录的数据中的占比,包括:The fault-aware process calculates the proportion of the counted processing failure data in the recorded data, including:故障感知进程计算所分的类别中其中一类或几类包含的处理失败的数据在所记录的数据中的占比;The proportion of data that fails to be processed in one or more of the categories classified by the fault-aware process calculation in the recorded data;所述故障感知进程判断所统计的处理失败的数据在所记录的数据中的占比是否超过预设的阈值,包括:The fault perception process determines whether the proportion of the counted data that fails to be processed in the recorded data exceeds a preset threshold, including:故障感知进程判断所计算的其中一类或几类包含的处理失败的数据在所记录的数据中的占比,是否超过与所分的类别对应的预设阈值。The fault perception process judges whether the proportion of the calculated data containing processing failures in one or several categories in the recorded data exceeds a preset threshold corresponding to the classified category.11.如权利要求3所述的方法,所述状态迁移进程在监听到请求端与主响应端之间的主用链路中当前通道出现故障之后,按照预设的迁移策略推动预设的状态机进行相应的状态迁移,包括:11. The method of claim 3, wherein the state migration process pushes a preset state according to a preset migration strategy after monitoring the failure of the current channel in the main link between the requester and the main responder. The machine performs corresponding state transitions, including:所述状态迁移进程在监听到请求端与主响应端之间的主用链路中当前通道出现故障之后,根据所述预设的状态机当前的状态确定预设的迁移策略;The state migration process determines a preset migration strategy according to the current state of the preset state machine after monitoring the failure of the current channel in the main link between the requester and the primary responder;所述状态迁移进程按照所确定的迁移策略推动预设的状态机进行相应的状态迁移。The state transition process pushes a preset state machine to perform corresponding state transition according to the determined transition strategy.12.一种容灾恢复方法,应用于请求端,所述请求端通过主用链路与主响应端连接,通过备用链路与备响应端连接,其中所述请求端与主响应端之间的主用链路包括多个通道,该方法包括:12. A disaster recovery and recovery method, applied to a requesting end, the requesting end is connected to a main responding end through a main link, and is connected to a standby responding end through a backup link, wherein the requesting end and the main responding end are connected. The active link includes a plurality of channels, and the method includes:在监听到请求端与主响应端之间的主用链路中故障通道故障恢复之后,将请求端与主响应端之间的主用链路中当前通道恢复至所述故障通道。After monitoring the failure recovery of the faulty channel in the main link between the requester and the main responder, restore the current channel in the main link between the requester and the main responder to the faulty channel.13.如权利要求12所述的方法,所述在监听到请求端与主响应端之间的主用链路中故障通道故障恢复之后,将请求端与主响应端之间的主用链路中当前通道恢复至所述故障通道,包括:13. The method according to claim 12, wherein after monitoring the fault recovery of the faulty channel in the main link between the requester and the main responder, the main link between the requester and the main responder is restored. The current channel is restored to the faulty channel, including:在监听到请求端与主响应端之间的主用链路中故障通道故障恢复之后,按照预设的迁移策略推动预设的状态机进行相应的状态回迁;After monitoring the fault recovery of the faulty channel in the main link between the requester and the main responder, push the preset state machine to perform the corresponding state reversion according to the preset migration strategy;在监听到所述预设的状态机进行相应的状态回迁之后,将请求端与主响应端之间的主用链路中当前通道恢复至所述故障通道。After monitoring the preset state machine to perform corresponding state reversion, restore the current channel in the active link between the requester and the primary responder to the faulty channel.14.如权利要求13所述的方法,预先配置状态迁移进程、故障切换进程;14. The method of claim 13, preconfiguring a state migration process and a failover process;所述状态迁移进程用于在监听到请求端与主响应端之间的主用链路中故障通道故障恢复之后,按照预设的迁移策略推动预设的状态机进行相应的状态回迁;The state transition process is used to push the preset state machine to perform corresponding state reversion according to the preset migration strategy after monitoring the failure recovery of the faulty channel in the main link between the requester and the main responder;所述故障切换进程用于在监听到所述预设的状态机进行相应的状态回迁之后,将请求端与主响应端之间的主用链路中当前通道恢复至所述故障通道。The failover process is configured to restore the current channel in the active link between the requester and the primary responder to the faulty channel after monitoring the preset state machine to perform a corresponding state transition.15.根据权利要求13所述的方法,该方法还包括:15. The method of claim 13, further comprising:将当前通道所传递的数据或预先设置的虚拟数据发往所述故障通道;Send the data transmitted by the current channel or the preset virtual data to the faulty channel;在预设的时间段内,记录经过所述故障通道所传递的数据;Within a preset time period, record the data transmitted through the faulty channel;在所记录的数据中统计处理失败的数据;Statistically processing failed data in the recorded data;计算所统计的处理失败的数据在所记录的数据中的占比;Calculate the proportion of the counted data that failed to be processed in the recorded data;判断所统计的处理失败的数据在所记录的数据中的占比是否超过预设的阈值;Determine whether the proportion of the counted data that fails to be processed in the recorded data exceeds a preset threshold;若否,按照预设的迁移策略推动预设的状态机进行相应的状态回迁。If not, push the preset state machine to perform corresponding state reversion according to the preset migration strategy.16.根据权利要求14所述的方法,预先配置故障感知进程;该方法还包括:16. The method according to claim 14, preconfiguring a fault awareness process; the method further comprises:所述故障感知进程将当前通道所传递的数据或预先设置的虚拟数据发往所述故障通道;The fault-aware process sends the data transmitted by the current channel or the preset virtual data to the fault channel;在预设的时间段内,所述故障感知进程记录经过所述故障通道所传递的数据;Within a preset time period, the fault sensing process records the data transmitted through the fault channel;所述故障感知进程在所记录的数据中统计处理失败的数据;The fault-aware process counts the failed data in the recorded data;所述故障感知进程计算所统计的处理失败的数据在所记录的数据中的占比;The fault-aware process calculates the proportion of the counted processing failure data in the recorded data;所述故障感知进程判断所统计的处理失败的数据在所记录的数据中的占比是否超过预设的阈值;The fault perception process judges whether the proportion of the counted processing failure data in the recorded data exceeds a preset threshold;若否,所述故障感知进程向状态迁移进程发送推动预设的状态机进行相应的状态回迁的通知;If not, the fault-aware process sends a notification to the state transition process to push the preset state machine to perform corresponding state reversion;所述状态迁移进程在监听到请求端与主响应端之间的主用链路中故障通道故障恢复之后,按照预设的迁移策略推动预设的状态机进行相应的状态回迁,包括:After monitoring the failure recovery of the faulty channel in the main link between the requester and the main responder, the state transition process pushes the preset state machine to perform corresponding state reversion according to the preset migration strategy, including:所述状态迁移进程在监听到故障感知进程发送的状态回迁通知之后,按照预设的迁移策略推动预设的状态机进行相应的状态回迁。The state transition process pushes a preset state machine to perform a corresponding state transition according to a preset transition strategy after monitoring the state transition notification sent by the fault sensing process.17.根据权利要求15所述的方法,所述在所记录的数据中统计处理失败的数据,包括:17. The method according to claim 15, wherein the statistics processing failure data in the recorded data comprises:在所记录的数据中对处理失败的数据进行分类,统计每类中包含的处理失败的数据;Classify the data that fails to be processed in the recorded data, and count the data that fails to be processed in each category;所述计算所统计的处理失败的数据在所记录的数据中的占比,包括:The proportion of the data that fails to be processed according to the calculation in the recorded data, including:计算所分的类别中其中一类或几类包含的处理失败的数据在所记录的数据中的占比;Calculate the proportion of data that fails to be processed in one or more of the classified categories in the recorded data;所述判断所统计的处理失败的数据在所记录的数据中的占比是否超过预设的阈值,包括:Described judging whether the proportion of the counted data that fails to be processed in the recorded data exceeds a preset threshold, including:判断所计算的其中一类或几类包含的处理失败的数据在所记录的数据中的占比,是否超过与所分的类别对应的预设阈值。It is judged whether the proportion of the data that has failed to be processed contained in the calculated one or several categories in the recorded data exceeds a preset threshold corresponding to the classified category.18.根据权利要求16所述的方法,所述故障感知进程在所记录的数据中统计处理失败的数据,包括:18. The method according to claim 16, wherein the failure-aware process statistically processes the failed data in the recorded data, comprising:所述故障感知进程在所记录的数据中对处理失败的数据进行分类,统计每类中包含的处理失败的数据;The fault awareness process classifies the data that fails to be processed in the recorded data, and counts the data that fails to be processed contained in each category;所述故障感知进程计算所统计的处理失败的数据在所记录的数据中的占比,包括:The fault-aware process calculates the proportion of the counted processing failure data in the recorded data, including:所述故障感知进程计算所分的类别中其中一类或几类包含的处理失败的数据在所记录的数据中的占比;The fault perception process calculates the proportion of the data that fails to be processed contained in one or more of the categories in the recorded data;所述故障感知进程判断所统计的处理失败的数据在所记录的数据中的占比是否超过预设的阈值,包括:The fault perception process determines whether the proportion of the counted data that fails to be processed in the recorded data exceeds a preset threshold, including:所述故障感知进程判断所计算的其中一类或几类包含的处理失败的数据在所记录的数据中的占比,是否超过与所分的类别对应的预设阈值。The fault perception process judges whether the ratio of the calculated data containing processing failures in one or several categories in the recorded data exceeds a preset threshold value corresponding to the classified category.19.一种计算机设备,包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,其中,所述处理器执行所述程序时实现如权利要求1至11任一项所述的方法。19. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the program as claimed in any one of claims 1 to 11 when the processor executes the program. method described.20.一种计算机设备,包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,其中,所述处理器执行所述程序时实现如权利要求12至17任一项所述的方法。20. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the program as claimed in any one of claims 12 to 17 when the processor executes the program. method described.
CN202111582686.6A2018-08-012018-08-01 Disaster-tolerant switching method and devicePendingCN114218020A (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN202111582686.6ACN114218020A (en)2018-08-012018-08-01 Disaster-tolerant switching method and device

Applications Claiming Priority (2)

Application NumberPriority DateFiling DateTitle
CN201810866695.XACN109101371B (en)2018-08-012018-08-01Disaster recovery switching method and device
CN202111582686.6ACN114218020A (en)2018-08-012018-08-01 Disaster-tolerant switching method and device

Related Parent Applications (1)

Application NumberTitlePriority DateFiling Date
CN201810866695.XADivisionCN109101371B (en)2018-08-012018-08-01Disaster recovery switching method and device

Publications (1)

Publication NumberPublication Date
CN114218020Atrue CN114218020A (en)2022-03-22

Family

ID=64848384

Family Applications (2)

Application NumberTitlePriority DateFiling Date
CN202111582686.6APendingCN114218020A (en)2018-08-012018-08-01 Disaster-tolerant switching method and device
CN201810866695.XAActiveCN109101371B (en)2018-08-012018-08-01Disaster recovery switching method and device

Family Applications After (1)

Application NumberTitlePriority DateFiling Date
CN201810866695.XAActiveCN109101371B (en)2018-08-012018-08-01Disaster recovery switching method and device

Country Status (1)

CountryLink
CN (2)CN114218020A (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN111427728B (en)2019-12-312022-07-01杭州海康威视数字技术股份有限公司State management method, main/standby switching method and electronic equipment
CN111488248A (en)*2020-04-142020-08-04深信服科技股份有限公司Control method, device and equipment for hosting private cloud system and storage medium
CN111698157B (en)*2020-07-232022-05-24迈普通信技术股份有限公司Link management method, board card and switch
CN112260895A (en)*2020-10-162021-01-22深圳卡路里科技有限公司Data transmission method and device and processor

Citations (12)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US20030185233A1 (en)*2002-03-292003-10-02Fujitsu LimitedMethod, apparatus, and medium for migration across link technologies
CN101159747A (en)*2007-11-122008-04-09中兴通讯股份有限公司TCP concurrency multilink based communication system and method therefor
CN101237314A (en)*2008-01-162008-08-06杭州华三通信技术有限公司A method and access device for guaranteeing duplication service transmission
US20090207726A1 (en)*2008-02-142009-08-20Graeme ThomsonSystem and method for network recovery from multiple link failures
CN102355414A (en)*2011-09-212012-02-15中兴通讯股份有限公司Processing method and device of APS (automatic protection switching) state machine
US20140078894A1 (en)*2012-09-172014-03-20Electronics And Telecommunications Research InstituteLane fault recovery apparatus and method
CN104038362A (en)*2013-03-082014-09-10中国移动通信集团广东有限公司Network system
CN105930245A (en)*2015-12-162016-09-07中国银联股份有限公司Method and system for monitoring service state of transaction terminal
CN106096960A (en)*2016-06-072016-11-09上海携程商务有限公司The method and apparatus of the outside payment system of monitoring and method of payment and system
CN108182139A (en)*2018-01-312018-06-19中国银行股份有限公司Method for early warning, device and system
CN108243021A (en)*2016-12-232018-07-03国网重庆市电力公司綦南供电分公司 A kind of disaster recovery emergency device and system
CN108255678A (en)*2018-01-242018-07-06郑州云海信息技术有限公司Monitoring nodes method, apparatus and storage medium based on Rack whole machine cabinets

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN102650961B (en)*2012-03-312014-01-01华为技术有限公司 Method and system for monitoring data replication of disaster recovery system and a disaster recovery system
CN103812675A (en)*2012-11-082014-05-21中兴通讯股份有限公司Method and system for realizing allopatric disaster recovery switching of service delivery platform
CN103001810B (en)*2012-12-262015-12-23盛科网络(苏州)有限公司Network path protection changing method and system
CN103914354A (en)*2012-12-312014-07-09北京新媒传信科技有限公司Method and system for database fault recovery
CN103259678B (en)*2013-04-282016-06-08华为技术有限公司Main/standby switching method, device, equipment and system

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US20030185233A1 (en)*2002-03-292003-10-02Fujitsu LimitedMethod, apparatus, and medium for migration across link technologies
CN101159747A (en)*2007-11-122008-04-09中兴通讯股份有限公司TCP concurrency multilink based communication system and method therefor
CN101237314A (en)*2008-01-162008-08-06杭州华三通信技术有限公司A method and access device for guaranteeing duplication service transmission
US20090207726A1 (en)*2008-02-142009-08-20Graeme ThomsonSystem and method for network recovery from multiple link failures
CN102355414A (en)*2011-09-212012-02-15中兴通讯股份有限公司Processing method and device of APS (automatic protection switching) state machine
US20140078894A1 (en)*2012-09-172014-03-20Electronics And Telecommunications Research InstituteLane fault recovery apparatus and method
CN104038362A (en)*2013-03-082014-09-10中国移动通信集团广东有限公司Network system
CN105930245A (en)*2015-12-162016-09-07中国银联股份有限公司Method and system for monitoring service state of transaction terminal
CN106096960A (en)*2016-06-072016-11-09上海携程商务有限公司The method and apparatus of the outside payment system of monitoring and method of payment and system
CN108243021A (en)*2016-12-232018-07-03国网重庆市电力公司綦南供电分公司 A kind of disaster recovery emergency device and system
CN108255678A (en)*2018-01-242018-07-06郑州云海信息技术有限公司Monitoring nodes method, apparatus and storage medium based on Rack whole machine cabinets
CN108182139A (en)*2018-01-312018-06-19中国银行股份有限公司Method for early warning, device and system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
杨靖: "分组传送网原理与技术", 31 August 2015, 北京邮电大学出版社, pages: 163 - 166*

Also Published As

Publication numberPublication date
CN109101371A (en)2018-12-28
CN109101371B (en)2021-11-16

Similar Documents

PublicationPublication DateTitle
CN109101371B (en)Disaster recovery switching method and device
US10078563B2 (en)Preventing split-brain scenario in a high-availability cluster
CN108737132B (en)Alarm information processing method and device
CN110022260B (en)Cross-environment receipt message processing method and system
US11438249B2 (en)Cluster management method, apparatus and system
US20160342459A1 (en)Dynamic escalation of service conditions
US20170351560A1 (en)Software failure impact and selection system
CN110413457A (en) Cloud service disaster recovery method and device
WO2023226380A1 (en)Disk processing method and system, and electronic device
CN114189429A (en)System, method, device and medium for monitoring server cluster faults
US10599505B1 (en)Event handling system with escalation suppression
CN102185717A (en)Service processing equipment, method and system
CN116743550B (en)Processing method of fault storage nodes of distributed storage cluster
CN112463514A (en)Monitoring method and device for distributed cache cluster
CN110321261B (en)Monitoring system and monitoring method
CN116501529A (en)Fault processing method and device, storage medium and electronic equipment
CN114064343B (en)Abnormal handling method and device for block chain
TWM641985U (en)Device for determining sending order to send notification messages based on sending parameters
CN108717384B (en)Data backup method and device
CN114090293A (en)Service providing method and electronic equipment
CN114221878A (en)Fault node detection method, system, electronic equipment and storage medium
CN114090346A (en)Data processing method and device
CN115858222B (en)Virtual machine fault processing method, system and electronic equipment
WO2020107208A1 (en)Failure notification method and device, and apparatus
CN114971649B (en)Information processing method, device, equipment and storage medium based on block chain

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination

[8]ページ先頭

©2009-2025 Movatter.jp