Movatterモバイル変換

[0]ホーム

Jump to content

Split-brain (computing)

Edit links

From Wikipedia, the free encyclopedia

Concept in computing

This article has multiple issues. Please helpimprove it or discuss these issues on thetalk page.(Learn how and when to remove these messages)

This article or sectionpossibly contains originalsynthesis. Source material shouldverifiably mention andrelate to the main topic. Relevant discussion may be found on thetalk page.(December 2011) (Learn how and when to remove this message)

This articleneeds additional citations forverification. Please helpimprove this article byadding citations to reliable sources. Unsourced material may be challenged and removed.
Find sources: "Split-brain" computing – news ·newspapers ·books ·scholar ·JSTOR(December 2011) (Learn how and when to remove this message)

(Learn how and when to remove this message)

Incomputing,split-brain is a state indicating data or availability inconsistencies originating from the maintenance of two separate data sets with overlap in scope, either because of severs in anetwork design, or a failure condition based on servers not communicating and synchronizing their data to each other. This last case is also commonly referred to as anetwork partition.^{[citation needed]} The name is based on an analogy with the medicalsplit-brain syndrome.

State

[edit]

Although the termsplit-brain typically refers to an error state,split-brain DNS (orsplit-horizon DNS) is sometimes used to describe a deliberate situation where internal and externalDomain Name System services (DNS services) for a corporate network are not communicating, so that separate DNS name spaces are to be administered for external computers and for internal ones. This requires a double administration, and if there is domain overlap in the computer names, there is a risk that the samefully qualified domain name (FQDN), may ambiguously occur in both name spaces referring to different computer IP addresses.^[1]

High-availability clusters (HA clusters) usually use aheartbeat private network connection which is used to monitor the health and status of each node in the cluster. For example, the split-brain syndrome may occur when all of the private links go down simultaneously, but the cluster nodes are still running, each one believing they are the only one running. The data sets of each cluster may then randomly serve clients by their own "idiosyncratic" data set updates, without any coordination with the other data sets. This may lead todata corruption or other data inconsistencies that might require operator intervention and cleanup.

Approaches for dealing with split-brain

[edit]

Davidson et al.,^[2] after surveying several approaches to handle the problem, classify them as either optimistic or pessimistic.

The optimistic approaches simply let the partitioned nodes work as usual; this provides a greater level of availability, at the cost of sacrificing correctness. Once the problem has ended, automatic or manual reconciliation might be required in order to have the cluster in a consistent state. One current implementation for this approach isHazelcast, which does automatic reconciliation of its key-value store.^[3]

The pessimistic approaches sacrifice availability in exchange for consistency. Once a network partitioning has been detected, access to the sub-partitions is limited in order to guarantee consistency. A typical approach, as described by Coulouris et al.,^[4] is to use aquorum-consensus approach. This allows the sub-partition with a majority of the votes to remain available, while the remaining sub-partitions should fall down to an auto-fencing mode. One current implementation for this approach is the one used byMongoDB replica sets.^[5] And another such implementation is Galera replication forMariaDB andMySQL.^[6]

Modern commercial general-purposeHA clusters typically use a combination of heartbeat network connections between cluster hosts, andquorum witness storage. The challenge with two-node clusters is that adding a witness device adds cost and complexity (even if implemented in the cloud), but without it, if heartbeat fails, cluster members cannot determine which should be active. In such clusters (without quorum), if a member fails, even if the members normally assign primary and secondary statuses to the hosts, there is at least a 50% probability that a 2-node HA cluster will totally fail until human intervention is provided, to prevent multiple members becoming active independently and either directly conflicting or corrupting data.

References

[edit]

^Windows Server 2008 Active Directory, Configuring (2nd Edition)ISBN 978-0-7356-5193-7
^Davidson, Susan; Garcia-Molina, Hector; Skeen, Dale (1985). "Consistency In A Partitioned Network: A Survey".ACM Computing Surveys.17 (3):341–370.doi:10.1145/5505.5508.hdl:1813/6456.S2CID 8424228.
^"Hazelcast Documentation". Retrieved16 February 2015.
^Coulouris, George; Dollimore, Jean; Kindberg, Tim (2001).Distributed systems: concepts and design (3. ed., 1st, 2nd and 3rd impression. ed.). Harlow [u.a.]: Addison-Wesley.ISBN 0201-61918-0.
^"MongoDB Replication Fundamentals". Retrieved12 December 2012.
^"Weighted Quorum in Galera Cluster". Retrieved17 December 2015.

Retrieved from "https://en.wikipedia.org/w/index.php?title=Split-brain_(computing)&oldid=1322937262"

Category:

High-availability cluster computing

Hidden categories:

[8]ページ先頭