Movatterモバイル変換


[0]ホーム

URL:


US20030158921A1 - Method for detecting the quick restart of liveness daemons in a distributed multinode data processing system - Google Patents

Method for detecting the quick restart of liveness daemons in a distributed multinode data processing system
Download PDF

Info

Publication number
US20030158921A1
US20030158921A1US10/078,076US7807602AUS2003158921A1US 20030158921 A1US20030158921 A1US 20030158921A1US 7807602 AUS7807602 AUS 7807602AUS 2003158921 A1US2003158921 A1US 2003158921A1
Authority
US
United States
Prior art keywords
node
group
message
nodes
adapter
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
US10/078,076
Other versions
US7203748B2 (en
Inventor
John Hare
Felipe Knop
Tseng-Hui Lin
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
International Business Machines Corp
Original Assignee
International Business Machines Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by International Business Machines CorpfiledCriticalInternational Business Machines Corp
Priority to US10/078,076priorityCriticalpatent/US7203748B2/en
Assigned to INTERNATIONAL BUSINESS MACHINES CORPORATIONreassignmentINTERNATIONAL BUSINESS MACHINES CORPORATIONASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS).Assignors: HARE, JOHN R., KNOP, FELIPE, LIN, TSENG-HUI
Publication of US20030158921A1publicationCriticalpatent/US20030158921A1/en
Application grantedgrantedCritical
Publication of US7203748B2publicationCriticalpatent/US7203748B2/en
Adjusted expirationlegal-statusCritical
Expired - Fee Relatedlegal-statusCriticalCurrent

Links

Images

Classifications

Definitions

Landscapes

Abstract

In distributed multinode data processing systems, mechanisms are employed to insure that the nodes are properly informed about the liveness of the other nodes in node groups in the network. In particular, the present invention employs group membership indicia as part of a mechanism for detecting that a node and/or its adapter have failed and have been recently restarted. Having detected this situation, group membership inconsistencies which it can engender are avoided.

Description

Claims (5)

The invention claimed is:
1. A method for detecting the quick restart of liveness daemons in a distributed, multinode data processing system in which nodes communicate liveness indicia in the form of heartbeat signals via adapters coupled to each node, said method comprising the steps of:
sending, from a first node to other nodes that are not in the sender's membership group, a first message which includes at least indicia of occurrence of a quick restart; and
determining, from said indicia of occurence of said quick restart and from locally stored group membership information, the existence of a quick restart at said first node, and responding thereto by sending a second message which indicates that said first node is to be expelled from the group.
2. The method ofclaim 1 in which said second message is sent by the node that is the downstream neighbor, in terms of heartbeat passing signals, of the node that sent the first message.
3. The method ofclaim 1 in which said quick restart indicia are selected from the group consisting of: (1) an indication that sender and receiver are not in the same adapter membership group; (2) an indication that the sender's address is part of the current adapter membership group according to said receiver; and (3) an indication of difference in instantiation number for the sender's adapter.
4. A multinode data processing system comprising:
a plurality of data processing nodes connected in a network capable of transmitting messages between nodes;
storage means within said nodes containing program code for sending, from a first node to other nodes that are not in the sender's membership group a first message which includes at least indicia of occurrence of a quick restart and for determining, from said indicia of occurrence of said quick restart and from group membership information in storage at at least one recipient node, the existence of said quick restart at said first node, and responding thereto by sending a second message which indicates that said first node is to be expelled from the group.
5. A machine readable medium containing program code for use in a multinode data processing system for sending, from a first node to other nodes that are not in the sender's membership group a first message which includes at least indicia of occurrence of a quick restart and for determining, from said indicia of occurrence of said quick restart and from group membership information in storage at at least one recipient node, the existence of a quick restart at said first node, and responding thereto by sending a second message which indicates that said first node is to be expelled from the group
US10/078,0762002-02-152002-02-15Method for detecting the quick restart of liveness daemons in a distributed multinode data processing systemExpired - Fee RelatedUS7203748B2 (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
US10/078,076US7203748B2 (en)2002-02-152002-02-15Method for detecting the quick restart of liveness daemons in a distributed multinode data processing system

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
US10/078,076US7203748B2 (en)2002-02-152002-02-15Method for detecting the quick restart of liveness daemons in a distributed multinode data processing system

Publications (2)

Publication NumberPublication Date
US20030158921A1true US20030158921A1 (en)2003-08-21
US7203748B2 US7203748B2 (en)2007-04-10

Family

ID=27732764

Family Applications (1)

Application NumberTitlePriority DateFiling Date
US10/078,076Expired - Fee RelatedUS7203748B2 (en)2002-02-152002-02-15Method for detecting the quick restart of liveness daemons in a distributed multinode data processing system

Country Status (1)

CountryLink
US (1)US7203748B2 (en)

Cited By (12)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US20050273657A1 (en)*2004-04-012005-12-08Hiroshi IchikiInformation processing apparatus and method, and recording medium and program for controlling the same
US20070255819A1 (en)*2006-05-012007-11-01Hua Binh KMethods and Arrangements to Detect a Failure in a Communication Network
CN100358300C (en)*2004-08-162007-12-26Ut斯达康通讯有限公司Network element restart detecting method
US20080270823A1 (en)*2007-04-272008-10-30International Business Machines CorporationFast node failure detection via disk based last gasp mechanism
US20140310410A1 (en)*2010-12-032014-10-16International Business Machines CorporationInter-node communication scheme for node status sharing
WO2018004600A1 (en)*2016-06-302018-01-04Sophos LimitedProactive network security using a health heartbeat
US10637865B2 (en)*2017-10-162020-04-28Juniper Networks, Inc.Fast heartbeat liveness between packet processing engines using media access control security (MACSEC) communication
CN111901422A (en)*2020-07-282020-11-06浪潮电子信息产业股份有限公司Method, system and device for managing nodes in cluster
US10972431B2 (en)2018-04-042021-04-06Sophos LimitedDevice management based on groups of network adapters
US11140195B2 (en)2018-04-042021-10-05Sophos LimitedSecure endpoint in a heterogenous enterprise network
US11271950B2 (en)2018-04-042022-03-08Sophos LimitedSecuring endpoints in a heterogenous enterprise network
US11616758B2 (en)2018-04-042023-03-28Sophos LimitedNetwork device for securing endpoints in a heterogeneous enterprise network

Families Citing this family (15)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US7389293B2 (en)*2000-12-202008-06-17Oracle International CorporationRemastering for asymmetric clusters in high-load scenarios
US7137040B2 (en)*2003-02-122006-11-14International Business Machines CorporationScalable method of continuous monitoring the remotely accessible resources against the node failures for very large clusters
US7475134B2 (en)*2003-10-142009-01-06International Business Machines CorporationRemote activity monitoring
US7493400B2 (en)*2005-05-182009-02-17Oracle International CorporationCreating and dissolving affinity relationships in a cluster
US8037169B2 (en)*2005-05-182011-10-11Oracle International CorporationDetermining affinity in a cluster
US7814065B2 (en)*2005-08-162010-10-12Oracle International CorporationAffinity-based recovery/failover in a cluster environment
US20100238813A1 (en)2006-06-292010-09-23Nortel Networks LimitedQ-in-Q Ethernet rings
US7840662B1 (en)*2008-03-282010-11-23EMC(Benelux) B.V., S.A.R.L.Dynamically managing a network cluster
US9762662B2 (en)2011-05-122017-09-12Microsoft Technology Licensing, LlcMass re-formation of groups in a peer-to-peer network
US8903893B2 (en)*2011-11-152014-12-02International Business Machines CorporationDiagnostic heartbeating in a distributed data processing environment
US9244796B2 (en)2011-11-152016-01-26International Business Machines CorporationDiagnostic heartbeat throttling
US8769089B2 (en)*2011-11-152014-07-01International Business Machines CorporationDistributed application using diagnostic heartbeating
US8756453B2 (en)2011-11-152014-06-17International Business Machines CorporationCommunication system with diagnostic capabilities
US8874974B2 (en)2011-11-152014-10-28International Business Machines CorporationSynchronizing a distributed communication system using diagnostic heartbeating
US9900229B2 (en)*2016-01-292018-02-20Microsoft Technology Licensing, LlcNetwork-connectivity detection

Citations (19)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US5357630A (en)*1991-10-211994-10-18Motorola, Inc.Name resolution method for a distributed data base management system
US5764875A (en)*1996-04-301998-06-09International Business Machines CorporationCommunications program product involving groups of processors of a distributed computing environment
US5999712A (en)*1997-10-211999-12-07Sun Microsystems, Inc.Determining cluster membership in a distributed computer system
US6014669A (en)*1997-10-012000-01-11Sun Microsystems, Inc.Highly-available distributed cluster configuration database
US6061723A (en)*1997-10-082000-05-09Hewlett-Packard CompanyNetwork management event correlation in environments containing inoperative network elements
US6163855A (en)*1998-04-172000-12-19Microsoft CorporationMethod and system for replicated and consistent modifications in a server cluster
US6308282B1 (en)*1998-11-102001-10-23Honeywell International Inc.Apparatus and methods for providing fault tolerance of networks and network interface cards
US20020049845A1 (en)*2000-03-162002-04-25Padmanabhan SreenivasanMaintaining membership in high availability systems
US6446134B1 (en)*1995-04-192002-09-03Fuji Xerox Co., LtdNetwork management system
US20020169861A1 (en)*2001-05-082002-11-14International Business Machines CorporationMethod for determination of remote adapter and/or node liveness
US6532494B1 (en)*1999-05-282003-03-11Oracle International CorporationClosed-loop node membership monitor for network clusters
US20030158936A1 (en)*2002-02-152003-08-21International Business Machines CorporationMethod for controlling group membership in a distributed multinode data processing system to assure mutually symmetric liveness status indications
US6785678B2 (en)*2000-12-212004-08-31Emc CorporationMethod of improving the availability of a computer clustering system through the use of a network medium link state function
US6854069B2 (en)*2000-05-022005-02-08Sun Microsystems Inc.Method and system for achieving high availability in a networked computer system
US6857082B1 (en)*2000-11-212005-02-15Unisys CorporationMethod for providing a transition from one server to another server clustered together
US6885644B1 (en)*2000-05-302005-04-26International Business Machines CorporationTopology propagation in a distributed computing environment with no topology message traffic in steady state
US6965936B1 (en)*2000-12-062005-11-15Novell, Inc.Method for detecting and resolving a partition condition in a cluster
US7058957B1 (en)*2002-07-122006-06-063Pardata, Inc.Cluster event notification system
US7069320B1 (en)*1999-10-042006-06-27International Business Machines CorporationReconfiguring a network by utilizing a predetermined length quiescent state

Patent Citations (21)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US5357630A (en)*1991-10-211994-10-18Motorola, Inc.Name resolution method for a distributed data base management system
US6446134B1 (en)*1995-04-192002-09-03Fuji Xerox Co., LtdNetwork management system
US5764875A (en)*1996-04-301998-06-09International Business Machines CorporationCommunications program product involving groups of processors of a distributed computing environment
US6014669A (en)*1997-10-012000-01-11Sun Microsystems, Inc.Highly-available distributed cluster configuration database
US6061723A (en)*1997-10-082000-05-09Hewlett-Packard CompanyNetwork management event correlation in environments containing inoperative network elements
US5999712A (en)*1997-10-211999-12-07Sun Microsystems, Inc.Determining cluster membership in a distributed computer system
US6163855A (en)*1998-04-172000-12-19Microsoft CorporationMethod and system for replicated and consistent modifications in a server cluster
US6308282B1 (en)*1998-11-102001-10-23Honeywell International Inc.Apparatus and methods for providing fault tolerance of networks and network interface cards
US6532494B1 (en)*1999-05-282003-03-11Oracle International CorporationClosed-loop node membership monitor for network clusters
US7069320B1 (en)*1999-10-042006-06-27International Business Machines CorporationReconfiguring a network by utilizing a predetermined length quiescent state
US20020049845A1 (en)*2000-03-162002-04-25Padmanabhan SreenivasanMaintaining membership in high availability systems
US6854069B2 (en)*2000-05-022005-02-08Sun Microsystems Inc.Method and system for achieving high availability in a networked computer system
US6885644B1 (en)*2000-05-302005-04-26International Business Machines CorporationTopology propagation in a distributed computing environment with no topology message traffic in steady state
US6857082B1 (en)*2000-11-212005-02-15Unisys CorporationMethod for providing a transition from one server to another server clustered together
US6965936B1 (en)*2000-12-062005-11-15Novell, Inc.Method for detecting and resolving a partition condition in a cluster
US6785678B2 (en)*2000-12-212004-08-31Emc CorporationMethod of improving the availability of a computer clustering system through the use of a network medium link state function
US20020169861A1 (en)*2001-05-082002-11-14International Business Machines CorporationMethod for determination of remote adapter and/or node liveness
US20050128960A1 (en)*2001-05-082005-06-16International Business Machines CorporationMethod for determination of remote adapter and/or node liveness
US20030158936A1 (en)*2002-02-152003-08-21International Business Machines CorporationMethod for controlling group membership in a distributed multinode data processing system to assure mutually symmetric liveness status indications
US7043550B2 (en)*2002-02-152006-05-09International Business Machines CorporationMethod for controlling group membership in a distributed multinode data processing system to assure mutually symmetric liveness status indications
US7058957B1 (en)*2002-07-122006-06-063Pardata, Inc.Cluster event notification system

Cited By (30)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US20050273657A1 (en)*2004-04-012005-12-08Hiroshi IchikiInformation processing apparatus and method, and recording medium and program for controlling the same
CN100358300C (en)*2004-08-162007-12-26Ut斯达康通讯有限公司Network element restart detecting method
US20070255819A1 (en)*2006-05-012007-11-01Hua Binh KMethods and Arrangements to Detect a Failure in a Communication Network
US20080225733A1 (en)*2006-05-012008-09-18Hua Binh KMethods and Arrangements to Detect A Failure In A Communication Network
US7743129B2 (en)2006-05-012010-06-22International Business Machines CorporationMethods and arrangements to detect a failure in a communication network
US7765290B2 (en)2006-05-012010-07-27International Business Machines CorporationMethods and arrangements to detect a failure in a communication network
US20080270823A1 (en)*2007-04-272008-10-30International Business Machines CorporationFast node failure detection via disk based last gasp mechanism
US7937610B2 (en)*2007-04-272011-05-03International Business Machines CorporationFast node failure detection via disk based last gasp mechanism
US20140310410A1 (en)*2010-12-032014-10-16International Business Machines CorporationInter-node communication scheme for node status sharing
US9553789B2 (en)*2010-12-032017-01-24International Business Machines CorporationInter-node communication scheme for sharing node operating status
US11184392B2 (en)2016-06-302021-11-23Sophos LimitedDetecting lateral movement by malicious applications
US11184391B2 (en)2016-06-302021-11-23Sophos LimitedServer-client authentication with integrated status update
US11616811B2 (en)2016-06-302023-03-28Sophos LimitedTracking usage of corporate credentials
US12309200B2 (en)2016-06-302025-05-20Sophos LimitedDetecting phishing attacks
WO2018004600A1 (en)*2016-06-302018-01-04Sophos LimitedProactive network security using a health heartbeat
US10986124B2 (en)2016-06-302021-04-20Sophos LimitedBaiting endpoints for improved detection of authentication attacks
US12273382B2 (en)2016-06-302025-04-08Sophos LimitedMulti-factor authentication
GB2566657A (en)*2016-06-302019-03-20Sophos LtdProactive network security using a health heartbeat
US11722521B2 (en)2016-06-302023-08-08Sophos LimitedApplication firewall
US11258821B2 (en)2016-06-302022-02-22Sophos LimitedApplication firewall
GB2566657B (en)*2016-06-302022-03-02Sophos LtdProactive network security using a health heartbeat
US12244641B2 (en)2016-06-302025-03-04Sophos LimitedApplication firewall
US11736522B2 (en)2016-06-302023-08-22Sophos LimitedServer-client authentication with integrated status update
US11316858B2 (en)2017-10-162022-04-26Juniper Networks, Inc.Fast heartbeat liveness between packet processing engines using media access control security (MACsec) communication
US10637865B2 (en)*2017-10-162020-04-28Juniper Networks, Inc.Fast heartbeat liveness between packet processing engines using media access control security (MACSEC) communication
US10972431B2 (en)2018-04-042021-04-06Sophos LimitedDevice management based on groups of network adapters
US11616758B2 (en)2018-04-042023-03-28Sophos LimitedNetwork device for securing endpoints in a heterogeneous enterprise network
US11271950B2 (en)2018-04-042022-03-08Sophos LimitedSecuring endpoints in a heterogenous enterprise network
US11140195B2 (en)2018-04-042021-10-05Sophos LimitedSecure endpoint in a heterogenous enterprise network
CN111901422A (en)*2020-07-282020-11-06浪潮电子信息产业股份有限公司Method, system and device for managing nodes in cluster

Also Published As

Publication numberPublication date
US7203748B2 (en)2007-04-10

Similar Documents

PublicationPublication DateTitle
US7043550B2 (en)Method for controlling group membership in a distributed multinode data processing system to assure mutually symmetric liveness status indications
US7203748B2 (en)Method for detecting the quick restart of liveness daemons in a distributed multinode data processing system
US7120693B2 (en)Method using two different programs to determine state of a network node to eliminate message response delays in system processing
US7370223B2 (en)System and method for managing clusters containing multiple nodes
EP3543870B1 (en)Exactly-once transaction semantics for fault tolerant fpga based transaction systems
US6574197B1 (en)Network monitoring device
Pagani et al.Providing reliable and fault tolerant broadcast delivery in mobile ad‐hoc networks
EP1524600B1 (en)A method for providing a reliable distributed failure notification
US6721907B2 (en)System and method for monitoring the state and operability of components in distributed computing systems
US7069320B1 (en)Reconfiguring a network by utilizing a predetermined length quiescent state
US7675869B1 (en)Apparatus and method for master election and topology discovery in an Ethernet network
US20090147698A1 (en)Network automatic discovery method and system
US20030005350A1 (en)Failover management system
Dunagan et al.Fuse: Lightweight guaranteed distributed failure notification
US20100082822A1 (en)Technique for realizing high reliability in inter-application communication
CN117857658B (en) A flow communication and dynamic switching method and device based on three-stack fusion
CN118487924A (en) Event log management method, device, equipment and readable storage medium
US6792558B2 (en)Backup system for operation system in communications system
JP2001202305A (en)Reliability improving method for communication in nms system, and nms system
CN112953744A (en)Network fault monitoring method, system, computer equipment and readable storage medium
US11956287B2 (en)Method and system for automated switchover timers tuning on network systems or next generation emergency systems
KR100377864B1 (en)System and method of communication for multiple server system
KR100233245B1 (en) Redundancy Control Method in High Speed Wireless Calling System
JPH05292125A (en)Bypass route changeover system and switch back system

Legal Events

DateCodeTitleDescription
ASAssignment

Owner name:INTERNATIONAL BUSINESS MACHINES CORPORATION, NEW Y

Free format text:ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:HARE, JOHN R.;KNOP, FELIPE;LIN, TSENG-HUI;REEL/FRAME:012627/0612

Effective date:20020214

FEPPFee payment procedure

Free format text:PAYOR NUMBER ASSIGNED (ORIGINAL EVENT CODE: ASPN); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

REMIMaintenance fee reminder mailed
LAPSLapse for failure to pay maintenance fees
STCHInformation on status: patent discontinuation

Free format text:PATENT EXPIRED DUE TO NONPAYMENT OF MAINTENANCE FEES UNDER 37 CFR 1.362

FPLapsed due to failure to pay maintenance fee

Effective date:20110410


[8]ページ先頭

©2009-2025 Movatter.jp