Movatterモバイル変換


[0]ホーム

URL:


CN114358340B - Device online maintenance method, device, equipment and computer-readable storage medium - Google Patents

Device online maintenance method, device, equipment and computer-readable storage medium

Info

Publication number
CN114358340B
CN114358340BCN202111683411.1ACN202111683411ACN114358340BCN 114358340 BCN114358340 BCN 114358340BCN 202111683411 ACN202111683411 ACN 202111683411ACN 114358340 BCN114358340 BCN 114358340B
Authority
CN
China
Prior art keywords
equipment
maintenance
parallel system
type
faulty
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111683411.1A
Other languages
Chinese (zh)
Other versions
CN114358340A (en
Inventor
周冲
吴鸿达
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou Inovance Control Technology Co Ltd
Original Assignee
Suzhou Inovance Control Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou Inovance Control Technology Co LtdfiledCriticalSuzhou Inovance Control Technology Co Ltd
Priority to CN202111683411.1ApriorityCriticalpatent/CN114358340B/en
Publication of CN114358340ApublicationCriticalpatent/CN114358340A/en
Application grantedgrantedCritical
Publication of CN114358340BpublicationCriticalpatent/CN114358340B/en
Activelegal-statusCriticalCurrent
Anticipated expirationlegal-statusCritical

Links

Landscapes

Abstract

The invention discloses an equipment on-line maintenance method, a device, equipment and a computer readable storage medium, wherein when an equipment fault occurs in a parallel operation system, the equipment type is firstly determined, and then corresponding fault-tolerant operation is carried out on the system aiming at different equipment types, so that the system can still maintain a normal running state after the fault occurs; the equipment to be put into is re-connected to the system after finishing maintenance or replacement and confirming that the equipment to be put into passes through a preset maintenance flow comprising a buffer detection flow and/or an operation detection flow, so that the normal operation of the system is not interfered in the whole process from fault occurrence to fault removal re-connection of the equipment in the parallel operation system, namely the system can always maintain a power-on state for normal production.

Description

Equipment on-line maintenance method, device, equipment and computer readable storage medium
Technical Field
The present invention relates to the field of electrical equipment control technology, and in particular, to an on-line maintenance method, an on-line maintenance device, and a computer readable storage medium.
Background
In the current field of electrical equipment control, electrical equipment may be applied to an application occasion of parallel operation, in this occasion, if there is equipment failure in a plurality of equipments of parallel operation, the equipment is stopped, and when machine replacement or maintenance is required, the whole system of parallel operation needs to be powered down completely until a worker maintains or changes the failed equipment, and then the parallel operation state of the whole system is restored, and because normal production cannot be performed during the fault removal, higher maintenance cost is caused.
Disclosure of Invention
The invention mainly aims to provide an equipment on-line maintenance method, device, equipment and a computer readable storage medium, and aims to solve the technical problem that the maintenance cost of the existing parallel operation fault removal mode is high.
In order to achieve the above object, the present invention provides an on-line maintenance method for equipment, the on-line maintenance method for equipment is applied to a parallel operation system, the method includes:
When detecting that equipment faults exist in the parallel operation system, determining equipment types of the fault equipment;
According to the equipment type, carrying out corresponding fault tolerance adjustment on the parallel operation system to maintain the operation of the parallel operation system, and removing the fault equipment from the parallel operation system;
and if the maintained equipment to be put into operation is determined to pass through a preset overhaul flow, accessing the equipment to be put into operation into the parallel operation system, wherein the overhaul flow comprises a buffer detection flow and/or an operation detection flow.
Optionally, the device type includes a master type and a slave type, and the step of performing corresponding fault tolerance adjustment on the parallel operation system according to the device type to maintain the parallel operation system and remove the fault device from the parallel operation system includes:
If the equipment type of the fault equipment is the host type, changing the equipment type of the fault equipment into the slave type, and selecting a new host from a plurality of slaves of the parallel operation system except the fault equipment;
and maintaining the operation of the parallel operation system based on the new host, and removing the fault equipment from the parallel operation system.
Optionally, the step of selecting a new host from a plurality of slaves of the parallel operation system except for the fault device includes:
And acquiring state information and number information of the plurality of slaves, and selecting the new master according to the state information and the number information.
Optionally, until the maintenance of the faulty device is completed, if it is determined that the maintained device to be input passes through a preset overhaul process, the step of accessing the device to be input into the parallel operation system includes:
determining to-be-put equipment after maintenance or replacement until a new access instruction of the equipment is received;
And executing the overhaul flow on the equipment to be input, and accessing the equipment to be input into the parallel operation system after the overhaul passes, wherein a direct current power switch of the equipment to be input is not switched on in the overhaul process.
Optionally, the step of executing the overhaul process on the equipment to be input and accessing the equipment to be input into the parallel operation system after overhaul is passed includes:
Judging whether the buffer function of the equipment to be put into is normal or not;
If yes, judging whether the equipment to be put into can normally operate or not;
If yes, judging that the equipment to be put into the operation passes the overhaul;
And switching on the direct-current power supply switch of the equipment to be put into so as to switch the equipment to be put into the parallel operation system.
Optionally, the step of determining whether the buffering function of the device to be put into operation is normal includes:
determining that both an alternating current power switch and a direct current power switch of the equipment to be put into operation are switched off;
switching on the alternating current power supply switch, and judging whether the actual direct current side voltage value of the equipment to be put into operation is consistent with a standard direct current side voltage value;
if yes, judging that the buffer function of the equipment to be put into is normal;
If not, judging that the buffer function of the equipment to be put into is abnormal.
Optionally, the device type includes a master type and a slave type, and the step of performing corresponding fault tolerance adjustment on the parallel operation system according to the device type to maintain the parallel operation system and remove the fault device from the parallel operation system includes:
And if the equipment type of the fault equipment is the slave type, informing a host in the parallel operation system to carry out task redistribution to a plurality of slaves except the fault equipment so as to maintain the operation of the parallel operation system, and removing the fault equipment from the parallel operation system.
In addition, in order to achieve the above purpose, the present invention also provides an on-line maintenance device for equipment, the device is provided in a parallel operation system, and the device includes:
The fault equipment determining module is used for determining equipment type of fault equipment when equipment faults exist in the parallel operation system;
The system fault tolerance adjustment module is used for carrying out corresponding fault tolerance adjustment on the parallel operation system according to the equipment type so as to maintain the operation of the parallel operation system and remove the fault equipment from the parallel operation system;
And the equipment re-access module is used for accessing the equipment to be input into the parallel operation system until the maintenance of the fault equipment is completed, if the maintained equipment to be input passes a preset overhaul flow, wherein the overhaul flow comprises a buffer detection flow and/or an operation detection flow.
In addition, in order to achieve the purpose, the invention also provides equipment parallel operation maintenance equipment, which comprises a memory, a processor and equipment on-line maintenance programs which are stored in the memory and can run on the processor, wherein the steps of the equipment on-line maintenance method are realized when the equipment on-line maintenance programs are executed by the processor.
In addition, in order to achieve the above object, the present invention also provides a computer-readable storage medium having stored thereon a device online maintenance program which, when executed by a processor, implements the steps of the device online maintenance method as described above.
Furthermore, to achieve the above object, the present invention provides a computer program product comprising a computer program which, when being executed by a processor, implements the steps of the device online maintenance method as described above.
When the parallel operation system has equipment faults, the equipment types are firstly determined, and then corresponding fault-tolerant operation is carried out on the system according to different equipment types, so that the system can still maintain a normal running state after the faults occur; the equipment to be put into is re-connected to the system after finishing maintenance or replacement and confirming that the equipment to be put into passes through a preset maintenance flow comprising a buffer detection flow and/or an operation detection flow, so that the normal operation of the system is not interfered in the whole process from the fault occurrence to the fault removal re-connection of the equipment in the parallel operation system, namely, the system can always perform normal production, thereby solving the technical problem of higher maintenance cost of the fault removal mode of the existing parallel operation.
Drawings
FIG. 1 is a schematic diagram of a device architecture of a hardware operating environment according to an embodiment of the present invention;
FIG. 2 is a schematic flow chart of a first embodiment of an on-line maintenance method of the device of the present invention;
FIG. 3 is a flow chart of an embodiment of an on-line maintenance method of the device of the present invention;
Fig. 4 is a schematic diagram of functional modules of the on-line maintenance device of the apparatus of the present invention.
The achievement of the objects, functional features and advantages of the present invention will be further described with reference to the accompanying drawings, in conjunction with the embodiments.
Detailed Description
It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
In the current field of electrical equipment control, electrical equipment may be applied to an application occasion of parallel operation, in this occasion, if there is equipment failure in a plurality of equipments of parallel operation, the equipment is stopped, and when machine replacement or maintenance is required, the whole system of parallel operation needs to be powered down completely until a worker maintains or changes the failed equipment, and then the parallel operation state of the whole system is restored, and because normal production cannot be performed during the fault removal, higher maintenance cost is caused.
In order to solve the problems, the invention provides an equipment online maintenance method, namely, when an equipment failure occurs in a parallel operation system, equipment types are firstly determined, then corresponding fault-tolerant operation is carried out on the system according to different equipment types, so that the system can still maintain a normal operation state after the failure occurs, the failed equipment is removed from the system in time, so that the failed equipment can be maintained or replaced under the condition that the normal operation of the system is not influenced, the equipment to be input is re-connected into the system after the maintenance or replacement is completed and the equipment to be input is confirmed to pass through a preset maintenance flow comprising a buffer detection flow and/or an operation detection flow, so that the normal operation of the system is not interfered in the whole process from the failure occurrence to the re-connection of the failure removal, namely, the system can always carry out normal production, and the technical problem of high maintenance cost of the existing failure removal mode of the parallel operation is solved.
Referring to fig. 1, fig. 1 is a schematic device structure of a hardware running environment according to an embodiment of the present invention.
As shown in fig. 1, the device online maintenance apparatus may include a processor 1001, such as a CPU, a user interface 1003, a network interface 1004, a memory 1005, and a communication bus 1002. Wherein the communication bus 1002 is used to enable connected communication between these components. The user interface 1003 may include a Display, an input unit such as a Keyboard (Keyboard), and the optional user interface 1003 may further include a standard wired interface, a wireless interface. The network interface 1004 may optionally include a standard wired interface, a wireless interface (e.g., WI-FI interface). The memory 1005 may be a high-speed RAM memory or a stable memory (non-volatile memory), such as a disk memory. The memory 1005 may also optionally be a storage device separate from the processor 1001 described above.
It will be appreciated by those skilled in the art that the device structure shown in fig. 1 is not limiting of the device and may include more or fewer components than shown, or may combine certain components, or a different arrangement of components.
As shown in fig. 1, a memory 1005 (provided in a parallel operation system) as a kind of computer storage medium may include an operating system, a network communication module, a user interface module, and an apparatus online maintenance program.
In the device shown in fig. 1, the network interface 1004 is mainly used for connecting to a background server and performing data communication with the background server, the user interface 1003 is mainly used for connecting to a client (user side) and performing data communication with the client, and the processor 1001 may be used for calling a device online maintenance program stored in the memory 1005 and performing the following operations:
When detecting that equipment faults exist in the parallel operation system, determining equipment types of the fault equipment;
According to the equipment type, carrying out corresponding fault tolerance adjustment on the parallel operation system to maintain the operation of the parallel operation system, and removing the fault equipment from the parallel operation system;
and if the maintained equipment to be put into operation is determined to pass through a preset overhaul flow, accessing the equipment to be put into operation into the parallel operation system, wherein the overhaul flow comprises a buffer detection flow and/or an operation detection flow.
Further, the device type includes a master type and a slave type, and the step of performing corresponding fault tolerance adjustment on the parallel operation system according to the device type to maintain the operation of the parallel operation system and remove the fault device from the parallel operation system includes:
If the equipment type of the fault equipment is the host type, changing the equipment type of the fault equipment into the slave type, and selecting a new host from a plurality of slaves of the parallel operation system except the fault equipment;
and maintaining the operation of the parallel operation system based on the new host, and removing the fault equipment from the parallel operation system.
Further, the step of selecting a new host from a plurality of slaves of the parallel operation system except the fault device includes:
And acquiring state information and number information of the plurality of slaves, and selecting the new master according to the state information and the number information.
Further, until the maintenance of the fault device is completed, if it is determined that the maintained device to be input passes through a preset overhaul process, the step of accessing the device to be input into the parallel operation system includes:
determining to-be-put equipment after maintenance or replacement until a new access instruction of the equipment is received;
And executing the overhaul flow on the equipment to be input, and accessing the equipment to be input into the parallel operation system after the overhaul passes, wherein a direct current power switch of the equipment to be input is not switched on in the overhaul process.
Further, the step of executing the overhaul process on the equipment to be input and accessing the equipment to be input into the parallel operation system after overhaul is passed includes:
Judging whether the buffer function of the equipment to be put into is normal or not;
If yes, judging whether the equipment to be put into can normally operate or not;
If yes, judging that the equipment to be put into the operation passes the overhaul;
And switching on the direct-current power supply switch of the equipment to be put into so as to switch the equipment to be put into the parallel operation system.
Further, the step of judging whether the buffer function of the equipment to be put into operation is normal includes:
determining that both an alternating current power switch and a direct current power switch of the equipment to be put into operation are switched off;
switching on the alternating current power supply switch, and judging whether the actual direct current side voltage value of the equipment to be put into operation is consistent with a standard direct current side voltage value;
if yes, judging that the buffer function of the equipment to be put into is normal;
If not, judging that the buffer function of the equipment to be put into is abnormal.
Further, the device type includes a master type and a slave type, and the step of performing corresponding fault tolerance adjustment on the parallel operation system according to the device type to maintain the operation of the parallel operation system and remove the fault device from the parallel operation system includes:
And if the equipment type of the fault equipment is the slave type, informing a host in the parallel operation system to carry out task redistribution to a plurality of slaves except the fault equipment so as to maintain the operation of the parallel operation system, and removing the fault equipment from the parallel operation system.
Based on the hardware structure, the embodiment of the equipment on-line maintenance method is provided.
Referring to fig. 2, fig. 2 is a schematic flow chart of a first embodiment of the on-line maintenance method of the device of the present invention. The equipment online maintenance method is applied to a parallel operation system, and comprises the following steps:
Step S10, when detecting that equipment faults exist in the parallel operation system, determining equipment types of faulty equipment;
In this embodiment, the parallel operation system refers to a system formed by parallel operation of a plurality of electrical devices (including, but not limited to, a frequency converter, a rectifier and an energy storage converter), and generally one parallel operation system includes a master machine and a plurality of slave machines, where the master machine is used to send control instructions to each slave machine, and each slave machine operates according to the control instructions sent by the master machine. It should be noted that, in a practical situation, one or more slaves may fail in the parallel operation system, the master may fail, or the master and the slaves may fail, so the device type of the failed device may be the master type and/or the slave type.
Specifically, each device in the parallel operation system monitors the running state of the device, and each device also sends its own state information to other devices in the system (for example, may send heartbeat messages), so when the device in the system stops running due to a fault, the system can determine the device with the fault in time, and further learn the device type of the fault device.
Step S20, according to the equipment type, carrying out corresponding fault tolerance adjustment on the parallel operation system to maintain the operation of the parallel operation system, and removing the fault equipment from the parallel operation system;
and step S30, if the maintained equipment to be put into is determined to pass through a preset overhaul process, accessing the equipment to be put into the parallel operation system until the maintenance of the fault equipment is completed, wherein the overhaul process comprises a buffer detection process and/or an operation detection process.
In this embodiment, the fault-tolerant adjustment policy corresponding to the master type is different from the fault-tolerant adjustment policy corresponding to the slave type. When the system only has the slave machine type or only has the equipment of the master machine type to fail, the system adopts the fault-tolerant adjustment strategy corresponding to the master machine type or the slave machine type, and when the master machine and the slave machine in the system fail, the system can simultaneously adopt the two fault-tolerant adjustment strategies. The overhaul flow at least comprises a buffer detection flow and/or an operation detection flow, and other overhaul flows can be added according to actual requirements.
Specifically, for the host type, the fault tolerance adjustment policy may be that a device is selected from the slave devices in the system as a new host device to take over the failed host to continue instruction transmission while (or after) removing the failed host from the system, and for the slave type, the slave device may be directly removed from the system because its removal will not affect the system, or the host in the system may be notified, and the host reassigns the task to be executed to the other normally operating slave device, while (or after) removing the failed slave device from the system to timely assign the task originally executed by the failed slave device to the other slave device to complete, thereby avoiding delay of task execution progress.
After the faulty device is removed, the faulty device can be repaired or replaced. After maintenance or replacement is completed, the maintained equipment or newly replaced equipment can be used as equipment to be put into preparation for accessing the original parallel operation system. It should be noted that, before the device to be put into operation is connected to the system, it is generally required to overhaul the device to be put into operation to confirm that the device to be put into operation has no abnormality in performance of each function, because of the system stability and safety.
The embodiment provides an equipment online maintenance method, which comprises the steps of firstly determining equipment types of equipment when a parallel operation system fails, then carrying out corresponding fault-tolerant operation on the system according to different equipment types, so that the system can still maintain a normal operation state after the failure occurs, timely removing failed equipment from the system, maintaining or replacing the failed equipment under the condition that the normal operation of the system is not affected, and re-accessing the equipment to be input into the system after the maintenance or replacement is completed and the equipment to be input is confirmed to pass through a preset maintenance flow comprising a buffer detection flow and/or an operation detection flow, so that the normal operation of the system is not disturbed in the whole process from the failure occurrence to the failure removal re-access of the equipment in the parallel operation system, namely the system can always carry out normal production, and the technical problem of high maintenance cost of the existing failure removal mode of the parallel operation is solved.
Further, based on the first embodiment shown in fig. 2, a second embodiment of the device online maintenance method of the present invention is proposed. In this embodiment, the device types include a master type and a slave type, and step S20 includes:
Step S21, if the equipment type of the fault equipment is a host type, changing the equipment type of the fault equipment into a slave type, and selecting a new host from a plurality of slaves of the parallel operation system except the fault equipment;
and step S22, maintaining the operation of the parallel operation system based on the new host, and removing the fault equipment from the parallel operation system.
In this embodiment, when the system detects that the host fails and cannot operate normally, the failed host needs to be changed to a slave first, and a new host needs to be selected from the slaves in the system to serve as the new host, and then the failed host that has been lowered to the slave needs to be removed from the system.
The selection strategy of the new host can be flexibly set according to actual conditions, specifically, a slave is selected from all the slaves in normal operation states in the slave system to serve as the new host, and a slave which is most suitable for the standard is selected from all the slaves in normal operation states to serve as the new host according to certain parameter standards.
According to the embodiment, when the host fails, a new host is rapidly contended from the slaves which are currently running, so that the fact that other slaves can continuously receive a correct control instruction is guaranteed, and stable running of the system is guaranteed.
Further, the step of selecting a new host from a plurality of slaves of the parallel operation system except for the faulty device in step S21 includes:
Step S211, obtaining state information and number information of the slaves, and selecting the new master according to the state information and the number information.
In this embodiment, the status information refers to information reflecting the operation status of the device, and may be specifically classified into a normal operation status and an abnormal status. The number information refers to a device number uniquely corresponding to each device.
Specifically, for selecting a new host, the system can obtain the state information of each slave device, screen the slave device with normal operation state from the state information, and then select the slave device with the largest (or smallest) device number from the slave devices with normal operation state as the new host.
Further, step S30 includes:
Step S31, determining to-be-put equipment after maintenance or replacement until a new equipment addition access instruction is received;
and S32, executing the overhaul flow on the equipment to be input, and accessing the equipment to be input into the parallel operation system after the overhaul is passed, wherein a direct current power switch of the equipment to be input is not switched on in the overhaul process.
In this embodiment, the device newly-added access instruction may be initiated to the system by a related maintainer. After the maintenance personnel complete the maintenance of the fault equipment, the maintenance personnel send an equipment newly-added access instruction to the system. After the system receives the instruction, the equipment to be put into is determined to be connected, then the equipment to be put into is overhauled according to a preset overhauling flow, each function of the equipment to be put into can be overhauled specifically, the equipment to be put into can be connected into the system after each function is confirmed to be normal, and specific operation of the system to be connected comprises switching on a direct current power supply switch of the equipment to be put into.
Before the maintenance is finished, the direct-current power switch of the equipment to be put into operation is always in an off state, so that the running state of the equipment to be put into operation is ensured not to influence the system, and the system is ensured to run stably all the time.
Further, based on the above second embodiment, a third embodiment of the online maintenance method of the device of the present invention is proposed. In the present embodiment, step S32 includes:
step S321, judging whether the buffer function of the equipment to be put into is normal;
step S322, if yes, judging whether the equipment to be put into can normally operate;
step S323, if yes, judging that the equipment to be put into the operation passes maintenance;
And step S324, switching on the direct-current power supply switch of the equipment to be input so as to switch the equipment to be input into the parallel operation system.
In the present embodiment, the buffer function is verified mainly by changing the on-off state of the ac power switch of the equipment to be put into operation. The system judges whether the buffering function of the equipment to be input is normal or not by changing the opening and closing state of an alternating current power switch, if the equipment to be input fails, the equipment to be input is indicated to have abnormal functions, the system cannot be accessed at the moment, further investigation is needed, the specific system can output abnormal prompt information on a display interface to prompt maintenance personnel to overhaul failure, if the equipment to be input is normal, the system continues to judge whether the equipment to be input can normally operate, if the equipment to be input can normally operate at the moment, the equipment to be input can be judged to pass overhaul, the equipment to be input can be accessed to the system at the moment, and if the equipment to be input can not normally operate (can be judged by operating error information), the equipment to be input can not pass through an overhaul flow, and the equipment to be input can not be accessed to the system at the moment.
Further, step S321 includes:
step S3211, determining that both the alternating current power switch and the direct current power switch of the equipment to be put into operation are switched off;
step S3212, switching on the AC power switch, and judging whether the actual DC side voltage value of the equipment to be put into is consistent with a standard DC side voltage value;
Step S3213, if yes, judging that the buffering function of the equipment to be put into is normal;
Step S3214, if not, judging that the buffering function of the equipment to be put into is abnormal.
In this embodiment, the system firstly ensures that the ac power source and the dc power source of the device to be put into operation are both switched off (i.e. not energized), then switches on (i.e. energized) the ac power switch, and judges whether the actual dc side voltage value of the device to be put into operation is consistent with the standard dc side voltage value at this time, so as to detect the buffer function of the device to be put into operation, if so, the system can judge that the buffer function of the device to be put into operation is normal, the specific system can output the prompt information of normal function on the display interface, and if not, the system can judge that the buffer function verified at this time is abnormal, and can also output the prompt information of abnormal function on the display interface.
Further, the device type includes a master type and a slave type, and step S20 includes:
And step A, if the equipment type of the fault equipment is the slave type, informing a host in the parallel operation system to carry out task redistribution to a plurality of slaves except the fault equipment so as to maintain the operation of the parallel operation system, and removing the fault equipment from the parallel operation system.
In this embodiment, if the slave device in the system fails, the system may notify the host of the failure, and after knowing that the host may redistribute the tasks to be executed next to the other slaves, the task originally executed by the failed slave is transferred or distributed to the other slaves that are normally executed, so as to ensure that all tasks can be normally advanced, and then the failed slave is removed from the system for maintenance alone.
As a specific example, fig. 3 shows.
When the system enters a maintenance operation mode aiming at equipment to be put into operation, firstly, a direct current power switch and an alternating current power switch of the equipment to be put into operation are switched off, after switching off completion is confirmed, the alternating current power switch is switched on, after switching off completion is confirmed, whether the direct current side voltage value of the equipment to be put into operation is consistent with a standard value is judged to judge whether the pre-charging is completed or not (namely, a buffering function is detected), if the direct current side voltage value is inconsistent with the standard value, a Human-machine interface (HMI, human MACHINE INTERFACE) prompts maintenance failure and switching off the alternating current power switch, if the direct current side voltage value is consistent with the standard value, the pre-charging is judged to complete, whether the equipment to be put into operation can be normally operated is continuously, if the equipment to be put into operation can not be normally operated, the maintenance failure is prompted through the Human-machine interface, and if the equipment to be overhauled successfully is prompted through the Human-machine interface. After the overhaul is successful, the system firstly breaks the AC power switch to be put into the equipment to exit from the overhaul operation mode, then enters the normal operation mode after the break is completed, and in the mode, the DC power switch and the AC power switch to be put into the equipment are switched on to formally access the system, and the equipment can normally operate after being accessed into the system.
As shown in fig. 4, the present invention further provides an on-line maintenance device for equipment, where the device is provided in a parallel operation system, and the device includes:
a fault device determining module 10, configured to determine a device type of a fault device when detecting that a device fault exists in the parallel operation system;
the system fault tolerance adjustment module 20 is configured to perform corresponding fault tolerance adjustment on the parallel operation system according to the device type so as to maintain the parallel operation system and remove the fault device from the parallel operation system;
And the equipment re-access module 30 is configured to access the equipment to be input to the parallel operation system until the maintenance of the fault equipment is completed, if it is determined that the maintained equipment to be input passes a preset overhaul process, where the overhaul process includes a buffer detection process and/or an operation detection process.
Optionally, the device type includes a master type and a slave type, and the system fault tolerance adjustment module 20 includes:
The host fault tolerance unit is used for changing the equipment type of the fault equipment into the slave type if the equipment type of the fault equipment is the host type, and selecting a new host from a plurality of slaves of the parallel operation system except the fault equipment;
And the first equipment removing unit is used for maintaining the operation of the parallel operation system based on the new host and removing the fault equipment from the parallel operation system.
Optionally, the host reselection unit is further configured to:
And acquiring state information and number information of the plurality of slaves, and selecting the new master according to the state information and the number information.
Optionally, the device re-access module 30 includes:
The new instruction receiving unit is used for determining the equipment to be put into after maintenance or replacement until a new access instruction of the equipment is received;
And the equipment maintenance unit is used for executing the maintenance flow on the equipment to be put into, and accessing the equipment to be put into the parallel operation system after the maintenance is passed, wherein the direct current power switch of the equipment to be put into is not switched on in the maintenance process.
Optionally, the equipment servicing unit is further configured to:
Judging whether the buffer function of the equipment to be put into is normal or not;
If yes, judging whether the equipment to be put into can normally operate or not;
If yes, judging that the equipment to be put into the operation passes the overhaul;
And switching on the direct-current power supply switch of the equipment to be put into so as to switch the equipment to be put into the parallel operation system.
Optionally, the step of determining whether the buffering function of the device to be put into operation is normal includes:
determining that both an alternating current power switch and a direct current power switch of the equipment to be put into operation are switched off;
switching on the alternating current power supply switch, and judging whether the actual direct current side voltage value of the equipment to be put into operation is consistent with a standard direct current side voltage value;
if yes, judging that the buffer function of the equipment to be put into is normal;
If not, judging that the buffer function of the equipment to be put into is abnormal.
Optionally, the device type includes a master type and a slave type, and the system fault tolerance adjustment module 20 includes:
And the slave fault tolerance unit is used for notifying the master in the parallel operation system to redistribute tasks to a plurality of slaves except the fault equipment if the equipment type of the fault equipment is the slave type so as to maintain the operation of the parallel operation system and remove the fault equipment from the parallel operation system.
The invention also provides equipment parallel operation overhauling equipment.
The equipment parallel operation overhaul equipment comprises a processor, a memory and an equipment online maintenance program which is stored in the memory and can run on the processor, wherein when the equipment online maintenance program is executed by the processor, the steps of the equipment online maintenance method are realized.
The method implemented when the device online maintenance program is executed may refer to various embodiments of the device online maintenance method of the present invention, which will not be described herein.
The invention also provides a computer readable storage medium.
The computer readable storage medium of the present invention stores a device online maintenance program which, when executed by a processor, implements the steps of the device online maintenance method as described above.
The method implemented when the device online maintenance program is executed may refer to various embodiments of the device online maintenance method of the present invention, which will not be described herein.
The invention also provides a computer program product comprising a computer program which, when executed by a processor, implements the steps of a method for online maintenance of a device as described above.
The method implemented when the computer program is executed may refer to various embodiments of the device online maintenance method of the present invention, which are not described herein.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or system that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or system. Without further limitation, an element defined by the phrase "comprising one does not exclude the presence of other like elements in a process, method, article, or system that comprises the element.
The foregoing embodiment numbers of the present invention are merely for the purpose of description, and do not represent the advantages or disadvantages of the embodiments.
From the above description of the embodiments, it will be clear to those skilled in the art that the above-described embodiment method may be implemented by means of software plus a necessary general hardware platform, but of course may also be implemented by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a storage medium (e.g. ROM/RAM, magnetic disk, optical disk) as described above, comprising instructions for causing a terminal device (which may be a mobile phone, a computer, a server, or a network device, etc.) to perform the method according to the embodiments of the present invention.
The foregoing description is only of the preferred embodiments of the present invention, and is not intended to limit the scope of the invention, but rather is intended to cover any equivalents of the structures or equivalent processes disclosed herein or in the alternative, which may be employed directly or indirectly in other related arts.

Claims (8)

Translated fromChinese
1.一种设备在线维护方法,其特征在于,所述设备在线维护方法应用于并机系统,所述方法包括:1. A method for online equipment maintenance, characterized in that the method for online equipment maintenance is applied to a parallel system, and the method comprises:在检测到所述并机系统中存在设备故障时,确定故障设备的设备类型;When a device fault is detected in the parallel system, determining a device type of the faulty device;根据所述设备类型,对所述并机系统进行相应的容错调整以维持所述并机系统运行,并将所述故障设备从所述并机系统中移除;According to the device type, corresponding fault tolerance adjustment is performed on the parallel system to maintain the operation of the parallel system, and the faulty device is removed from the parallel system;直至所述故障设备维护完成,若确定所维护的待投入设备通过预设的检修流程,则将所述待投入设备接入所述并机系统,其中,所述检修流程包括缓冲检测流程和/或运行检测流程;Until the maintenance of the faulty equipment is completed, if it is determined that the equipment to be put into use has passed the preset maintenance process, the equipment to be put into use is connected to the parallel system, wherein the maintenance process includes a buffer detection process and/or an operation detection process;所述直至所述故障设备维护完成,若确定所维护的待投入设备通过预设的检修流程,则将所述待投入设备接入所述并机系统的步骤包括:The step of connecting the equipment to be put into use to the parallel system until the maintenance of the faulty equipment is completed and if it is determined that the equipment to be put into use passes the preset maintenance process, comprises:直至接收到设备新增接入指令,确定维修或更换后的待投入设备;Until receiving the device access instruction, the device to be put into use after repair or replacement is determined;对所述待投入设备执行所述检修流程,并在检修通过后将所述待投入设备接入所述并机系统,其中,所述待投入设备的直流电源开关在所述检修的过程中不合闸,判断所述待投入设备的缓冲功能是否正常;若是,则判断所述待投入设备是否能够正常运行;若是,则判定所述待投入设备通过检修;将所述待投入设备的直流电源开关合闸,以将所述待投入设备接入所述并机系统。The maintenance process is performed on the equipment to be put into use, and the equipment to be put into use is connected to the parallel system after the maintenance is passed, wherein the DC power switch of the equipment to be put into use is not closed during the maintenance process, and it is judged whether the buffer function of the equipment to be put into use is normal; if so, it is judged whether the equipment to be put into use can operate normally; if so, it is determined that the equipment to be put into use has passed the maintenance; the DC power switch of the equipment to be put into use is closed to connect the equipment to be put into use to the parallel system.2.如权利要求1所述的设备在线维护方法,其特征在于,所述设备类型包括主机类型和从机类型,所述根据所述设备类型,对所述并机系统进行相应的容错调整以维持所述并机系统运行,并将所述故障设备从所述并机系统中移除的步骤包括:2. The device online maintenance method according to claim 1, wherein the device type includes a master type and a slave type, and the steps of performing corresponding fault tolerance adjustment on the parallel system according to the device type to maintain the operation of the parallel system and removing the faulty device from the parallel system include:若所述故障设备的设备类型为主机类型,则将所述故障设备的设备类型更改为从机类型,并从所述并机系统除所述故障设备外的若干从机中选择出新主机;If the device type of the faulty device is a master type, the device type of the faulty device is changed to a slave type, and a new master is selected from a number of slaves in the parallel system except the faulty device;基于所述新主机维持所述并机系统运行,并将所述故障设备从所述并机系统中移除。The parallel system is maintained in operation based on the new host, and the faulty device is removed from the parallel system.3.如权利要求2所述的设备在线维护方法,其特征在于,所述从所述并机系统除所述故障设备外的若干从机中选择出新主机的步骤包括:3. The device online maintenance method according to claim 2, characterized in that the step of selecting a new master from a number of slaves in the parallel system except the faulty device comprises:获取所述若干从机的状态信息和编号信息,根据所述状态信息和编号信息选择出所述新主机。The status information and number information of the plurality of slaves are obtained, and the new master is selected according to the status information and number information.4.如权利要求1所述的设备在线维护方法,其特征在于,所述判断所述待投入设备的缓冲功能是否正常的步骤包括:4. The equipment online maintenance method according to claim 1, characterized in that the step of judging whether the buffer function of the equipment to be put into use is normal comprises:确定所述待投入设备的交流电源开关和直流电源开关均已分闸;Ensure that the AC power switch and DC power switch of the equipment to be put into operation are both opened;将所述交流电源开关合闸,判断所述待投入设备的实际直流侧电压值是否与标准直流侧电压值一致;The AC power switch is turned on to determine whether the actual DC side voltage value of the device to be put into use is consistent with the standard DC side voltage value;若是,则判定所述待投入设备的缓冲功能正常;If so, it is determined that the buffer function of the equipment to be put into use is normal;若否,则判定所述待投入设备的缓冲功能不正常。If not, it is determined that the buffer function of the equipment to be put into use is abnormal.5.如权利要求4所述的设备在线维护方法,其特征在于,所述设备类型包括主机类型和从机类型,所述根据所述设备类型,对所述并机系统进行相应的容错调整以维持所述并机系统运行,并将所述故障设备从所述并机系统中移除的步骤包括:5. The device online maintenance method according to claim 4, characterized in that the device type includes a master type and a slave type, and the steps of performing corresponding fault tolerance adjustment on the parallel system according to the device type to maintain the operation of the parallel system and removing the faulty device from the parallel system include:若所述故障设备的设备类型为从机类型,则通知所述并机系统中的主机向除去所述故障设备外的若干从机进行任务重分配,以维持所述并机系统运行,并将所述故障设备从所述并机系统中移除。If the device type of the faulty device is a slave type, the host in the parallel system is notified to reallocate tasks to several slaves excluding the faulty device to maintain the operation of the parallel system and remove the faulty device from the parallel system.6.一种设备在线维护装置,其特征在于,所述装置设于并机系统,所述装置包括:6. An equipment online maintenance device, characterized in that the device is arranged in a parallel system, and the device comprises:故障设备确定模块,用于在检测到所述并机系统中存在设备故障时,确定故障设备的设备类型;A faulty device determining module, used to determine the device type of the faulty device when a device fault is detected in the parallel system;系统容错调整模块,用于根据所述设备类型,对所述并机系统进行相应的容错调整以维持所述并机系统运行,并将所述故障设备从所述并机系统中移除;A system fault tolerance adjustment module, used to perform corresponding fault tolerance adjustment on the parallel system according to the device type to maintain the operation of the parallel system and remove the faulty device from the parallel system;设备重新接入模块,用于直至所述故障设备维护完成,若确定所维护的待投入设备通过预设的检修流程,则将所述待投入设备接入所述并机系统,其中,所述检修流程包括缓冲检测流程和/或运行检测流程;An equipment reconnection module is used to connect the equipment to be put into use to the parallel system until the maintenance of the faulty equipment is completed, if it is determined that the equipment to be put into use has passed the preset maintenance process, wherein the maintenance process includes a buffer detection process and/or an operation detection process;所述设备重新接入模块,还用于直至接收到设备新增接入指令,确定维修或更换后的待投入设备;对所述待投入设备执行所述检修流程,并在检修通过后将所述待投入设备接入所述并机系统,其中,所述待投入设备的直流电源开关在所述检修的过程中不合闸,判断所述待投入设备的缓冲功能是否正常;若是,则判断所述待投入设备是否能够正常运行;若是,则判定所述待投入设备通过检修;将所述待投入设备的直流电源开关合闸,以将所述待投入设备接入所述并机系统。The device reconnection module is also used to determine the device to be put into use after repair or replacement until a new device connection instruction is received; execute the maintenance process for the device to be put into use, and connect the device to be put into use to the parallel system after the maintenance is passed, wherein the DC power switch of the device to be put into use is not closed during the maintenance process, and judge whether the buffer function of the device to be put into use is normal; if so, judge whether the device to be put into use can operate normally; if so, determine that the device to be put into use has passed the maintenance; close the DC power switch of the device to be put into use to connect the device to be put into use to the parallel system.7.一种设备并机检修设备,其特征在于,所述设备并机检修设备包括:存储器、处理器及存储在所述存储器上并可在所述处理器上运行的设备在线维护程序,所述设备在线维护程序被所述处理器执行时实现如权利要求1至5中任一项所述的设备在线维护方法的步骤。7. A device for parallel maintenance, characterized in that the device for parallel maintenance comprises: a memory, a processor, and a device online maintenance program stored in the memory and executable on the processor, wherein the device online maintenance program implements the steps of the device online maintenance method as described in any one of claims 1 to 5 when executed by the processor.8.一种计算机可读存储介质,其特征在于,所述计算机可读存储介质包括计算机程序,所述计算机程序被处理器执行时实现如权利要求1至5中任一项所述的设备在线维护方法的步骤。8. A computer-readable storage medium, characterized in that the computer-readable storage medium comprises a computer program, and when the computer program is executed by a processor, the steps of the device online maintenance method according to any one of claims 1 to 5 are implemented.
CN202111683411.1A2021-12-312021-12-31 Device online maintenance method, device, equipment and computer-readable storage mediumActiveCN114358340B (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN202111683411.1ACN114358340B (en)2021-12-312021-12-31 Device online maintenance method, device, equipment and computer-readable storage medium

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN202111683411.1ACN114358340B (en)2021-12-312021-12-31 Device online maintenance method, device, equipment and computer-readable storage medium

Publications (2)

Publication NumberPublication Date
CN114358340A CN114358340A (en)2022-04-15
CN114358340Btrue CN114358340B (en)2025-07-15

Family

ID=81104892

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN202111683411.1AActiveCN114358340B (en)2021-12-312021-12-31 Device online maintenance method, device, equipment and computer-readable storage medium

Country Status (1)

CountryLink
CN (1)CN114358340B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
EP0433979A2 (en)*1989-12-221991-06-26Tandem Computers IncorporatedFault-tolerant computer system with/config filesystem
US5295258A (en)*1989-12-221994-03-15Tandem Computers IncorporatedFault-tolerant computer system with online recovery and reintegration of redundant components
TW200419884A (en)*2003-03-172004-10-01Phoenixtec Ind Co LtdControl method of UPS module connected in parallel and system thereof
CN206313520U (en)*2016-12-092017-07-07中铁二十二局集团电气化工程有限公司Uninterruptible power system

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US4967347A (en)*1986-04-031990-10-30Bh-F (Triplex) Inc.Multiple-redundant fault detection system and related method for its use
US5317752A (en)*1989-12-221994-05-31Tandem Computers IncorporatedFault-tolerant computer system with auto-restart after power-fall
US8051327B2 (en)*2008-10-282011-11-01Microsoft CorporationConnection between machines and power source
CN102624075B (en)*2012-04-102014-01-22河北实华科技有限公司Multimachine parallel connection method and scheme of modular UPS (uninterrupted power supply) system
CN105790980B (en)*2014-12-222020-01-31中兴通讯股份有限公司 A kind of fault repair method and device

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
EP0433979A2 (en)*1989-12-221991-06-26Tandem Computers IncorporatedFault-tolerant computer system with/config filesystem
US5295258A (en)*1989-12-221994-03-15Tandem Computers IncorporatedFault-tolerant computer system with online recovery and reintegration of redundant components
TW200419884A (en)*2003-03-172004-10-01Phoenixtec Ind Co LtdControl method of UPS module connected in parallel and system thereof
CN206313520U (en)*2016-12-092017-07-07中铁二十二局集团电气化工程有限公司Uninterruptible power system

Also Published As

Publication numberPublication date
CN114358340A (en)2022-04-15

Similar Documents

PublicationPublication DateTitle
US12172658B2 (en)Autonomous driving control system and control method and device
US7831860B2 (en)System and method for testing redundancy and hot-swapping capability of a redundant power supply
CN111385107B (en)Main/standby switching processing method and device for server
CN107819605A (en)Method and apparatus for the switching server in server cluster
CN102394791A (en)Downtime recovery method and system
CN103312767A (en)Cluster system
CN103595572B (en)A kind of method of cloud computing cluster interior joint selfreparing
CN112477919A (en)Dynamic redundancy backup method and system suitable for train control system platform
CN101442688A (en)Method and system for upgrading intelligent network platform, controller and intelligent network platform equipment
CN101145973A (en) Software upgrade method and device
CN105812161A (en)Controller fault backup method and system
EP4554173A1 (en)Network fault recovery method and apparatus, device, and storage medium
CN107491344B (en)Method and device for realizing high availability of virtual machine
CN117872709A (en)Equipment redundancy method, device, equipment and computer readable storage medium
CN114358340B (en) Device online maintenance method, device, equipment and computer-readable storage medium
US20170149693A1 (en)Computer-readable recording medium, switch controlling apparatus, and method of controlling a switch
CN119645473A (en)Cross-platform automatic deployment method and system for upgrade patch of operating system
CN119668916A (en) Cluster system fault handling method, system, device, equipment, medium and program
CN112104497A (en)Terminal management method, device, system, server, terminal and storage medium
CN113468806A (en)Fault detection method and device for energy storage charging pile and computer readable storage medium
CN117914706A (en)Network configuration method and system
CN104754562A (en)Method and device for repairing data replication abnormity
CN115664925B (en) Method, device and equipment for handling faults of emergency nodes
CN106255960B (en) Redundant systems and communication units
CN105306256B (en)A kind of two-node cluster hot backup implementation method based on VxWorks equipment

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination
GR01Patent grant
GR01Patent grant

[8]ページ先頭

©2009-2025 Movatter.jp