技术领域technical field
本发明涉及网络通信领域,特别涉及一种服务网口状态检测和容错的装置及其方法。The invention relates to the field of network communication, in particular to a device and method for detecting and fault-tolerant service network port status.
背景技术Background technique
随着互联网技术和广播电视技术的不断发展,基于网络的服务业务突飞猛进,网络服务将成为未来互联网的核心应用之一。With the continuous development of Internet technology and radio and television technology, network-based service business is advancing by leaps and bounds, and network service will become one of the core applications of the Internet in the future.
网络服务系统根据用户的请求,通过提供各种内容来服务用户。对于服务系统来说,系统设备的可靠性需要很高,系统设备能够长时间稳定运行,一旦出现异常情况,需要立刻能够恢复,这就需要一系列的检测和预防机制。The web service system serves users by providing various contents according to their requests. For the service system, the reliability of the system equipment needs to be high, and the system equipment can run stably for a long time. Once an abnormal situation occurs, it needs to be restored immediately. This requires a series of detection and prevention mechanisms.
通常情况下,如果服务系统正在提供服务的节点传输链路因拥塞而失效、节点的服务能力突然下降、接收的数据不完整等,这些情况都会严重影响用户的体验。为了保证用户节点接受服务的连续性,必须采取一些容错机制使网络的服务能力不受影响或尽快恢复。Under normal circumstances, if the transmission link of the node that the service system is providing services fails due to congestion, the service capability of the node suddenly drops, and the received data is incomplete, etc., these situations will seriously affect the user experience. In order to ensure the continuity of service received by user nodes, some fault-tolerant mechanisms must be adopted to ensure that the service capability of the network is not affected or restored as soon as possible.
网口容错是业界通常采用的一种容错机制。所述网口容错的普遍做法是:预先准备一个备用网口,当检测到某个网口发生异常时,将发生异常网口的服务迁移到备用网口上,当异常网口恢复时,又将服务迁移回来,但这样就导致备用网口经常处于空闲状态,未能充分利用网口资源。Network port fault tolerance is a fault tolerance mechanism commonly used in the industry. The general method of network port fault tolerance is: prepare a backup network port in advance, when an abnormality occurs in a certain network port, migrate the service of the abnormal network port to the backup network port, and when the abnormal network port recovers, restore the The service is migrated back, but this causes the standby network port to be often idle, failing to make full use of the network port resources.
发明内容Contents of the invention
本发明的目的在于克服现有技术中的网口容错方法需要配备专用的备用网口,不能充分利用网口资源的缺陷,从而提供一种能够及时将该网口所服务的数据迁移到富裕网口上的装置与方法。The purpose of the present invention is to overcome the defect that the network port fault-tolerant method in the prior art needs to be equipped with a dedicated backup network port and cannot make full use of the network port resources, thereby providing a method that can migrate the data served by the network port to the Fuyu network in time. Oral devices and methods.
为了实现上述目的,本发明提供了一种服务网口状态检测和容错的装置,该装置用于对服务器上的各个网口进行管理,包括:定时发包模块、定时检测模块以及资源管理模块;其中,In order to achieve the above object, the present invention provides a service network port state detection and fault-tolerant device, which is used to manage each network port on the server, including: a timing packet sending module, a timing detection module and a resource management module; wherein ,
所述定时发包模块定时发送ICMP的ping包到指定网关或者服务器中的各个网口,使网口的入端流量定时增长;所述定时检测模块定时向服务器中各网口的流量寄存器查询各网口的入端流量,向资源管理模块报告入端流量的增长未达标的网口出现异常;所述资源管理模块查询发生异常的网口的所有负载信息,将这些负载信息转移到其他富裕带宽能够满足负载需求的网口。Described timing sending module regularly sends the ping packet of ICMP to each network port in designated gateway or server, makes the incoming flow of network port regularly increase; The ingress flow of the interface reports to the resource management module that the growth of the ingress flow has not reached the standard and the network interface is abnormal; the resource management module queries all the load information of the abnormal network interface, and transfers these load information to other rich bandwidth. A network port that meets the load requirements.
上述技术方案中,所述定时检测模块将网口的入端流量的增长量与自身发送的特定数据量进行比较,若网口的入端流量的增长量小于自身发送的特定数据量,则网口入端流量的增长未达标。In the above technical solution, the timing detection module compares the increase of the incoming flow of the network port with the specific amount of data sent by itself, if the increased amount of the incoming flow of the network port is smaller than the specific amount of data sent by itself, the network The growth of ingress and ingress traffic did not meet the target.
上述技术方案中,所述定时检测模块的定时周期大于所述定时发包模块的定时周期。In the above technical solution, the timing period of the timing detection module is greater than the timing period of the timing packet sending module.
上述技术方案中,所述资源管理模块记录每个网口的状态信息和相关服务负载信息;其中,网口的状态信息包括:网口号、网口是否正常运行、网口最大出带宽、已使用出带宽;网口的相关服务负载信息包括该网口上的每个负载信息,每个负载信息包括:负载使用的带宽、目的MAC、目的IP、源IP、源MAC。In the above technical solution, the resource management module records the state information and related service load information of each network port; wherein, the state information of the network port includes: network port number, whether the network port is running normally, the maximum output bandwidth of the network port, the used Outgoing bandwidth; the related service load information of the network port includes each load information on the network port, and each load information includes: bandwidth used by the load, destination MAC, destination IP, source IP, and source MAC.
上述技术方案中,所述资源管理模块将出现异常的端口的负载信息转移到其他富裕带宽能够满足负载需求的网口包括:选取一个富裕带宽能够满足负载需求的容错网口,将发生异常的网口的负载数据迁移到该容错网口输出,原始负载数据报文中的源IP和源MAC信息变更为容错网口的IP和MAC,并更改发生异常网口和容错网口的已使用出带宽以及负载信息中的源IP和源MAC信息;如果没有富裕网口,及时向用户通知该服务不能进行,让用户重新开启服务。In the above technical solution, the resource management module transfers the load information of the abnormal port to other network ports with rich bandwidth that can meet the load demand, including: selecting a fault-tolerant network port with rich bandwidth that can meet the load demand, and transferring the abnormal network port The load data of the port is migrated to the fault-tolerant network port for output, the source IP and source MAC information in the original load data packet is changed to the IP and MAC of the fault-tolerant network port, and the used outgoing bandwidth of the abnormal network port and the fault-tolerant network port is changed And the source IP and source MAC information in the load information; if there is no Fuyu network port, notify the user that the service cannot be performed in time, and let the user restart the service.
本发明还提供了基于所述的服务网口状态检测和容错的装置所实现的方法,包括:The present invention also provides a method realized by the device based on the service network port state detection and fault tolerance, including:
步骤1)、向指定网关或各网口定时发送ICMP的ping包;Step 1), regularly send ICMP ping packets to the designated gateway or each network port;
步骤2)、各网口定期检测自身的入端流量,比较入端流量的增长量是否不小于自身发送的特定数据量,达到条件则表明网口在正常工作;未达到条件则表明网口处于异常状态;Step 2), each network port regularly detects its own inbound traffic, and compares whether the increase in inbound traffic is not less than the specific amount of data sent by itself. If the condition is met, it indicates that the network port is working normally; if the condition is not met, it indicates that the network port is in Abnormal state;
步骤3)、记录每个网口的状态信息和相关服务负载信息,当检测到某个网口异常时,将该网口所服务的负载数据迁移到其他具有富裕带宽的网口进行输出;若没有富裕网口,及时向用户通知该服务不能进行,以重新开启服务。Step 3), record the status information and related service load information of each network port, and when a certain network port is detected to be abnormal, migrate the load data served by the network port to other network ports with rich bandwidth for output; if If there is no Fuyu network port, the user will be notified in time that the service cannot be carried out, so as to restart the service.
本发明的优点在于:The advantages of the present invention are:
1、本发明的装置与方法能够及时检测出发生异常的网口,将该网口的服务迁移到服务器上的其他网口,从而保证用户服务的不间断;1. The device and method of the present invention can detect the abnormal network port in time, and migrate the service of the network port to other network ports on the server, thereby ensuring uninterrupted user services;
2、本发明的装置与方法能够充分利用服务器的网口资源,提高服务并发数。2. The device and method of the present invention can make full use of the network port resources of the server and increase the number of concurrent services.
附图说明Description of drawings
图1是本发明的服务网口状态检测和容错装置的功能模块图。Fig. 1 is a functional block diagram of the service network port state detection and fault tolerance device of the present invention.
具体实施方式detailed description
现结合附图对本发明作进一步的描述。The present invention will be further described now in conjunction with accompanying drawing.
本发明的服务网口状态检测和容错装置位于服务器上,能够对服务器上的各个网口做状态检测与容错处理;如图1所示,该装置包括:定时发包模块、定时检测模块以及资源管理模块;其中的定时发包模块定时发送ICMP的ping包到指定网关或者是某一网口,使该网口的入端流量能够定时增长;定时检测模块定时向服务器中各网口的流量寄存器查询各网口的入端流量,若某一个网口的入端流量的增长未达标,则向资源管理模块报告该网口异常;资源管理模块查询发生异常的网口的所有负载信息,针对每个负载,查询其余网口是否有富裕带宽,如果有,则根据负载均衡策略,将该异常网口的所有负载数据迁移到另一个富裕网口输出,并修改相应的网口状态信息和相关服务负载信息,如果没有,则及时通知用户服务不能进行。The service network port state detection and fault-tolerant device of the present invention is located on the server, and can perform state detection and fault-tolerant processing on each network port on the server; as shown in Figure 1, the device includes: timing packet sending module, timing detection module and resource management module; the timing sending module regularly sends ICMP ping packets to the designated gateway or a certain network port, so that the incoming flow of the network port can be increased regularly; the timing detection module regularly queries the traffic registers of each network port in the server for each For the inbound traffic of the network port, if the growth of the inbound traffic of a certain network port does not meet the standard, report the abnormality of the network port to the resource management module; the resource management module queries all the load information of the abnormal network port, and for each load , query whether the remaining network ports have rich bandwidth, if so, according to the load balancing strategy, migrate all the load data of the abnormal network port to another rich network port for output, and modify the corresponding network port status information and related service load information , if not, notify the user in time that the service cannot be performed.
下面对本发明的装置中的各个模块做进一步说明。Each module in the device of the present invention will be further described below.
定时发包模块所采用的ICMP(InternetControlMessageProtocol)是Internet控制报文协议。它是TCP/IP协议族的一个子协议,用于在IP主机、路由器之间传递控制消息。控制消息是指网络通不通、主机是否可达、路由是否可用等网络本身的消息。这些控制消息虽然并不传输用户数据,但是对于用户数据的传递起着重要的作用。“ping”包可以检查网络是否连通,可以很好地帮助我们分析和判定网络故障。因此,ICMP的“ping”包有特定的格式,采用ICMP的“ping”包具有通用性,不仅可以通过“ping”自身网口来增加网口的入端流量,也可以通过“ping”指定的地址或者网关,通过回复的“ping”包来增加网口的入端流量。The ICMP (InternetControlMessageProtocol) used by the timing sending module is the Internet Control Message Protocol. It is a sub-protocol of the TCP/IP protocol family and is used to transmit control messages between IP hosts and routers. The control message refers to the message of the network itself such as whether the network is unreachable, whether the host is reachable, and whether the route is available. Although these control messages do not transmit user data, they play an important role in the transmission of user data. The "ping" package can check whether the network is connected, which can help us analyze and determine network failures. Therefore, the "ping" packet of ICMP has a specific format, and the "ping" packet of ICMP is universal. It can not only increase the inbound traffic of the network port by "pinging" its own network port, but also can Address or gateway, increase the inbound traffic of the network port by replying "ping" packets.
定时检测模块定时向各网口的流量寄存器查询对应网口入端流量的增长量是否不小于自身发送的特定数据量(指包含有专门用于网口状态检测的“ping”包的数据流的流量),如果是,则说明该网口向外的链路是联通的,网口正常工作;如果不是,则说明该网口处于异常状态,不能向外提供服务,需要向资源管理模块报告,让资源管理模块将该网口上的负载数据迁移到其他网口上输出,保证服务继续进行。需要注意定时检测模块的定时周期需要大于定时发包模块的定时周期,最好在一次定时检测过程中,有两到三个“ping”包发送出去。The timing detection module regularly inquires from the flow registers of each network port whether the growth of the corresponding network port ingress traffic is not less than the specific data volume sent by itself (referring to the data stream containing the "ping" packet specially used for network port state detection. traffic), if it is, it means that the outgoing link of the network port is connected, and the network port is working normally; Let the resource management module migrate the load data on the network port to output on other network ports to ensure that the service continues. It should be noted that the timing period of the timing detection module needs to be greater than the timing period of the timing packet sending module. It is best to send out two to three "ping" packets during a timing detection process.
资源管理模块需要记录每个网口的状态信息和相关服务负载信息,网口状态信息包括:网口号、网口是否正常运行、网口最大出带宽、已使用出带宽等;网口相关的服务负载信息包括该网口上的每个负载信息,每个负载信息包括:负载使用的带宽、目的MAC、目的IP、源IP、源MAC等。当定时检测模块报告某个网口异常时,将该发生异常的网口作为出错网口,资源管理模块查询该出错网口的所有负载信息,针对每个负载,查询其余网口的剩余带宽是否能满足该负载的带宽,如果满足,则根据负载均衡策略,选取一个容错网口,将该负载在出错网口的负载数据迁移到容错网口输出,原始负载数据报文中的源IP和源MAC信息变更为容错网口的IP和MAC,并更改出错网口和容错网口的已使用出带宽以及负载信息中的源IP和源MAC信息;如果没有富裕网口,则及时向用户通知该服务不能进行,让用户重新开启服务。The resource management module needs to record the status information and related service load information of each network port. The network port status information includes: network port number, whether the network port is running normally, the maximum outgoing bandwidth of the network port, the used outgoing bandwidth, etc.; network port related services The load information includes each load information on the network port, and each load information includes: bandwidth used by the load, destination MAC, destination IP, source IP, source MAC, etc. When the timing detection module reports an abnormality of a network port, the abnormal network port is regarded as an error network port, and the resource management module queries all load information of the error network port, and for each load, checks whether the remaining bandwidth of the remaining network ports is The bandwidth that can satisfy the load, if so, select a fault-tolerant network port according to the load balancing strategy, and migrate the load data of the load on the faulty network port to the output of the fault-tolerant network port, the source IP and source IP address in the original load data packet Change the MAC information to the IP and MAC of the fault-tolerant network port, and change the used bandwidth of the faulty network port and the fault-tolerant network port, as well as the source IP and source MAC information in the load information; if there is no rich network port, the user will be notified in time The service cannot be performed, let the user restart the service.
基于本发明的装置,本发明所实现的服务网口状态检测和容错方法包括以下步骤:Based on the device of the present invention, the service network port state detection and fault tolerance method implemented by the present invention includes the following steps:
步骤1)、各网口定期向自身发送特定规则的数据,包括“ping”一个特定地址。Step 1), each network port regularly sends data of specific rules to itself, including "ping" a specific address.
步骤2)、各网口定期检测自身的入端流量,比较入端流量的增长量是否不小于自身发送的特定数据量,达到条件则表明网口在正常工作;未达到条件则表明网口处于异常状态。Step 2), each network port regularly detects its own inbound traffic, and compares whether the increase in inbound traffic is not less than the specific amount of data sent by itself. If the condition is met, it indicates that the network port is working normally; if the condition is not met, it indicates that the network port is in Abnormal state.
步骤3)、记录每个网口的状态信息和相关服务负载信息。当检测到某个网口异常时,将该网口所服务的负载数据迁移到其他可用的网口进行输出,保证用户服务的不间断。Step 3), record the state information and related service load information of each network port. When a network port is detected to be abnormal, the load data served by the network port is migrated to other available network ports for output, ensuring uninterrupted user services.
最后所应说明的是,以上实施例仅用以说明本发明的技术方案而非限制。尽管参照实施例对本发明进行了详细说明,本领域的普通技术人员应当理解,对本发明的技术方案进行修改或者等同替换,都不脱离本发明技术方案的精神和范围,其均应涵盖在本发明的权利要求范围当中。Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention rather than limit them. Although the present invention has been described in detail with reference to the embodiments, those skilled in the art should understand that modifications or equivalent replacements to the technical solutions of the present invention do not depart from the spirit and scope of the technical solutions of the present invention, and all of them should be included in the scope of the present invention. within the scope of the claims.
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201410245842.3ACN105281929B (en) | 2014-06-04 | 2014-06-04 | A kind of service network interface state-detection and fault-tolerant devices and methods therefor |
| PCT/CN2014/093489WO2015184759A1 (en) | 2014-06-04 | 2014-12-10 | Apparatus and method for state detection and fault tolerance of service network port |
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201410245842.3ACN105281929B (en) | 2014-06-04 | 2014-06-04 | A kind of service network interface state-detection and fault-tolerant devices and methods therefor |
| Publication Number | Publication Date |
|---|---|
| CN105281929Atrue CN105281929A (en) | 2016-01-27 |
| CN105281929B CN105281929B (en) | 2018-10-02 |
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN201410245842.3AExpired - Fee RelatedCN105281929B (en) | 2014-06-04 | 2014-06-04 | A kind of service network interface state-detection and fault-tolerant devices and methods therefor |
| Country | Link |
|---|---|
| CN (1) | CN105281929B (en) |
| WO (1) | WO2015184759A1 (en) |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN112445662B (en)* | 2019-08-30 | 2022-12-02 | 上海哔哩哔哩科技有限公司 | Internet data broadcast socket testing method, server and storage medium |
| CN112672203B (en)* | 2020-12-16 | 2023-05-23 | 努比亚技术有限公司 | File transfer control method, mobile terminal and computer readable storage medium |
| CN112565746A (en)* | 2020-12-30 | 2021-03-26 | 杭州视洞科技有限公司 | Automatic pressure test method and process for detecting IP address of wired network port of camera |
| CN114244723A (en)* | 2021-09-29 | 2022-03-25 | 浙江国利网安科技有限公司 | Service flow simulation method and device and service flow simulator |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN201114126Y (en)* | 2007-10-26 | 2008-09-10 | 中兴通讯股份有限公司 | Multi- net opening test device |
| US20100312866A1 (en)* | 2009-06-04 | 2010-12-09 | Fujitsu Limited | Redundancy pair detecting method, communication device and recording medium for recording redundancy pair detection program |
| CN102307122A (en)* | 2011-09-06 | 2012-01-04 | 北京傲天动联技术有限公司 | Ethernet over Coax (EoC) link failure detection system and method |
| CN102447639A (en)* | 2012-01-17 | 2012-05-09 | 华为技术有限公司 | Policy routing method and device |
| CN102833591A (en)* | 2012-08-09 | 2012-12-19 | 中兴通讯股份有限公司 | Method and device for uninterrupted on-demand service in interactive personality television system |
| CN202649363U (en)* | 2011-12-19 | 2013-01-02 | 光一科技股份有限公司 | Network port detection device |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN201114126Y (en)* | 2007-10-26 | 2008-09-10 | 中兴通讯股份有限公司 | Multi- net opening test device |
| US20100312866A1 (en)* | 2009-06-04 | 2010-12-09 | Fujitsu Limited | Redundancy pair detecting method, communication device and recording medium for recording redundancy pair detection program |
| CN102307122A (en)* | 2011-09-06 | 2012-01-04 | 北京傲天动联技术有限公司 | Ethernet over Coax (EoC) link failure detection system and method |
| CN202649363U (en)* | 2011-12-19 | 2013-01-02 | 光一科技股份有限公司 | Network port detection device |
| CN102447639A (en)* | 2012-01-17 | 2012-05-09 | 华为技术有限公司 | Policy routing method and device |
| CN102833591A (en)* | 2012-08-09 | 2012-12-19 | 中兴通讯股份有限公司 | Method and device for uninterrupted on-demand service in interactive personality television system |
| Title |
|---|
| 幸福等: "Linux 服务器下多网口负载均衡算法的研究", 《计算机工程与应用》* |
| 罗雪松等: "双网卡冗余在嵌入式容错网络中的设计与实现", 《四川师范大学学报(自然科学版)》* |
| Publication number | Publication date |
|---|---|
| CN105281929B (en) | 2018-10-02 |
| WO2015184759A1 (en) | 2015-12-10 |
| Publication | Publication Date | Title |
|---|---|---|
| EP2671352B1 (en) | System and method for aggregating and estimating the bandwidth of multiple network interfaces | |
| JP7313480B2 (en) | Congestion Avoidance in Slice-Based Networks | |
| CN104219107B (en) | A kind of detection method of communication failure, apparatus and system | |
| JP5913635B2 (en) | Redundant network connection | |
| CN108965123A (en) | A kind of link switch-over method and network communicating system | |
| CN113472646B (en) | Data transmission method, node, network manager and system | |
| CN108123824A (en) | A kind of network fault detecting method and device | |
| US9515919B2 (en) | Method and apparatus for protection switching in packet transport system | |
| US20100218034A1 (en) | Method And System For Providing High Availability SCTP Applications | |
| CN108206753A (en) | A kind of method, apparatus and system for detecting time delay | |
| CN104283711B (en) | Fault detection method, node and system based on bidirectional forwarding detection BFD | |
| CN109088819A (en) | A kind of message forwarding method, interchanger and computer readable storage medium | |
| CN105281929B (en) | A kind of service network interface state-detection and fault-tolerant devices and methods therefor | |
| CN105656715A (en) | Method and device for monitoring state of network device under cloud computing environment | |
| CN107332793B (en) | A message forwarding method, related equipment and system | |
| CN105610594B (en) | Fault diagnosis method and device for service chain | |
| CN110690989A (en) | Service data transmission method, device and computer readable storage medium | |
| CN108270593A (en) | A kind of two-node cluster hot backup method and system | |
| CN101330404A (en) | Method, system and device for managing network device port status | |
| US7746949B2 (en) | Communications apparatus, system and method of creating a sub-channel | |
| KR20200072941A (en) | Method and apparatus for handling VRRP(Virtual Router Redundancy Protocol)-based network failure using real-time fault detection | |
| US8903991B1 (en) | Clustered computer system using ARP protocol to identify connectivity issues | |
| US11469981B2 (en) | Network metric discovery | |
| US11290319B2 (en) | Dynamic distribution of bidirectional forwarding detection echo sessions across a multi-processor system | |
| CN115883492B (en) | RoCE-SAN lossless storage network fault convergence method under MLAG environment |
| Date | Code | Title | Description |
|---|---|---|---|
| C06 | Publication | ||
| PB01 | Publication | ||
| C10 | Entry into substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| GR01 | Patent grant | ||
| GR01 | Patent grant | ||
| CF01 | Termination of patent right due to non-payment of annual fee | ||
| CF01 | Termination of patent right due to non-payment of annual fee | Granted publication date:20181002 Termination date:20200604 |