
技术领域technical field
本发明涉及一种消息总线监视方法和装置,尤其涉及一种电力系统消息总线监视方法和装置。The invention relates to a message bus monitoring method and device, in particular to a power system message bus monitoring method and device.
背景技术Background technique
随着我国电网规模不断扩大,特高压交直流混合运行,新能源大规模集中接入,国家电网逐渐形成了以特高压为核心的主干输电网和地方输配电网并重发展的格局。电网的快速发展和大规模互联电网的形成,使得电网特性由区域模式转向全局模式,电网运行和调度控制均呈现数据密集、通信密集和计算密集的特性,电网调控安全运行工作面临严峻的挑战。而随着特高压交直流混联电网的快速发展和新能源的大规模涌入,电网运行压力不断向调度运行环节传导,调度自动化主站系统作为电网运行控制和调度生产管理的核心支撑系统,是各类电网数据汇聚和处理的中心,调度自动化主站系统潜在的这些问题日趋突显,将成为威胁大电网安全运行的巨大隐患。With the continuous expansion of the scale of my country's power grid, the mixed operation of UHV AC and DC, and the large-scale centralized access of new energy, the State Grid has gradually formed a pattern of equal development of the main transmission network with UHV as the core and the local transmission and distribution network. The rapid development of power grids and the formation of large-scale interconnected power grids have transformed the characteristics of power grids from regional models to global models. Power grid operation and dispatching control are data-intensive, communication-intensive, and computing-intensive. With the rapid development of the UHV AC-DC hybrid power grid and the large-scale influx of new energy, the operating pressure of the power grid is continuously transmitted to the dispatching operation link. The dispatching automation master station system is the core support system for power grid operation control and dispatching production management. It is the center of data aggregation and processing of various power grids. These potential problems of the dispatch automation master station system are becoming more and more prominent, and will become a huge hidden danger that threatens the safe operation of large power grids.
2009年智能电网调度控制系统发布后,广泛应用于32个省级及以上的调度控制中心和百余个地调,该系统基于国产服务器、国产通用操作系统和数据库等基础软件开发。其中,消息总线承担了消息管理功能,有效支撑了数据采集、数据处理、事故反演、公共服务等应用,实现了消息的按需、快速传递。总线在调度自动化系统中长期运行过程中积累了大量经验,也经受了各类事故和特殊情况的考验,总线监视手段薄弱的问题逐渐浮现出来,总线数据传输缺乏有力的监管与可视化展示,没有可靠的监视数据提供给运维人员进行值班巡查,不能及时、直观地反映消息传输中出现的问题。如何及时地发现二次系统异常、定位系统异常原因,保证调度自动化主站系统数据稳定传输,这是保障电网安全稳定运行需要迫切解决的问题。After the release of the smart grid dispatching control system in 2009, it has been widely used in 32 provincial-level and above dispatching control centers and more than 100 local surveys. The system is developed based on basic software such as domestic servers, domestic general operating systems and databases. Among them, the message bus undertakes the message management function, effectively supports applications such as data collection, data processing, accident inversion, and public services, and realizes on-demand and fast delivery of messages. The bus has accumulated a lot of experience in the long-term operation of the dispatching automation system, and has also withstood the test of various accidents and special circumstances. The problem of weak bus monitoring methods has gradually emerged. The bus data transmission lacks powerful supervision and visual display, and there is no reliable The monitoring data provided to the operation and maintenance personnel for on-duty inspections cannot reflect the problems in the message transmission in a timely and intuitive manner. How to timely discover the abnormality of the secondary system and the cause of the abnormality of the positioning system, and ensure the stable data transmission of the dispatching automation master station system is an urgent problem to be solved to ensure the safe and stable operation of the power grid.
目前,在计算机网络领域,已有直接基于硬件设备的嵌入式系统消息总线监控相关实现。例如在申请号为200710067178.8的中国专利申请中,公开了一种基于消息总线的嵌入式监控系统报警管理装置,该装置对连接嵌入式数字视频系统的外部报警源进行有效监测以及数据存储,完成了嵌入式数字视频监控系统对不同报警信息的联动处理。再例如在申请号为201610446984.5的中国专利申请中,公开了一种消息总线的监控系统和方法,其所述消息总线由硬件设备实现,提供归一化处理和告警管理功能。然而上述两种监控技术面向的都是直接基于硬件设备的嵌入式系统消息总线,其总线固化在存储器芯片或单片机内部,软硬件关系紧密结合,面向特定底层应用,不支持修改和二次开发。而通用操作系统中间件层的消息总线对通用性、可移植性、可迭代性要求很高,目前尚无针对这一领域的有效监控手段。因此,使得这一领域内应对突发故障异常方面的能力明显不足,同时缺乏细致、完善的日志记录,难以对历史运行状态进行排查与追溯,给故障分析与定位带来了困难。At present, in the field of computer networks, there have been related implementations of embedded system message bus monitoring directly based on hardware devices. For example, in the Chinese patent application with application number 200710067178.8, a message bus-based embedded monitoring system alarm management device is disclosed, which effectively monitors and stores data from external alarm sources connected to the embedded digital video system, and completes the Embedded digital video monitoring system for linkage processing of different alarm information. Another example is a Chinese patent application with application number 201610446984.5, which discloses a system and method for monitoring a message bus. The message bus is implemented by hardware devices and provides normalization processing and alarm management functions. However, the above two monitoring technologies are all oriented to the embedded system message bus directly based on hardware devices. The bus is solidified inside the memory chip or single-chip microcomputer, and the relationship between software and hardware is closely integrated. It is oriented to specific underlying applications and does not support modification and secondary development. However, the message bus in the middleware layer of the general-purpose operating system has high requirements for versatility, portability, and iterability. Currently, there is no effective monitoring method for this field. Therefore, the ability to deal with unexpected failures and abnormalities in this field is obviously insufficient. At the same time, there is a lack of detailed and perfect log records, and it is difficult to check and trace the historical operation status, which brings difficulties to failure analysis and location.
发明内容Contents of the invention
发明目的:针对上述问题,提出了一种电力系统消息总线监视方法和装置,以实现对通用操作系统中间件层的消息总线的监视,并构建面向消息注册状态、消息订阅信息、消息队列堆积信息、节点消息传输速率与数据流量的全景式实时监控体系,增强消息总线监视能力,提升过程监控水平。Purpose of the invention: In view of the above problems, a method and device for monitoring the message bus of a power system are proposed to realize the monitoring of the message bus of the middleware layer of the general operating system, and to construct information oriented to message registration status, message subscription information, and message queue accumulation information , A panoramic real-time monitoring system of node message transmission rate and data flow, which enhances the monitoring capability of the message bus and improves the level of process monitoring.
技术方案:一方面,本发明涉及一种电力系统消息总线监视方法。该方法包括:各节点对来自消息总线的消息监视信息进行采集与汇聚;在各节点上对汇聚的消息监视信息展示;各节点对消息监视信息进行落盘存储。Technical solution: In one aspect, the present invention relates to a method for monitoring a power system message bus. The method includes: each node collects and aggregates message monitoring information from a message bus; each node displays the aggregated message monitoring information; each node stores the message monitoring information on disk.
进一步地,各节点对来自消息总线的消息监视信息进行采集与汇聚具体包括:对共享内存和锁进行初始化,获取当前节点的本地主机名和节点ID以及注册进程管理;然后判断当前节点在配置文件中是被设置为服务端节点还是被设置为客户端节点,如果是服务端节点则创建服务端线程、消息同步服务端线程和消息同步客户端线程;如果是服务端节点则创建消息同步客户端线程。Further, each node collects and aggregates the message monitoring information from the message bus specifically includes: initializing shared memory and locks, obtaining the local host name and node ID of the current node, and registering process management; and then judging that the current node is in the configuration file Whether it is set as a server node or a client node, if it is a server node, create a server thread, a message synchronization server thread and a message synchronization client thread; if it is a server node, create a message synchronization client thread .
进一步地,所述服务端线程包括:调用服务注册接口注册服务,使用请求/响应模型发布服务,然后在回调函数中对服务请求进行处理,回调函数判断服务请求所要查询的节点是否为当前节点,若是当前节点则调用本地查询函数,反之则在缓冲区内进行查询;查询时通过操作码和检索项选择对应接口,将查询结果拼接为应答报文返回。Further, the server thread includes: calling the service registration interface to register the service, using the request/response model to publish the service, and then processing the service request in the callback function, the callback function judges whether the node to be queried by the service request is the current node, If it is the current node, the local query function is called, otherwise, the query is performed in the buffer; when querying, the corresponding interface is selected through the operation code and the search item, and the query results are spliced into a response message and returned.
进一步地,所述消息同步服务端线程包括:调用初始化函数对服务端进行初始化,然后注册文件描述符上的事件放在事件表中,监听并检测到文件描述符上的I/O事件发生时,接口返回I/O事件数目,对应的文件描述符和事件类型则通过输出参数返回;遍历所有的I/O事件,通过事件中的文件描述符判断连接性质,若为监听套接字接收到的 I/O事件,代表有新的客户端申请连接,此时应该接受连接并将该客户端注册到事件表中;若不是监听套接字接收到的I/O事件,代表已经连接的客户端在发送数据,此时应接收来自其他节点同步过来的监控信息;在接收来自其他节点同步过来的监控信息的过程中,需要接收三次同步报文,第一次是接收同步报文头并对其进行合法性检查、更新节点运行状态信息,第二次是接收监控信息数目并获取同步报文长度,第三次是接收同步报文内容并解析,解析过程主要是将同步报文的流量、订阅和队列三项内容分别放到内存中的相应容器,包括所有节点的流量信息,通道订阅信息和消息队列信息。Further, the message synchronization server thread includes: calling an initialization function to initialize the server, then registering events on the file descriptor and placing them in the event table, listening and detecting when an I/O event on the file descriptor occurs , the interface returns the number of I/O events, and the corresponding file descriptor and event type are returned through the output parameters; traverse all I/O events, and judge the connection nature through the file descriptor in the event, if it is received by the listening socket If the I/O event is not received by the listening socket, it means that there is a new client applying for a connection. At this time, the connection should be accepted and the client should be registered in the event table; if the I/O event is not received by the listening socket, it means that the connected client The terminal is sending data, and at this time it should receive the monitoring information synchronized from other nodes; in the process of receiving the monitoring information synchronized from other nodes, it needs to receive three synchronization messages, the first time is to receive the synchronization message header and modify It performs legality checks and updates node operating status information. The second time is to receive the number of monitoring information and obtain the length of the synchronization message. The third time is to receive and analyze the content of the synchronization message. The analysis process is mainly the flow of the synchronization message, The three items of subscription and queue are placed in corresponding containers in memory, including traffic information of all nodes, channel subscription information and message queue information.
进一步地,所述消息同步客户端线程包括:为每个服务器节点建立消息同步客户端请求线程,在该线程中完成客户端的初始化和与服务端的连接;然后从共享内存中读取本节点的流量信息、订阅信息和消息队列信息写入到缓存,并将本节点相应监控信息拷贝到同步报文体中;最后发送三次同步报文,其中任何一个失败都要重新连接服务端,以保持收发次序的一致性,客户端周期性地向服务端推送监控信息,周期由配置文件获取。Further, the message synchronization client thread includes: establishing a message synchronization client request thread for each server node, completing the initialization of the client and the connection with the server in the thread; and then reading the flow of the node from the shared memory Information, subscription information, and message queue information are written to the cache, and the corresponding monitoring information of the node is copied to the synchronization message body; finally, three synchronization messages are sent, and any one of them fails to reconnect to the server to maintain the sequence of sending and receiving Consistency, the client periodically pushes monitoring information to the server, and the period is obtained by the configuration file.
进一步地,在各节点上对消息监视信息展示包括:在各节点上对当前节点的运行状态监控信息、消息流量统计信息、节点监控详情信息进行展示;所述运行状态监控信息包括:各节点秒级、分钟级、小时级有无收发消息,各节点所有应用的消息队列有无消息堆积的情况,以及是否存在异常节点;所述异常节点是指在分钟级时间粒度内没有与其他节点发生消息交互的节点;所述消息流量统计信息包括:从秒级、分钟级、小时级三种时间维度统计的各节点消息总线的发送消息流量、接收消息流量、收发消息总流量;所述节点监控详情信息包括:指定域名、态名、通道名、时间粒度中的一项或几项作为检索条件对发送消息流量或接收消息流量进行详细信息查询后得到的信息。Further, displaying the message monitoring information on each node includes: displaying the current node’s operating status monitoring information, message flow statistics, and node monitoring details on each node; the operating status monitoring information includes: each node’s second Whether there are messages sent and received at the level, minute level, and hour level, whether there is any message accumulation in the message queues of all applications of each node, and whether there are abnormal nodes; the abnormal node refers to that there is no message with other nodes within the minute-level time granularity Interacting nodes; the message traffic statistics include: the traffic of sending messages, the traffic of receiving messages, and the total traffic of sending and receiving messages of each node’s message bus from the three time dimensions of second level, minute level, and hour level; the node monitoring details The information includes: specifying one or more of the domain name, state name, channel name, and time granularity as the retrieval condition to query the detailed information of the sent message flow or received message flow.
进一步地,各节点对消息监视信息进行落盘存储包括:对于重要的监视信息,以日志形式落盘存储在文件系统中,以备后期的回溯和追查;日志文件记录了程序运行的过程中的一些关键数据,全面记录消息总线的启动、退出、故障的运行信息,记录消息发送失败、丢包重传、重复消息过滤的关键信息。Furthermore, each node’s storage of the message monitoring information includes: for important monitoring information, it is stored in the file system in the form of a log for later backtracking and tracing; the log file records the process of the program running. Some key data, comprehensively record the operation information of message bus startup, exit, and fault, and record key information of message sending failure, packet loss retransmission, and duplicate message filtering.
另一方面,本发明涉及一种电力系统消息总线监视装置。该装置包括:用于对来自消息总线的消息监视信息进行采集与汇聚的组件;用于对汇聚的消息监视信息展示的组件;用于对消息监视信息进行落盘存储的组件。In another aspect, the present invention relates to a power system message bus monitoring device. The device includes: a component for collecting and assembling message monitoring information from a message bus; a component for displaying the aggregated message monitoring information; and a component for storing the message monitoring information on disk.
有益效果:与现有技术相比:Beneficial effect: compared with the prior art:
1、本发明针对不同的节点设置不同的线程,能够实现对通用操作系统中间件层的消息总线的监视,从而更好地应对突发故障异常,提高对历史运行状态的排查与追溯能力,有利于故障分析与定位。1. The present invention sets different threads for different nodes, which can realize the monitoring of the message bus of the middleware layer of the general operating system, thereby better responding to sudden failure exceptions, improving the investigation and traceability of historical operating states, and having Facilitate fault analysis and location.
2、本发明针对电网调度控制系统缺乏监控信息的统计、整合与分析功能,没有提供相应界面,无法实现形象、直观的消息与服务监视,同时缺乏细致、完善的日志记录等问题,构建了全景式消息实时监控体系,用于监视整个系统中消息总线的运行状态,包括消息注册信息、消息订阅信息、消息堆积信息、消息流量信息等,并在此基础上按“域”、“态”、“通道”等多维度进行统计分析,准确评估系统消息运行状态,部分重要监视信息以日志形式落盘存储进文件系统,保证了对历史运行状态的排查与追溯,方便了故障分析与准确定位。2. The present invention aims at the lack of statistics, integration and analysis functions of monitoring information in the power grid dispatching control system, does not provide corresponding interfaces, cannot realize vivid and intuitive message and service monitoring, and lacks detailed and perfect log records, etc., and builds a panoramic view The message real-time monitoring system is used to monitor the running status of the message bus in the whole system, including message registration information, message subscription information, message accumulation information, message flow information, etc. "Channel" and other multi-dimensional statistical analysis to accurately evaluate the operating status of system messages, and some important monitoring information is stored in the file system in the form of logs, which ensures the investigation and traceability of historical operating status, and facilitates fault analysis and accurate positioning.
3、消息监视技术显著增强了电网调控系统总线监视能力,提升了过程监控水平,为电网调控系统数据传输和交互提供了全面、直观的状态信息,有助于保障电网安全稳定运行。3. The message monitoring technology significantly enhances the bus monitoring capability of the power grid control system, improves the process monitoring level, provides comprehensive and intuitive status information for data transmission and interaction of the power grid control system, and helps to ensure the safe and stable operation of the power grid.
附图说明Description of drawings
图1为发发明总线监视方法中消息总线监视分层设计示意图。Fig. 1 is a schematic diagram of hierarchical design of message bus monitoring in the bus monitoring method of the invention.
具体实施方式Detailed ways
下面结合附图和具体实施例对本发明作进一步说明。The present invention will be further described below in conjunction with the accompanying drawings and specific embodiments.
本发明以面向调度自动化主站系统的消息总线监视为目标开展总体设计。首先,基于消息总线传输的流程特点,研究消息传输流量/堆积等关键监视指标,采用集中式结构化指标信息收集方法,研究并实现了消息总线多维度监视技术;其次,设计了一种可计算化的高效日志落盘格式,以极低的资源开销存储消息传输、消息订阅等监视数据;最后,提出了关键监视指标的数据结构和统计算法,结合消息总线高时效分析技术,捕获消息堆积/传输异常信息,实现了消息传输的全程精细化监视,为调度自动化主站系统稳定运行提供有力支撑。The present invention develops an overall design aiming at the message bus monitoring of the dispatching automation master station system. First, based on the process characteristics of message bus transmission, key monitoring indicators such as message transmission flow/accumulation are studied, and a centralized structured index information collection method is used to study and realize the multi-dimensional monitoring technology of message bus; secondly, a computational An efficient and efficient log storage format, which stores monitoring data such as message transmission and message subscription with extremely low resource overhead; finally, a data structure and statistical algorithm for key monitoring indicators are proposed, combined with the high-time-effective analysis technology of the message bus, to capture message accumulation/ The transmission of abnormal information realizes the fine monitoring of the whole process of message transmission, and provides strong support for the stable operation of the scheduling automation master station system.
消息总线监视技术要点Key Points of Message Bus Monitoring Technology
(一)功能描述(1) Function description
消息监视模块主要提供的功能如下:The main functions provided by the message monitoring module are as follows:
a)消息监视信息采集。本节点的消息监视信息通过读取共享内存方式获取,其他节点的消息监视信息通过TCP方式同步到监视服务中心节点(监视服务中心在配置文件中指定)。a) Message monitoring information collection. The message monitoring information of this node is obtained by reading the shared memory, and the message monitoring information of other nodes is synchronized to the monitoring service center node through TCP (the monitoring service center is specified in the configuration file).
b)消息监视服务。消息监视服务包括消息流量监视、消息分类统计(按时间周期和节点)、消息堆积监视、订阅信息监视等。b) Message monitoring service. Message monitoring service includes message flow monitoring, message classification statistics (according to time period and node), message accumulation monitoring, subscription information monitoring, etc.
c)消息监视展示。展示方式分为界面展示和终端命令行展示。c) Message monitoring display. The display methods are divided into interface display and terminal command line display.
(二)分层设计(2) Hierarchical design
消息总线监视模块的设计方案如图1所示,采用分层式设计,每一层解决特定的问题,下层为上层提供数据和服务。The design scheme of the message bus monitoring module is shown in Figure 1. It adopts a layered design, each layer solves a specific problem, and the lower layer provides data and services for the upper layer.
a)数据采集层:包括本节点和其他节点的消息监视信息,内容包括消息的发送、接收、订阅、堆积等。本节点的消息监视信息通过访问共享内存方式获取,其他节点的消息监视信息需要通过网络方式获取。a) Data collection layer: including message monitoring information of this node and other nodes, including message sending, receiving, subscription, accumulation, etc. The message monitoring information of this node is obtained by accessing the shared memory, and the message monitoring information of other nodes needs to be obtained through the network.
b)数据缓冲层:为数据处理层提供数据,包括系统整体数据和各节点数据。为提高界面访问效率,数据缓冲层本身提供分类和计算功能,这样界面可以直接读取最新的计算结果,而不需要在访问时进行计算。b) Data buffer layer: provide data for the data processing layer, including the overall data of the system and the data of each node. In order to improve the efficiency of interface access, the data buffer layer itself provides classification and calculation functions, so that the interface can directly read the latest calculation results without performing calculations during access.
c)数据处理层:以接口的方式返回数据,在服务总线的回调函数中调用,提供消息的发送、接收、订阅、堆积、统计、节点运行状态等6大类服务(消息统计用于实现汇总、排序等功能,节点运行状态用于检测和诊断某个节点消息总线功能是否正常)。c) Data processing layer: return data in the form of an interface, call it in the callback function of the service bus, and provide six types of services such as message sending, receiving, subscription, accumulation, statistics, and node running status (message statistics are used to implement summary , sorting and other functions, the node running status is used to detect and diagnose whether the message bus function of a certain node is normal).
d)数据展示层:消息总线监视界面、客户端工具,提供给用户、运维、开发人员使用。d) Data display layer: message bus monitoring interface, client tools, provided to users, operation and maintenance, and developers.
e)日志存储层:主要记录消息总线的启动、退出、故障等信息,记录消息发送失败、丢包重传、重复消息过滤等关键信息,具有日志文件过期自动删除功能。e) Log storage layer: It mainly records information such as the start, exit, and failure of the message bus, and records key information such as message sending failure, packet loss retransmission, and duplicate message filtering, and has the function of automatically deleting log files when they expire.
(三)具体实施(3) Specific implementation
1)监视信息的采集与汇聚1) Collection and aggregation of surveillance information
首先对共享内存和锁进行初始化,获取本地主机名和节点ID以及注册进程管理等。然后判断当前节点是否在配置文件中被设置为服务端节点,如果是服务端节点则创建服务端线程,消息同步服务端线程和消息同步客户端线程,反之则创建客户端同步线程。First, initialize the shared memory and lock, obtain the local host name and node ID, and register process management, etc. Then determine whether the current node is set as a server node in the configuration file, if it is a server node, create a server thread, message synchronization server thread and message synchronization client thread, otherwise create a client synchronization thread.
服务端线程调用服务注册接口注册服务,使用请求/响应模型发布服务,然后在回调函数中对服务请求进行处理,回调函数判断服务请求所要查询的节点是否为当前节点,若是当前节点则调用本地查询函数,反之则在缓冲区内进行查询。查询时通过操作码和检索项选择对应接口,将查询结果拼接为应答报文返回。The server thread calls the service registration interface to register the service, publishes the service using the request/response model, and then processes the service request in the callback function. The callback function judges whether the node to be queried by the service request is the current node, and if it is the current node, calls the local query function, otherwise the query is performed in the buffer. When querying, select the corresponding interface through the operation code and the search item, and splice the query results into a response message and return it.
消息同步客户端线程首先为每个服务器节点建立消息同步客户端请求线程,在该线程中完成客户端的初始化和与服务端的连接;然后从共享内存中读取本节点的流量信息、订阅信息和消息队列信息写入到缓存,并将本节点相应监控信息拷贝到同步报文体中;最后发送三次同步报文,其中任何一个失败都要重新连接服务端,以保持收发次序的一致性,客户端周期性地向服务端推送监控信息,周期由配置文件获取。The message synchronization client thread first establishes a message synchronization client request thread for each server node, and completes the initialization of the client and the connection with the server in this thread; then reads the traffic information, subscription information and messages of the node from the shared memory The queue information is written into the cache, and the corresponding monitoring information of the node is copied into the synchronization message body; finally, three synchronization messages are sent, and if any one of them fails, the server must be reconnected to maintain the consistency of the sending and receiving sequence. The monitoring information is periodically pushed to the server, and the period is obtained by the configuration file.
消息同步服务端线程首先调用初始化函数对服务端进行初始化,然后注册文件描述符上的事件放在事件表中,监听并检测到文件描述符上的I/O事件发生时,接口返回I/O事件数目,对应的文件描述符和事件类型则通过输出参数返回。遍历所有的I/O事件,通过事件中的文件描述符判断连接性质,若为监听套接字接收到的I/O事件,代表有新的客户端申请连接,此时应该接受连接并将该客户端注册到事件表中;若不是监听套接字接收到的I/O事件,代表已经连接的客户端在发送数据,此时应接收来自其他节点同步过来的监控信息。这个过程中,需要接收三次同步报文,第一次是接收同步报文头并对其进行合法性检查、更新节点运行状态信息,第二次是接收监控信息数目并获取同步报文长度,第三次是接收同步报文内容并解析,解析过程主要是将同步报文的流量、订阅和队列三项内容分别放到内存中的相应容器,包括所有节点的流量信息,通道订阅信息和消息队列信息。The message synchronization server thread first calls the initialization function to initialize the server, and then registers the events on the file descriptor and puts them in the event table. When the I/O event on the file descriptor is monitored and detected, the interface returns I/O The number of events, corresponding file descriptors and event types are returned through output parameters. Traversing all I/O events, judging the nature of the connection through the file descriptor in the event, if it is an I/O event received by the listening socket, it means that there is a new client applying for a connection. At this time, the connection should be accepted and the The client is registered in the event table; if it is not the I/O event received by the listening socket, it means that the connected client is sending data, and at this time it should receive the monitoring information synchronized from other nodes. In this process, it is necessary to receive three synchronization messages. The first time is to receive the synchronization message header and check its validity and update the node operation status information. The second time is to receive the number of monitoring information and obtain the length of the synchronization message. The third time is to receive and analyze the content of the synchronization message. The analysis process is mainly to put the flow, subscription and queue of the synchronization message into the corresponding containers in the memory, including the flow information of all nodes, channel subscription information and message queue. information.
2)监视信息展示2) Monitor information display
a)运行状态总览界面a) Running status overview interface
本界面主要在宏观角度,分别监控各节点秒级、分钟级、小时级有无收发消息,以及各节点所有应用的消息队列有无消息堆积的情况。界面提供“仅显示异常节点”功能,若有节点在分钟级时间粒度内没有与其他节点发生消息交互,则视为可能的异常节点。界面提供通过下拉菜单选择节点功能,若不选择具体节点则默认显示系统内所有节点。This interface mainly monitors whether each node receives and receives messages at the second, minute, and hour level from a macro perspective, and whether there is any message accumulation in the message queues of all applications on each node. The interface provides the function of "displaying only abnormal nodes". If a node has no message interaction with other nodes within the minute-level time granularity, it is regarded as a possible abnormal node. The interface provides the function of selecting nodes through the drop-down menu. If no specific node is selected, all nodes in the system will be displayed by default.
b)消息流量统计界面b) Message traffic statistics interface
本界面分别从秒级、分钟级、小时级三种时间维度统计各节点消息总线发送消息流量、接收消息流量、收发消息总流量。This interface counts the sending message traffic, receiving message traffic, and total sending and receiving message traffic of each node's message bus from the three time dimensions of second level, minute level, and hour level.
界面提供按不同时间粒度的查询功能并支持通过下拉菜单选择节点,若不选择具体节点则默认显示系统内所有节点。The interface provides query functions based on different time granularities and supports selecting nodes through the drop-down menu. If no specific node is selected, all nodes in the system will be displayed by default.
c)节点监控详情界面c) Node monitoring details interface
此界面能够指定域名、态名、通道名、时间粒度中的一项或几项作为检索条件对发送消息流量或接收消息流量进行详细信息查询。同时支持对系统内任意节点的进程订阅信息和进程消息队列堆积信息进行查询。This interface can specify one or more of the domain name, state name, channel name, and time granularity as search conditions to query the detailed information of the sent message traffic or received message traffic. At the same time, it supports querying process subscription information and process message queue accumulation information of any node in the system.
3)监视信息落盘3) Surveillance information placement
对于重要的监视信息,以日志形式落盘存储在文件系统中,以备后期的回溯和追查。日志文件记录了程序运行的过程中的一些关键数据,全面记录消息总线的启动、退出、故障等运行信息,记录消息发送失败、丢包重传、重复消息过滤等关键信息,可通过日志文件,对程序运行的状态进行回溯,分析程序运行出错的原因,便于维护和排查问题。同时,也可根据日志文件中的时间信息,查看当前程序运行了多少时间,作为系统维护或其他工作的参考数据。For important monitoring information, it is stored in the file system in the form of logs for later backtracking and tracing. The log file records some key data during the running of the program, comprehensively records the operation information such as the start, exit, and failure of the message bus, and records key information such as message sending failure, packet loss retransmission, and duplicate message filtering. Through the log file, Backtrack the running status of the program, analyze the cause of the program running error, and facilitate maintenance and troubleshooting. At the same time, you can also check how long the current program has been running according to the time information in the log file, which can be used as reference data for system maintenance or other work.
提供消息日志的自动创建和定时删除过期日志功能,防止日志文件过大或过多占用磁盘空间。日志文件保存30天,30天后自动删除。设计专门线程定期检查日志文件状态,用于删除过期日志文件,并创建新日志文件。Provides the function of automatically creating message logs and regularly deleting expired logs to prevent log files from being too large or occupying too much disk space. Log files are kept for 30 days and automatically deleted after 30 days. A dedicated thread is designed to periodically check the status of log files for deleting expired log files and creating new log files.
本领域内的技术人员应明白,本发明的实施例可提供为方法、系统、或计算机程序产品。因此,本发明可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且,本发明可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。Those skilled in the art should understand that the embodiments of the present invention may be provided as methods, systems, or computer program products. Accordingly, the present invention can take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.
本发明是参照根据本发明实施例的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It should be understood that each procedure and/or block in the flowchart and/or block diagram, and a combination of procedures and/or blocks in the flowchart and/or block diagram can be realized by computer program instructions. These computer program instructions may be provided to a general purpose computer, special purpose computer, embedded processor, or processor of other programmable data processing equipment to produce a machine such that the instructions executed by the processor of the computer or other programmable data processing equipment produce a An apparatus for realizing the functions specified in one or more procedures of the flowchart and/or one or more blocks of the block diagram.
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。These computer program instructions may also be stored in a computer-readable memory capable of directing a computer or other programmable data processing apparatus to operate in a specific manner, such that the instructions stored in the computer-readable memory produce an article of manufacture comprising instruction means, the instructions The device realizes the function specified in one or more procedures of the flowchart and/or one or more blocks of the block diagram.
这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。These computer program instructions can also be loaded onto a computer or other programmable data processing device, causing a series of operational steps to be performed on the computer or other programmable device to produce a computer-implemented process, thereby The instructions provide steps for implementing the functions specified in the flow chart or blocks of the flowchart and/or the block or blocks of the block diagrams.
最后应当说明的是:以上实施例仅用以说明本发明的技术方案而非对其限制,尽管参照上述实施例对本发明进行了详细的说明,本发明中的控制节点与边缘计算节点的交互方式,收集反馈信息内容与在线调度方法在各系统中均适用,所属领域的普通技术人员应当理解:依然可以对本发明的具体实施方式进行修改或者等同替换,而未脱离本发明精神和范围的任何修改或者等同替换,其均应涵盖在本发明的权利要求保护范围之内。Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention and not to limit them. Although the present invention has been described in detail with reference to the above embodiments, the interaction mode between the control node and the edge computing node in the present invention , the content of collecting feedback information and the online scheduling method are applicable in each system, and those of ordinary skill in the art should understand that the specific implementation of the present invention can still be modified or equivalently replaced without any modification departing from the spirit and scope of the present invention Or equivalent replacements, all of which should fall within the protection scope of the claims of the present invention.
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202110053151.3ACN112865311B (en) | 2021-01-15 | 2021-01-15 | Method and device for monitoring message bus of power system |
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202110053151.3ACN112865311B (en) | 2021-01-15 | 2021-01-15 | Method and device for monitoring message bus of power system |
| Publication Number | Publication Date |
|---|---|
| CN112865311A CN112865311A (en) | 2021-05-28 |
| CN112865311Btrue CN112865311B (en) | 2022-11-01 |
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202110053151.3AActiveCN112865311B (en) | 2021-01-15 | 2021-01-15 | Method and device for monitoring message bus of power system |
| Country | Link |
|---|---|
| CN (1) | CN112865311B (en) |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN114020453A (en)* | 2021-10-26 | 2022-02-08 | 浙江鸿泉电子科技有限公司 | Service bus message processing method, service bus system, electronic device, and medium |
| CN114422333B (en)* | 2021-12-27 | 2023-11-10 | 广西壮族自治区公众信息产业有限公司 | Message consumption method and system based on message middleware back pressure |
| CN115102278B (en)* | 2022-06-16 | 2024-01-23 | 国网信息通信产业集团有限公司 | Distributed photovoltaic power quality configuration monitoring system and method |
| CN115348160B (en)* | 2022-07-15 | 2024-09-27 | 深圳手回科技集团有限公司 | Retrospective data storage method and device and computer equipment |
| CN115840680B (en)* | 2022-12-23 | 2025-09-19 | 中国电子科技集团公司第二十九研究所 | Message bus-based multi-cooperative task monitoring system |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN102238105A (en)* | 2010-04-28 | 2011-11-09 | 捷讯研究有限公司 | System and method for distributing messages to an electronic device based on communications between devices |
| CN102360310A (en)* | 2011-09-28 | 2012-02-22 | 中国电子科技集团公司第二十八研究所 | Multitask process monitoring method and system in distributed system environment |
| CN110515938A (en)* | 2019-05-09 | 2019-11-29 | 北京科东电力控制系统有限责任公司 | Data aggregation storage method, device and storage medium based on KAFKA message bus |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20040205048A1 (en)* | 2003-03-28 | 2004-10-14 | Pizzo Michael J. | Systems and methods for requesting and receiving database change notifications |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN102238105A (en)* | 2010-04-28 | 2011-11-09 | 捷讯研究有限公司 | System and method for distributing messages to an electronic device based on communications between devices |
| CN102360310A (en)* | 2011-09-28 | 2012-02-22 | 中国电子科技集团公司第二十八研究所 | Multitask process monitoring method and system in distributed system environment |
| CN110515938A (en)* | 2019-05-09 | 2019-11-29 | 北京科东电力控制系统有限责任公司 | Data aggregation storage method, device and storage medium based on KAFKA message bus |
| Title |
|---|
| 电力大数据多元数据采集监视技术研究与应用;孙超等;《计算机技术与发展》;20200710(第07期);全文* |
| Publication number | Publication date |
|---|---|
| CN112865311A (en) | 2021-05-28 |
| Publication | Publication Date | Title |
|---|---|---|
| CN112865311B (en) | Method and device for monitoring message bus of power system | |
| CN107302466B (en) | Big data analysis platform and method for dynamic loop monitoring system | |
| CN110457190B (en) | Full link monitoring method, device and system based on block chain | |
| CN114500250B (en) | System linkage comprehensive operation and maintenance system and method in cloud mode | |
| CN111464336B (en) | A method and system for high concurrent data processing based on power communication room | |
| CN102521781B (en) | Safe region-crossing equipment uniform monitoring method based on independent monitoring services, and monitoring system for the same | |
| CN111077870A (en) | An intelligent system and method for real-time acquisition and monitoring of OPC data based on flow computing | |
| CN103295155B (en) | Security core service system method for supervising | |
| CN103337012B (en) | Towards the multi-threaded intelligent comprehensive alert analysis method of grid equipment monitoring | |
| CN103532744A (en) | Information-communication integrated supporting platform of intelligent power grid | |
| WO2018064843A1 (en) | System and method for managing infrastructure of data center | |
| CN112688819A (en) | Comprehensive management system for network operation and maintenance | |
| CN108777637A (en) | A kind of data center's total management system and method for supporting server isomery | |
| CN111585352A (en) | A power monitoring system based on PMS2.0 system, equipment and readable storage medium | |
| CN112749060A (en) | Power system service bus monitoring method | |
| CN112052134A (en) | Method and device for monitoring service data | |
| CN104125085A (en) | EBS (Enterprise Service Bus) data management and control method and device | |
| CN106789398A (en) | A kind of method of media big data hadoop cluster monitoring | |
| CN111984495A (en) | A big data monitoring method, device and storage medium | |
| CN117112656A (en) | An integrated information intelligent management system and method for scientific and technological volunteer service management | |
| CN117729576A (en) | Alarm monitoring methods, devices, equipment and storage media | |
| CN202150114U (en) | Oracle monitoring system | |
| CN114138720A (en) | Log processing method, log processing device, electronic device and storage medium | |
| CN109800133A (en) | A kind of method, one-stop monitoring alarm platform and the system of unified monitoring alarm | |
| CN116594840A (en) | Log fault acquisition and analysis method, system, equipment and medium based on ELK |
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| GR01 | Patent grant | ||
| GR01 | Patent grant |