CN120455497B

Movatterモバイル変換

Info

Publication number: CN120455497B
Application number: CN202510947360.0A
Authority: CN
Inventors: 齐立; 宋佳旭; 田佳卉; 魏志华; 刘茂林
Original assignee: Shuntong Information Technology Dalian Co ltd
Current assignee: Shuntong Information Technology Dalian Co ltd
Priority date: 2025-07-10
Filing date: 2025-07-10
Publication date: 2025-09-02
Anticipated expiration: 2045-07-10
Also published as: CN120455497A

Abstract

Translated fromChinese

本发明公开了基于工业控制平台的工业设备互联互通方法，涉及工业设备互联互通技术领域，具体包括以下步骤：对内容重复的补发数据帧，计算其与被补发数据帧之间的行为纠缠度，若纠缠度超过预设阈值，则提取顺序偏移量、时间延迟量和字段波动幅度，生成用于描述数据标识时效程度的指标向量；将生成的指标向量输入经过预先训练后的图嵌入算法模型，生成第一参数与第二参数，第一参数用于描述该数据帧与历史演化路径的一致性，第二参数用于刻画该数据帧标识的时效程度。本发明解决了工业设备补发数据无法被准确识别的问题，基于因果图谱与图嵌入模型，实现了状态数据的时效性判断与有效性识别。

The present invention discloses an industrial equipment interconnection and intercommunication method based on an industrial control platform, which relates to the technical field of industrial equipment interconnection and intercommunication, and specifically includes the following steps: for a reissued data frame with repeated content, calculating the behavioral entanglement between it and the reissued data frame; if the entanglement exceeds a preset threshold, extracting the sequence offset, time delay, and field fluctuation amplitude to generate an indicator vector for describing the timeliness of the data identification; inputting the generated indicator vector into a pre-trained graph embedding algorithm model to generate a first parameter and a second parameter, the first parameter being used to describe the consistency of the data frame with the historical evolution path, and the second parameter being used to characterize the timeliness of the data frame identification. The present invention solves the problem that reissued data of industrial equipment cannot be accurately identified, and based on a causal graph and a graph embedding model, realizes the timeliness judgment and validity identification of status data.

Description

Translated fromChinese

基于工业控制平台的工业设备互联互通方法Industrial equipment interconnection method based on industrial control platform

技术领域Technical Field

本发明涉及工业设备互联互通技术领域，具体涉及基于工业控制平台的工业设备互联互通方法。The present invention relates to the technical field of industrial equipment interconnection and intercommunication, and in particular to an industrial equipment interconnection and intercommunication method based on an industrial control platform.

背景技术Background Art

在现代工业生产体系中，随着制造设备的多样化与智能化发展，工业现场中往往同时部署来自不同厂商、具有不同通信协议与接口标准的工业设备。为了实现对这些设备的集中管理与高效协同，工业设备之间的互联互通成为工业自动化与数字化建设中的关键环节。工业设备互联互通，指的是实现设备间数据采集、命令控制、状态反馈等信息的双向传输与兼容通信，使得各类设备能够在统一系统中协同运行，支持数据共享、统一监控与集中控制。在这一背景下，工业控制平台作为信息汇聚与指令中枢，具备强大的计算能力、通信管理能力和业务逻辑处理能力，能够承载大规模设备接入与高频数据流通的需求。基于工业控制平台构建工业设备互联互通系统，不仅可以实现对多种设备的统一接入和集成管理，还能通过平台提供的远程服务能力，支持实时监控、智能分析和协同控制，从而为工业企业提供高度集中化、智能化的生产运行支撑，推动生产管理的数字化与高效化。In modern industrial production systems, with the diversification and intelligent development of manufacturing equipment, industrial sites often deploy industrial equipment from different manufacturers and with varying communication protocols and interface standards. To achieve centralized management and efficient collaboration of these devices, interoperability among industrial equipment has become a key component of industrial automation and digitalization. Industrial equipment interoperability refers to the bidirectional transmission and compatible communication of information, such as data collection, command control, and status feedback, between devices. This enables various types of equipment to operate collaboratively within a unified system, supporting data sharing, unified monitoring, and centralized control. In this context, industrial control platforms, as information aggregation and command hubs, possess powerful computing, communication management, and business logic processing capabilities, capable of handling the demands of large-scale device access and high-frequency data flow. Building an industrial equipment interoperability system based on an industrial control platform not only enables unified access and integrated management of diverse devices, but also supports real-time monitoring, intelligent analysis, and collaborative control through the platform's remote service capabilities. This provides industrial enterprises with highly centralized and intelligent production operations and promotes digital and efficient production management.

现有的基于工业控制平台的工业设备互联互通技术，通常是通过在工业控制平台中集成多协议通信模块、数据采集模块和设备管理模块等功能单元，来实现对各类工业设备的统一接入与集中管理。具体而言，首先在设备接入层，平台通过支持Modbus、OPC UA、PROFINET、EtherCAT等主流工业通信协议，或者通过部署协议适配器和数据网关，将不同品牌、不同通信接口的设备接入到平台中；随后，在数据解析与映射阶段，平台会对采集到的原始数据进行统一格式化与语义映射，确保不同设备的数据结构能够兼容处理；接着，在数据管理与服务调度层，工业控制平台基于预设的数据模型和控制逻辑，对设备状态进行实时监控、历史数据存储、异常报警分析等处理，并通过统一的接口将处理结果提供给上层应用系统或运维人员；此外，在远程控制与协同执行环节，平台能够下发控制指令至各接入设备，实现跨设备的联动与自动控制策略；最后，在安全保障层，现有系统普遍通过访问控制、身份认证、通信加密及数据隔离等手段，确保工业数据在传输和处理过程中的安全性与完整性，从而形成了一个以工业控制平台为核心、贯穿设备接入、数据解析、控制调度与安全防护等多个环节的工业设备互联互通体系。Existing industrial equipment interconnection technologies based on industrial control platforms usually achieve unified access and centralized management of various industrial equipment by integrating functional units such as multi-protocol communication modules, data acquisition modules, and equipment management modules into the industrial control platform. Specifically, at the device access layer, the platform connects devices of different brands and with different communication interfaces by supporting mainstream industrial communication protocols such as Modbus, OPC UA, PROFINET, and EtherCAT, or by deploying protocol adapters and data gateways. Subsequently, during the data parsing and mapping phase, the platform uniformly formats and semantically maps the collected raw data to ensure compatible data structures across different devices. Next, at the data management and service scheduling layer, the industrial control platform monitors device status in real time, stores historical data, and analyzes abnormal alarms based on pre-set data models and control logic. The platform then provides the results to upper-level application systems or operations and maintenance personnel through a unified interface. Furthermore, during remote control and collaborative execution, the platform can issue control commands to each connected device, enabling cross-device linkage and automated control strategies. Finally, at the security layer, existing systems generally ensure the security and integrity of industrial data during transmission and processing through access control, identity authentication, communication encryption, and data isolation. This results in an industrial device interconnection system centered on the industrial control platform, encompassing multiple links including device access, data parsing, control scheduling, and security protection.

现有技术存在以下不足：The existing technology has the following deficiencies:

在工业控制平台对工业设备进行互联互通的过程中，当通信网络发生瞬时丢包或延迟的情况下，部分具备数据补发机制的设备会自动补发上一周期的状态数据包，此类数据虽然内容与原始数据一致，但时间戳或顺序编号已发生微小变化；由于设备发送行为不透明，平台无法直接识别该数据是否为“补发”而非“实时”，从而在字段内容与格式均合法的前提下，平台默认将其视为最新状态数据并采纳。由于平台现有的数据处理方式未能引入对数据来源时效、数据行为背景（如补发行为）的识别与校验机制，进而无法根据设备在数据补发行为发生且内容重复的情况下的数据标识时效程度去判断该状态字段是否应作为当前设备的有效运行状态，从而造成对设备真实状态的误判。这将导致平台在错误的设备状态基础上做出错误控制决策，例如重复触发状态异常处理、错误中断当前任务或推迟关键命令的下发，进而产生生产节奏混乱、调度逻辑失真、设备控制链条失步等一系列严重后果。When industrial control platforms interconnect industrial devices, some devices equipped with data retransmission mechanisms automatically retransmit status packets from the previous cycle when the communication network experiences momentary packet loss or delay. While the content of this data remains consistent with the original data, the timestamp or sequence number may have been slightly altered. Due to the opaque nature of the device's transmission behavior, the platform cannot directly identify whether this data is "retransmitted" rather than "real-time." Therefore, provided the field content and format are valid, the platform defaults to treating it as the latest status data and adopts it accordingly. Because the platform's existing data processing methods fail to incorporate mechanisms to identify and verify the timeliness of the data source and the context of the data behavior (such as retransmission behavior), it is unable to determine whether the status field should be considered as the current device's valid operating status based on the timeliness of the data identifier when the data retransmission occurs and the content is duplicated, resulting in misjudgment of the device's true status. This can lead to incorrect control decisions based on the platform's erroneous device status, such as repeatedly triggering status exception handling, incorrectly interrupting current tasks, or delaying the issuance of critical commands. This can lead to serious consequences such as disrupted production rhythms, distorted scheduling logic, and loss of synchronization in the device control chain.

在所述背景技术部分公开的上述信息仅用于加强对本公开的背景的理解，因此它可以包括不构成对本领域普通技术人员已知的现有技术的信息。The above information disclosed in this Background section is only for enhancement of understanding of the background of the present disclosure and therefore it may contain information that does not form the prior art that is already known to a person of ordinary skill in the art.

发明内容Summary of the Invention

本发明的目的是提供基于工业控制平台的工业设备互联互通方法，以解决上述背景技术中的问题。The purpose of the present invention is to provide an industrial equipment interconnection method based on an industrial control platform to solve the problems in the above-mentioned background technology.

为了实现上述目的，本发明提供如下技术方案：基于工业控制平台的工业设备互联互通方法，具体包括以下步骤：In order to achieve the above objectives, the present invention provides the following technical solution: an industrial equipment interconnection method based on an industrial control platform, specifically comprising the following steps:

构建工业设备状态字段与控制指令之间的因果图谱，用于记录字段组合与任务演化路径的因果关系；Construct a causal graph between industrial equipment status fields and control instructions to record the causal relationship between field combinations and task evolution paths;

接收来自工业设备的数据帧，判断数据帧是否与历史帧字段内容一致且时间戳或顺序编号不同，若满足则标记为内容重复的补发数据帧；Receive data frames from industrial equipment and determine whether the data frame is consistent with the historical frame field content and the timestamp or sequence number is different. If so, mark it as a retransmitted data frame with duplicate content;

对内容重复的补发数据帧，计算其与被补发数据帧之间的行为纠缠度，若纠缠度超过预设阈值，则提取顺序偏移量、时间延迟量和字段波动幅度，生成用于描述数据标识时效程度的指标向量；For resent data frames with duplicate content, the behavioral entanglement between them and the resent data frame is calculated. If the entanglement exceeds a preset threshold, the sequence offset, time delay, and field fluctuation amplitude are extracted to generate an indicator vector describing the timeliness of the data identification.

将生成的指标向量输入经过预先训练后的图嵌入算法模型，生成第一参数与第二参数，第一参数用于描述该数据帧与历史演化路径的一致性，第二参数用于刻画该数据帧标识的时效程度；The generated indicator vector is input into the pre-trained graph embedding algorithm model to generate the first parameter and the second parameter. The first parameter is used to describe the consistency of the data frame with the historical evolution path, and the second parameter is used to characterize the timeliness of the data frame identification.

基于第一参数与第二参数的联合判断，判断该数据帧对应的状态字段是否应作为当前设备的有效运行状态；Based on a joint determination of the first parameter and the second parameter, determining whether the status field corresponding to the data frame should be used as a valid operating status of the current device;

根据判断结果更新设备状态快照，并将判定结果反馈至因果图谱路径，用于动态调整路径置信权重与状态识别策略。The device status snapshot is updated based on the judgment results, and the judgment results are fed back to the causal graph path to dynamically adjust the path confidence weight and state identification strategy.

优选的，构建工业设备状态字段与控制指令之间的因果图谱，具体为：Preferably, a causal graph between the industrial equipment status field and the control instruction is constructed, specifically:

通过采集工业设备在任务执行过程中的历史运行数据，提取每一条状态数据帧所包含的字段组与其对应的控制指令，构建以字段组、控制指令及其发生先后顺序组成的三元组集合；By collecting historical operating data of industrial equipment during task execution, the field groups and corresponding control instructions contained in each status data frame are extracted, and a triple set consisting of field groups, control instructions and their occurrence sequence is constructed;

将该三元组集合表示为图结构中的有向边关系，字段组作为图中的起始节点，控制指令作为终止节点，同时依据任务演化时间轴对边赋予时间权重与频次权重，以形成可用于状态演化路径推理的有向因果图谱。The triple set is represented as a directed edge relationship in a graph structure, with the field group as the starting node and the control instruction as the ending node. At the same time, time weight and frequency weight are assigned to the edges according to the task evolution timeline to form a directed causal graph that can be used for state evolution path reasoning.

优选的，接收来自工业设备的数据帧，判断数据帧是否与历史帧字段内容一致且时间戳或顺序编号不同，若满足则标记为内容重复的补发数据帧，具体为：Preferably, a data frame from an industrial device is received, and it is determined whether the data frame is consistent with the historical frame field content and the timestamp or sequence number is different. If so, it is marked as a retransmitted data frame with duplicate content, specifically:

接收来自工业设备的数据帧，提取数据帧中的多个关键状态字段，按照字段采集顺序与工业设备数据结构模板排列，构建字段序列特征向量，用于表示当前数据帧所对应的完整设备状态；Receive data frames from industrial equipment, extract multiple key status fields from the data frames, arrange them in the order of field collection and the industrial equipment data structure template, and construct a field sequence feature vector to represent the complete equipment status corresponding to the current data frame;

对字段序列特征向量进行哈希编码处理，通过加权字段值与字段位置索引的组合编码方式，生成字段内容签名值，该签名值在同一设备维度内用于唯一标识一组状态字段组合；Perform hash coding on the field sequence feature vector and generate a field content signature value by combining the weighted field value and the field position index. This signature value is used to uniquely identify a group of status field combinations within the same device dimension.

在历史数据缓冲区中检索字段内容签名值相同的历史数据帧，并选取其中与当前数据帧时间间隔最小的一条作为对比帧；Retrieve historical data frames with the same field content signature value in the historical data buffer, and select the one with the smallest time interval with the current data frame as the comparison frame;

判断当前数据帧与对比帧的时间戳或顺序编号是否存在非递增、反转或编号间断的情况，若存在，则标记当前数据帧为补发数据的候选对象；Determine whether the timestamps or sequence numbers of the current data frame and the comparison frame are non-increasing, reversed, or discontinuous. If so, mark the current data frame as a candidate for re-sending data.

计算当前数据帧接收时间与对比帧记录时间之间的时间偏移量，若该偏移量小于预设的任务周期容忍阈值，且当前数据帧的字段内容与对比帧保持完全一致，则判定当前数据帧为内容重复的补发数据帧。Calculate the time offset between the reception time of the current data frame and the recording time of the comparison frame. If the offset is less than the preset task cycle tolerance threshold and the field content of the current data frame remains completely consistent with the comparison frame, then determine that the current data frame is a retransmitted data frame with duplicate content.

优选的，对内容重复的补发数据帧，计算其与被补发数据帧之间的行为纠缠度，若纠缠度超过预设阈值，则提取顺序偏移量、时间延迟量和字段波动幅度，生成用于描述数据标识时效程度的指标向量，具体包括以下步骤：Preferably, for a reissued data frame with repeated content, the behavioral entanglement between it and the reissued data frame is calculated. If the entanglement exceeds a preset threshold, the sequence offset, time delay, and field fluctuation amplitude are extracted to generate an indicator vector for describing the timeliness of the data identification. Specifically, the following steps are included:

采集内容重复的补发数据帧与其被补发数据帧在前后时间窗口内预设数量的字段序列向量，按照时间顺序排列为字段值时间序列，并将每一时刻的字段序列向量按采样顺序组成向量列表，以形成表示字段演化趋势的向量序列轨迹，其中每条轨迹用于反映设备在补发行为前后相同字段组合在状态空间中的时序变化过程；Collect a preset number of field sequence vectors of the reissued data frame with repeated content and the data frame being reissued within the time window before and after, arrange them in chronological order as a field value time series, and organize the field sequence vectors at each moment into a vector list in the sampling order to form a vector sequence trajectory representing the field evolution trend, where each trajectory is used to reflect the temporal change process of the same field combination in the state space before and after the reissue behavior;

基于余弦相似度与动态时间规整算法联合计算两段向量序列轨迹之间的行为纠缠度，余弦相似度用于衡量局部字段变化的一致性，动态时间规整用于比对整体时序结构的匹配程度；The behavioral entanglement between two vector sequence trajectories is calculated based on the cosine similarity and dynamic time warping algorithms. Cosine similarity is used to measure the consistency of local field changes, and dynamic time warping is used to compare the matching degree of the overall temporal structure.

若行为纠缠度超过预设阈值，则计算补发数据帧的时间戳与被补发数据帧的时间戳之间的时间延迟量，顺序编号之间的差值作为顺序偏移量，字段对应值的差分均值作为字段波动幅度；If the behavior entanglement exceeds the preset threshold, the time delay between the timestamp of the resent data frame and the timestamp of the resent data frame is calculated, the difference between the sequence numbers is used as the sequence offset, and the differential mean of the corresponding values of the fields is used as the field fluctuation amplitude;

将时间延迟量、顺序偏移量与字段波动幅度分别进行归一化处理，并按照预设顺序组合为具有统一量纲的指标向量，用于表征该补发数据帧的标识时效程度。The time delay, sequence offset and field fluctuation amplitude are normalized respectively and combined into an indicator vector with a unified dimension according to a preset order, which is used to characterize the identification timeliness of the reissued data frame.

优选的，基于余弦相似度与动态时间规整算法联合计算两段向量序列轨迹之间的行为纠缠度，具体的计算方式如下：Preferably, the behavioral entanglement between two vector sequence trajectories is calculated based on the cosine similarity and the dynamic time warping algorithm. The specific calculation method is as follows:

对内容重复的补发数据帧与被补发数据帧在各自前后预设时间窗口内提取的字段序列向量，分别按采样顺序排列为两个向量序列轨迹，对每条轨迹中的字段向量依字段最大值与最小值进行归一化处理，将字段值映射至统一区间；The field sequence vectors extracted from the retransmitted data frame with repeated content and the retransmitted data frame within the preset time window before and after each are arranged into two vector sequence tracks in sampling order. The field vectors in each track are normalized according to the maximum and minimum values of the field, and the field values are mapped to a unified interval.

基于归一化后的两个向量序列轨迹，采用动态时间规整算法，计算两条轨迹中每对字段向量之间的欧氏距离，并根据最小累计代价构建一一对应的匹配路径，同时记录路径中每对向量在各自轨迹中的采样时间位置；Based on the two normalized vector sequence trajectories, a dynamic time warping algorithm is used to calculate the Euclidean distance between each pair of field vectors in the two trajectories. A one-to-one matching path is constructed based on the minimum cumulative cost, and the sampling time position of each pair of vectors in the path in their respective trajectories is recorded.

对匹配路径中所有已配对的字段向量，计算其余弦相似度，按路径顺序形成相似度数值序列，计算该数值序列的算术平均值，作为衡量两个轨迹在局部字段变化方向上一致程度的第一数值；For all paired field vectors in the matching path, calculate the cosine similarity, form a similarity value sequence in the order of the paths, and calculate the arithmetic mean of the value sequence as the first value to measure the consistency of the two trajectories in the direction of local field change;

对匹配路径中各字段向量对的采样时间位置差值进行平均，获得两个轨迹在整体时间结构上的偏移程度作为第二数值，将第一数值与第二数值分别归一化后按照预设加权比例进行线性组合，得到用于表征两个向量轨迹整体变化一致程度的行为纠缠度。The sampling time position differences of each field vector pair in the matching path are averaged to obtain the degree of offset of the two trajectories in the overall time structure as the second value. The first and second values are normalized respectively and then linearly combined according to the preset weighting ratio to obtain the behavioral entanglement degree used to characterize the degree of consistency of the overall changes of the two vector trajectories.

优选的，将生成的指标向量输入经过预先训练后的图嵌入算法模型，生成第一参数与第二参数，具体为：Preferably, the generated indicator vector is input into a pre-trained graph embedding algorithm model to generate the first parameter and the second parameter, specifically:

将生成的指标向量作为当前数据帧的特征向量，在字段组合与控制指令之间的因果图谱中定位与该指标向量匹配的字段组合所对应的图节点，采集该节点在图结构中的邻接节点及其边权信息，构建结构邻接矩阵与节点特征矩阵；The generated indicator vector is used as the feature vector of the current data frame. The graph node corresponding to the field combination matching the indicator vector is located in the causal graph between the field combination and the control instruction. The adjacent nodes and edge weight information of the node in the graph structure are collected to construct the structural adjacency matrix and the node feature matrix.

将结构邻接矩阵与包含指标向量的节点特征矩阵一同输入至预先训练完成的图嵌入算法模型中，模型在融合字段组合上下文结构与指标动态特征的基础上，输出表示该图节点时态语义的嵌入向量；The structural adjacency matrix and the node feature matrix containing the indicator vector are input into a pre-trained graph embedding algorithm model. The model outputs an embedding vector representing the temporal semantics of the graph node based on the integration of the field combination context structure and the indicator dynamic features.

从嵌入向量中提取用于度量当前数据帧与历史字段演化轨迹一致性的若干维度分量，计算该维度分量的均值，得到用于表征字段演化一致性的第一数值参数；Extracting several dimensional components for measuring the consistency between the current data frame and the historical field evolution trajectory from the embedding vector, calculating the mean of the dimensional components, and obtaining a first numerical parameter for characterizing the field evolution consistency;

从嵌入向量中提取用于反映当前数据帧与历史控制路径中字段扰动幅度与时间偏移表现的若干维度分量，计算该分量的加权平均值，得到用于刻画状态延迟与扰动幅度的第二数值参数。Several dimensional components reflecting the field disturbance amplitude and time offset performance in the current data frame and the historical control path are extracted from the embedded vector, and the weighted average of the components is calculated to obtain a second numerical parameter for characterizing the state delay and disturbance amplitude.

优选的，预先训练完成的图嵌入算法模型具体为：Preferably, the pre-trained graph embedding algorithm model is specifically:

以字段组合与控制指令之间因果路径构成的图结构及其历史状态数据为基础，构建训练样本，将每个字段组合节点对应的历史指标向量作为输入，将该节点在历史任务中的状态判定结果作为训练目标，通过训练使图嵌入算法模型能够基于图结构与指标特征生成嵌入向量；Based on the graph structure consisting of the causal paths between field combinations and control instructions and its historical state data, training samples are constructed. The historical indicator vector corresponding to each field combination node is used as input, and the state judgment results of this node in historical tasks are used as training targets. Through training, the graph embedding algorithm model can generate embedding vectors based on the graph structure and indicator characteristics.

训练完成后，该模型在输入结构邻接矩阵与指标向量后，输出嵌入向量，其中包含用于生成第一参数与第二参数的数值维度，分别用于反映数据帧与历史状态路径的一致性及其标识状态的时效程度。After training, the model takes as input the structural adjacency matrix and the indicator vector and outputs an embedding vector, which contains the numerical dimensions used to generate the first and second parameters, respectively used to reflect the consistency of the data frame with the historical state path and the timeliness of its identification state.

优选的，基于第一参数与第二参数的联合判断，判断该数据帧对应的状态字段是否应作为当前设备的有效运行状态，具体为：Preferably, based on the joint judgment of the first parameter and the second parameter, it is judged whether the status field corresponding to the data frame should be used as the valid operating status of the current device, specifically:

构建二维参数判定空间，将第一参数作为水平坐标、第二参数作为垂直坐标，对历史任务执行过程中已知有效与无效的状态数据帧进行散点分布标注，依据分布结果预设有效状态判定边界；Construct a two-dimensional parameter judgment space, using the first parameter as the horizontal coordinate and the second parameter as the vertical coordinate. Scatter distribution annotation is performed on the known valid and invalid state data frames during the historical task execution process, and the valid state judgment boundary is preset based on the distribution results.

接收当前数据帧对应的第一参数与第二参数，在二维参数判定空间中计算该数据帧所对应的坐标点；receiving a first parameter and a second parameter corresponding to a current data frame, and calculating a coordinate point corresponding to the data frame in a two-dimensional parameter determination space;

判断该坐标点是否处于预设的有效状态判定边界之内，若在边界内则判断该数据帧对应的状态字段为当前设备的有效运行状态，若在边界之外则判定为无效状态；Determine whether the coordinate point is within the preset valid state determination boundary. If it is within the boundary, the state field corresponding to the data frame is determined to be the valid operating state of the current device. If it is outside the boundary, it is determined to be an invalid state.

预设的有效状态判定边界基于历史数据分布特征采用固定阈值拟合方式构建，所使用的阈值来源于历史状态数据统计分析。The preset effective state judgment boundary is constructed based on the distribution characteristics of historical data using a fixed threshold fitting method, and the threshold used is derived from the statistical analysis of historical state data.

在上述技术方案中，本发明提供的技术效果和优点：In the above technical solution, the technical effects and advantages provided by the present invention are:

1、本发明通过构建工业设备状态字段与控制指令之间的因果图谱，并结合历史任务演化路径，实现了对状态字段组合与控制逻辑之间因果关联的结构化建模，突破了传统数据处理方式无法识别补发数据行为背景的技术瓶颈。尤其是结合字段内容签名与时间信息建立的状态追踪机制，使得平台具备了对“内容一致但时间滞后”的补发数据帧的主动识别能力，从而有效解决了平台误判补发数据为实时状态所导致的设备控制误触发问题，提升了设备间信息互通的准确性与系统控制决策的稳定性。1. This invention constructs a causal graph between industrial equipment status fields and control instructions, and combines it with historical task evolution paths to achieve structured modeling of the causal relationship between status field combinations and control logic. This overcomes the technical bottleneck of traditional data processing methods, which cannot identify the behavioral context of reissued data. In particular, the state tracking mechanism established by combining field content signatures with time information enables the platform to actively identify reissued data frames with "consistent content but delayed time." This effectively solves the problem of device control mistriggering caused by the platform misjudging reissued data as real-time, improving the accuracy of information exchange between devices and the stability of system control decisions.

2、本发明引入“行为纠缠度”这一新型指标，利用向量轨迹、动态时间规整与余弦相似度的联合计算方式，精确建模了补发数据与原始数据在字段演化趋势与时序结构上的差异，进而量化了状态数据的“时效程度”。在此基础上生成的指标向量作为嵌入模型的输入，实现了对数据帧在“字段一致性”和“标识新鲜度”两个维度的高精度表征。通过训练好的图嵌入算法模型进一步映射为语义参数，使得平台在面对复杂数据流时，能够基于历史演化语境与指标特征联合判断状态的可靠性，大幅提升了数据有效性识别的智能化水平。2. The present invention introduces a new indicator called "behavioral entanglement". It uses the joint calculation method of vector trajectory, dynamic time warping and cosine similarity to accurately model the differences between the reissued data and the original data in the field evolution trend and time series structure, and then quantify the "timeliness" of the status data. The indicator vector generated on this basis is used as the input of the embedding model to achieve high-precision characterization of the data frame in the two dimensions of "field consistency" and "identification freshness". The trained graph embedding algorithm model is further mapped into semantic parameters, so that when facing complex data streams, the platform can jointly judge the reliability of the status based on the historical evolution context and indicator characteristics, which greatly improves the intelligent level of data validity identification.

3、本发明通过将判断结果实时反馈至因果图谱路径，并据此调整路径置信权重与状态识别策略，构建了一个具备自演化能力的动态图结构控制机制。该机制不仅能持续优化平台对于设备状态的判别策略，还能在任务迭代中逐步学习不同设备间的行为模式与通信规律，实现了平台在实际部署过程中对突发补发、通信扰动和数据重传等复杂情形的高度适应性。因此，本发明具有显著的工程适用性与系统鲁棒性，能够在多源异构设备环境下提升工业控制系统的安全性、实时性与数据可信度。3. The present invention constructs a dynamic graph structure control mechanism with self-evolution capabilities by feeding back the judgment results to the causal graph path in real time and adjusting the path confidence weight and state recognition strategy accordingly. This mechanism not only continuously optimizes the platform's strategy for distinguishing device states, but also gradually learns the behavior patterns and communication rules between different devices during task iteration, enabling the platform to be highly adaptable to complex situations such as sudden retransmissions, communication disturbances, and data retransmissions during actual deployment. Therefore, the present invention has significant engineering applicability and system robustness, and can improve the security, real-time performance, and data credibility of industrial control systems in a multi-source heterogeneous device environment.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

为了更清楚地说明本申请实施例或现有技术中的技术方案，下面将对实施例中所需要使用的附图作简单的介绍，显而易见地，下面描述中的附图仅仅是本发明中记载的一些实施例，对于本领域普通技术人员来讲，还可以根据这些附图获得其他的附图。In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, a brief introduction to the drawings required for use in the embodiments will be given below. Obviously, the drawings described below are only some embodiments recorded in the present invention. For ordinary technicians in this field, other drawings can also be obtained based on these drawings.

图1为本发明基于工业控制平台的工业设备互联互通方法的流程示意图。FIG1 is a flow chart of an industrial equipment interconnection method based on an industrial control platform according to the present invention.

具体实施方式DETAILED DESCRIPTION

现在将参考附图更全面地描述示例实施方式。然而，示例实施方式能够以多种形式实施，且不应被理解为限于在此阐述的范例；相反，提供这些示例实施方式使得本公开的描述将更加全面和完整，并将示例实施方式的构思全面地传达给本领域的技术人员。Example embodiments will now be described more fully with reference to the accompanying drawings. However, example embodiments can be implemented in many forms and should not be construed as limited to the examples set forth herein; rather, these example embodiments are provided so that the description of this disclosure will be thorough and complete and will fully convey the concepts of the example embodiments to those skilled in the art.

本发明提供了如图1所示的基于工业控制平台的工业设备互联互通方法，具体包括以下步骤：The present invention provides an industrial equipment interconnection method based on an industrial control platform as shown in FIG1 , which specifically includes the following steps:

本实施例中，构建工业设备状态字段与控制指令之间的因果图谱，具体为：In this embodiment, a causal graph between the industrial equipment status field and the control instructions is constructed, specifically:

可以通过对工业设备运行日志和通信记录中的原始数据帧进行解析，实现字段组与控制指令的提取。首先，通过工业控制平台中的任务调度记录与数据采集服务接口，获取设备在每一个任务周期内的连续数据帧序列，并对每条数据帧进行字段拆解，提取如温度、压力、转速、告警码等关键状态字段，按时间窗口将其聚合形成字段组；同时，从同一时间线上的控制层记录中同步调取该字段组所对应下发的控制指令，并通过字段变化前后的状态差异确认其指令触发点。接着，通过时间戳对字段组与控制指令之间的先后关系进行排序，构建出字段组-控制指令-顺序三元组集合。该过程可通过规则引擎、字段差异映射函数和任务时间线映射器进行软件建模，最终以图数据结构的方式存储，用于构建因果图谱节点与边的输入。Field groups and control instructions can be extracted by parsing the raw data frames in industrial equipment operation logs and communication records. First, through the task scheduling record and data acquisition service interface in the industrial control platform, a continuous sequence of data frames from the equipment within each task cycle is obtained. Each data frame is field-decomposed, extracting key status fields such as temperature, pressure, speed, and alarm codes. These fields are then aggregated into field groups according to time windows. Simultaneously, the control instructions corresponding to these field groups are synchronously retrieved from the control layer records on the same timeline, and the instruction trigger points are identified by the state differences before and after the field changes. Next, the order of the field groups and control instructions is sorted by timestamp, constructing a set of field group-control instruction-sequence triples. This process can be modeled in software using a rule engine, a field difference mapping function, and a task timeline mapper. Ultimately, the data is stored as a graph data structure, which serves as the input for constructing nodes and edges in the causal graph.

之所以要将状态字段组与控制指令之间的对应关系结构化为三元组，并以因果图谱的形式进行表达，是因为在复杂工业场景中，设备的行为并非由单一字段变化所触发，而是由多个字段在特定组合状态下触发某种控制操作的结果。这种关联若仅靠传统的静态阈值判断将难以准确反映状态与控制之间的真实因果链条。而通过在任务执行过程中的时间演化维度上提取字段组与控制指令之间的演变路径，不仅可以实现对控制行为背景的精准建模，还能为后续异常识别、状态判断提供上下文基础，特别是在出现数据补发、状态回滚等复杂场景下，通过因果图谱可有效判断当前状态是否属于路径上的合理推进，显著增强对状态有效性的判断准确度。因此，三元组提取不仅是数据预处理的关键步骤，更是后续图嵌入建模和行为偏序分析的前提基础。The reason for structuring the correspondence between status field groups and control instructions into triples and expressing them in the form of a causal graph is that in complex industrial scenarios, device behavior is not triggered by a single field change, but rather by the combination of multiple fields in a specific state, triggering a certain control operation. Relying solely on traditional static thresholds to determine this relationship would make it difficult to accurately reflect the true causal chain between state and control. However, by extracting the evolutionary path between field groups and control instructions along the temporal evolution dimension during task execution, not only can the control behavior context be accurately modeled, but it also provides a contextual basis for subsequent anomaly identification and status judgment. In particular, in complex scenarios such as data retransmission and status rollback, the causal graph can effectively determine whether the current state represents a reasonable progression along the path, significantly enhancing the accuracy of judgments on state validity. Therefore, triple extraction is not only a key step in data preprocessing but also a prerequisite for subsequent graph embedding modeling and behavioral partial order analysis.

可以通过建立图结构建模模块，将前序步骤中提取的三元组集合映射为图数据结构中的节点与有向边。在具体实现过程中，首先以每个字段组作为起始节点，将其唯一化编码，例如通过哈希算法生成字段组合标识；控制指令作为终止节点，也进行结构化唯一标识。接着，以“字段组 → 控制指令”的方向创建有向边，用以表达状态引发行为的方向性因果逻辑。随后，结合任务执行的历史时间轴，为每一对起始节点和终止节点之间的边赋予两个属性：一是时间权重，通过统计该三元组在任务周期内出现的平均时间间隔获得，用于表达状态-指令之间的演化速度；二是频次权重，通过计算该字段组在任务中多次触发同一控制指令的比例，用于衡量其因果关系强度。整个建图过程可通过图数据库（如Neo4j）或图计算框架（如NetworkX）完成建模、存储与查询，实现设备状态行为在演化过程中的可视化、结构化与推理可用性。By establishing a graph structure modeling module, the triples extracted in the previous step can be mapped into nodes and directed edges in a graph data structure. In the specific implementation, each field group is first used as a starting node and uniquely encoded, for example, by generating a field combination identifier using a hash algorithm. The control instruction, serving as the ending node, is also structured and uniquely identified. Next, directed edges are created in the direction of "field group → control instruction" to express the directional causal logic of state-induced behavior. Subsequently, based on the historical timeline of task execution, two attributes are assigned to each pair of edges between the starting and ending nodes: a time weight, obtained by counting the average time interval between the appearance of the triple within the task cycle, which expresses the evolutionary speed between state and instruction; and a frequency weight, which measures the strength of the causal relationship by calculating the proportion of times the field group triggers the same control instruction within a task. The entire graph construction process can be modeled, stored, and queried using a graph database (such as Neo4j) or a graph computing framework (such as NetworkX), enabling visualization, structuring, and reasoning about the evolution of device state and behavior.

将三元组集合映射为有向因果图谱并赋权的做法，能够系统性捕捉工业设备中“状态组合如何演化为控制决策”的本质逻辑。相比传统的数据表或规则列表，图结构具有更强的结构表达能力与路径计算能力，特别适用于描述因果关系网络。在工业控制场景中，设备行为往往具备序列性、重叠性与多源驱动特征，单纯通过字段值判断容易丢失上下文信息。而通过将状态字段组合与控制指令建模为图中节点，并以方向性边表示其演化逻辑，不仅能清晰表达“某类状态字段在何种组合下倾向于引发何种控制行为”，还可利用路径分析算法对后续输入状态数据进行匹配与预测，尤其在异常场景如数据补发、行为回滚等情况下，有向图谱可判断当前状态是否处于演化路径上的合理节点，从而为后续判断“该状态是否应被采纳”为有效状态提供因果基础。这种图谱结构的引入，使设备状态判断不再依赖单一数据帧，而是基于历史经验的逻辑推理，极大提升了平台在复杂数据动态中的自适应判断能力。Mapping triple sets into directed causal graphs and assigning weights to them systematically captures the underlying logic of how state combinations evolve into control decisions in industrial equipment. Compared to traditional data tables or rule lists, graph structures offer stronger structural representation and path computation capabilities, making them particularly suitable for describing causal networks. In industrial control scenarios, device behavior is often sequential, overlapping, and driven by multiple sources. Simply judging by field values can easily lose context. By modeling state field combinations and control instructions as nodes in a graph and representing their evolutionary logic with directional edges, this not only clearly expresses the control behavior that certain state field combinations tend to trigger, but also leverages path analysis algorithms to match and predict subsequent state input data. In particular, in exceptional scenarios such as data retransmission and behavior rollback, the directed graph can determine whether the current state is at a reasonable point in the evolutionary path, providing a causal basis for subsequent decisions about whether the state should be adopted as a valid state. This graph structure eliminates the reliance on a single data frame for device state judgment, relying instead on logical reasoning based on historical experience. This significantly enhances the platform's adaptive judgment capabilities in complex data dynamics.

本实施例中，接收来自工业设备的数据帧，判断数据帧是否与历史帧字段内容一致且时间戳或顺序编号不同，若满足则标记为内容重复的补发数据帧，具体为：In this embodiment, a data frame is received from an industrial device, and it is determined whether the data frame is consistent with the historical frame field content and the timestamp or sequence number is different. If the conditions are met, it is marked as a retransmitted data frame with duplicate content. Specifically:

在工业控制平台中接收来自工业设备的数据帧后，可以通过预定义的字段解析模板，对每一帧数据进行字段级别的结构化拆解，提取其中具有代表性的多个关键状态字段，这些字段通常包括设备运行过程中对控制逻辑产生直接影响的指标变量，例如运行温度、主轴转速、系统压力、进料电流、位移位置、报警标志位、运行状态码、控制模式编码等。这些字段被称为关键状态字段，是因为它们不仅反映了设备当前的运行工况，还直接决定了控制平台是否会触发响应动作。为了使每帧数据在后续处理流程中具有统一结构和可比性，需要按照设备厂商协议或统一通信标准中约定的字段采集顺序进行排列，并结合设备数据结构模板对字段进行归位补全与类型校验，确保字段语义完整一致。在构建字段序列特征向量时，可将每个字段的当前值按既定顺序组合为多维向量，必要时对数值型字段进行归一化处理，对枚举型字段进行离散编码，使其在后续处理流程中具有标准的数值表达形式。通过上述方式构建出的字段序列特征向量，不仅保留了设备状态的全貌信息，还具备一致性、判别性与机器可处理性，便于后续进行数据一致性判断、行为识别与图结构映射等操作。因此，该步骤是将原始数据帧转化为可用于语义比对和时序推理的核心中间表示，是实现智能判断与状态理解的前提基础。After receiving data frames from industrial equipment, the industrial control platform can use predefined field parsing templates to perform field-level structural analysis on each frame, extracting multiple representative key status fields. These fields typically include indicator variables that directly impact the control logic during equipment operation, such as operating temperature, spindle speed, system pressure, feed current, displacement position, alarm flags, operating status codes, and control mode codes. These fields are referred to as key status fields because they not only reflect the current operating conditions of the equipment but also directly determine whether the control platform triggers a response action. To ensure uniform structure and comparability in subsequent processing, each frame of data must be arranged according to the field collection sequence specified in the equipment manufacturer's agreement or unified communication standard. Fields must be aligned, completed, and type-checked using the equipment data structure template to ensure semantic integrity and consistency. When constructing the field sequence feature vector, the current values of each field are combined into a multidimensional vector in a predetermined order. Numeric fields are normalized as necessary, and enumerated fields are discretized to ensure a standard numerical representation in subsequent processing. The field sequence feature vector constructed in this way not only retains the full picture of the device status but also possesses consistency, discriminability, and machine processability, facilitating subsequent operations such as data consistency judgment, behavior recognition, and graph structure mapping. Therefore, this step converts the raw data frame into a core intermediate representation that can be used for semantic comparison and temporal reasoning, and is the prerequisite for achieving intelligent judgment and status understanding.

在构建字段序列特征向量后，可通过哈希编码处理方式将其转化为一组可比对的数字指纹，以便快速识别字段内容是否一致。该处理方式的核心在于将每个关键状态字段的值与其在序列中的位置索引进行绑定，使得不仅字段本身的数值特征被记录，其在整个状态结构中的语义位置也被编码。具体而言，首先对字段序列中的每个字段赋予一个唯一的位置索引，例如第一个字段为0，第二个为1，依此类推；随后，将每个字段值乘以其对应的索引权重系数，该权重可为线性增长系数或根据字段重要性预设的因子，从而形成一组“加权字段值”；然后，将所有加权字段值依序连接成一个长整型或定长字符串，作为哈希输入项。最后，利用结构稳定性强且低碰撞率的哈希算法（如SHA-256或MurmurHash3）对该加权字符串进行摘要运算，生成字段内容签名值。该签名值在同一工业设备维度内具有唯一性，即只要状态字段组合的值和顺序保持一致，即使时间戳不同，也会生成相同的签名值，因此可作为判断数据内容是否重复的快速依据。其中，“哈希编码处理”是一种将任意长度输入映射为定长数字输出的摘要方式，保证结构不变性；“加权字段值”引入了字段内容与其在整体结构中语义地位的融合考虑，增强了特征表达能力；“字段位置索引”保证了字段排列顺序的唯一性，不同排列即便字段值相同也能生成不同签名，避免误判。这一机制可通过软件在数据接收后即时执行，具备高效、判别强、可嵌入后续图结构建模的特性，是实现设备状态内容快速比对和判别的核心支撑手段。After constructing the field sequence feature vector, it can be converted into a set of comparable digital fingerprints through a hash encoding process, allowing for quick identification of field content consistency. The core of this process is to bind the value of each key state field to its position index in the sequence. This not only records the numerical characteristics of the field itself, but also encodes its semantic position within the entire state structure. Specifically, each field in the field sequence is first assigned a unique position index—for example, the first field is 0, the second is 1, and so on. Each field value is then multiplied by its corresponding index weight coefficient, which can be a linearly increasing coefficient or a preset factor based on the field's importance, to form a set of "weighted field values." All weighted field values are then concatenated sequentially into a long integer or fixed-length string, which serves as the hash input. Finally, a digest operation is performed on this weighted string using a hash algorithm with strong structural stability and low collision rate (such as SHA-256 or MurmurHash3) to generate the field content signature value. The signature value is unique within the same industrial equipment dimension, that is, as long as the value and order of the status field combination remain consistent, the same signature value will be generated even if the timestamp is different. Therefore, it can be used as a quick basis for determining whether the data content is repeated. Among them, "hash coding processing" is a summary method that maps input of arbitrary length to a fixed-length digital output to ensure structural invariance; "weighted field value" introduces the fusion consideration of field content and its semantic status in the overall structure, enhancing the feature expression capability; "field position index" ensures the uniqueness of the field arrangement order. Different arrangements can generate different signatures even if the field values are the same, avoiding misjudgment. This mechanism can be executed immediately by software after data is received. It has the characteristics of high efficiency, strong discrimination, and can be embedded in subsequent graph structure modeling. It is the core support means for realizing rapid comparison and discrimination of equipment status content.

在工业控制平台中，为了识别某一接收到的数据帧是否为补发数据帧，需要对其字段内容是否与历史数据帧重复进行快速比对。实现这一操作的关键步骤之一是：在历史数据缓冲区中检索字段内容签名值相同的历史数据帧，并从中选取与当前数据帧时间间隔最小的一条作为对比帧。该过程可通过构建基于哈希索引的数据结构实现。具体方式是：在平台的数据缓冲区中维护一个以“字段内容签名值”为键、以对应历史数据帧列表为值的哈希表结构。每当新的数据帧到达并完成字段序列向量哈希编码后，即可以该签名值为检索条件，在哈希表中快速查找是否存在相同签名值的历史记录。若存在，则将该签名值下的所有历史数据帧按时间戳升序排列，再计算每条历史数据帧与当前数据帧接收时间的绝对时间间隔，从中选取时间间隔最小的那一帧作为对比帧。该对比帧是最接近当前数据行为上下文的历史“等价状态”，用于进一步分析当前数据是否属于正常状态流或补发行为。之所以这样做，是因为设备数据存在高频采样、异步上报甚至通信网络不稳定等特性，可能导致多个状态帧在短时间内重复出现，只有精确找出时间最近的“等内容历史帧”，才能基于时间逻辑判断是否存在补发行为，避免将正常更新误判为冗余重发，从而为后续的时序偏移判断与有效性识别提供可靠参照点。该方法不仅具备高效的检索性能和良好的扩展性，而且高度适配工业数据流在设备级别中的特征差异和实时性要求，完全可通过平台软件逻辑高效实现。In an industrial control platform, to identify whether a received data frame is a reissued data frame, its field content must be quickly compared with historical data frames to verify duplicates. A key step in achieving this is to search the historical data buffer for historical data frames with the same field content signature value and select the frame with the smallest time interval with the current data frame as the comparison frame. This process can be implemented by constructing a hash-indexed data structure. Specifically, a hash table structure is maintained in the platform's data buffer, with the field content signature value as the key and the corresponding historical data frame list as the value. Whenever a new data frame arrives and its field sequence vector is hash-encoded, the signature value is used as a search criterion to quickly search the hash table for historical records with the same signature value. If so, all historical data frames with the signature value are sorted in ascending timestamp order. The absolute time interval between each historical data frame and the current data frame's reception time is calculated, and the frame with the smallest time interval is selected as the comparison frame. This comparison frame is the historical "equivalent state" that best matches the current data behavior context and is used to further analyze whether the current data represents a normal state flow or a reissued behavior. This approach is based on the high-frequency sampling, asynchronous reporting, and even unstable communication networks of device data, which can cause multiple status frames to recur in a short period of time. Only by accurately identifying the most recent "equal content history frame" can we determine whether reissues have occurred based on temporal logic, avoiding misjudging normal updates as redundant retransmissions. This provides a reliable reference point for subsequent timing offset judgment and validity identification. This method not only offers efficient retrieval performance and good scalability, but is also highly adaptable to the characteristic differences and real-time requirements of industrial data streams at the device level, and can be efficiently implemented entirely through platform software logic.

为判断当前数据帧是否为补发数据的候选对象，可以通过比较当前数据帧与其对应的对比帧在时间戳或顺序编号上的逻辑关系来实现，重点识别是否存在非递增、反转或编号间断等异常情况。具体实现方式是：首先，提取当前数据帧和对比帧各自的时间戳值和顺序编号值；随后，建立判断逻辑：若当前数据帧的时间戳小于对比帧的时间戳，说明该数据帧在时序上“倒退”，存在明显非递增行为；若两者时间戳相同但顺序编号小于或大于预期连续编号（例如，当前编号应为对比帧编号加一，但出现了跳跃或回退），则判定为顺序编号间断或反转。为避免误判，也可以设置一个合理的编号窗口容忍区间，确保正常波动不被误识别为异常行为。此外，平台可结合设备在历史周期内的行为模型，判断其是否存在补发逻辑特征，例如是否定期重发最近状态、是否在通信异常后批量重传等模式，从而提升判断鲁棒性。这样做的根本原因在于：在工业设备的数据通信过程中，由于网络波动、节点缓存重发或边缘设备延迟处理等因素，常常会出现相同内容的数据被多次发送的现象。若平台仅基于字段内容一致性来认定设备状态，则无法分辨实时数据与延迟补发，进而可能导致状态快照错误、控制逻辑误触发等严重后果。因此，通过时间戳和顺序编号的逻辑判断，可以以最小开销、最高实时性捕捉出补发行为的候选数据，为后续是否将其采纳为有效状态提供必要的数据基础。这一判断过程完全可通过软件流程嵌入数据接收模块中实时执行，具备良好的工程实现性与通用性。To determine whether the current data frame is a candidate for retransmission, the system compares the logical relationship between the timestamp and sequence number of the current data frame and its corresponding comparison frame, focusing on identifying any anomalies such as non-incremental, reversed, or numbering gaps. Specifically, the system extracts the timestamp and sequence number values of the current and comparison frames. Next, it establishes a judgment logic: If the timestamp of the current data frame is less than that of the comparison frame, it indicates a chronological regression and obvious non-incremental behavior. If the timestamps of the two frames are the same but the sequence number is less than or greater than the expected consecutive number (for example, the current number should be the comparison frame number plus one, but there is a jump or rollback), it is determined to be a sequence numbering gap or reversal. To avoid misjudgment, a reasonable numbering window tolerance can be set to ensure that normal fluctuations are not mistaken for abnormal behavior. Furthermore, the platform can combine historical device behavior models to determine whether the frame exhibits retransmission logic characteristics, such as whether it periodically retransmits its most recent status or whether it performs batch retransmissions after communication anomalies, thereby improving judgment robustness. The fundamental reason for this is that during the data communication process of industrial equipment, data with the same content is often sent multiple times due to factors such as network fluctuations, node cache retransmissions, or delayed processing of edge devices. If the platform determines the status of the device based solely on the consistency of field content, it will not be able to distinguish between real-time data and delayed retransmissions, which may lead to serious consequences such as status snapshot errors and false triggering of control logic. Therefore, through logical judgment of timestamps and sequence numbers, candidate data for retransmission behavior can be captured with minimal overhead and maximum real-time performance, providing the necessary data basis for whether it will be adopted as a valid state in the future. This judgment process can be completely embedded in the data receiving module through software processes and executed in real time, with good engineering feasibility and versatility.

为实现对补发数据帧的最终确认，可通过计算当前数据帧的接收时间与其对应对比帧的记录时间之间的时间偏移量，结合字段内容一致性判断，来确定该数据帧是否为内容重复的补发数据帧。具体实现方式是：首先，提取当前数据帧被工业控制平台接收的时间戳，与对比帧在历史数据缓冲区中记录的时间戳进行相减，计算两者之间的绝对时间偏移量。该时间偏移值反映了两条内容相同的数据帧在被平台接收到的时间差异，若该差值非常小（通常低于一个完整任务周期的标准采样间隔），则说明可能为短时补发；此时再进一步比对两条数据帧的字段内容是否完全一致，若字段值逐项完全一致（可借助字段序列特征向量或签名值比对），即可基本确认当前数据帧为一条冗余补发数据。To achieve final confirmation of a reissued data frame, the time offset between the reception time of the current data frame and the recording time of its corresponding comparison frame can be calculated, combined with field content consistency to determine whether the data frame is a reissued data frame with duplicate content. Specifically, the timestamp of the current data frame's reception by the industrial control platform is extracted and subtracted from the timestamp of the comparison frame recorded in the historical data buffer to calculate the absolute time offset between the two. This time offset value reflects the difference in time between the two identical data frames being received by the platform. If this difference is very small (typically less than the standard sampling interval of a complete task cycle), it may indicate a short reissue. The field contents of the two data frames are then compared for complete consistency. If the field values are completely consistent (this can be achieved by comparing field sequence feature vectors or signature values), the current data frame is essentially confirmed to be a redundant reissued data frame.

其中，“预设的任务周期容忍阈值”是指设备正常运行过程中每类任务或状态刷新过程允许的数据重复时间范围上限。该阈值可根据设备采样周期、控制执行周期或状态刷新周期等参数设定，例如：若某设备的控制循环为100ms，则阈值可设定为200ms以内；此设计既考虑到小范围网络抖动，又避免误将正当采样帧误判为补发帧。平台可通过学习设备历史运行数据自动生成该阈值，或由工程人员基于任务逻辑手动配置。如此设定的好处是，确保平台在判断数据有效性时引入时间语义判断机制，而非仅依赖字段内容一致性，从而避免将设备正常上报的数据误当作最新状态，尤其是在存在通信延迟、缓存溢出、重发机制等干扰因素时。整体流程可通过软件逻辑模块实现嵌入，实时计算与阈值比对过程运算轻量、响应迅速，适配工业控制平台对高频数据稳定性与准确性的双重需求。The "preset task cycle tolerance threshold" refers to the upper limit of the data duplication time allowed for each type of task or status refresh process during normal device operation. This threshold can be set based on parameters such as the device sampling cycle, control execution cycle, or status refresh cycle. For example, if a device has a 100ms control cycle, the threshold can be set to within 200ms. This design accounts for small-scale network jitter and prevents incorrectly misclassifying legitimate sampling frames as retransmission frames. The platform can automatically generate this threshold by learning from historical device operating data, or engineers can manually configure it based on task logic. This approach ensures that the platform incorporates temporal semantics when determining data validity, rather than relying solely on field content consistency. This prevents mistaking legitimately reported data from devices for the latest status, especially in the presence of interference factors such as communication delays, buffer overflows, and retransmission mechanisms. The entire process can be embedded in a software logic module, making the real-time calculation and threshold comparison process computationally lightweight and responsive, meeting the dual requirements of high-frequency data stability and accuracy for industrial control platforms.

本实施例中，对内容重复的补发数据帧，计算其与被补发数据帧之间的行为纠缠度，若纠缠度超过预设阈值，则提取顺序偏移量、时间延迟量和字段波动幅度，生成用于描述数据标识时效程度的指标向量，具体包括以下步骤：In this embodiment, for a retransmitted data frame with duplicate content, the behavioral entanglement between the retransmitted data frame and the retransmitted data frame is calculated. If the entanglement exceeds a preset threshold, the sequence offset, time delay, and field fluctuation amplitude are extracted to generate an indicator vector describing the timeliness of the data identification. The specific steps include:

该过程可通过软件系统中的数据处理逻辑模块实现。首先，平台持续维护一个基于时间窗口滑动的历史数据缓存区，用于存储所有设备上报的数据帧。系统识别出当前内容重复的补发数据帧及其被补发数据帧后，即可以这两帧为中心，向其前后时间轴分别提取预设数量的连续数据帧；对于每一帧，从中提取预定义的关键字段集合（如温度、转速、负载、电流等）形成字段序列向量。随后，系统将按时间先后顺序对这些字段序列向量进行排序排列，形成字段值时间序列。最终，所有字段序列向量按采样顺序组合为一个二维向量列表，该向量列表即为一条向量序列轨迹，反映该设备在补发行为发生前后的状态变化过程。该轨迹构建过程可通过标准化数组操作、时间戳排序函数及字段组解析模块完成，确保在系统内高效执行并可扩展适配不同类型设备的字段模板。This process is implemented through the data processing logic module in the software system. First, the platform continuously maintains a historical data buffer based on a sliding time window to store data frames reported by all devices. After identifying a reissued data frame with duplicate content and its corresponding reissued data frame, the system extracts a preset number of consecutive data frames from the time axis preceding and following these two frames. For each frame, a predefined set of key fields (such as temperature, speed, load, and current) is extracted to form a field sequence vector. The system then sorts these field sequence vectors in chronological order to form a field value time series. Finally, all field sequence vectors are combined into a two-dimensional vector list in sampling order. This vector list is a vector sequence trajectory, reflecting the device's state changes before and after the reissue. This trajectory construction process is accomplished through standardized array operations, timestamp sorting functions, and a field group parsing module, ensuring efficient execution within the system and scalable field templates for different device types.

构建字段演化趋势的向量序列轨迹的根本目的是为了提供一种比单点字段内容比对更具上下文感知能力的状态变化表达方式。在数据补发行为发生的场景中，仅通过字段内容一致性无法判断数据是否具有当前代表性，而通过提取数据帧前后一段连续时间的字段变化模式，可分析该字段组合在补发前后是否持续演化或处于停滞阶段。例如，若被补发数据帧所在时间段的字段波动活跃，而补发数据帧前后字段值保持静止，则极可能是历史状态的回送；反之若字段轨迹连贯、变化一致，则可能是实时重传。这种基于时序背景构建的轨迹为后续纠缠度计算提供必要基础，使系统能够进行动态状态趋势比较，从而提升对补发帧的判定准确性与稳定性，避免错误采纳旧数据引发状态判断失真。该策略贴合工业控制中高频高精度状态识别的要求，是对传统静态字段比对方法的重要改进。The fundamental purpose of constructing vector sequence trajectories of field evolution trends is to provide a more context-aware representation of state changes than single-point field content comparison. In scenarios where data reissue occurs, field content consistency alone cannot determine whether the data is currently representative. However, by extracting the field change patterns over a continuous period before and after a data frame, it is possible to analyze whether the field combination has been continuously evolving or stagnant before and after the reissue. For example, if the field fluctuates significantly during the time period of the reissued data frame, while the field values remain static before and after the reissued data frame, it is likely a historical state retransmission. Conversely, if the field trajectory is coherent and the changes are consistent, it is likely a real-time retransmission. This trajectory constructed based on the temporal context provides the necessary foundation for subsequent entanglement calculations, enabling the system to perform dynamic state trend comparisons, thereby improving the accuracy and stability of reissued frame judgments and avoiding distorted state judgments caused by the incorrect adoption of old data. This strategy meets the requirements of high-frequency and high-precision state recognition in industrial control and is a significant improvement over traditional static field comparison methods.

该过程可通过工业控制平台中嵌入的数据处理与阈值判断模块以软件方式实现。系统首先基于已构建的字段序列轨迹，完成行为纠缠度的计算，纠缠度为一个定量评分，表示补发数据帧与被补发数据帧在演化趋势上的重合程度。平台设置一个“预设阈值”作为行为纠缠度的判断基准，该阈值可通过对大量设备历史运行数据进行无监督聚类分析得到：选取补发行为明确的数据样本，对其轨迹纠缠度进行统计，提取其经验分布的分位数（如95%）作为初始阈值；再结合实际系统运行经验动态调整。若某帧数据纠缠度高于该阈值，则平台会认为该补发行为具有高度可疑性，需要进一步对其时间和内容差异进行分析。此时，平台提取当前补发数据帧与对应被补发数据帧的时间戳与顺序编号，计算时间延迟量与顺序偏移量，并进一步提取两帧字段向量的字段对应项，计算其值差均值作为字段波动幅度。这些步骤在程序内部可通过时间比较函数、位置索引差值运算以及字段级向量差分与平均操作高效实现。This process can be implemented in software using the data processing and threshold judgment module embedded in the industrial control platform. The system first calculates behavioral entanglement based on the constructed field sequence trajectory. Entanglement is a quantitative score that indicates the degree of overlap in the evolutionary trends of the reissued data frame and the reissued data frame. The platform sets a "preset threshold" as a benchmark for determining behavioral entanglement. This threshold is obtained through unsupervised cluster analysis of a large amount of historical device operation data. Data samples with clear reissue behaviors are selected, their trajectory entanglement is statistically analyzed, and the quantile of their empirical distribution (e.g., 95th percentile) is extracted as the initial threshold. This threshold is then dynamically adjusted based on actual system operation experience. If the entanglement of a frame exceeds this threshold, the platform deems the reissue behavior highly suspicious and requires further analysis of the timing and content differences. At this point, the platform extracts the timestamps and sequence numbers of the current reissued data frame and the corresponding reissued data frame, calculates the time delay and sequence offset, and further extracts the corresponding field items in the field vectors of the two frames and calculates the mean difference between their values as the field fluctuation amplitude. These steps can be efficiently implemented within the program through time comparison functions, position index difference operations, and field-level vector difference and average operations.

具体计算时，时间延迟量可通过当前补发数据帧的接收时间减去被补发数据帧的记录时间得到，例如若被补发帧时间为10:00:00，补发帧接收时间为10:00:03，则延迟量为3 秒；顺序偏移量为其顺序编号的差值，例如编号为105与110，则偏移量为5；字段波动幅度则计算各对应字段值之差的绝对值平均值，例如字段包括温度、电流与压力，原值为[100,5.1, 0.8]，补发帧为[100.5, 5.0, 0.85]，则字段差为[0.5, 0.1, 0.05]，均值为约0.216。通过上述三个指标，系统即可从时间、顺序与字段变化三个维度刻画该帧数据的“标识时效程度”，进而构建后续判断模型所需的特征输入。这种方式兼顾了设备行为的时效连续性与字段一致性，适合在工业数据复杂动态环境中精准识别历史重发帧。Specifically, the time delay is calculated by subtracting the recording time of the resent frame from the reception time of the current resent frame. For example, if the resent frame is received at 10:00:00 and the resent frame is received at 10:00:03, the delay is 3 seconds. The sequence offset is the difference in sequence numbers. For example, if numbers 105 and 110 are different, the offset is 5. The field fluctuation amplitude is calculated by taking the average of the absolute differences between the corresponding field values. For example, if the fields include temperature, current, and pressure, and the original values are [100, 5.1, 0.8] and the resent frame values are [100.5, 5.0, 0.85], the field differences are [0.5, 0.1, 0.05], with an average of approximately 0.216. Using these three metrics, the system can characterize the "identification timeliness" of the frame data from the three dimensions of time, sequence, and field variation, thereby constructing the feature inputs required for the subsequent judgment model. This approach takes into account both the temporal continuity and field consistency of device behavior, and is suitable for accurately identifying historical retransmitted frames in complex and dynamic industrial data environments.

在软件实现中，该过程可通过工业控制平台内置的特征工程模块完成。平台首先对时间延迟量、顺序偏移量与字段波动幅度分别进行归一化处理，以消除不同物理含义和数值范围带来的量纲干扰。归一化可采用基于滑动窗口的 Min-Max 线性变换，即将当前值映射到基于近期一段历史数据中统计得到的最小值与最大值之间的 [0,1] 区间；这样处理能动态适应工业过程中的状态波动。随后，平台按照预设的指标顺序（如时间延迟量为第1维、顺序偏移量为第2维、字段波动幅度为第3维）将三个已归一化的值依次组合成三维向量，即构成具有统一量纲的指标向量。此预设顺序是由平台在模型设计阶段确定并在整个系统中保持一致，以确保所有设备、所有任务下生成的向量具有稳定的维度语义与处理顺序。这一步骤的根本目的是为后续状态识别模型（如图嵌入神经网络、时效分类器等）提供格式一致、数值可比、含义明确的输入向量，提升系统对补发数据帧时效判别的准确性与鲁棒性，并为设备状态智能评估奠定特征基础。这种向量化和归一处理方式不仅能提高模型的泛化能力，也能支持模型在多场景、跨设备间的可复用性。In software implementation, this process can be accomplished through the feature engineering module built into the industrial control platform. The platform first normalizes the time delay, sequence offset, and field fluctuation amplitude to eliminate dimensional interference caused by different physical meanings and numerical ranges. Normalization can be performed using a sliding window-based min-max linear transformation, mapping the current value to the [0, 1] interval between the minimum and maximum values statistically determined from a recent period of historical data. This process dynamically adapts to state fluctuations in the industrial process. The platform then combines the three normalized values into a three-dimensional vector according to a preset indicator order (e.g., time delay as the first dimension, sequence offset as the second dimension, and field fluctuation amplitude as the third dimension). This creates a uniformly dimensional indicator vector. This preset order is determined by the platform during the model design phase and remains consistent throughout the system to ensure that the vectors generated for all devices and tasks have stable dimensional semantics and processing order. The fundamental purpose of this step is to provide subsequent state recognition models (such as embedded neural networks and time-sensitive classifiers) with input vectors that are consistent in format, comparable in value, and clearly defined. This improves the accuracy and robustness of the system's time-sensitive identification of re-sent data frames and lays the foundation for intelligent device state assessment. This vectorization and normalization approach not only improves the model's generalization capabilities but also supports its reusability across multiple scenarios and devices.

本实施例中，基于余弦相似度与动态时间规整算法联合计算两段向量序列轨迹之间的行为纠缠度，具体的计算方式如下：In this embodiment, the behavioral entanglement between two vector sequence trajectories is calculated based on the cosine similarity and dynamic time warping algorithm. The specific calculation method is as follows:

这一处理步骤可以通过软件方式在数据预处理阶段实现，主要目的是将不同时刻采集到的字段序列向量统一映射至一个标准化的数值空间，以便后续进行轨迹比对和行为纠缠度计算。具体实现方式为：在接收到内容重复的补发数据帧与其对应的被补发数据帧后，首先向前与向后分别采集预设时间窗口内的多个历史数据帧，从中提取相应的字段序列向量，例如温度、压力、负载等关键字段。接着，对采集到的每条向量轨迹，按采样时间顺序进行排序，形成两个有序的字段向量序列。为了消除不同字段在量纲和取值范围上的影响，需要对每个字段在向量中的取值进行归一化处理。具体做法是：以每个字段在该时间窗口内出现的最大值和最小值为上下边界，采用标准归一化公式将其线性映射到[0,1]区间。例如，若某字段在窗口内最小值为40，最大值为140，则原值60将归一化为(60-40)/(140-40)=0.2。通过此处理，每个字段值都被转化为无量纲数值，统一到标准尺度上，从而避免后续比较过程中某些字段因数值大而产生主导效应。这种方式不仅增强了轨迹间的可比性，也有助于后续算法（如余弦相似度与动态时间规整）更准确地识别字段变化趋势与结构差异。以两个轨迹分别表示“补发数据帧前后5秒内的状态序列”与“被补发数据帧前后5秒内的状态序列”为例，经过上述归一化处理后，即可得到两个标准化的字段值轨迹，用于进行行为相似度评估。This processing step can be implemented in software during the data preprocessing phase. Its primary purpose is to uniformly map the field sequence vectors collected at different times into a standardized numerical space to facilitate subsequent trajectory comparison and behavioral entanglement calculation. Specifically, after receiving a retransmitted data frame with duplicate content and its corresponding retransmitted data frame, multiple historical data frames within a preset time window are first collected both forward and backward, from which corresponding field sequence vectors are extracted, such as key fields such as temperature, pressure, and load. Next, each collected vector trajectory is sorted by sampling time, forming two ordered field vector sequences. To eliminate the influence of different fields in terms of dimension and value range, the values of each field in the vector are normalized. Specifically, the maximum and minimum values of each field within the time window are used as upper and lower bounds, and a standard normalization formula is used to linearly map them to the interval [0, 1]. For example, if the minimum value of a field within the window is 40 and the maximum value is 140, the original value of 60 will be normalized to (60-40)/(140-40)=0.2. Through this process, each field value is converted to a dimensionless value and standardized on a standard scale, preventing the dominance of certain fields due to large values during subsequent comparisons. This approach not only enhances the comparability between trajectories but also helps subsequent algorithms (such as cosine similarity and dynamic time warping) more accurately identify field change trends and structural differences. For example, consider two trajectories representing the state sequence within 5 seconds before and after the resent data frame and the state sequence within 5 seconds before and after the resent data frame. After the above normalization process, two standardized field value trajectories are obtained for behavioral similarity assessment.

实现这一处理步骤，可以通过软件中集成的时间序列分析算法实现，常用工具如Python的dtaidistance、tslearn或MATLAB的DTW模块等。其实现过程为：首先获取两个已归一化的字段向量轨迹（例如分别代表补发数据帧和被补发数据帧在前后时段内的状态演化序列），将它们作为输入时间序列进行动态时间规整（DTW）匹配。软件在运行过程中，会对两个轨迹中的每一个字段向量，逐一计算与另一条轨迹中所有向量的欧氏距离，形成二维代价矩阵。接下来，通过在该代价矩阵上寻找一条代价最小的路径（累计距离最小）作为匹配路径，从而将两条不同长度或不完全对齐的向量轨迹建立一一对应关系。在该过程中，软件会同时记录每一对成功配对的字段向量的采样时间位置，以支持后续的相似度计算与结构分析。这种方式可有效解决设备数据在时间轴上存在速率波动、采样偏移等情况所导致的轨迹不同步问题，使得比较聚焦于字段变化趋势本身，而非时间错位的干扰。This processing step can be implemented using time series analysis algorithms integrated into the software, such as Python's dtaidistance, tslearn, or MATLAB's DTW module. The implementation process involves first obtaining two normalized field vector trajectories (e.g., representing the state evolution sequence of the retransmitted and retransmitted data frames over the preceding and subsequent time periods, respectively) and performing dynamic time warping (DTW) matching on them as input time series. During execution, the software calculates the Euclidean distance between each field vector in each trajectory and all vectors in the other trajectory, forming a two-dimensional cost matrix. Next, the software searches for a path with the lowest cost (minimum cumulative distance) within this cost matrix as the matching path, thereby establishing a one-to-one correspondence between two vector trajectories of different lengths or incomplete alignment. During this process, the software also records the sampling time position of each successfully matched pair of field vectors to support subsequent similarity calculation and structural analysis. This approach effectively addresses trajectory asynchrony caused by time-varying device data, such as rate fluctuations and sampling offsets, allowing comparisons to focus on the field's changing trends rather than the interference of temporal misalignment.

“动态时间规整算法”（Dynamic Time Warping, DTW）是一种广泛应用于时间序列对齐分析的算法，目的是在时间维度上非线性拉伸序列以达到最优配对。它允许一对时间序列中某一序列的一个点与另一序列的多个点对齐，从而支持处理采样不均或时延问题。“欧氏距离”用于衡量两个字段向量间的差异程度，其定义为两个向量在所有字段维度上差值的平方和再开方。例如，若两个字段向量为[0.2, 0.3, 0.6]与[0.3, 0.4, 0.5]，则欧氏距离为√((0.2-0.3)²+(0.3-0.4)²+(0.6-0.5)²)=√(0.01+0.01+0.01)=√0.03≈0.173。DTW算法通过在整个代价矩阵中搜索“最小累计代价路径”，即从起点到终点路径上所有欧氏距离之和最小的路径，作为两个轨迹间最优匹配路径，从而保证整体配对最合理、全局误差最小。此路径将作为后续相似度与行为对齐度计算的基础。Dynamic Time Warping (DTW) is a widely used algorithm for time series alignment analysis. Its purpose is to nonlinearly stretch time series to achieve optimal pairing. It allows one point in one time series to be aligned with multiple points in the other, thus addressing uneven sampling or time delay. Euclidean distance measures the difference between two field vectors and is defined as the square root of the sum of the squared differences between the two vectors across all field dimensions. For example, if the two field vectors are [0.2, 0.3, 0.6] and [0.3, 0.4, 0.5], the Euclidean distance is √((0.2-0.3)²+(0.3-0.4)²+(0.6-0.5)²)=√(0.01+0.01+0.01)=√0.03≈0.173. The DTW algorithm searches the entire cost matrix for the "minimum cumulative cost path," meaning the path with the smallest sum of all Euclidean distances from the starting point to the end point. This path is then used as the optimal matching path between two trajectories, ensuring the most reasonable overall pairing and the lowest global error. This path serves as the basis for subsequent similarity and behavior alignment calculations.

实现该步骤可通过集成在数据分析平台中的向量相似度分析工具完成，例如Python中的NumPy或Scikit-learn库。首先，从动态时间规整（DTW）所得到的匹配路径中提取所有已配对的字段向量对。针对每一对向量，调用余弦相似度函数进行计算。余弦相似度用于衡量两个向量之间的方向一致性，定义为两个向量的点积除以其模长的乘积，值域为[-1, 1]。软件依次对每一对匹配字段向量执行此计算，并将计算结果按路径的顺序构成一个相似度数值序列。最后，对该序列中的所有相似度值进行算术平均处理，得到用于反映两个字段序列轨迹在局部字段演化方向上一致性的数值，也就是本步骤中的“第一数值”。引入余弦相似度作为局部字段变化方向判断的依据，是因为即便字段值略有差异，只要变化趋势一致，其余弦相似度仍可维持高值，从而对突发抖动、采样精度差异具有较强的鲁棒性，有利于提升行为模式识别的准确性。This step can be accomplished using vector similarity analysis tools integrated into data analysis platforms, such as the NumPy or Scikit-learn libraries in Python. First, all paired field vector pairs are extracted from the matching paths obtained by dynamic time warping (DTW). For each pair of vectors, the cosine similarity function is applied. Cosine similarity measures the directional consistency between two vectors and is defined as the dot product of the two vectors divided by the product of their moduli, with a value range of [-1, 1]. The software performs this calculation on each pair of matching field vectors, and the results are organized into a sequence of similarity values, arranged in the order of the paths. Finally, all similarity values in this sequence are arithmetic averaged to obtain a value reflecting the consistency of the local field evolution direction between the two field sequence trajectories. This value is referred to as the "first value" in this step. Cosine similarity is used as a basis for determining the direction of local field change because even if the field values differ slightly, as long as the change trend is consistent, the cosine similarity remains high. This provides strong robustness to sudden jitter and sampling accuracy variations, improving the accuracy of behavioral pattern recognition.

举例说明，设两个字段向量为A = [0.6, 0.8, 0.1]，B = [0.5, 0.75, 0.2]。首先计算其点积：0.6×0.5 + 0.8×0.75 + 0.1×0.2 = 0.3 + 0.6 + 0.02 = 0.92；然后计算两个向量的模长：||A|| =√(0.36+0.64+0.01)=√1.01≈1.005，||B|| =√(0.25+0.5625+0.04)=√0.8525≈0.923。最终余弦相似度为0.92÷(1.005×0.923)≈0.92÷0.927≈0.993，表明两个向量在方向上非常一致。依此方式处理所有配对向量，最终取所有余弦相似度的平均值（如多对后结果为0.976），即可得出局部字段变化方向的一致性指标。此值越接近1，说明两段轨迹在字段变化趋势上越一致。For example, suppose two field vectors are A = [0.6, 0.8, 0.1] and B = [0.5, 0.75, 0.2]. First, calculate their dot product: 0.6 × 0.5 + 0.8 × 0.75 + 0.1 × 0.2 = 0.3 + 0.6 + 0.02 = 0.92. Then, calculate the moduli of the two vectors: ||A|| = √(0.36 + 0.64 + 0.01) = √1.01 ≈ 1.005, and ||B|| = √(0.25 + 0.5625 + 0.04) = √0.8525 ≈ 0.923. The final cosine similarity is 0.92 ÷ (1.005 × 0.923) ≈ 0.92 ÷ 0.927 ≈ 0.993, indicating that the two vectors are very consistent in direction. By processing all paired vectors in this way and ultimately taking the average of all cosine similarities (e.g., 0.976 after multiple pairs), we can obtain a consistency index for the direction of local field change. The closer this value is to 1, the more consistent the field change trends between the two trajectories.

该过程可以通过软件中的时间序列分析模块实现，如使用Python的Pandas与NumPy库完成计算。首先，针对动态时间规整算法所生成的匹配路径，提取每对配对向量在原始轨迹中的采样时间索引（即各自的时间位置）；然后对每对向量的采样时间位置计算其差值，记录所有差值后进行平均处理，得到两个轨迹在整体时间结构上的平均偏移程度，作为第二数值。该数值反映的是补发数据与原始数据在时间结构上的一致程度，即它们在时间维度是否高度重叠。随后，将第一数值（即余弦相似度均值）与该第二数值分别做归一化处理，使其量纲统一。归一化方式可采用最大最小归一法，即将每个数值映射至[0,1]区间，以便后续组合分析。该步骤的意义在于将“局部字段变化趋势一致性”与“整体时间结构匹配程度”两个维度的信息统一考虑，进而得出一个更具综合判断力的行为一致性量化指标，有助于准确判断数据帧的状态是否真实有效。This process can be implemented using the time series analysis module in software, such as Python's Pandas and NumPy libraries. First, for the matching paths generated by the dynamic time warping algorithm, the sampling time index (i.e., their respective temporal positions) of each pair of paired vectors in the original trajectory are extracted. The difference between the sampling time positions of each pair of vectors is then calculated, recorded, and averaged to obtain the average offset between the two trajectories in terms of their overall temporal structure, which serves as the second value. This value reflects the degree of temporal structural consistency between the reissued data and the original data, specifically whether they overlap significantly in the temporal dimension. Subsequently, the first value (i.e., the mean cosine similarity) and the second value are normalized to align their dimensions. The normalization method can be the minimum-maximum method, which maps each value to the interval [0, 1] for subsequent combined analysis. This step combines the two dimensions of "local field change trend consistency" and "overall temporal structure matching" to produce a more comprehensive quantitative indicator of behavioral consistency, helping to accurately determine the authenticity and validity of the data frame status.

在完成归一化后，系统将第一数值与第二数值按预设加权比例进行线性组合，以生成最终的行为纠缠度。线性组合形式通常表示为：行为纠缠度=α×第一数值+β×第二数值，其中α和β分别为归一化后的第一数值与第二数值的权重系数，满足α+β= 1。该加权比例可依据工业场景的可靠性需求、数据特性及历史误判率统计分析结果进行经验调优。例如，在状态变化趋势更关键的场合，α可设为0.7，β为0.3；若设备对时间延迟更为敏感，权重可反转。权重的选择过程可通过离线训练或基于交叉验证的模型评估策略确定，并根据反馈机制持续优化，以保证行为纠缠度能够准确反映向量轨迹间的综合匹配程度。使用这种方式可以避免单一维度对判断结果产生主导影响，使补发数据的识别更加稳健和精准。After normalization, the system linearly combines the first and second values according to a preset weighting ratio to generate the final behavioral entanglement. This linear combination is typically expressed as: Behavioral entanglement = α × first value + β × second value, where α and β are the weighting coefficients of the normalized first and second values, respectively, such that α + β = 1. This weighting ratio can be empirically tuned based on the reliability requirements of the industrial scenario, data characteristics, and statistical analysis of historical false positive rates. For example, in scenarios where state change trends are more critical, α can be set to 0.7 and β to 0.3. If the device is more sensitive to time delays, the weightings can be reversed. The weightings are determined through offline training or a cross-validation-based model evaluation strategy, and continuously optimized through a feedback mechanism to ensure that the behavioral entanglement accurately reflects the overall matching between vector trajectories. This approach prevents a single dimension from dominating the judgment results, making the recognition of reissued data more robust and accurate.

本实施例中，将生成的指标向量输入经过预先训练后的图嵌入算法模型，生成第一参数与第二参数，具体为：In this embodiment, the generated indicator vector is input into a pre-trained graph embedding algorithm model to generate the first parameter and the second parameter, specifically:

为了将生成的指标向量有效地映射到因果图谱中对应的字段组合节点，首先需要将因果图谱以图结构的方式存储为支持检索与索引的形式。每个字段组合被建模为一个图节点，每条边表示字段组合与控制指令之间存在的控制因果关系，边权可用于表达指令出现的频率或路径置信度。系统通过遍历图中所有字段组合节点，对其静态特征（如字段种类、顺序、布尔标记）进行编码，并与指标向量进行相似度匹配（如欧氏距离或余弦相似度）以识别最接近的图节点。识别后，系统提取该目标节点的邻接节点集合及其连边信息，构建结构邻接矩阵（以邻接表或稀疏矩阵形式存储节点间连接关系）与节点特征矩阵（记录当前节点及其邻接节点的特征，如指标值、任务频率、上下游状态模式等），为后续图嵌入模型提供完整输入。该过程实现了指标向量与图语义上下文之间的结构对齐，确保了后续语义计算的准确性。To effectively map the generated indicator vectors to the corresponding field combination nodes in the causal graph, the causal graph must first be stored in a graph structure that supports retrieval and indexing. Each field combination is modeled as a graph node, with each edge representing the causal relationship between the field combination and a control instruction. Edge weights can be used to express instruction frequency or path confidence. The system traverses all field combination nodes in the graph, encoding their static features (such as field type, order, and Boolean flags). Similarity matching (such as Euclidean distance or cosine similarity) is performed with the indicator vector to identify the closest graph node. After identification, the system extracts the target node's adjacent nodes and their edge information, constructing a structural adjacency matrix (storing node connectivity in the form of an adjacency list or sparse matrix) and a node feature matrix (recording the features of the current node and its adjacent nodes, such as indicator values, task frequencies, and upstream and downstream state patterns). These provide complete input for the subsequent graph embedding model. This process achieves structural alignment between the indicator vector and the graph's semantic context, ensuring the accuracy of subsequent semantic computations.

其中，“结构邻接矩阵”用于表示图中各节点之间的拓扑结构，是一个二维矩阵，矩阵中的每一个元素表示两个节点之间是否存在边连接（如值为1表示有边，0表示无边），并可扩展存储边权信息以表示控制逻辑强度；而“节点特征矩阵”用于存储图中每个节点的特征信息，每一行对应一个节点，每一列表示某一特征维度，例如设备状态的稳定性标签、历史执行频次、典型响应指令集合等。构建这两个矩阵的目的在于将原始图数据结构转化为神经网络模型可以直接处理的数值化输入，使图嵌入模型能同时捕捉字段组合节点的内部属性（通过特征矩阵）和图结构中的拓扑关系（通过邻接矩阵），进而准确地产出反映该状态在控制链中行为背景及相似度的高维向量表示，为后续的状态识别和判断机制提供基础支撑。Among them, the "structural adjacency matrix" is used to represent the topological structure between the nodes in the graph. It is a two-dimensional matrix. Each element in the matrix indicates whether there is an edge connection between the two nodes (such as a value of 1 indicates an edge, and 0 indicates no edge). It can be expanded to store edge weight information to represent the strength of the control logic; and the "node feature matrix" is used to store the feature information of each node in the graph. Each row corresponds to a node, and each column represents a certain feature dimension, such as the stability label of the device status, the historical execution frequency, the typical response instruction set, etc. The purpose of constructing these two matrices is to convert the original graph data structure into a numerical input that can be directly processed by the neural network model, so that the graph embedding model can simultaneously capture the internal attributes of the field combination node (through the feature matrix) and the topological relationship in the graph structure (through the adjacency matrix), and then accurately produce a high-dimensional vector representation that reflects the behavioral background and similarity of the state in the control chain, providing basic support for the subsequent state recognition and judgment mechanism.

该步骤可通过将结构邻接矩阵与包含指标向量的节点特征矩阵共同输入至预先训练完成的图嵌入算法模型来实现，以获得当前字段组合节点的嵌入语义表示。具体实现方式为：首先采用如GCN（图卷积网络）、GAT（图注意力网络）或GraphSAGE（图采样聚合）等主流图神经网络模型，这些模型具备同时处理结构邻接关系与节点属性特征的能力。模型接收邻接矩阵，理解当前字段组合节点与其邻接节点之间的连接结构，建立字段状态组合在设备运行路径中的语义上下文；同时接收节点特征矩阵，对节点本身所携带的状态指标信息（即之前生成的指标向量）进行非线性变换和特征融合。模型在训练阶段已基于大量历史任务图进行学习，能够捕捉字段组合与控制路径之间的共现逻辑与动态演化规律，因此在推理阶段输入上述结构后，即可输出一个固定维度的嵌入向量，表示当前数据帧在图结构与动态语义下的综合表示。例如：一个字段组合“进料=1, 转速=1500, 温度=120”通过图嵌入后可能被转化为一个128维的向量，其中某些维度表达其与“加热指令”路径的契合程度，另一些维度则代表其历史稳定性与扰动敏感度，从而为后续参数生成与状态判断提供有力支持。这种方式之所以重要，是因为它将数据帧的静态结构位置与动态运行表现统一在同一个语义空间中建模，显著提升了时态数据判别的上下文理解能力。This step is achieved by inputting both the structural adjacency matrix and the node feature matrix containing indicator vectors into a pre-trained graph embedding algorithm model to obtain an embedded semantic representation of the current field combination node. Specifically, this approach involves first employing mainstream graph neural network models such as GCN (graph convolutional network), GAT (graph attention network), or GraphSAGE (graph sampling aggregation). These models are capable of simultaneously processing structural adjacency relationships and node attribute features. The model receives the adjacency matrix, understands the connection structure between the current field combination node and its adjacent nodes, and establishes the semantic context of the field state combination within the device's operation path. It also receives the node feature matrix and performs nonlinear transformations and feature fusion on the state indicator information carried by the node itself (i.e., the previously generated indicator vector). The model has been trained on a large number of historical task graphs and is able to capture the co-occurrence logic and dynamic evolution patterns between field combinations and control paths. Therefore, when inputting this structure during inference, it outputs a fixed-dimensional embedding vector representing the current data frame in terms of both graph structure and dynamic semantics. For example, a field combination like "feed = 1, speed = 1500, temperature = 120" can be converted into a 128-dimensional vector through graph embedding. Some dimensions express its degree of alignment with the "heating instruction" path, while others represent its historical stability and sensitivity to disturbances, providing strong support for subsequent parameter generation and state judgment. This approach is important because it unifies the static structural position and dynamic operational performance of the data frame in the same semantic space, significantly improving the contextual understanding capabilities of temporal data discrimination.

提取并计算用于表征字段演化一致性的第一数值参数的方式主要基于图嵌入向量的维度语义分离机制。在图嵌入模型的训练阶段，需使用监督学习方式或自监督方式引导部分嵌入维度专门学习“字段演化路径相似性”的特征表示，例如通过历史数据帧之间演化路径的一致性标签进行训练，使得这些维度在嵌入空间中聚焦于轨迹方向、字段趋势和任务阶段等语义。模型训练完成后，在实际推理阶段可将当前数据帧的嵌入向量中对应这些训练有素的维度分量提取出来，通常是嵌入向量中的连续或不连续的若干维度（如第5、7、9、12维等），将这些数值取算术平均值即为第一数值参数。例如，若提取出的字段演化维度分量为[0.78, 0.82, 0.75, 0.81]，则其均值为0.79，该值表示当前数据帧在字段变化趋势上与历史路径的平均一致程度，数值越高，说明其演化轨迹越贴合历史规律，可信度越强。该参数物理意义在于提供一个量化指标，用以衡量补发数据帧是否延续了已知任务路径的演化逻辑，辅助判断其是否可被采信为设备当前的有效状态。The method for extracting and calculating the first numerical parameter used to characterize field evolution consistency is primarily based on the semantic separation mechanism of graph embedding vector dimensions. During the training phase of the graph embedding model, supervised or self-supervised learning is used to guide some embedding dimensions to specifically learn feature representations of "field evolution path similarity." For example, these dimensions are trained using consistent labels of evolution paths between historical data frames, so that they focus on semantics such as trajectory direction, field trend, and task stage in the embedding space. After model training is complete, during the actual inference phase, the components corresponding to these trained dimensions are extracted from the embedding vector of the current data frame. These are typically continuous or discontinuous dimensions (such as the 5th, 7th, 9th, and 12th dimensions). The arithmetic mean of these values is taken as the first numerical parameter. For example, if the extracted field evolution dimension components are [0.78, 0.82, 0.75, 0.81], their mean is 0.79, representing the average consistency of the current data frame's field change trend with the historical path. A higher value indicates a closer fit between the evolution trajectory and historical patterns, and a higher credibility. The physical meaning of this parameter is to provide a quantitative indicator to measure whether the re-sent data frame continues the evolution logic of the known task path, and to assist in determining whether it can be trusted as the current valid state of the device.

用于提取“用于度量当前数据帧与历史字段演化轨迹一致性的若干维度分量”可通过模型训练期间的维度语义标定方法实现。具体方式包括：在构建图嵌入模型的训练数据时，对每个数据帧构造对应的字段组合演化标签或轨迹一致性得分，并设置一个多头图嵌入输出机制，在嵌入向量中预定义若干维度为“轨迹一致性专属维度”，通过多任务损失函数使这些维度聚焦于轨迹趋势学习而非其他语义（如时间扰动）。训练完成后，这些维度在向量输出中具有明确的语义归属，成为可用于一致性判断的专属通道。在推理阶段，即可按照该定义提取对应维度。例如若模型中第10～15维为轨迹一致性维度，则直接提取该段子向量进行均值运算。这种方式的优点在于：其提取操作稳定、标准、无需后期二次聚类或分类，并能确保维度语义的可解释性和通用性，从而支撑跨场景复用和模型推理的一致性。Extracting several dimensional components used to measure the consistency between the current data frame and historical field evolution trajectories can be achieved through dimension semantic calibration during model training. Specifically, when constructing training data for the graph embedding model, a corresponding field combination evolution label or trajectory consistency score is constructed for each data frame. A multi-head graph embedding output mechanism is then implemented, pre-defining several dimensions in the embedding vector as "trajectory consistency-specific dimensions." Using a multi-task loss function, these dimensions are focused on learning trajectory trends rather than other semantics (such as temporal perturbations). After training, these dimensions have clear semantic assignments in the vector output, becoming dedicated channels for consistency judgment. During inference, the corresponding dimensions can be extracted according to this definition. For example, if dimensions 10-15 in the model are trajectory consistency dimensions, these sub-vectors are directly extracted and averaged. The advantages of this approach include stable and standardized extraction, eliminating the need for subsequent secondary clustering or classification, and ensuring interpretable and universal dimension semantics, thus supporting cross-scenario reuse and consistent model inference.

为了生成用于刻画状态延迟与扰动幅度的第二数值参数，可以从图嵌入模型输出的嵌入向量中，提取与字段扰动幅度和时间偏移表现相关的若干维度分量，并对这些分量执行加权平均处理。具体实现方式如下：在模型训练阶段，需引入补发数据帧对应的“时间延迟量”与“字段扰动幅度”标签，通过回归任务训练一组专属维度，使其学习反映数据帧的时间漂移程度与字段稳定性表现。在推理阶段，从嵌入向量中提取这些预设维度分量，乘以对应的扰动或延迟权重（如根据字段波动对系统控制逻辑影响程度设定），加权求和后再除以总权重，即得第二数值参数。例如，若提取维度值为[0.45, 0.62, 0.57]，对应权重为[0.3, 0.4, 0.3]，则加权平均值为0.45×0.3 + 0.62×0.4 + 0.57×0.3 = 0.555，该值即为第二数值参数。该参数的物理意义是度量当前数据帧是否存在显著的延迟发送行为或字段异常波动，从而提示该数据帧是否可靠地反映当前真实状态。数值越高，说明扰动越大、滞后越严重，可信度越低。To generate a second numerical parameter to characterize state delay and disturbance amplitude, several dimensional components related to field disturbance amplitude and time drift can be extracted from the embedding vector output by the graph embedding model and a weighted average of these components can be performed. This is achieved as follows: During model training, labels for "time delay" and "field disturbance amplitude" corresponding to the retransmitted data frames are introduced. A set of dedicated dimensions is trained through a regression task to reflect the degree of temporal drift and field stability of the data frames. During inference, these pre-set dimensional components are extracted from the embedding vector, multiplied by the corresponding disturbance or delay weights (e.g., based on the impact of field fluctuations on system control logic), and the weighted sum is then divided by the total weight to obtain the second numerical parameter. For example, if the extracted dimension values are [0.45, 0.62, 0.57] and the corresponding weights are [0.3, 0.4, 0.3], the weighted average is 0.45 × 0.3 + 0.62 × 0.4 + 0.57 × 0.3 = 0.555, which is the second numerical parameter. The physical meaning of this parameter is to measure whether the current data frame has significant delays or abnormal field fluctuations, indicating whether the data frame reliably reflects the current state. Higher values indicate greater disturbances, more severe lags, and lower credibility.

对于“用于反映当前数据帧与历史控制路径中字段扰动幅度与时间偏移表现的若干维度分量”的提取，可通过在图嵌入模型训练阶段执行任务分层监督策略实现。具体做法是，在训练数据中为每个数据帧配备两类标签：一类为字段扰动强度（例如字段值波动幅度的标准差），另一类为时间偏移标签（如补发延迟时间与标准任务周期的比值）。然后，在模型结构中指定若干维度用于回归字段扰动幅度，另一些维度用于拟合时间偏移程度，这些维度通过不同损失函数联合优化，以实现语义分离训练。在推理阶段，这些维度在嵌入向量中有固定的索引位置，开发者可直接提取，如维度3和5用于字段扰动，维度6和7用于时间偏移。提取后根据各维度在训练时的预测能力设定权重，进行加权平均，生成用于状态评估的综合参数。这种方法保证了提取的维度分量具有明确定义、语义一致性和可复用性，避免后期人为干预或语义歧义。Extracting several dimensional components reflecting the magnitude of field perturbations and time offsets between the current data frame and historical control paths can be achieved by implementing a task-layered supervision strategy during the graph embedding model training phase. Specifically, two types of labels are assigned to each data frame in the training data: one for the magnitude of field perturbations (e.g., the standard deviation of the field value fluctuations) and the other for the time offset (e.g., the ratio of the reissue delay to the standard task period). The model architecture then specifies several dimensions for regressing the magnitude of field perturbations and others for fitting the degree of time offset. These dimensions are jointly optimized using different loss functions to achieve semantically separate training. During inference, these dimensions have fixed index positions in the embedding vector and can be directly extracted by developers, such as dimensions 3 and 5 for field perturbations and dimensions 6 and 7 for time offsets. After extraction, weights are assigned to each dimension based on its predictive power during training, and a weighted average is performed to generate a comprehensive parameter for state assessment. This approach ensures that the extracted dimensional components are clearly defined, semantically consistent, and reusable, avoiding subsequent human intervention or semantic ambiguity.

本实施例中，预先训练完成的图嵌入算法模型具体为：In this embodiment, the pre-trained graph embedding algorithm model is specifically:

实现“以字段组合与控制指令之间因果路径构成的图结构及其历史状态数据为基础，构建训练样本，将每个字段组合节点对应的历史指标向量作为输入，将该节点在历史任务中的状态判定结果作为训练目标，通过训练使图嵌入算法模型能够基于图结构与指标特征生成嵌入向量”的过程，可以通过以下方式软件实现：首先，依据历史工业数据中设备状态字段与平台下发控制指令的顺序关系，构建图结构，其中节点表示字段组合，边表示控制逻辑上的先后因果关系。接着，对每一个字段组合节点，收集其在多个任务中被判定为“有效状态”或“无效状态”的标签信息，作为训练目标。同时，提取该字段组合在每次判定前所关联的指标向量（包括时间延迟、顺序偏移和字段扰动等指标），作为训练输入。构建图嵌入模型时，将字段组合节点的指标向量嵌入到节点特征矩阵中，图结构的邻接关系用于指导模型学习节点之间的语义依赖。通过图神经网络类算法（如GCN、GAT等）进行前向传播，并将模型输出与真实状态标签进行比对，通过误差反向传播优化模型参数，直至模型能稳定输出具有区分能力的节点嵌入向量。The process of constructing training samples based on a graph structure consisting of causal paths between field combinations and control instructions and its historical status data, using the historical indicator vector corresponding to each field combination node as input and the status determination results of that node in historical tasks as training targets, and enabling the graph embedding algorithm model to generate embedding vectors based on the graph structure and indicator features through training, can be implemented in software as follows: First, a graph structure is constructed based on the sequential relationship between device status fields and control instructions issued by the platform in historical industrial data, where nodes represent field combinations and edges represent sequential causal relationships in control logic. Next, for each field combination node, label information indicating whether it was determined to be "valid" or "invalid" in multiple tasks is collected as training targets. Simultaneously, the indicator vector associated with each field combination before each determination (including indicators such as time delay, sequence offset, and field perturbation) is extracted as training input. When constructing the graph embedding model, the indicator vector of the field combination node is embedded in the node feature matrix. The adjacency relationships in the graph structure are used to guide the model in learning the semantic dependencies between nodes. Forward propagation is performed through graph neural network algorithms (such as GCN, GAT, etc.), and the model output is compared with the true state label. The model parameters are optimized through error backpropagation until the model can stably output node embedding vectors with distinguishing capabilities.

之所以要这样做，是因为传统的状态判断算法往往依赖静态规则或阈值判断，缺乏对设备行为变化上下文的深度理解。而通过构建字段组合与指令的图结构，不仅保留了状态与行为之间的因果链条，还能利用图嵌入模型有效整合节点自身特征（指标向量）与其图邻居信息，使得模型具备从图谱中挖掘时态规律、识别微弱状态差异的能力。最终，该训练过程使得模型在面对新的补发数据帧时，能够输出包含语义特征的一组向量分量，从中提取出的参数可用于精准区分状态是否真实有效，从而避免平台做出错误控制决策。This is necessary because traditional state judgment algorithms often rely on static rules or threshold judgments, lacking a deep understanding of the context of device behavior changes. By constructing a graph structure of field combinations and instructions, not only is the causal chain between state and behavior preserved, but the graph embedding model can also effectively integrate the node's own characteristics (indicator vector) with its graph neighbor information, enabling the model to mine temporal patterns from the graph and identify subtle state differences. Ultimately, this training process enables the model to output a set of vector components containing semantic features when faced with new reissued data frames. The parameters extracted from these can be used to accurately distinguish whether the state is real and valid, thereby preventing the platform from making erroneous control decisions.

本实施例中，基于第一参数与第二参数的联合判断，判断该数据帧对应的状态字段是否应作为当前设备的有效运行状态，具体为：In this embodiment, based on the combined judgment of the first parameter and the second parameter, it is determined whether the status field corresponding to the data frame should be used as the valid operating status of the current device, specifically:

判断过程可通过软件方式实现，主要步骤包括：首先，基于历史设备状态数据，统计所有被确认为有效运行状态与无效状态的数据帧所对应的第一参数与第二参数的数值分布，使用这些历史标注样本在二维坐标系中绘制散点图，并采用密度聚类或支持向量机（SVM）等可解释的判定边界构建方法生成一条划分有效状态区域与无效状态区域的边界曲线。其次，系统每次接收到一条待判断数据帧后，先根据该数据帧已生成的第一参数与第二参数，计算其在二维判定空间中的坐标点，再调用内置的边界判定算法确定该点是否处于有效状态区域内。若在区域内，则更新当前设备状态快照为该数据帧字段内容；若不在区域内，则丢弃该数据帧作为无效输入，并记录其标识信息用于后续模型自校验更新。此方法引入判定空间的主要目的是将连续数值参数的判断转化为可视化的空间决策过程，提高系统对历史经验的记忆与判定规则的稳定性，避免采用简单阈值判断带来的误判问题，同时也为后续模型调整和人工干预提供直观依据。The judgment process can be implemented in software. The main steps include: First, based on historical device status data, the numerical distribution of the first and second parameters corresponding to all data frames identified as valid and invalid operating states is statistically analyzed. These historically labeled samples are used to plot a scatter plot in a two-dimensional coordinate system. Interpretable decision boundary construction methods, such as density clustering or support vector machines (SVM), are then used to generate a boundary curve that demarcates the valid and invalid state regions. Second, upon receiving each data frame to be judged, the system first calculates the coordinates of its point in the two-dimensional decision space based on the first and second parameters generated for that data frame. The system then invokes a built-in boundary judgment algorithm to determine whether the point lies within the valid state region. If it does, the current device status snapshot is updated to contain the data frame's field contents. If not, the data frame is discarded as invalid input, and its identification information is recorded for subsequent model self-verification updates. The main purpose of introducing the decision space in this method is to transform the judgment of continuous numerical parameters into a visual spatial decision process, improving the system's memory of historical experience and the stability of the judgment rules, avoiding the misjudgment problems associated with simple threshold judgments, and providing an intuitive basis for subsequent model adjustments and manual intervention.

预设的有效状态判定边界基于历史数据分布特征采用固定阈值拟合方式构建，所使用的阈值来源于历史状态数据统计分析，确保判定标准的一致性与稳定性。The preset effective state judgment boundary is constructed based on the distribution characteristics of historical data using a fixed threshold fitting method. The threshold used is derived from the statistical analysis of historical state data to ensure the consistency and stability of the judgment criteria.

预设的有效状态判定边界可通过对历史状态数据的统计分析在软件中实现，具体做法为：首先，系统收集在历史任务中被人工或规则系统标注为“有效状态”和“无效状态”的数据帧样本，并提取这些样本对应的第一参数与第二参数值，在二维平面上进行散点分布可视化。随后，统计所有有效状态样本第一参数和第二参数的取值区间，分别计算其均值与标准差，在两个参数维度上分别设定一个偏离均值不超过设定倍数（如1.5倍标准差）的阈值区间，用于构建规则边界；也可以采用等频分箱、分位数分析等方式提取80%–90%的主密集分布区域。之后，使用这些区间联合构建一个矩形或椭圆形区域作为有效状态判定边界，并将该边界参数写入系统模型配置中，实现边界的规则化应用。该方法可确保边界构建依托于真实数据分布特征而非人工经验判断，提升状态判断的稳定性和泛化能力，同时由于其基于固定阈值拟合而非模型训练，具备较强的可解释性和可调试性，适合在工业环境中部署与迭代优化。The preset valid state determination boundary can be implemented in the software through statistical analysis of historical state data. The specific approach is as follows: First, the system collects data frame samples that have been manually or rule-basedly labeled as "valid state" and "invalid state" in historical tasks, extracts the first and second parameter values corresponding to these samples, and visualizes their scatter distribution on a two-dimensional plane. Subsequently, the value ranges of the first and second parameters of all valid state samples are counted, and their means and standard deviations are calculated respectively. A threshold range is set in each parameter dimension, with the deviation from the mean not exceeding a set multiple (e.g., 1.5 times the standard deviation) to construct a regular boundary. Alternatively, equal frequency binning and quantile analysis can be used to extract the 80%–90% main dense distribution area. These intervals are then used to jointly construct a rectangular or elliptical area as the valid state determination boundary, and the boundary parameters are written into the system model configuration to implement the regular application of the boundary. This method ensures that boundary construction is based on real data distribution characteristics rather than manual experience judgment, improving the stability and generalization ability of state judgment. At the same time, because it is based on fixed threshold fitting rather than model training, it has strong interpretability and debuggability, and is suitable for deployment and iterative optimization in industrial environments.

实现该步骤可以通过以下方式在软件中完成：系统在对某一数据帧完成有效状态判断后，立即将判断结果与该数据帧的字段组合一同写入设备状态快照表中，状态快照结构以字段组合作为主键、附带最新的判断标签（有效/无效）、时间戳以及设备ID等标识信息。随后，系统会定位该字段组合在因果图谱中的节点位置，沿其路径回溯前一跳的字段组合节点与控制指令边，并更新该路径上的置信权重。更新方法包括：若当前判断为有效状态，提升当前路径边的置信得分（例如累计命中次数+1，并计算其在所有路径中的相对频率）；若判断为无效状态，则降低该路径边的置信得分，并记录状态判断失败的上下文信息。This step can be implemented in software in the following way: After the system completes the valid status judgment for a data frame, it immediately writes the judgment result and the field combination of the data frame into the device status snapshot table. The status snapshot structure uses the field combination as the primary key, and is accompanied by the latest judgment label (valid/invalid), timestamp, and device ID and other identification information. Subsequently, the system locates the node position of the field combination in the causal graph, traces back along its path to the field combination node and control instruction edge of the previous hop, and updates the confidence weight on the path. The update method includes: if the current judgment is a valid state, increase the confidence score of the current path edge (for example, the cumulative number of hits +1, and calculate its relative frequency in all paths); if the judgment is an invalid state, reduce the confidence score of the path edge, and record the context information of the failed status judgment.

这一机制的设计目的是实现图谱中路径可信度的动态演化，使平台能逐步从实际控制经验中学习路径的合理性，从而提升后续状态识别的准确性与决策判断的精度。此外，通过不断将新的状态判断结果反馈给图谱，还可以触发状态识别策略的自适应调整，例如当某条路径长期被判定为无效，则系统可自动弱化其参与后续嵌入推理的权重或筛选条件，从而防止过时或异常路径对实时控制造成误导。这种反馈式机制不仅提升了图谱模型的实时性与鲁棒性，也增强了整个控制平台的智能演化能力。This mechanism is designed to achieve dynamic evolution of the credibility of paths in the graph, enabling the platform to gradually learn the rationality of paths from actual control experience, thereby improving the accuracy of subsequent state recognition and the precision of decision-making. In addition, by continuously feeding back new state judgment results to the graph, it can also trigger adaptive adjustments to the state recognition strategy. For example, when a path is judged invalid for a long time, the system can automatically weaken its weight or screening conditions for participating in subsequent embedded reasoning, thereby preventing outdated or abnormal paths from misleading real-time control. This feedback mechanism not only improves the real-time and robustness of the graph model, but also enhances the intelligent evolution capability of the entire control platform.

上述公式均是去量纲取其数值计算，公式是由采集大量数据进行软件模拟得到最近真实情况的一个公式，公式中的预设参数由本领域的技术人员根据实际情况进行设置。The above formulas are all dimensionless and numerical calculations. The formulas are obtained by collecting a large amount of data and performing software simulation to obtain the most recent real situation. The preset parameters in the formulas are set by technicians in this field according to actual conditions.

上述实施例，可以全部或部分地通过软件、硬件、固件或其他任意组合来实现。当使用软件实现时，上述实施例可以全部或部分地以计算机程序产品的形式实现。计算机程序产品包括一个或多个计算机指令或计算机程序。在计算机上加载或执行计算机指令或计算机程序时，全部或部分地产生按照本申请实施例的流程或功能。计算机可以为通用计算机、专用计算机、计算机网络，或者其他可编程装置。计算机指令可以存储在计算机可读存储介质中，或者从一个计算机可读存储介质向另一个计算机可读存储介质传输，例如，计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线或无线（例如红外、无线、微波等）方式向另一个网站站点、计算机、服务器或数据中心进行传输。计算机可读存储介质可以是计算机能够存取的任何可用介质或者是包含一个或多个可用介质集合的服务器、数据中心等数据存储设备。可用介质可以是磁性介质（例如，软盘、硬盘、磁带）、光介质（例如，DVD），或者半导体介质。半导体介质可以是固态硬盘。The above embodiments can be implemented in whole or in part via software, hardware, firmware, or any other combination. When implemented using software, the above embodiments can be implemented in whole or in part in the form of a computer program product. A computer program product comprises one or more computer instructions or computer programs. When the computer instructions or computer program are loaded or executed on a computer, the processes or functions according to the embodiments of the present application are fully or partially generated. The computer can be a general-purpose computer, a special-purpose computer, a computer network, or other programmable device. Computer instructions can be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another. For example, computer instructions can be transmitted from one website, computer, server, or data center to another website, computer, server, or data center via wired or wireless means (e.g., infrared, wireless, microwave, etc.). A computer-readable storage medium can be any available medium accessible by a computer or a data storage device such as a server or data center that contains a collection of one or more available media. Available media can be magnetic media (e.g., floppy disks, hard disks, magnetic tapes), optical media (e.g., DVDs), or semiconductor media. Semiconductor media can be solid-state drives.

应理解，在本申请的各种实施例中，上述各过程的序号的大小并不意味着执行顺序的先后，各过程的执行顺序应以其功能和内在逻辑确定，而不应对本申请实施例的实施过程构成任何限定。It should be understood that in the various embodiments of the present application, the size of the serial numbers of the above-mentioned processes does not mean the order of execution. The execution order of each process should be determined by its function and internal logic, and should not constitute any limitation on the implementation process of the embodiments of the present application.

本领域普通技术人员可以意识到，结合本文中所公开的实施例描述的各示例的单元及算法步骤，能够以电子硬件，或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行，取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能，但是这种实现不应认为超出本申请的范围。Those skilled in the art will appreciate that the units and algorithm steps of each example described in conjunction with the embodiments disclosed herein can be implemented in electronic hardware, or a combination of computer software and electronic hardware. Whether these functions are performed in hardware or software depends on the specific application and design constraints of the technical solution. Professional and technical personnel can use different methods to implement the described functions for each specific application, but such implementation should not be considered beyond the scope of this application.

在本申请所提供的几个实施例中，应该理解到，所揭露的系统和方法，可以通过其他的方式实现。例如，以上所描述的实施例仅仅是示意性的，例如，单元的划分，仅仅为一种逻辑功能划分，实际实现时可以有另外的划分方式，例如多个单元或组件可以结合或者可以集成到另一个系统，或一些特征可以忽略，或不执行。另一点，所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口，装置或单元的间接耦合或通信连接，可以是电性，机械或其他的形式。In the several embodiments provided in this application, it should be understood that the disclosed systems and methods can be implemented in other ways. For example, the embodiments described above are merely illustrative. For example, the division of units is merely a logical function division. In actual implementation, there may be other division methods, such as multiple units or components can be combined or integrated into another system, or some features can be ignored or not executed. Another point is that the mutual coupling or direct coupling or communication connection shown or discussed can be an indirect coupling or communication connection through some interface, device or unit, which can be electrical, mechanical or other forms.

作为分离部件说明的单元可以是或者也可以不是物理上分开的，作为单元显示的部件可以是或者也可以不是物理单元，即可以位于一个地方，或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。Units described as separate components may or may not be physically separate, and components shown as units may or may not be physical units, that is, they may be located in one place or distributed across multiple network units. Some or all of these units may be selected to achieve the purpose of this embodiment according to actual needs.

另外，在本申请各个实施例中的各功能单元可以集成在一个处理单元中，也可以是各个单元单独物理存在，也可以两个或两个以上单元集成在一个单元中。In addition, each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit.

以上，仅为本申请的具体实施方式，但本申请的保护范围并不局限于此，任何熟悉本技术领域的技术人员在本申请揭露的技术范围内，可轻易想到变化或替换，都应涵盖在本申请的保护范围之内。因此，本申请的保护范围应以所述权利要求的保护范围为准。The above are only specific embodiments of the present application, but the scope of protection of this application is not limited thereto. Any changes or substitutions that can be easily conceived by a person skilled in the art within the technical scope disclosed in this application should be included in the scope of protection of this application. Therefore, the scope of protection of this application should be based on the scope of protection of the claims.

Claims

Translated fromChinese

1.基于工业控制平台的工业设备互联互通方法，其特征在于，具体包括以下步骤：1. An industrial equipment interconnection method based on an industrial control platform, characterized in that it specifically includes the following steps:

对内容重复的补发数据帧，计算其与被补发数据帧之间的行为纠缠度，若纠缠度超过预设阈值，则提取顺序偏移量、时间延迟量和字段波动幅度，生成用于描述数据标识时效程度的指标向量；所述行为纠缠度是基于内容重复的补发数据帧与被补发数据帧的归一化字段序列向量，通过动态时间规整算法与余弦相似度计算结果，按预设加权比例线性组合得到的；For reissued data frames with duplicate content, the behavioral entanglement between them and the reissued data frame is calculated. If the entanglement exceeds a preset threshold, the sequence offset, time delay, and field fluctuation amplitude are extracted to generate an indicator vector used to describe the timeliness of the data identification. The behavioral entanglement is based on the normalized field sequence vectors of the reissued data frame with duplicate content and the reissued data frame, obtained by linearly combining the results of the dynamic time warping algorithm and the cosine similarity calculation according to a preset weighted ratio.

2.根据权利要求1所述的基于工业控制平台的工业设备互联互通方法，其特征在于，构建工业设备状态字段与控制指令之间的因果图谱，具体为：2. The industrial equipment interconnection method based on an industrial control platform according to claim 1 is characterized in that a causal graph between industrial equipment status fields and control instructions is constructed, specifically:

通过采集工业设备在任务执行过程中的历史运行数据，提取每一条状态数据帧所包含的字段组与其对应的控制指令，构建以字段组、控制指令及其发生先后顺序组成的三元组集合；By collecting historical operating data of industrial equipment during task execution, the field groups contained in each status data frame and their corresponding control instructions are extracted, and a triple set consisting of field groups, control instructions and their occurrence sequence is constructed;

3.根据权利要求2所述的基于工业控制平台的工业设备互联互通方法，其特征在于，接收来自工业设备的数据帧，判断数据帧是否与历史帧字段内容一致且时间戳或顺序编号不同，若满足则标记为内容重复的补发数据帧，具体为：3. The industrial equipment interconnection and intercommunication method based on the industrial control platform according to claim 2 is characterized in that, receiving a data frame from the industrial equipment, determining whether the data frame is consistent with the historical frame field content and has a different timestamp or sequence number, and if so, marking it as a reissued data frame with duplicate content, specifically:

4.根据权利要求3所述的基于工业控制平台的工业设备互联互通方法，其特征在于，对内容重复的补发数据帧，计算其与被补发数据帧之间的行为纠缠度，若纠缠度超过预设阈值，则提取顺序偏移量、时间延迟量和字段波动幅度，生成用于描述数据标识时效程度的指标向量，具体包括以下步骤：4. The industrial equipment interconnection and interoperability method based on an industrial control platform according to claim 3 is characterized by calculating the behavioral entanglement between the retransmitted data frame with repeated content and the retransmitted data frame. If the entanglement exceeds a preset threshold, the sequence offset, time delay, and field fluctuation amplitude are extracted to generate an indicator vector for describing the timeliness of the data identification, which specifically includes the following steps:

5.根据权利要求4所述的基于工业控制平台的工业设备互联互通方法，其特征在于，基于余弦相似度与动态时间规整算法联合计算两段向量序列轨迹之间的行为纠缠度，具体的计算方式如下：5. The industrial equipment interconnection method based on an industrial control platform according to claim 4 is characterized in that the behavioral entanglement between two vector sequence trajectories is calculated based on the cosine similarity and dynamic time warping algorithm. The specific calculation method is as follows:

6.根据权利要求5所述的基于工业控制平台的工业设备互联互通方法，其特征在于，将生成的指标向量输入经过预先训练后的图嵌入算法模型，生成第一参数与第二参数，具体为：6. The industrial equipment interconnection method based on an industrial control platform according to claim 5 is characterized in that the generated indicator vector is input into a pre-trained graph embedding algorithm model to generate the first parameter and the second parameter, specifically:

7.根据权利要求6所述的基于工业控制平台的工业设备互联互通方法，其特征在于，预先训练完成的图嵌入算法模型具体为：7. The industrial equipment interconnection method based on an industrial control platform according to claim 6 is characterized in that the pre-trained graph embedding algorithm model is specifically:

8.根据权利要求7所述的基于工业控制平台的工业设备互联互通方法，其特征在于，基于第一参数与第二参数的联合判断，判断该数据帧对应的状态字段是否应作为当前设备的有效运行状态，具体为：8. The industrial equipment interconnection and intercommunication method based on an industrial control platform according to claim 7 is characterized in that, based on the joint judgment of the first parameter and the second parameter, it is judged whether the status field corresponding to the data frame should be used as the valid operating status of the current device, specifically: