CN119577648A

Movatterモバイル変換

Info

Publication number: CN119577648A
Application number: CN202411644141.7A
Authority: CN
Inventors: 吴大鹏; 魏海超; 王汝言; 张鸿; 张若英; 邹虹
Original assignee: Chongqing University of Post and Telecommunications
Current assignee: Chongqing University of Post and Telecommunications
Priority date: 2024-11-18
Filing date: 2024-11-18
Publication date: 2025-03-07

Abstract

Translated fromChinese

本发明涉及一种多因素融合的物联网离群数据识别方法，属于物联网技术领域，包括确定识别的目标因素，并融合与其相关联的多种影响因素，获取对应的历史数据并进行预处理；对具体指标进行相关性分析，选择对目标因素有显著影响的指标数据，形成最终的有效指标组合；构建LSTM结合特征注意力机制的预测模型，利用相关性分析筛选后的数据进行预测，输出未来一段时间的预测值，通过反归一化获得实际预测值；采用预测模型，融合多因素综合分析的离群数据识别方法，使用孤立森林算法建模，将预测残差和影响因子作为输入特征，计算每个数据点的异常分数；基于异常分数的分布，采用滑动窗口技术动态设置阈值，将超出阈值的数据点判定为离群数据。

The invention relates to a multi-factor fused outlier data identification method for the Internet of Things, belonging to the technical field of the Internet of Things, and comprising the steps of determining an identified target factor, fusing a plurality of influencing factors associated therewith, acquiring corresponding historical data and performing preprocessing; performing correlation analysis on specific indicators, selecting indicator data having a significant influence on the target factor, and forming a final effective indicator combination; constructing a prediction model combining an LSTM with a feature attention mechanism, using data screened by correlation analysis to perform prediction, outputting a prediction value for a period of time in the future, and obtaining an actual prediction value by denormalization; adopting a prediction model, integrating an outlier data identification method for comprehensive analysis of multiple factors, using an isolation forest algorithm to build a model, taking prediction residuals and influencing factors as input features, and calculating an abnormal score for each data point; and dynamically setting a threshold value based on the distribution of the abnormal score by using a sliding window technology, and determining data points exceeding the threshold value as outlier data.