CN114330972A

Movatterモバイル変換

Info

Publication number: CN114330972A
Application number: CN202111313556.2A
Authority: CN
Inventors: 王琳; 蓝科; 张国兵
Original assignee: Chengdu Sefon Software Co Ltd
Current assignee: Chengdu Sefon Software Co Ltd
Priority date: 2022-02-24
Filing date: 2022-02-24
Publication date: 2022-04-12

Abstract

The invention discloses a facility equipment evaluation method and device based on probability distribution, and mainly solves the problems that in the prior art, the correlation and independence among frame loss characteristics and the expansibility are evaluated in the conventional equipment and facility evaluation method. The facility equipment evaluation method based on the probability distribution comprises the steps of analyzing a phase relation production evaluation framework according to a historical data analyzer of facility equipment, and then grading new data of the facility equipment according to the evaluation framework to output a result. Through the scheme, the method and the device achieve the purposes of automatically establishing an evaluation framework, retaining the relevance and independence among characteristic flowers, and calculating the score and the weight in an information distribution mode to prevent information loss and multi-dimensional evaluation.

Description

Translated fromChinese

一种基于概率分布的设施设备评价方法及装置A method and device for evaluating facilities and equipment based on probability distribution

技术领域technical field

本发明涉及计算机学习技术领域，具体地说，是涉及一种基于概率分布的设施设备评价方法及装置。The invention relates to the technical field of computer learning, in particular to a method and device for evaluating facilities and equipment based on probability distribution.

背景技术Background technique

随着人工智能的发展，各行各业对设施设备的需求量逐渐增大，并且对设施设备的依赖性也非常大；设施设备在进行监控任务的情况下，其健康状态会极大影响设施设备的正常运行。因此，为了能更好维护设施设备，及时获知设施设施运行状况，建立一个设施设备监控的评估框架是非常有必要的。With the development of artificial intelligence, the demand for facilities and equipment in all walks of life is gradually increasing, and the dependence on facilities and equipment is also very large; when facilities and equipment are performing monitoring tasks, their health status will greatly affect facilities and equipment. of normal operation. Therefore, in order to better maintain the facilities and equipment and know the operation status of the facilities in time, it is very necessary to establish an evaluation framework for the monitoring of the facilities and equipment.

近年来出现了很多针对设施设备等进行评估的方法，例如：层次分析法、模糊评价法、几何评价法、熵权法等；这些方法能在一定程度上反映设施设备的状况，但是需要专家主观建立一个非常合理的评估框架和一个具有客观效应的概率分布，这种方式适合数据维度较小的情况，并且该评估框架已经丢失了特征之间的相关性和独立性。同时，由于这些方法的评估结果会受认为主观认识的影响，因此，评估框架并不具备一定扩展性。In recent years, there have been many evaluation methods for facilities and equipment, such as: AHP, fuzzy evaluation method, geometric evaluation method, entropy weight method, etc. These methods can reflect the status of facilities and equipment to a certain extent, but require experts to subjectively Establish a very reasonable evaluation framework and a probability distribution with objective effects, which is suitable for the case where the data dimension is small, and the evaluation framework has lost the correlation and independence between features. At the same time, since the evaluation results of these methods are affected by subjective perceptions, the evaluation framework does not have certain extensibility.

发明内容SUMMARY OF THE INVENTION

本发明的目的在于提供一种基于概率分布的设施设备评价方法及装置，以解决现有设备设施评价方法中评估框架丢失特征之间的相关性和独立性及不具备扩展性的问题。The purpose of the present invention is to provide a facility equipment evaluation method and device based on probability distribution, so as to solve the problems of correlation and independence between missing features of the evaluation framework and lack of scalability in the existing equipment facility evaluation method.

为了解决上述问题，本发明提供如下技术方案：In order to solve the above problems, the present invention provides the following technical solutions:

一种基于概率分布的设施设备评价方法包括以下步骤：A facility and equipment evaluation method based on probability distribution includes the following steps:

S1对设施设备历史数据的数据集进行数据标准化，然后计算其相关系数矩阵corrMatrix；S1 standardizes the data set of historical data of facilities and equipment, and then calculates its correlation coefficient matrix corrMatrix;

S2、根据步骤S1的相关系数矩阵corrMatrix得出对应的载荷矩阵loadingMatrix，得到评估框架；S2, obtain the corresponding loading matrix loadingMatrix according to the correlation coefficient matrix corrMatrix of step S1, obtain the evaluation frame;

S3、根据步骤S2的评估框架设置特征的期望区间，然后将设备设施的新数据转化为概率分布；S3, set the expected interval of the feature according to the evaluation framework of step S2, and then convert the new data of the equipment and facilities into a probability distribution;

S4、根据步骤S3的概率分布推算父节点的概率分布至根节点，然后计算根节点的分值形成评估结果。S4. Calculate the probability distribution of the parent node to the root node according to the probability distribution in step S3, and then calculate the score of the root node to form an evaluation result.

本发明利用特征之间的相关性进行分析，从数据中去发现特征与特征之间的关系，并利用这样的相关关系建立一个客观的，具有一定扩展性的评估框架，并利用模糊评价原理结合信息融合方法对新产生的设施设备数据进行评估，以此达到从各个维度综合评价设施设备的状况的目的。The present invention uses the correlation between features to analyze, finds the relationship between the features from the data, and uses such a correlation to establish an objective evaluation framework with certain expansibility, and uses the fuzzy evaluation principle to combine The information fusion method evaluates the newly generated facility and equipment data, so as to achieve the purpose of comprehensively evaluating the condition of the facility and equipment from various dimensions.

进一步的，步骤S1中对设施设备历史数据的数据集进行数据标准化的具体过程如下：Further, the specific process of performing data standardization on the data set of historical data of facilities and equipment in step S1 is as follows:

S101、遍历设施设备历史数据数据集的m个特征，得到特征F_i(i＝1,2,…,m)的最大值max_i和最小值min_i；S101, traverse the m features of the facility equipment historical data dataset to obtain the maximum value max_i and the minimum value min_i of the feature F_i (i=1,2,...,m);

S102、将设施设备历史数据进行标准化，得到数据集normalData，标准化方法为：

S102, standardize the historical data of facilities and equipment to obtain a data set normalData. The standardization method is:

其中x′_ij表示x_ij标准化后的值，x_ij表示数据集normalData中特征F_i(i＝1,2,…,m)的第j(j＝1,2,…,n)个值；

表示特征F_i的均值，其计算方法为：

where x′_ij represents the standardized value of x_ij , and x_ij represents the jth (j=1, 2, …, n) value of the feature F_i (i=1, 2, …, m) in the dataset normalData;

_Represents the mean value of feature Fi, and its calculation method is:

std_i表示特征F_i的标准差，其计算方法为：

std_i represents the standard deviation of feature Fi_, and its calculation method is:

进一步的，步骤S1中计算其相关系数矩阵corrMatrix的具体过程如下：Further, the specific process of calculating the correlation coefficient matrix corrMatrix in step S1 is as follows:

(1)求特征F_i和特征F_j(i,j＝1,2,…,m)之间的相关系数r_ij，计算方法如下：(1) Find the correlation coefficient r_ij between the feature F_i and the feature F_j (i,j=1,2,...,m), the calculation method is as follows:

其中，其中，k＝1,2,…,n，且n表示样本个数，x_ks表示第s个特征F_s的第k个样本取值，x_kj表示第j个特征F_j的第k个样本取值，

表示第s个特征F_s的均值，

表示第j个特征F_j的均值；Among them, k=1,2,...,n, and n represents the number of samples, x_ks represents the value of the k-th sample of the s-th feature F_s , and x_kj represents the k-th sample value of the j-th feature F_j sample values,

represents the mean of the s-th feature F_s ,

Represents the mean of the jth feature F_j ;

(2)相关系数矩阵corrMatrix为：

其中n表示样本数，m表示特征数。(2) The correlation coefficient matrix corrMatrix is:

where n is the number of samples and m is the number of features.

进一步的，步骤S2中利用SVD分解计算相关系数矩阵corrMatrix得到对应的载荷矩阵loadingMatrix。Further, in step S2, the SVD decomposition is used to calculate the correlation coefficient matrix corrMatrix to obtain the corresponding loading matrix loadingMatrix.

进一步的，步骤S2中得到评估框架的具体过程如下：Further, the specific process of obtaining the evaluation framework in step S2 is as follows:

S201、根据载荷矩阵loadingMatrix计算对应的方差值D，统计方差D大于1的个数N，N即为特征的分类数；S201. Calculate the corresponding variance value D according to the loading matrix, and count the number N of which the variance D is greater than 1, and N is the number of classifications of the feature;

S202、将载荷矩阵loadingMatrix利用方差最大法进行旋转得到旋转矩阵rotationMatrix，定位特征F_i在旋转矩阵rotationMatrix行中的最大值所处的位置k(k＝1,2,…,K)，得出特征F_i为第k类特征；S202. Rotate the loading matrix by using the maximum variance method to obtain the rotation matrix rotationMatrix, locate the position_k (k=1, 2, . F_i is the k-th type of feature;

S203、根据步骤S201的方差值求父节点权重：假设第k类特征中包含s个特征，计算特征F_kj(k＝1,2,…,K；j＝1,2,…,s)的权重w_kj，计算方法为：

S203. Calculate the weight of the parent node according to the variance value in step S201: assuming that the k-th type of feature contains s features, calculate the feature F_kj (k=1,2,...,K; j=1,2,...,s) The weight w_kj of , the calculation method is:

其中，D_kj表示第k类特征中第j个特征的方差；Among them, D_kj represents the variance of the j-th feature in the k-th type of feature;

S204、根据步骤S101的特征最大值和最小值、步骤S203的权重得到评估框架如下：

S204, according to the characteristic maximum value and minimum value of step S101 and the weight of step S203, the evaluation framework is obtained as follows:

该框架包含各个节点的权重w、叶子节点的F_ij(i＝1,2,…,K；j＝1,2,…,S_K)的最大值max_ij和最小值min_ij，其中

表示第K类特征中第S_K个特征，其中最大值max_ij和最小值min_ij分别是步骤S101的特征的最大值和最小值。The frame includes the weight w of each node, the maximum value max_ij and the minimum value min_ij of F_ij (i=1,2,...,K; j=1,2,...,S_K ) of leaf nodes, where

represents the S_K th feature in the K th type of feature, wherein the maximum value max_ij and the minimum value min_ij are the maximum value and the minimum value of the feature in step S101, respectively.

进一步的，步骤S3中设置期望区间的具体过程为：设置每个特征的期望值区间

并且upLimit_ij表示E_ij上限值，lowerLimit_ij表示E_ij下限值。Further, the specific process of setting the expected interval in step S3 is: setting the expected value interval of each feature

In addition, upLimit_ij represents the upper limit value of E_ij , and lowerLimit_ij represents the lower limit value of E_ij .

进一步的，步骤S3中将设备设施的新数据转化为概率分布的具体过程如下：Further, the specific process of converting the new data of equipment and facilities into probability distribution in step S3 is as follows:

以特征F_ij为例，v_ij表示特征F_ij的特征值，生成概率分布的过程如下：Taking the feature F_ij as an example, v_ij represents the eigenvalue of the feature F_ij , and the process of generating the probability distribution is as follows:

(1)：判断v_ij是否在期望值区间

内，是则表示当前特征值v_ij在该特征最优状态区间内，则将其设置为等级1的概率为1，为其他等级的概率为0，即该特征的此时的概率分布为：p_ij＝(1,0,0,…)；否则转(2)；其中，E_j表示第j个特征F_j的期望值区间；(1): Determine whether v_ij is in the expected value range

If it is within, it means that the current feature value v_ij is in the optimal state interval of the feature, then the probability of setting it to level 1 is 1, and the probability of being other levels is 0, that is, the probability distribution of the feature at this time is: p_ij =(1,0,0,...); otherwise, go to (2); wherein, E_j represents the expected value interval of the jth feature F_j ;

(2)：判断v_ij是否在满足v_ij＞max_ij or v_ij＜min_ij内，是则表示当前特征值已经超出了可接受的范围，该特征的此时的概率分布为p_ij＝(0,0,0,…)；否则，转(3)；max_j和min_j对应第j个特征F_j的最大值和最小值，lowerLimit_j,upLimit_j表示对应第j个特征F_j的期望值下限和期望值上限；(2): Judging whether v_ij is within v_ij >max_ij or v_ij <min_ij , if yes, it means that the current feature value has exceeded the acceptable range, and the probability distribution of the feature at this time is p_ij =( 0,0,0,...); otherwise, go to (3); max_j and min_j correspond to the maximum and minimum values of the j-th feature F_j , lowerLimit_j , upLimit_j represent the expected value corresponding to the j-th feature F_j lower bound and upper bound of expected value;

(3)：判断v_ij是否满足min_ij≤v_ij＜lowerLimit_ij，是则当前特征越大越好，以区间

为基准生成概率分布，否则转(4)；(3): Judging whether v_ij satisfies min_ij ≤ v_ij <lowerLimit_ij , if yes, the larger the current feature, the better, with the interval

Generate a probability distribution for the benchmark, otherwise go to (4);

概率分布生成过程如下：The probability distribution generation process is as follows:

将区间

等分为N-1个区间，第M(1≤M≤N-1)个等级的区间范围为：[a_M,b_M]the interval

It is equally divided into N-1 intervals, and the interval range of the Mth (1≤M≤N-1) level is: [a_M ,b_M ]

其中：in:

a_M＝min_ij+(M-1)×distance_ija_M =min_ij +(M-1)×distance_ij

b_M＝min_ij+M×distance_ijb_M =min_ij +M×distance_ij

其中，distance_j表示将区间

等分成N-1个区间的区间长度；Among them, distance_j represents the interval

Divide the interval length into N-1 intervals equally;

假设v_ij在第M个等级区间范围，则v_ij为第M个等级的概率为：Assuming that v_ij is in the range of the M-th grade interval, the probability that v_ij is the M-th grade is:

v_ij为第M-1个等级的概率为：The probability that v_ij is the M-1th level is:

p_M-1＝1-p_Mp_M-1 = 1-p_M

v_ij为其他等级的概率均为0，则：The probabilities of v_ij for other levels are all 0, then:

P_i＝{0,…,0,1-p_M,p_M,0,…,0}P_i ={0,...,0,1-p_M ,p_M ,0,...,0}

(4)判断v_ij是否满足upLimit_ij≤v_ij≤max_ij，则表示当前特征变化应该是越小越好；概率分布生成过程如下：(4) Judging whether v_ij satisfies upLimit_ij ≤ v_ij ≤ max_ij , it means that the current feature change should be as small as possible; the probability distribution generation process is as follows:

将区间

其中：in:

a_M＝max_ij+(M-1)×distance_ija_M =max_ij +(M-1)×distance_ij

b_M＝max_ij+M×distance_ijb_M =max_ij +M×distance_ij

p_M-1＝1-p_Mp_M-1 = 1-p_M

P_i＝{0,…,0,1-p_M,p_M,0,…,0}。P_i ={0,...,0,1-p_M ,p_M ,0,...,0}.

进一步的，步骤S3中概率分布推算父节点的概率分布至根节点的具体过程如下：Further, the specific process of calculating the probability distribution of the parent node to the root node from the probability distribution in step S3 is as follows:

推算父节点的概率分布的具体过为：评估框架中第i个分类factor_i为实例，其在评估框架中展示为：

The specific procedure for calculating the probability distribution of the parent node is as follows: the i-th classification factor_i in the evaluation framework is an instance, which is displayed in the evaluation framework as:

F_ij表示叶子节点即特征，然后进行信息融合，即将特征的实际信息一层一层进行融合，直到根节点为止；根据每个特征的实际值产生的概率分布矩阵为：F_ij represents the leaf node is the feature, and then information fusion is performed, that is, the actual information of the feature is fused layer by layer until the root node; the probability distribution matrix generated according to the actual value of each feature is:

其中N表示评估评价等级，S_i表示因子factor_i的特征个数；Among them, N represents the evaluation evaluation level, and S_i represents the number of features of factor_i ;

其中，S_i个特征的权重分布为

则计算F_ij的概率分布计算如下：Among them, the weight distribution of S_i features is

Then the probability distribution of_Fij is calculated as follows:

依次类推直到融合到根节点。And so on until fusion to the root node.

进一步的，步骤S3中分值形成评估结果的具体过程如下：Further, the specific process of forming the evaluation result by the score in step S3 is as follows:

每个节点的概率分布均可以转换为综合分值，以节点factor_i为例，其概率分布为P_i＝[w_i1 w_i2 … w_iN]，则该节点的总和得分为：

其中，

表示第g个等级对应的分值，以此类推，直到将所有节点计算完毕。The probability distribution of each node can be converted into a comprehensive score. Taking the node factor_i as an example, its probability distribution is P_i =[w_i1 w_i2 ... w_iN ], then the total score of the node is:

in,

Indicates the score corresponding to the gth level, and so on, until all nodes are calculated.

一种基于信息分布的设施设备评价装置包括存储器：用于存储可执行指令；处理器：用于执行所述存储器中存储的可执行指令，实现一种基于概率分布的设施设备评价方法。A facility equipment evaluation device based on information distribution includes a memory: used for storing executable instructions; a processor: used for executing the executable instructions stored in the memory to implement a probability distribution-based facility equipment evaluation method.

与现有技术相比，本发明具有以下有益效果：Compared with the prior art, the present invention has the following beneficial effects:

(1)本发明通过设备设施历史数据分析特征之间的相关性关系，建立评估框架，该框架可以达到实时计算指标分值，并且所需空间小，具有计算速度快，所需储存空间小的特点。(1) The present invention establishes an evaluation framework by analyzing the correlation between the historical data of equipment and facilities, and the framework can achieve real-time calculation of the index score, and the required space is small, the calculation speed is fast, and the required storage space is small. Features.

(2)本发明能从多个空间维度进行评价当前对象状况，并且能快速、直观的以分数高低来评价好坏，具有多维度评价，能全面反映当前评价对象状况的特点。(2) The present invention can evaluate the current status of the object from multiple spatial dimensions, and can quickly and intuitively evaluate the quality of the object according to the score, has multi-dimensional evaluation, and can fully reflect the characteristics of the current status of the evaluation object.

(3)本发明能自动创建评估框架，克服了传统的手动创建评估框架，利用历史数据挖掘特征之间的关系，从而创建评估框架，既能排除人为主观影响，也能保留数据之间的相关性和独立性，在极大程度上保留了数据中蕴藏的信息，。(3) The present invention can automatically create an evaluation framework, overcomes the traditional manual creation of evaluation frameworks, and utilizes historical data to mine the relationship between features, thereby creating an evaluation framework, which can not only exclude human subjective influence, but also preserve the correlation between data Sexuality and independence preserve the information contained in the data to a great extent.

(4)本发明用信息分布的形式来推算分值和权重，以防止信息丢失；在计算评价分数的过程中将特征值转换为概率分布，保留了数据中蕴藏的更多信息。(4) The present invention uses the form of information distribution to calculate the score and weight to prevent information loss; in the process of calculating the evaluation score, the feature value is converted into a probability distribution, and more information contained in the data is retained.

附图说明Description of drawings

为了更清楚地说明本发明具体实施方式或现有技术中的技术方案，下面将对具体实施方式或现有技术描述中所需要使用的附图作简单地介绍，显而易见地，下面描述中的附图是本发明的一些实施方式，对于本领域普通技术人员来讲，在不付出创造性劳动的前提下，还可以根据这些附图获得其他的附图，其中：In order to illustrate the specific embodiments of the present invention or the technical solutions in the prior art more clearly, the following briefly introduces the accompanying drawings that need to be used in the description of the specific embodiments or the prior art. Obviously, the accompanying drawings in the following description The drawings are some embodiments of the present invention. For those of ordinary skill in the art, other drawings can also be obtained from these drawings without creative work, wherein:

图1为本发明的流程示意图。FIG. 1 is a schematic flow chart of the present invention.

具体实施方式Detailed ways

为了使本发明的目的、技术方案和优点更加清楚，下面将结合图1对本发明作进一步地详细描述，所描述的实施例不应视为对本发明的限制，本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其它实施例，都属于本发明保护的范围。In order to make the purpose, technical solutions and advantages of the present invention clearer, the present invention will be described in further detail below in conjunction with FIG. All other embodiments obtained under the premise of creative work fall within the protection scope of the present invention.

实施例1Example 1

如图1所示，一种基于概率分布的设施设备评价方法是针对设施设备评价，在大数据条件下提供的应用于设施设备评价中的基于信息分布的特征聚类、模糊综合评价方法，该方法包括基于相关性分析生成评估框架、基于信息分布的模糊评价两部分，具体如下：As shown in Figure 1, a facility and equipment evaluation method based on probability distribution is a feature clustering and fuzzy comprehensive evaluation method based on information distribution applied to facility and equipment evaluation under big data conditions. The method includes two parts: the generation of evaluation framework based on correlation analysis and the fuzzy evaluation based on information distribution. The details are as follows:

1、基于相关性分析生成评估框架1. Generate an evaluation framework based on correlation analysis

S1，遍历历史数据的m个特征，得到特征F_i(i＝1,2,…,m)的最大值max_i和最小值min_i；S1, traverse the m features of the historical data to obtain the maximum value max_i and the minimum value min_i of the feature F_i (i=1,2,...,m);

S2，将历史数据进行标准化，得到数据集normalData，标准化方法如下：S2, standardize the historical data to obtain the data set normalData. The standardization method is as follows:

表示特征F_i的均值，其计算方法如下：where x′_ij represents the standardized value of x_ij , and x_ij represents the jth (j=1, 2, …, n) value of the feature F_i (i=1, 2, …, m) in the dataset normalData;

_Represents the mean value of feature Fi, which is calculated as follows:

std_i表示特征F_i的标准差，其计算方法如下：std_i represents the standard deviation of the feature F_i , which is calculated as follows:

S3，计算数据集normalData的相关系数矩阵corrMatrix，其计算方法如下：S3, calculate the correlation coefficient matrix corrMatrix of the data set normalData, and the calculation method is as follows:

(1)求特征F_i和特征F_j(i,j＝1,2,…,m)之间的相关系数，计算方法如下：(1) Find the correlation coefficient between feature F_i and feature F_j (i,j=1,2,...,m), the calculation method is as follows:

表示第s个特征F_s的均值，

represents the mean of the s-th feature F_s ,

Represents the mean of the jth feature F_j ;

(2)相关系数矩阵corrMatrix如下：(2) The correlation coefficient matrix corrMatrix is as follows:

其中n表示样本数，m表示特征数

where n is the number of samples and m is the number of features

S4，利用SVD分解计算corrMatrix对应的载荷矩阵loadingMatrix。S4, use SVD decomposition to calculate the loadingMatrix corresponding to corrMatrix.

S5，根据载荷矩阵loadingMatrix计算对应的方差值D，根据D统计方差大于1的个数N，该值即为特征的分类数。S5 , calculate the corresponding variance value D according to the loading matrix, and count the number N of which the variance is greater than 1 according to D, and the value is the number of classifications of the feature.

S6，载荷矩阵loadingMatrix利用方差最大法进行旋转得到旋转矩阵rotationMatrix，定位特征F_i在旋转矩阵rotationMatrix行中的最大值所处的位置k(k＝1,2,…,K)，即特征F_i为第k类特征。S6, the loading matrix is rotated by the maximum variance method to obtain the rotation matrix rotationMatrix, and the position k (k=1,2,...,K) of the maximum value of the feature F_i in the row of the rotation matrix rotationMatrix is located, that is, the feature F_i is the k-th feature.

S7，计算每个特征的权重，假设第k类特征中包含s个特征，则特征F_kj(k＝1,2,…,K；j＝1,2,…,s)的权重w_kj计算方法如下：S7, calculate the weight of each feature, assuming that the k-th type of feature contains s features, then the weight w_kj of the feature F_kj (k=1,2,...,K; j=1,2,...,s) is calculated Methods as below:

其中，D_kj表示第k类特征中第j个特征的方差。Among them, D_kj represents the variance of the jth feature in the kth class of features.

S8，形成评估框架，如下：S8, forming an evaluation framework, as follows:

该框架包含各个节点的权重w、叶子节点的F_ij(i＝1,2,…,K；j＝1,2,…,S_K)的最大值max_ij和最小值min_ij；其中

表示第K类特征中第S_K个特征，其中最大值max_ij和最小值min_ij分别是步骤S101的特征的最大值和最小值；The frame includes the weight w of each node, the maximum value max_ij and the minimum value min_ij of F_ij (i=1,2,...,K; j=1,2,...,S_K ) of leaf nodes; wherein

Represents the S_K th feature in the K th type of feature, wherein the maximum value max_ij and the minimum value min_ij are the maximum value and the minimum value of the feature in step S101;

2、基于信息分布的模糊评价2. Fuzzy evaluation based on information distribution

该部分是基于前面创建的评估框架来进行，其可以在评估框架中获取每个节点的权重w、每个子节点F_ij(i＝1,2,…,K；j＝1,2,…,S_K)的最大值max_ij和最小值min_ij，则在此基础上进行新数据的评估，设N为评估评价等级数，其对应的等级集合为G＝{grade|grode＝1,2,3,…,N}，其中N表示对低分等级，1表示满分等级，fullmark表示评估评价满分(比如百分值时，fullmark＝100)，因此，等级对应的评分集合为：

具体步骤如下：This part is based on the previously created evaluation framework, which can obtain the weight w of each node, each child node F_ij (i=1,2,...,K; j=1,2,..., The maximum value max_ij and the minimum value min_ij of S_K ), then the evaluation of new data is carried out on this basis, and N is the number of evaluation evaluation grades, and the corresponding grade set is G={grade|grode=1,2, 3, .

Specific steps are as follows:

S1，设置每个特征的期望值区间

并且upLimit_ij表示E_ij上限值，lowerLimit_ij表示E_ij下限值。S1, set the expected value interval of each feature

S2，将新数据转换为概率分布，以特征F_ij为例，v_ij表示特征F_ij的特征值，生成概率分布的过程如下：S2, convert the new data into a probability distribution, take the feature F_ij as an example, v_ij represents the eigenvalue of the feature F_ij , and the process of generating the probability distribution is as follows:

(1)：判断v_ij是否在期望值区间

内，如果是，则表示当前特征值v_ij在该特征最优状态区间内，则将其设置为等级1的概率为1，为其他等级的概率为0，即：p_ij＝(1,0,0,…)；否则，转(2)；其中，E_j表示第j个特征F_j的期望值区间；(1): Determine whether v_ij is in the expected value range

If it is, it means that the current feature value v_ij is within the optimal state interval of the feature, then the probability of setting it as level 1 is 1, and the probability for other levels is 0, that is: p_ij =(1,0 ,0,...); otherwise, go to (2); where E_j represents the expected value interval of the jth feature F_j ;

(2)：判断v_ij是否在满足v_ij＞max_ij or v_ij＜min_ij内，如果是，则表示当前特征值已经超出了可接受的范围，该特征的此时的概率分布为p_ij＝(0,0,0,…)；否则，转(3)；(2): Determine whether v_ij is within the satisfaction of v_ij >max_ij or v_ij <min_ij , if so, it means that the current feature value has exceeded the acceptable range, and the probability distribution of the feature at this time is p_ij =(0,0,0,...); otherwise, go to (3);

(3)：判断v_ij是否满足min_ij≤v_ij＜lowerLimit_ij，如果是，则当前特征越大越好，以区间

为基准生成概率分布，否则转(4)；(3): Determine whether v_ij satisfies min_ij ≤ v_ij <lowerLimit_ij , if so, the larger the current feature, the better, with the interval

Generate a probability distribution for the benchmark, otherwise go to (4);

将区间

其中：in:

a_M＝min_ij+(M-1)×distance_ija_M =min_ij +(M-1)×distance_ij

b_M＝min_ij+M×distance_ijb_M =min_ij +M×distance_ij

其中，distance_j表示将区间

Divide the interval length into N-1 intervals equally;

p_M-1＝1-p_Mp_M-1 = 1-p_M

P_i＝{0,…,0,1-p_M,p_M,0,…,0}P_i ={0,...,0,1-p_M ,p_M ,0,...,0}

(4)判断v_ij是否满足upLimit_ij≤v_ij≤max_ij，则表示当前特征变化应该是越小越好。概率分布生成过程如下：(4) Judging whether v_ij satisfies upLimit_ij ≤ v_ij ≤ max_ij , it means that the current feature change should be as small as possible. The probability distribution generation process is as follows:

将区间

其中：in:

a_M＝max_ij+(M-1)×distance_ija_M =max_ij +(M-1)×distance_ij

b_M＝max_ij+M×distance_ijb_M =max_ij +M×distance_ij

p_M-1＝1-p_Mp_M-1 = 1-p_M

P_i＝{0,…,0,1-p_M,p_M,0,…,0}P_i ={0,...,0,1-p_M ,p_M ,0,...,0}

S2，信息融合，推算父节点的概率分布，具体如下：S2, information fusion, calculate the probability distribution of the parent node, as follows:

评估框架中第i个分类factor_i为实例，其在评估框架中展示如下：The i-th classification factor_i in the evaluation framework is an instance, which is shown in the evaluation framework as follows:

F_ij表示叶子节点，也就是特征，信息融合就是将特征的实际信息一层一层进行融合，直到根节点为止；根据每个特征的实际值产生的概率分布矩阵为：F_ij represents the leaf node, that is, the feature. Information fusion is to fuse the actual information of the feature layer by layer until the root node; the probability distribution matrix generated according to the actual value of each feature is:

其中N表示评估评价等级，S_i表示因子factor_i的特征个数。其中，S_i个特征的权重分布为

则计算F_ij的概率分布计算如下：Among them, N represents the evaluation evaluation level, and S_i represents the number of features of the factor_i . Among them, the weight distribution of S_i features is

Then the probability distribution of_Fij is calculated as follows:

依次类推直到融合到根节点。And so on until fusion to the root node.

S3，评分计算，计算方法如下：S3, score calculation, the calculation method is as follows:

每个节点的概率分布均可以转换为综合分值，以节点factor_i为例，其概率分布为P_i＝[w_i1 w_i2 … w_iN]，则该节点的总和得分为：The probability distribution of each node can be converted into a comprehensive score. Taking the node factor_i as an example, its probability distribution is P_i =[w_i1 w_i2 ... w_iN ], then the total score of the node is:

其中，

表示第g个等级对应的分值，以此类推，直到将所有节点计算完毕。in,

实施例2Example 2

本发明在创建评估框架时充分分析了历史数据中隐藏的信息，利用相关性将特征进行分类，克服了人为的主观影响，同时很好的保留了数据信息；在计算评估对象分数时，将数据转换为概率分布，以信息分布的形式极大程度上防止信息丢失，同时在信息融合时利用模糊评价的思想进行信息融合，将每个子节点的权重和信息分布进行综合计算，以此来计算父节点的权重和信息分布，具有一定合理性可解释性。在大数据条件下，该发明能以较少的历史数据创建评估框架，并且该评估框架能达到实时计算的效果，在一定程度上大大降低了数据存储和计算所需要空间。因此，该方法具有较好的实用性、扩展性以及可解释性。The invention fully analyzes the information hidden in the historical data when creating the evaluation frame, and uses the correlation to classify the features, overcomes the subjective influence of human beings, and at the same time preserves the data information well; when calculating the evaluation object score, the data Convert it to probability distribution to prevent information loss to a great extent in the form of information distribution. At the same time, use the idea of fuzzy evaluation to perform information fusion during information fusion, and comprehensively calculate the weight and information distribution of each child node to calculate the parent node. The weight and information distribution of nodes have a certain rationality and interpretability. Under the condition of big data, the invention can create an evaluation framework with less historical data, and the evaluation framework can achieve the effect of real-time computing, which greatly reduces the space required for data storage and computing to a certain extent. Therefore, the method has good practicability, scalability and interpretability.

在本申请所提供的几个实施例中，应该理解到，所揭露的装置和方法，也可以通过其它的方式实现。以上所描述的装置实施例仅仅是示意性的，例如，附图中的流程图和框图显示了根据本发明的多个实施例的装置、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上，流程图或框图中的每个方框可以代表一个模块、程序段或代码的一部分，所述模块、程序段或代码的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。也应当注意，在有些作为替换的实现方式中，方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如，两个连续的方框实际上可以基本并行地执行，它们有时也可以按相反的顺序执行，这依所涉及的功能而定。也要注意的是，框图和/或流程图中的每个方框、以及框图和/或流程图中的方框的组合，可以用执行规定的功能或动作的专用的基于硬件的系统来实现，或者可以用专用硬件与计算机指令的组合来实现。In the several embodiments provided in this application, it should be understood that the disclosed apparatus and method may also be implemented in other manners. The apparatus embodiments described above are merely illustrative, for example, the flowcharts and block diagrams in the accompanying drawings illustrate the architecture, functionality and possible implementations of apparatuses, methods and computer program products according to various embodiments of the present invention. operate. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code that contains one or more functions for implementing the specified logical function(s) executable instructions. It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It is also noted that each block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations, can be implemented in dedicated hardware-based systems that perform the specified functions or actions , or can be implemented in a combination of dedicated hardware and computer instructions.

另外，在本发明各个实施例中的各功能模块可以集成在一起形成一个独立的部分，也可以是各个模块单独存在，也可以两个或两个以上模块集成形成一个独立的部分。In addition, each functional module in each embodiment of the present invention may be integrated to form an independent part, or each module may exist independently, or two or more modules may be integrated to form an independent part.

所述功能如果以软件功能模块的形式实现并作为独立的产品销售或使用时，可以存储在一个计算机可读取存储介质中。基于这样的理解，本发明的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来，该计算机软件产品存储在一个存储介质中，包括若干指令用以使得一台计算机设备(可以是个人计算机，服务器，或者网络设备等)执行本发明各个实施例所述方法的全部或部分步骤。需要说明的是，在本文中，诸如第一和第二等之类的关系术语仅仅用来将一个实体或者操作与另一个实体或操作区分开来，而不一定要求或者暗示这些实体或操作之间存在任何这种实际的关系或者顺序。而且，术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含，从而使得包括一系列要素的过程、方法、物品或者设备不仅包括那些要素，而且还包括没有明确列出的其他要素，或者是还包括为这种过程、方法、物品或者设备所固有的要素。在没有更多限制的情况下，由语句“包括一个……”限定的要素，并不排除在包括所述要素的过程、方法、物品或者设备中还存在另外的相同要素。If the functions are implemented in the form of software function modules and sold or used as independent products, they can be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present invention can be embodied in the form of a software product in essence, or the part that contributes to the prior art or the part of the technical solution. The computer software product is stored in a storage medium, including Several instructions are used to cause a computer device (which may be a personal computer, a server, or a network device, etc.) to execute all or part of the steps of the methods described in the various embodiments of the present invention. It should be noted that, in this document, relational terms such as first and second are only used to distinguish one entity or operation from another entity or operation, and do not necessarily require or imply any relationship between these entities or operations. any such actual relationship or sequence exists. Moreover, the terms "comprising", "comprising" or any other variation thereof are intended to encompass a non-exclusive inclusion such that a process, method, article or device that includes a list of elements includes not only those elements, but also includes not explicitly listed or other elements inherent to such a process, method, article or apparatus. Without further limitation, an element qualified by the phrase "comprising a..." does not preclude the presence of additional identical elements in a process, method, article or apparatus that includes the element.

以上所述仅为本发明的优选实施例而已，并不用于限制本发明，对于本领域的技术人员来说，本发明可以有各种更改和变化。凡在本发明的精神和原则之内，所作的任何修改、等同替换、改进等，均应包含在本发明的保护范围之内。应注意到：相似的标号和字母在下面的附图中表示类似项，因此，一旦某一项在一个附图中被定义，则在随后的附图中不需要对其进行进一步定义和解释。The above descriptions are only preferred embodiments of the present invention, and are not intended to limit the present invention. For those skilled in the art, the present invention may have various modifications and changes. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention shall be included within the protection scope of the present invention. It should be noted that like numerals and letters refer to like items in the following figures, so once an item is defined in one figure, it does not require further definition and explanation in subsequent figures.

以上所述，仅为本发明的具体实施方式，但本发明的保护范围并不局限于此，任何熟悉本技术领域的技术人员在本发明揭露的技术范围内，可轻易想到变化或替换，都应涵盖在本发明的保护范围之内。因此，本发明的保护范围应所述以权利要求的保护范围为准。The above are only specific embodiments of the present invention, but the protection scope of the present invention is not limited thereto. Any person skilled in the art can easily think of changes or substitutions within the technical scope disclosed by the present invention. should be included within the protection scope of the present invention. Therefore, the protection scope of the present invention should be based on the protection scope of the claims.

Claims

1. A facility equipment evaluation method based on probability distribution is characterized by comprising the following steps:

s1, carrying out data standardization on the data set of the historical data of the facility equipment, and then calculating a correlation coefficient matrix corrMatrix of the data set;

s2, obtaining a corresponding load matrix loadingMatrix according to the correlation coefficient matrix corrMatrix of the step S1, and obtaining an evaluation frame;

s3, setting expected intervals of the characteristics according to the evaluation framework of the step S2, and then converting the new data of the equipment into probability distribution;

and S4, calculating the probability distribution of the father node to the root node according to the probability distribution in the step S3, and then calculating the score of the root node to form an evaluation result.

2. The method for evaluating a facility device based on probability distribution according to claim 1, wherein the specific process of data normalization of the data set of the history data of the facility device in step S1 is as follows:

s101, traversing m characteristics of historical data set of facility equipment to obtain characteristics F_iMaximum value max of (i ═ 1,2, …, m)_iAnd minimum value min_i；

S102, standardizing historical data of the facility equipment to obtain a data set normalData, wherein the standardized method comprises the following steps:

wherein, x'_ijDenotes x_ijNormalized value, x_ijRepresenting feature F in dataset normalData_iJ (j) th value of (i ═ 1,2, …, m) (j ═ 1,2, …, n);

represents a feature F_iThe calculation method of the mean value of (a) is as follows:

std_irepresents a feature F_iThe calculation method of the standard deviation of (2) is as follows:

3. the method of claim 2, wherein the step S1 of calculating the correlation coefficient matrix corrMatrix comprises the following steps:

(1) feature F is obtained_iAnd feature F_jCorrelation coefficient r between (i, j ═ 1,2, …, m)_ijThe calculation method is as follows:

wherein k is 1,2, …, n, and n represents the number of samples, x_ksDenotes the s-th feature F_sThe kth sample value of (2), x_kjDenotes the jth feature F_jThe value of the kth sample of (a),

denotes the s-th feature F_sThe average value of (a) of (b),

denotes the jth feature F_jThe mean value of (a);

(2) the correlation coefficient matrix corrMatrix is:

where n represents the number of samples and m represents the number of features.

4. The method as claimed in claim 3, wherein the SVD is used to calculate a correlation coefficient matrix corrMatrix to obtain a corresponding load matrix loadingMatrix in step S2.

5. The method for evaluating a facility device based on probability distribution according to claim 4, wherein the specific process of obtaining the evaluation framework in step S2 is as follows:

s201, calculating a corresponding variance value D according to the load matrix loadingMatrix, and counting the number N of variances D larger than 1, wherein N is the classification number of the features;

s202, rotating the load matrix loadingMatrix by using a variance maximization method to obtain a rotation matrix rotamatrix, and positioning characteristics F_iAt the position K (K1, 2, …, K) where the maximum value in the rotation matrix rotamatrix row is located, the feature F is derived_iIs a k-th class feature;

s203, solving the weight of the parent node according to the variance value in the step S201: assuming that the kth class of features contains s features, calculate feature F_kj(K1, 2, …, K; j 1,2, …, s) weight w_kjThe calculation method comprises the following steps:

wherein D is_kjRepresenting the variance of the jth feature in the kth class of features;

s204, obtaining an evaluation framework according to the characteristic maximum value and the characteristic minimum value in the step S101 and the weight in the step S203 as follows:

the frame contains the weight w of each node, F of the leaf node_ij(i＝1,2,…,K；j＝1,2,…,S_K) Maximum value max of_ijAnd minimum value min_ijWherein

Indicating the S-th in the K-th class of features_KA maximum value max_ijAnd minimum value min_ijRespectively, the maximum value and the minimum value of the characteristic of step S101.

6. The method for evaluating a facility device based on probability distribution according to claim 5, wherein the specific process of setting the expected interval in step S3 is as follows: setting a desired value interval for each feature

And upLimit_ijRepresents E_ijUpper limit value, lowerLimit_ijRepresents E_ijThe lower limit value.

7. The method for evaluating facility equipment based on probability distribution according to claim 6, wherein the specific process of converting the new data of the equipment into probability distribution in step S3 is as follows:

by the feature F_ijAs an example, v_ijRepresents a feature F_ijThe process of generating the probability distribution is as follows:

(1): judgment of v_ijWhether or not within the expected value interval

If yes, then the current characteristic value v is represented_ijIn the feature optimal state interval, the probability of the feature being set to level 1 is 1, and the probabilities of the other levels are 0, that is, the probability distribution of the feature at this time is: p is a radical of_ij(1,0,0, …); otherwise, turning to (2); wherein E is_jDenotes the jth feature F_jAn expected value interval of (a);

(2): judgment of v_ijWhether or not v is being satisfied_ij＞max_ij or v_ij＜min_ijIf yes, then the current feature value is out of the acceptable range, and the probability distribution of the feature at this time is p_ij(0,0,0, …); otherwise, turning to (3); max_jAnd min_jCorresponding to the jth feature F_jMaximum and minimum values of, lowerLimit_j,upLimit_jIndicates the corresponding jth feature F_jLower desired value and upper desired value;

(3): judgment of v_ijWhether or not it satisfies min_ij≤v_ij＜lowerLimit_ijIf yes, the larger the current feature is, the better, in intervals

Generating probability distribution for the benchmark, otherwise, turning to (4);

the probability distribution generation process is as follows:

section of will

Equally dividing the interval into N-1 intervals, wherein the interval range of the Mth (M is more than or equal to 1 and less than or equal to N-1) grade is as follows: [ a ] A_M,b_M]

Wherein:

a_M＝min_ij+(M-1)×distance_ij

b_M＝min_ij+M×distance_ij

wherein distance_jIndicates a section of

The length of the interval divided into N-1 intervals is equally;

suppose v_ijIn the Mth class interval, v_ijThe probability for the mth level is:

v_ijthe probability of the M-1 th level is:

p_M-1＝1-p_M

v_ijthe probabilities of other levels are all 0, then:

P_i＝{0,…,0,1-p_M,p_M,0,…,0}

(4) judgment of v_ijWhether or not to satisfy upLimit_ij≤v_ij≤max_ijIt means that the current feature change should be as small as possible; the probability distribution generation process is as follows:

section of will

Equally dividing the interval into N-1 intervals, wherein the interval range of the Mth (M is more than or equal to 1 and less than or equal to N-1) grade is as follows: [ a ] A_M,b_M]Wherein:

a_M＝max_ij+(M-1)×distance_ij

b_M＝max_ij+M×distance_ij

v_ijthe probability of the M-1 th level is:

p_M-1＝1-p_M

v_ijthe probabilities of other levels are all 0, then:

P_i＝{0,…,0,1-p_M,p_M,0,…,0}。

8. the method of claim 7, wherein the step of calculating the probability distribution of the parent node to the root node based on the probability distribution in step S3 is as follows:

the specific process of calculating the probability distribution of the father node is as follows: evaluation framework ith Classification factor_iAs an example, it is shown in the evaluation framework as:

F_ijrepresenting leaf nodes, namely characteristics, and then performing information fusion, namely fusing actual information of the characteristics layer by layer until a root node; the probability distribution matrix generated from the actual values of each feature is:

wherein N represents an evaluation rating, S_iExpression factor_iThe number of features of (a);

wherein S is_iThe weight distribution of each feature is

Then calculate F_ijThe probability distribution of (c) is calculated as follows:

and so on until merging to the root node.

9. The method for evaluating a facility device based on probability distribution as claimed in claim 1, wherein the specific process of forming the evaluation result in step S3 is as follows:

the probability distribution of each node can be converted into a comprehensive score by a node factor_iFor example, the probability distribution is P_i＝[w_i1 w_i2 … w_iN]Then the sum of the nodes is divided into:

wherein,

and expressing the score corresponding to the g-th grade, and so on until all the nodes are calculated.

10. A facility equipment evaluation device based on information distribution is characterized by comprising

A memory: for storing executable instructions;

a processor: for executing executable instructions stored in said memory, implementing a probability distribution based utility device evaluation method as claimed in any one of claims 1-9.