CN106874935A

Movatterモバイル変換

Info

Publication number: CN106874935A
Application number: CN201710028492.9A
Authority: CN
Inventors: 王海伦; 蔡志宏; 叶虹; 王天真
Original assignee: Quzhou University
Current assignee: Quzhou University
Priority date: 2017-01-16
Filing date: 2017-01-16
Publication date: 2017-06-20

Abstract

Translated fromChinese

本发明公开了一种基于多核函数自适应融合的支持向量机参数选择方法，与现有技术相比，本发明分别对局部核函数、全局核函数、混合核函数以及多核函数的特性进行了具体分析。把多核函数的所有融合系数、核函数参数、回归参数组合在一起作为参数状态向量，从而将模型选择问题转换成一个非线性系统的状态估计问题，然后用五阶容积卡尔曼滤波进行参数估计，实现多核函数加权系数的自适应融合及核参数、回归参数的选择。

The invention discloses a support vector machine parameter selection method based on the self-adaptive fusion of multi-kernel functions. analyze. Combine all the fusion coefficients, kernel function parameters, and regression parameters of the multi-kernel function as a parameter state vector, thereby converting the model selection problem into a nonlinear system state estimation problem, and then use the fifth-order volumetric Kalman filter for parameter estimation. Realize adaptive fusion of multi-kernel function weighting coefficients and selection of kernel parameters and regression parameters.

Description

Translated fromChinese

基于多核函数自适应融合的支持向量机参数选择方法Parameter Selection Method of Support Vector Machine Based on Multi-kernel Function Adaptive Fusion

技术领域technical field

本发明涉及数据处理算法领域，尤其涉及一种基于多核函数自适应融合的支持向量机参数选择方法。The invention relates to the field of data processing algorithms, in particular to a support vector machine parameter selection method based on multi-kernel function adaptive fusion.

背景技术Background technique

支持向量机(Support Vector Machine，SVM)自1995年Corinna Cortes和Vapnik等首先提出以来，吸引了国内外众多学者的关注和研究，它是建立在统计学习理论基础之上，以结构风险最小化为原则的通用学习方法^[1]。SVM具有强大的非线性处理能力和泛化能力，特别是在解决非线性、小样本及高维模式识别中表现出明显的优势，已经广泛用于分类和回归问题的处理，如图像识别、文本分类、人脸识别以及入侵检测等多个领域^{[2，3，4，5]}。Support Vector Machine (Support Vector Machine, SVM) has attracted the attention and research of many scholars at home and abroad since it was first proposed by Corinna Cortes and Vapnik in 1995. principles of general learning methods^[1] . SVM has strong nonlinear processing ability and generalization ability, especially in solving nonlinear, small sample and high-dimensional pattern recognition, and has been widely used in the processing of classification and regression problems, such as image recognition, text Classification, face recognition and intrusion detection and other fields^{[2, 3, 4, 5]} .

SVM的核心是核函数的引入，每一种核函数都具有单独的特性，对于不同的应用场合针对性很强，核函数的性能表现差异性也较大，并且核函数的构造与核参数的选择至今没有完备的理论依据。基于单特征空间的单个核函数构建的支持向量机在处理样本分布不均匀等问题时有很大的缺陷，比如，特征是由两个特征融合而成，第一个特征服从多项式分布，而第二个特征服从正态分布。如果采用单一核函数只能刻画数据某一方面的特性就无法对不同分布的特征进行恰当的表示^[6]。文献[7-9]提出用处理局部信息强和全局信息能力都很强的混合核函数处理分类问题。这种方法可以弥补单一核函数在处理样本局部和全局信息方面的不足，但是缺乏有效的方法对两个基础核函数的加权系数进行优化。文献[10]基于混合核函数提供了另外一条估计加权系数与核参数的途径，把加权系数、全局核函数参数、局部核函数参数以及惩罚参数结合在一起整体作为混合核函数的参数，将基于混合核函数的支持向量回归机多参数调整看成非线性动态系统的参数识别问题，并用五阶容积卡尔曼滤波对参数进行估计，成功地实现了自适应调整加权系数与核参数。The core of SVM is the introduction of kernel functions. Each kernel function has its own characteristics and is highly targeted for different applications. The performance of kernel functions is also quite different. There is no complete theoretical basis for the choice so far. The support vector machine constructed based on a single kernel function in a single feature space has great defects when dealing with problems such as uneven sample distribution. For example, a feature is formed by fusing two features. The two features follow a normal distribution. If a single kernel function can only describe the characteristics of a certain aspect of the data, it will not be able to properly represent the characteristics of different distributions^[6] . Literature [7-9] proposes to use a hybrid kernel function with strong ability to deal with local information and global information to deal with classification problems. This method can make up for the shortcomings of a single kernel function in processing sample local and global information, but it lacks an effective method to optimize the weighting coefficients of the two basic kernel functions. Literature [10] provides another way to estimate weighting coefficients and kernel parameters based on the hybrid kernel function. The weighting coefficient, global kernel function parameters, local kernel function parameters, and penalty parameters are combined together as the parameters of the hybrid kernel function. Based on The multi-parameter adjustment of support vector regression machine with mixed kernel function is regarded as the parameter identification problem of nonlinear dynamic system, and the fifth-order volumetric Kalman filter is used to estimate the parameters, and the adaptive adjustment of weighting coefficients and kernel parameters is successfully realized.

随着信息技术的发展，当前需要处理的问题数据规模逐步变得更加复杂，而注重小样本统计规律的支持向量机已不足以应对复杂的数据特征。特别地，当数据规模庞大^[11]、样本特征含有异构信息^[12]、多维数据不规则^[13]或者高维特征空间分布不平坦^[14]等问题时，如PE过程的复杂性以及故障原因、故障现象和故障机制的多样性、随机性和模糊性，使得其绝缘故障诊断存在许多困难，针对由不同分布的特征融合而成的故障特征，采用混合核函数构建的支持向量机就无法充分利用故障样本的多类特征而且性能主要依赖于主特征，因此只有两个核函数的混合核函数SVM在处理异源特征融合时分类效果不理想。基于此，Lanckriet等提出了多核学习框架，这种多核学习方法在解决上述问题时有很大的优越性^[15]。多核模型是一类灵活性更强的基于核的学习模型，近年来的关于多核学习的理论和应用已经证明多核模型比单核模型或混合模型所获得更好的性能，利用多核代替单核能增强决策函数的可解释性^[16]。但是，这里最重要的问题就是如何得到这个组合的特征空间，也就是如何学习得到权系数。针对这一问题，近来出现了多种有效的多核学习理论及方法。如早期的基于Boosting的多核组合模型学习方法^[17，18]，基于半定规划(Semidefiniteprogramming，SDP)的多核学习方法^[19]，基于二次约束型二次规划(Quadraticallyconstrained quadratic program，QCQP)的学习方法^[20]，基于半无限线性规划(Semi-in^-nitelinear program，SILP)的学习方法^[21]，基于超核(Hyperkernels)的学习方法^[22]，以及近来出现的简单多核学习(Simple MKL)方法^[23，24]和基于分组Lasso思想的多核学习方法。在权系数与核函数的组合方面，研究人员也对多核方法进行了一些改进，如非平稳的多核学习方法^[25]，局部多核学习方法^[26]，非稀疏多核学习方法^[27]等。然而，基于这些方法对加权系数及核参数的选择得到的支持向量机分类效果并不是很理想。With the development of information technology, the scale of the current problem data that needs to be dealt with has gradually become more complex, and the support vector machine, which focuses on the statistical laws of small samples, is no longer enough to deal with complex data characteristics. In particular, when the data scale is huge^[11] , sample features contain heterogeneous information^[12] , multi-dimensional data is irregular^[13] or the distribution of high-dimensional feature space is not flat^[14] , such as the complexity of the PE process and The diversity, randomness and ambiguity of fault causes, fault phenomena and fault mechanisms make it difficult to diagnose insulation faults. Aiming at the fault characteristics formed by the fusion of different distribution features, the support vector machine constructed by using the hybrid kernel function is The multi-class features of fault samples cannot be fully utilized and the performance mainly depends on the main features. Therefore, the hybrid kernel function SVM with only two kernel functions is not ideal for classification when dealing with heterogeneous feature fusion. Based on this, Lanckriet et al. proposed a multi-core learning framework. This multi-core learning method has great advantages in solving the above problems^[15] . The multi-core model is a more flexible kernel-based learning model. In recent years, the theory and application of multi-core learning have proved that the multi-core model has better performance than the single-core model or the hybrid model. Using multi-core instead of single-core can Enhance the interpretability of decision functions^[16] . However, the most important issue here is how to obtain the feature space of this combination, that is, how to learn the weight coefficients. In response to this problem, a variety of effective multi-kernel learning theories and methods have recently emerged. For example, the early multi-core combination model learning method based on Boosting^{[17, 18]} , the multi-core learning method based on semidefinite programming (Semidefinite programming, SDP)^[19] , the quadratically constrained quadratic program (QCQP) based Learning methods^[20] , learning methods based on semi-infinite linear programming (Semi-in^- nitelinear program, SILP)^[21] , learning methods based on Hyperkernels^[22] , and the recent simple multi-core learning (Simple MKL) methods^{[23, 24]} and multi-kernel learning methods based on grouped Lasso ideas. In terms of the combination of weight coefficients and kernel functions, researchers have also made some improvements to multi-kernel methods, such as non-stationary multi-kernel learning methods^[25] , local multi-kernel learning methods^[26] , non-sparse multi-kernel learning methods^[27] , etc. However, based on the selection of weighting coefficients and kernel parameters by these methods, the SVM classification effect is not ideal.

发明内容Contents of the invention

本发明的目的就在于为了解决上述问题而提供一种基于多核函数自适应融合的支持向量机参数选择方法。The object of the present invention is to provide a support vector machine parameter selection method based on multi-kernel function adaptive fusion in order to solve the above problems.

本发明通过以下技术方案来实现上述目的：The present invention achieves the above object through the following technical solutions:

本发明包括以下步骤：The present invention comprises the following steps:

(1)由于多种核函数加权融合而成的多核函数是一种新的核函数，把所有的加权融合系数也看成是多核函数的核参数，故将多核函数加权融合系数p_t,p_r、局部核函数的内部参数、全局核函数的内部参数以及惩罚参数C组合在一起作为支持向量机的核参数γ，进而整个参数的选择问题就可作为一个非线性动态系统的滤波估计问题；令k_t为第t个局部核函数的所有核参数，k_r为第r个全局核函数的所有核参数，则参数状态向量γ＝[p₁,...p_t,...p_m,p₁,...p_r,...p_n,k₁,...k_t,...k_m,k₁,...k_r,...k_n,C]^T，首先建立如下参数非线性系统(1) Since the multi-kernel function formed by the weighted fusion of multiple kernel functions is a new kernel function, all weighted fusion coefficients are also regarded as kernel parameters of the multi-kernel function, so the weighted fusion coefficients p_t , p_r , the internal parameters of the local kernel function, the internal parameters of the global kernel function and the penalty parameter C are combined together as the kernel parameter γ of the support vector machine, and then the whole parameter selection problem can be regarded as a filtering estimation problem of a nonlinear dynamic system; Let k_t be all kernel parameters of the t-th local kernel function, and k_r be all kernel parameters of the r-th global kernel function, then the parameter state vector γ=[p₁ ,...p_t ,...p_m ,p₁ ,...p_r ,...p_n ,k₁ ,...k_t ,...k_m ,k₁ ,...k_r ,...k_n ,C]^T , Firstly, the following parametric nonlinear system is established

γ(k)＝γ(k-1)+w(k) (18)γ(k)=γ(k-1)+w(k) (18)

y(k)＝h(γ(k))+v(k) (19)y(k)=h(γ(k))+v(k) (19)

其中，γ(k)是n维参数状态向量，y(k)是观测输出，过程噪声w(k)和观测噪声v(k)均是均值为零的高斯白噪声，且方差分别为Q和R；Among them, γ(k) is the n-dimensional parameter state vector, y(k) is the observation output, the process noise w(k) and the observation noise v(k) are both Gaussian white noise with zero mean, and the variances are Q and R;

(2)由于待求的最优参数可以看做是固定不变的，所以可以建立式(18)所示的关于参数的线性状态方程，其次对于任何一个状态向量γ(k)，经过LIBSVM^[22]训练预测之后每一个原始数据都有一个预测输出，故可建立式(19)所示的非线性观测方程；为了五阶CKF滤波算法的运行，需要对系统模型加入人工过程白噪声和观测白噪声；(2) Since the optimal parameters to be sought can be regarded as fixed, a linear state equation about the parameters shown in Eq. (18) can be established. Secondly, for any state vector γ(k), through LIBSVM^{[ 22]} After training and prediction, each original data has a prediction output, so the nonlinear observation equation shown in formula (19) can be established; in order to run the fifth-order CKF filtering algorithm, it is necessary to add artificial process white noise and observation to the system model White Noise;

(3)假设支持向量回归机的原始样本数据集合为D＝{(x_i,y_i)|i∈I}，其中指标集合I＝{1,2,...,N}，y_i为数据的目标向量，运用k-折交叉验证方法将样本数据分成k组，即(3) Assume that the original sample data set of the support vector regression machine is D={(_xi ,y_i )|i∈I}, where the index set I={1,2,...,N}, y_i is The target vector of the data, using the k-fold cross-validation method to divide the sample data into k groups, namely

D_j＝{(x_i,y_i)|i∈I_j} (20)D_j ＝{(x_i ,y_i )|i∈I_j } (20)

其中j∈{1,2,...,k}，并且所有组的指标集I_j满足I₁∪I_j∪…∪I_k＝I，所有组的数据集D_j满足D₁∪D₂∪…∪D_k＝D；在每一次支持向量回归的迭代运算中，使用其中任意一组数据D_p用作预测，剩下的k-1组数据作为训练数据库，给定初始的参数γ₀利用LIBSVM训练支持向量回归机；设此时训练结果为和则此时的决策函数为where j∈{1,2,...,k}, and the index set I_j of all groups satisfies I₁ ∪I_j ∪…∪I_k ＝I, and the data set D_j of all groups satisfies D₁ ∪D₂ ∪…∪D_k ＝D; In each iterative operation of support vector regression, any set of data D_p is used as prediction, and the remaining k-1 sets of data are used as training database, given the initial parameter γ₀ Use LIBSVM to train the support vector regression machine; set the training result at this time as with Then the decision function at this time is

其中，in,

(4)将数据组D_p代入式(15)，即可得到D_p的预测输出值(4) Substituting the data set D_p into formula (15), the predicted output value of D_p can be obtained

分别将数据组D_i，i∈{1,2,...,k}作为预测数据组，其余的数据组D₁,...,D_i-1,D_i+1,...,D_k作为支持向量回归机训练数据组，经过k-折交叉验证回归预测之后，样本数据集D中的每一个数据有且仅有一个预测输出值；故对于参数向量γ，可定义如下预测输出函数：Take the data group D_i , i∈{1,2,...,k} as the prediction data group, and the remaining data groups D₁ ,...,D_i-1 ,D_i+1 ,..., D_k is used as the training data set of the support vector regression machine. After k-fold cross-validation regression prediction, each data in the sample data set D has one and only one predicted output value; therefore, for the parameter vector γ, the following predicted output can be defined function:

其中，in,

(5)因此针对非线性系统模型(18)(19)，基于多核函数及k-折交叉验证法用k个子LIBSVM训练数据集D，并将它的预测输出输入到五阶容积卡尔曼滤波器当中进行参数状态估计；整个基于多核函数自适应融合的支持向量回归机模型参数选择算法同样是包括两个过程，即时间更新过程和测量更新过程：(5) Therefore, for the nonlinear system model (18) (19), use k sub-LIBSVM training data set D based on multi-kernel function and k-fold cross-validation method, and input its prediction output to the fifth-order volumetric Kalman filter The parameter state estimation is carried out; the entire support vector regression model parameter selection algorithm based on the adaptive fusion of multi-kernel functions also includes two processes, namely the time update process and the measurement update process:

时间更新：Time update:

由于该更新过程是对状态的预测更新，且状态方程是线性已知的，故可根据五阶容积卡尔曼滤波算法的时间更新步骤公式(5)-(7)以及容积规则(46)进行支持向量回归机模型参数选择算法的时间更新；Since the update process is a predictive update of the state, and the state equation is known linearly, it can be supported according to the time update step formulas (5)-(7) and the volume rule (46) of the fifth-order volumetric Kalman filter algorithm Time update of the vector regression machine model parameter selection algorithm;

测量更新：Measurement update:

带入参数状态向量γ(k)到支持向量回归机参数中，利用LIBSVM来训练数据集，再预测输出然后，将预测输出函数(17)嵌入到五阶容积卡尔曼滤波算法的测量更新步骤公式(8)-(14)及容积规则(46)中对各个参数进行估计。Bring the parameter state vector γ(k) into the parameters of the support vector regression machine, use LIBSVM to train the data set, and then predict the output Then, the predicted output function (17) is embedded into the measurement update step formulas (8)-(14) and the volume rule (46) of the fifth-order volumetric Kalman filter algorithm to estimate each parameter.

本发明的有益效果在于：The beneficial effects of the present invention are:

本发明是一种基于多核函数自适应融合的支持向量机参数选择方法，与现有技术相比，本发明分别对局部核函数、全局核函数、混合核函数以及多核函数的特性进行了具体分析。把多核函数的所有融合系数、核函数参数、回归参数组合在一起作为参数状态向量，从而将模型选择问题转换成一个非线性系统的状态估计问题，然后用五阶容积卡尔曼滤波进行参数估计，实现多核函数加权系数的自适应融合及核参数、回归参数的选择。The present invention is a support vector machine parameter selection method based on multi-kernel function self-adaptive fusion. Compared with the prior art, the present invention specifically analyzes the characteristics of local kernel function, global kernel function, hybrid kernel function and multi-kernel function . Combine all the fusion coefficients, kernel function parameters, and regression parameters of the multi-kernel function as a parameter state vector, thereby converting the model selection problem into a nonlinear system state estimation problem, and then use the fifth-order volumetric Kalman filter for parameter estimation. Realize adaptive fusion of multi-kernel function weighting coefficients and selection of kernel parameters and regression parameters.

附图说明Description of drawings

图1是支持向量机结构示意图；Fig. 1 is a schematic diagram of the support vector machine structure;

图2是高斯核函数特性图；Fig. 2 is a Gaussian kernel function characteristic diagram;

图3是多项式核函数特性图；Fig. 3 is a characteristic figure of polynomial kernel function;

图4是混合核函数特性图；Fig. 4 is a mixed kernel function characteristic diagram;

图5是两局部一全局三核函数特性图；Fig. 5 is a characteristic diagram of two local one global three kernel functions;

图6是一局部两全局三核函数特性图。Fig. 6 is a characteristic diagram of one local two global three kernel functions.

具体实施方式detailed description

下面结合附图对本发明作进一步说明：The present invention will be further described below in conjunction with accompanying drawing:

问题描述：Problem Description:

支持向量回归机的结构示意图如图1所示，支持向量机的关键是核函数的引入。核函数巧妙地解决了低维向量映射到高维时所带来的维数灾难问题，提高了机器学习的非线性处理能力。而每一种核函数都有各自的特性，基于不同核函数得到的支持向量回归机具有不同的泛化能力。核函数是支持向量机中最重要的问题之一，如何选择合适的核函数也是支持向量机中未有理论依据的一个难点问题。The structure diagram of support vector regression machine is shown in Figure 1. The key of support vector machine is the introduction of kernel function. The kernel function cleverly solves the problem of dimensionality disaster caused by mapping low-dimensional vectors to high-dimensional ones, and improves the nonlinear processing ability of machine learning. Each kernel function has its own characteristics, and support vector regression machines based on different kernel functions have different generalization capabilities. Kernel function is one of the most important problems in support vector machine, how to choose the appropriate kernel function is also a difficult problem without theoretical basis in support vector machine.

令K(x_i,x_j)表示核函数，其中x_i，x_j为样本数据。目前的核函数都是在实验中去尝试，常用的核函数有四种^[28]Let K(_xi , x_j ) denote the kernel function, where_xi , x_j are sample data. The current kernel functions are all tried in experiments, and there are four commonly used kernel functions^[28]

(1)线性核函数(1) Linear kernel function

K(x_i,x_j)＝x_i·x_j (3)K(x_i ,x_j )＝x_i ·x_j (3)

线性核函数是核函数中的一个特例，采用线性核函数，则是在原始空间中寻找最优泛化性的支持向量机。主要特点是参数少，速度快。The linear kernel function is a special case of the kernel function, and the linear kernel function is used to find the optimal generalization support vector machine in the original space. The main features are fewer parameters and faster speed.

(2)多项式核函数(2) Polynomial kernel function

K(x_i,x_j)＝((x_i·x_j)+c)^q (4)K(x_i ,x_j )=((x_i ·x_j )+c)^q (4)

其中，c，q为核参数且满足c≥0，q∈N，基于此核函数得到的是q阶多项式分类器。当c＝1时的核函数是常用的一个多项式核。多项式核函数属于全局核函数，具有全局特性，允许相距很远的数据点可以对核函数的值有影响。q值越大，映射的维数越高，计算量也越大。当q过大时，函数集的VC维升高，学习机器的复杂性也提高，支持向量机的推广能力降低，易出现“过拟合”现象。Among them, c, q are kernel parameters and satisfy c≥0, q∈N, based on this kernel function, a q-order polynomial classifier is obtained. The kernel function when c=1 is a commonly used polynomial kernel. The polynomial kernel function belongs to the global kernel function and has global characteristics, allowing data points far apart to have an impact on the value of the kernel function. The larger the value of q, the higher the dimension of the mapping, and the greater the amount of calculation. When q is too large, the VC dimension of the function set increases, the complexity of the learning machine also increases, the generalization ability of the support vector machine decreases, and the phenomenon of "over-fitting" tends to occur.

(3)高斯核函数(RBF核)(3) Gaussian kernel function (RBF kernel)

K(x_i,x_j)＝exp(-||x_i-x_j||²/σ²) (5)K(x_i ,x_j )＝exp(-||x_i -x_j ||² /σ² ) (5)

其中，σ＞0为核参数，基于高斯核函数得到的支持向量机是一个径向基函数的学习机，它的每一个基函数中心对应一个支持向量。RBF核函数是局部性强的核函数，其外推能力随参数的增大而减弱。与一般的核函数相比，高斯核函数只需要确定一个参数，建立核函数模型相对简单。因此，RBF核函数是目前被应用最广的一种核函数。Among them, σ>0 is the kernel parameter, and the support vector machine obtained based on the Gaussian kernel function is a radial basis function learning machine, and each basis function center of it corresponds to a support vector. The RBF kernel function is a kernel function with strong locality, and its extrapolation ability decreases with the increase of parameters. Compared with the general kernel function, the Gaussian kernel function only needs to determine one parameter, and the establishment of the kernel function model is relatively simple. Therefore, the RBF kernel function is currently the most widely used kernel function.

(4)Sigmoid核函数(4) Sigmoid kernel function

其中，λ，为核参数且满足λ＞0，基于Sigmoid核函数得到的支持向量机是一个包含隐层的多层感知器。支持向量机的理论基础决定了它最终求得的全局最优值而不是局部最小值，也保证了它对于未知样本的良好泛化能力而不会出现过学习现象。Among them, λ, is a kernel parameter and satisfies λ>0, The support vector machine obtained based on the Sigmoid kernel function is a multi-layer perceptron including hidden layers. The theoretical basis of the support vector machine determines the global optimal value it finally obtains rather than the local minimum value, and also ensures its good generalization ability for unknown samples without over-learning phenomenon.

(5)傅里叶(Fourier)核函数(5) Fourier kernel function

其中，0＜q＜1，基于傅里叶核函数的支持向量机已经得到了越来越多的应用。Among them, 0<q<1, the support vector machine based on Fourier kernel function has been applied more and more.

目前，核函数主要分为两大类：全局核函数与局部核函数。局部核函数善于提取样本的局部性，核函数的值只受距离很近的数据点影响，插值能力较强，因此其学习能力强。常用的核函数当中，高斯核函数、傅里叶核函数属于局部核函数。全局核函数善于提取样本的全局特性，核函数的值只受距离很远的数据点影响，因此其泛化能力强^[29]。与局部核函数相比，全局核函数插值能力较弱。常用的核函数当中，线性核函数、多项式核函数、和Sigmoid核函数都属于全局核函数。Currently, kernel functions are mainly divided into two categories: global kernel functions and local kernel functions. The local kernel function is good at extracting the locality of the sample. The value of the kernel function is only affected by the data points at a close distance, and the interpolation ability is strong, so its learning ability is strong. Among the commonly used kernel functions, the Gaussian kernel function and the Fourier kernel function belong to the local kernel function. The global kernel function is good at extracting the global characteristics of the sample, and the value of the kernel function is only affected by the data points far away, so its generalization ability is strong^[29] . Compared with the local kernel function, the interpolation ability of the global kernel function is weak. Among the commonly used kernel functions, the linear kernel function, the polynomial kernel function, and the Sigmoid kernel function all belong to the global kernel function.

单一核函数的缺陷难以满足支持向量机获得高性能的要求。每一种核函数都有各自的优缺点，且表现出各不相同的特点，从而基于这些核函数得到的支持向量回归机决策函数分类性能也完全不同。实际样本的特性复杂多变，并不能完全由局部核函数或者全局核函数来刻画反映，当样本特征含有异构信息，数据不规则及空间分布不平坦等情况时，由局部核函数与全局核函数加权组合而成的混合核函数也不能准确描述实际样本特征。而多核学习框架很好的解决了上述问题，为此本发明对多核函数的特性进行分析研究，并将多核函数的构造与核函数参数的选择问题转化成一个非线性滤波问题，基于高性能的五阶容积卡尔曼滤波算法对状态参数进行求解，从而实现对融合系数的自适应调整以及核参数与惩罚参数的估计。The defect of a single kernel function is difficult to meet the high performance requirement of support vector machine. Each kernel function has its own advantages and disadvantages, and shows different characteristics, so the classification performance of support vector regression machine decision function based on these kernel functions is also completely different. The characteristics of the actual sample are complex and changeable, and cannot be fully described and reflected by the local kernel function or the global kernel function. The mixed kernel function composed of weighted functions cannot accurately describe the actual sample characteristics. The multi-kernel learning framework solves the above problems very well. For this reason, the present invention analyzes and studies the characteristics of multi-kernel functions, and converts the construction of multi-kernel functions and the selection of kernel function parameters into a nonlinear filtering problem. The fifth-order volumetric Kalman filter algorithm solves the state parameters, so as to realize the adaptive adjustment of fusion coefficients and the estimation of kernel parameters and penalty parameters.

核函数特性分析：Kernel function characteristic analysis:

多核函数构造：Multi-core function construction:

核函数主要包括2个方面的内容：核函数的构造和核函数模型的选择。其中，模型的恰当选择是提高支持向量回归机性能的关键。模型选择就是针对一批给定的原始样本数据，在进行训练之前确定比较适用于此数据特点的核函数。核函数的确定主要包括两个步骤：首先确定核函数的类型，然后选择核函数的相关参数。目前的研究，也主要集中于核函数模型的选择。核函数的构造往往比核函数的选择更具有重要意义，由于不同具体的实际应用样本具有不同的特性，一个好核函数的构造仍然是非常困难的。The kernel function mainly includes two aspects: the construction of the kernel function and the selection of the kernel function model. Among them, the proper selection of the model is the key to improve the performance of the support vector regression machine. Model selection is to determine a kernel function that is more suitable for the characteristics of this data before training for a given batch of original sample data. The determination of the kernel function mainly includes two steps: first, determine the type of the kernel function, and then select the relevant parameters of the kernel function. The current research also mainly focuses on the selection of kernel function model. The construction of the kernel function is often more important than the selection of the kernel function. Since different specific practical application samples have different characteristics, it is still very difficult to construct a good kernel function.

众所周知，支持向量机的性能的好坏取决于核函数及其参数的选取。而核函数的类型繁多，各有各的特性，且各有利弊，因此如果在求解实际问题时仅仅采用单一的核函数，往往无法令支持向量机的分类性能达到最优，具有一定的局限性。实际上，除了上面提到的几种常见的核函数外，常常需要根据实际问题构造出相应的核函数。因此考虑将不同的核函数组合起来学习，以达到选取合适核函数的目的，M个核函数组成的多核函数K_mk(x_i,x_j)表达式如下所示As we all know, the performance of support vector machine depends on the selection of kernel function and its parameters. There are many types of kernel functions, each with its own characteristics, and each has its own advantages and disadvantages. Therefore, if only a single kernel function is used to solve practical problems, it is often impossible to optimize the classification performance of support vector machines, which has certain limitations. . In fact, in addition to the several common kernel functions mentioned above, it is often necessary to construct corresponding kernel functions according to actual problems. Therefore, it is considered to combine different kernel functions to learn in order to achieve the purpose of selecting a suitable kernel function. The expression of the multi-kernel function K_mk (_xi , x_j ) composed of M kernel functions is as follows

其中，K_s(x_i,x_j)表示第s个核函数，p_s表示第s个核函数的加权系数。Among them, K_s (_xi , x_j ) represents the sth kernel function, and p_s represents the weighting coefficient of the sth kernel function.

定理1：记分别表示第r个局部核函数与第t个全局核函数，则多核函数K_mk(x_i,x_j)的表达式为：Theorem 1: remember denote the r-th local kernel function and the t-th global kernel function respectively, then the expression of the multi-kernel function K_mk (_xi , x_j ) is:

其中，m，n为局部核函数和全局核函数的个数，且各核函数的融合系数满足0≤p₁,p₂…p_r…p_m≤1，0≤p₁,p₂…p_t…p_n≤1，则多核函数仍然是Mercer核。Among them, m, n are the number of local kernel functions and global kernel functions, and the fusion coefficient of each kernel function satisfies 0≤p₁ ,p₂ …p_r …p_m ≤1, 0≤p₁ ,p₂ …p_t ...p_n ≤ 1, The multi-core function is still a Mercer core.

证明：由于为局部核函数与全局核函数，所以其均满足Mercer条件^[21]，即，对任意且满足Proof: due to is a local kernel function and a global kernel function, so they all satisfy the Mercer condition^[21] , that is, for any and Satisfy

又由于0≤p₁,p₂…p_r…p_m≤1，0≤p₁,p₂…p_t…p_n≤1，所以And since 0≤p₁ , p₂ …p_r …p_m ≤1, 0≤p₁ , p₂ …p_t …p_n ≤1, so

即，which is,

理论证明函数只要满足Mercer条件都可选为核函数，所以多核函数满足Mercer条件，可选为核函数。多核函数是多种局部核函数与全局核函数的凸组合，它的引入，弥补了单一使用全局核函数和局部核函数以及混合核函数的不足。Theory proves that as long as the function meets the Mercer condition, it can be selected as the kernel function, so the multi-kernel function can be selected as the kernel function if it meets the Mercer condition. The multi-kernel function is a convex combination of various local kernel functions and global kernel functions. Its introduction makes up for the shortage of single use of global kernel functions, local kernel functions and mixed kernel functions.

很明显，当某个核函数的融合系数为1，其它所有核函数融合系数为0时，多核函数就退化成单一的核函数。基于单个核函数的支持向量回归机的模型选择仅仅是在选择单个核函数内部的参数，而以多核函数为核的支持向量回归机的模型选择既要选择局部核函数内部的参数，还要选择全局核函数内部的参数，同时还要确定这所有核函数的融合系数，使得基于多核函数得到的支持向量回归机的性能最优。支持向量回归机在训练样本之前需要代入具体的加权融合系数值，多核函数(9)式中的加权融合系数p_t，p_r往往是根据经验而事先设计好的，由于组合后的混合核函数并不定能很好的刻画实际样本的特性，使得回归预测性能降低。Obviously, when the fusion coefficient of a certain kernel function is 1 and the fusion coefficient of all other kernel functions is 0, the multi-kernel function degenerates into a single kernel function. The model selection of a support vector regression machine based on a single kernel function is only to select the parameters inside a single kernel function, while the model selection of a support vector regression machine with a multi-kernel function as a kernel must not only select the parameters inside the local kernel function, but also select The internal parameters of the global kernel function, and at the same time determine the fusion coefficients of all the kernel functions, so that the performance of the support vector regression machine based on the multi-kernel function is optimal. The support vector regression machine needs to substitute specific weighted fusion coefficient values before training samples. The weighted fusion coefficients p_t and p_r in the multi-kernel function (9) are often designed in advance based on experience. Since the combined mixed kernel function It may not be able to describe the characteristics of the actual sample well, so that the performance of regression prediction is reduced.

核函数特性分析：Kernel function characteristic analysis:

核函数是影响支持向量机的核心因素，核函数不同，函数所表现的特性就不同。因此对核函数的特性进行分析十分必要。图2为高斯核函数核参数σ分别取0.1，0.2，0.3，0.4时的特性曲线图，测试点为0.5。从图上可以看出，局部核函数仅对测试点附近的小范围内的数据有影响，因为在远离测试点的值迅速减小趋于零，因此外推能力弱。随着高斯核函数半径的不断改变，核函数的影响范围也相应改变，且半径越大，影响的范围也就越小，其内推能力随着参数σ的增大而减小。The kernel function is the core factor that affects the support vector machine. Different kernel functions have different characteristics. Therefore, it is necessary to analyze the characteristics of the kernel function. Figure 2 is the characteristic curve when the kernel parameter σ of the Gaussian kernel function is 0.1, 0.2, 0.3, and 0.4 respectively, and the test point is 0.5. It can be seen from the figure that the local kernel function only affects the data in a small range near the test point, because the value far away from the test point decreases rapidly and tends to zero, so the extrapolation ability is weak. As the radius of the Gaussian kernel function changes continuously, the influence range of the kernel function also changes accordingly, and the larger the radius, the smaller the influence range, and its interpolation ability decreases with the increase of the parameter σ.

图3为多项式参数q分别取1，2，3，4时，多项式核函数的曲线，对于不同的多项式阶数采用的是同一个数据测试点0.5。从图中可以发现，多项式的数据点对于所有的待识别数据点都有全局性的影响，全局核函数不仅对测试点附近的小范围数据有影响，对远离测试点的数据也有一定影响。Figure 3 shows the curves of the polynomial kernel function when the polynomial parameter q is 1, 2, 3, and 4 respectively. The same data test point 0.5 is used for different polynomial orders. It can be seen from the figure that the polynomial data points have a global impact on all the data points to be identified, and the global kernel function not only has an impact on the small-scale data near the test point, but also has a certain impact on the data far away from the test point.

图4是混合核函数的特性曲线图，其中局部核函数为高斯核函数，全局核函数为多项式核函数。对加权系数分别赋予4组不同的权值得到4条混合函数特性曲线，测试点为0.5。从图中我们可以看到，混合核函数结合了局部核函数与全局核函数的特性，对测试点附近的小范围内的数据有影响，在远离测试点的右端数据仍有影响，在远离测试点的左端迅速趋于0，即对左端数据影响较小，因此混合核函数既能对测试点附近的数据样本有较大影响，对远离测试点的数据也有一定的影响。Fig. 4 is a characteristic curve diagram of a mixed kernel function, wherein the local kernel function is a Gaussian kernel function, and the global kernel function is a polynomial kernel function. Four groups of different weights are assigned to the weighting coefficients to obtain four mixing function characteristic curves, and the test point is 0.5. We can see from the figure that the hybrid kernel function combines the characteristics of the local kernel function and the global kernel function, and has an impact on the data in a small range near the test point. The left end of the point quickly tends to 0, that is, it has little influence on the data at the left end. Therefore, the mixed kernel function can have a greater impact on the data samples near the test point, and also have a certain impact on the data far away from the test point.

图5是两局部一全局三核函数的特性曲线图，三核函数由两个局部核函数与一个全局核函数组成。其中局部核函数为高斯核函数和傅里叶核函数，全局核函数为多项式核函数，测试点为0.5。对三种核函数的加权系数进行不同的取值得到如图所示的特性曲线，从图中可以看出由于具有两局部核函数，三核函数对测试点附近的数据有较大影响，局部特性明显，学习能力强，对远离测试点的数据也有影响。FIG. 5 is a characteristic curve diagram of two local and one global three-kernel functions. The three-kernel function is composed of two local kernel functions and one global kernel function. Among them, the local kernel function is Gaussian kernel function and Fourier kernel function, the global kernel function is polynomial kernel function, and the test point is 0.5. The characteristic curves shown in the figure are obtained by different values of the weighting coefficients of the three kernel functions. It can be seen from the figure that since there are two local kernel functions, the three kernel functions have a great influence on the data near the test point, and the local It has obvious characteristics and strong learning ability, and it also has an impact on data far away from the test point.

图6为一局部两全局三核函数的特性图，三核函数由一个局部核函数与两个全局核函数组成。其中局部核函数为高斯核函数，全局核函数分别为多项式核函数、Sigmod核函数，测试点为0.5。对三种核函数给予四种不同的取值得到的特性曲线如图所示，从图中可以看出，三核函数对测试点附近的数据有一定的影响，对远离测试点的数据影响较大，全局特性明显，泛化能力强。FIG. 6 is a characteristic diagram of a local two-global three-kernel function. The three-kernel function is composed of one local kernel function and two global kernel functions. The local kernel function is a Gaussian kernel function, the global kernel function is a polynomial kernel function and a Sigmod kernel function respectively, and the test point is 0.5. The characteristic curves obtained by giving four different values to the three kernel functions are shown in the figure. It can be seen from the figure that the three kernel functions have a certain influence on the data near the test point, and have a greater influence on the data far away from the test point. Large, with obvious global characteristics and strong generalization ability.

总之，局部核函数学习能力强，而其泛化能力弱；全局核函数泛化能力强，但是其学习能力弱。混合核函数同时具有局部特性和全局特性，具备一定的学习能力和泛化能力。两局部一全局三核函数局部特性明显，同时具备全局能力，拟合能力强。一局部两全局三核函数全局特性明显，同时具备局部能力，外推能力强。因此，将不同的核函数进行组合得到的多核函数具有更加丰富的特性曲线，能够相对准确的描述刻画实际样本的特性，同时不同的多核函数具有不同的函数特性，多核函数的特性跟组成核函数的加权系数密切相关。如何针对具体的实际应用选择多核函数的加权系数是一个难题。In short, the local kernel function has strong learning ability, but its generalization ability is weak; the global kernel function has strong generalization ability, but its learning ability is weak. The hybrid kernel function has local characteristics and global characteristics at the same time, and has certain learning ability and generalization ability. Two local and one global three-kernel function has obvious local characteristics, and at the same time has global ability and strong fitting ability. One local, two global, and three kernel functions have obvious global characteristics, and have local capabilities and strong extrapolation capabilities. Therefore, the multi-kernel function obtained by combining different kernel functions has a richer characteristic curve, which can describe and characterize the characteristics of the actual sample relatively accurately. At the same time, different multi-kernel functions have different function characteristics. The weighting coefficients are closely related. How to select the weighting coefficients of the multi-kernel function for specific practical applications is a difficult problem.

基于多核函数自适应融合的支持向量回归机参数选择方法Parameter Selection Method of Support Vector Regression Machine Based on Multi-kernel Function Adaptive Fusion

多核函数支持向量机Multi-core function support vector machine

支持向量回归机的最终目的是找到一个回归函数f：R^D→R，使得The ultimate goal of the support vector regression machine is to find a regression function f: R^D → R, such that

其中，是一个由多核函数组成，并将数据x从低维映射到高维特征空间的函数。w是一个权重向量，并且b是一个上下平移的数值。标准支持向量回归机采用ε-不灵敏函数，假设所有训练数据在精度ε下用线性函数拟合。这时，把问题转化为求优化目标函数最小化问题^[17]：in, is a function that consists of multi-kernel functions and maps the data x from a low-dimensional to a high-dimensional feature space. w is a weight vector, and b is a value to translate up and down. Standard support vector regression machines use ε-insensitive functions, assuming that all training data are fitted with a linear function to accuracy ε. At this time, the problem is transformed into the problem of minimizing the optimization objective function^[17] :

式中，ξ_i,是松弛因子，当拟合有误差时，ξ_i,都大于0，误差不存在时为0，优化函数第一项使拟合函数更为平坦，从而提高泛化能力；第二项为减小误差；常数C＞0表示对超出误差ε的样本的惩罚程度。支持向量机的性能受误差惩罚参数C的影响，它是用来对样本被错分进行的惩罚度，对算法复杂度和样本错分度进行折中处理，因此，对于实际问题要选择恰当的惩罚系数C。where ξ_i , is the relaxation factor, when there is an error in fitting, ξ_i , Both are greater than 0, and it is 0 when the error does not exist. The first item of the optimization function makes the fitting function flatter, thereby improving the generalization ability; the second item is to reduce the error; the constant C>0 means that the sample that exceeds the error ε degree of punishment. The performance of the support vector machine is affected by the error penalty parameter C, which is used to punish the misclassification of the sample, and to compromise the complexity of the algorithm and the misclassification of the sample. Therefore, for practical problems, it is necessary to choose the appropriate Penalty factor C.

求解(13)式的凸二次优化问题，引入Lagrange乘子α_i,支持向量回归机的原始问题(13)转换成如下对偶形式：To solve the convex quadratic optimization problem of (13), introduce the Lagrange multiplier α_i , The original problem (13) of the support vector regression machine is transformed into the following dual form:

通过求解该对偶问题得到原始问题的解从而构造决策函数。将混合核函数K_mk(x_i,x_j)代替目标函数(14)中的內积(x_i·x_j)，则得到决策函数为：The solution to the original problem is obtained by solving the dual problem So as to construct the decision function. Replace the inner product (_xi x_j ) in the objective function (14) with the mixed kernel function K_mk (_xi , x_j ), then the decision function is:

其中，按如下方式计算：选择开区间中的或若选到的是则in, Calculated as follows: choose the open interval or If selected but

若选到的是则If selected but

基于多核函数自适应融合的支持向量机模型参数选择方法Parameter Selection Method of Support Vector Machine Model Based on Multi-kernel Function Adaptive Fusion

由于多种核函数加权融合而成的多核函数是一种新的核函数，我们也可以把所有的加权融合系数也看成是多核函数的核参数，故可将多核函数加权融合系数p_t,p_r、局部核函数的内部参数、全局核函数的内部参数以及惩罚参数C组合在一起作为支持向量机的核参数γ，进而整个参数的选择问题就可作为一个非线性动态系统的滤波估计问题。令k_t为第t个局部核函数的所有核参数，k_r为第r个全局核函数的所有核参数，则参数状态向量γ＝[p₁,...p_t,...p_m,p₁,...p_r,...p_n,k₁,...k_t,...k_m,k₁,...k_r,...k_n,C]^T，首先建立如下参数非线性系统Since the multi-kernel function formed by the weighted fusion of multiple kernel functions is a new kernel function, we can also regard all weighted fusion coefficients as kernel parameters of the multi-kernel function, so the weighted fusion coefficient p_t of the multi-kernel function, p_r , the internal parameters of the local kernel function, the internal parameters of the global kernel function and the penalty parameter C are combined together as the kernel parameter γ of the support vector machine, and then the whole parameter selection problem can be regarded as a filtering estimation problem of a nonlinear dynamic system . Let k_t be all kernel parameters of the t-th local kernel function, and k_r be all kernel parameters of the r-th global kernel function, then the parameter state vector γ=[p₁ ,...p_t ,...p_m ,p₁ ,...p_r ,...p_n ,k₁ ,...k_t ,...k_m ,k₁ ,...k_r ,...k_n ,C]^T , Firstly, the following parametric nonlinear system is established

γ(k)＝γ(k-1)+w(k) (18)γ(k)=γ(k-1)+w(k) (18)

y(k)＝h(γ(k))+v(k) (19)y(k)=h(γ(k))+v(k) (19)

其中，γ(k)是n维参数状态向量，y(k)是观测输出，过程噪声w(k)和观测噪声v(k)均是均值为零的高斯白噪声，且方差分别为Q和R。Among them, γ(k) is the n-dimensional parameter state vector, y(k) is the observation output, the process noise w(k) and the observation noise v(k) are both Gaussian white noise with zero mean, and the variances are Q and R.

由于待求的最优参数可以看做是固定不变的，所以可以建立式(18)所示的关于参数的线性状态方程，其次对于任何一个状态向量γ(k)，经过LIBSVM^[22]训练预测之后每一个原始数据都有一个预测输出，故可建立式(19)所示的非线性观测方程。为了五阶CKF滤波算法的运行，需要对系统模型加入人工过程白噪声和观测白噪声。Since the optimal parameters to be sought can be regarded as fixed, a linear state equation about the parameters shown in equation (18) can be established. Secondly, for any state vector γ(k), after training by LIBSVM^[22] After prediction, each original data has a prediction output, so the nonlinear observation equation shown in formula (19) can be established. In order to run the fifth-order CKF filtering algorithm, it is necessary to add artificial process white noise and observation white noise to the system model.

假设支持向量回归机的原始样本数据集合为D＝{(x_i,y_i)|i∈I}，其中指标集合I＝{1,2,...,N}，y_i为数据的目标向量，运用k-折交叉验证方法将样本数据分成k组，即Assume that the original sample data set of the support vector regression machine is D={(_xi ,y_i )|i∈I}, where the index set I={1,2,...,N}, and y_i is the target of the data Vector, use the k-fold cross-validation method to divide the sample data into k groups, namely

D_j＝{(x_i,y_i)|i∈I_j} (20)D_j ＝{(x_i ,y_i )|i∈I_j } (20)

其中j∈{1,2,...,k}，并且所有组的指标集I_j满足I₁∪I_j∪…∪I_k＝I，所有组的数据集D_j满足D₁∪D₂∪…∪D_k＝D。在每一次支持向量回归的迭代运算中，使用其中任意一组数据D_p用作预测，剩下的k-1组数据作为训练数据库，给定初始的参数γ₀利用LIBSVM训练支持向量回归机。设此时训练结果为和则此时的决策函数为where j∈{1,2,...,k}, and the index set I_j of all groups satisfies I₁ ∪I_j ∪…∪I_k ＝I, and the data set D_j of all groups satisfies D₁ ∪D₂ ∪..._∪Dk =D. In each iterative operation of support vector regression, any set of data D_p is used for prediction, and the remaining k-1 sets of data are used as training databases. Given the initial parameter γ₀ , use LIBSVM to train the support vector regression machine. Let the training result at this time be with Then the decision function at this time is

其中，in,

将数据组D_p代入式(15)，即可得到D_p的预测输出值Substituting the data group D_p into formula (15), the predicted output value of D_p can be obtained

分别将数据组D_i，i∈{1,2,...,k}作为预测数据组，其余的数据组D₁,...,D_i-1,D_i+1,...,D_k作为支持向量回归机训练数据组，经过k-折交叉验证回归预测之后，样本数据集D中的每一个数据有且仅有一个预测输出值。故对于参数向量γ，可定义如下预测输出函数：Take the data group D_i , i∈{1,2,...,k} as the prediction data group, and the remaining data groups D₁ ,...,D_i-1 ,D_i+1 ,..., D_k is used as the training data set of the support vector regression machine. After k-fold cross-validation regression prediction, each data in the sample data set D has one and only one predicted output value. Therefore, for the parameter vector γ, the following prediction output function can be defined:

其中，in,

因此针对非线性系统模型(18)(19)，基于多核函数及k-折交叉验证法用k个子LIBSVM训练数据集D，并将它的预测输出输入到五阶容积卡尔曼滤波器当中进行参数状态估计。整个基于多核函数自适应融合的支持向量回归机模型参数选择算法同样是包括两个过程，即时间更新过程和测量更新过程：Therefore, for the nonlinear system model (18) (19), based on the multi-kernel function and k-fold cross-validation method, k sub-LIBSVM training data set D is used, and its predicted output is input into the fifth-order volumetric Kalman filter for parameterization. state estimation. The entire support vector regression model parameter selection algorithm based on adaptive fusion of multi-kernel functions also includes two processes, namely, the time update process and the measurement update process:

时间更新：Time update:

由于该更新过程是对状态的预测更新，且状态方程是线性已知的，故可根据五阶容积卡尔曼滤波算法的时间更新步骤公式(5)-(7)以及容积规则(46)进行支持向量回归机模型参数选择算法的时间更新。Since the update process is a predictive update of the state, and the state equation is known linearly, it can be supported according to the time update step formulas (5)-(7) and the volume rule (46) of the fifth-order volumetric Kalman filter algorithm A temporal update of the vector regression machine model parameter selection algorithm.

测量更新：Measurement update:

带入参数状态向量γ(k)到支持向量回归机参数中，利用LIBSVM来训练数据集，再预测输出然后，将预测输出函数(17)嵌入到五阶容积卡尔曼滤波算法的测量更新步骤公式(8)-(14)及容积规则(46)中对各个参数进行估计。具体的基于多核函数自适应融合的支持向量回归机模型参数选择算法步骤如表1所示。Bring the parameter state vector γ(k) into the parameters of the support vector regression machine, use LIBSVM to train the data set, and then predict the output Then, the predicted output function (17) is embedded into the measurement update step formulas (8)-(14) and the volume rule (46) of the fifth-order volumetric Kalman filter algorithm to estimate each parameter. The specific parameters selection algorithm steps of the support vector regression machine model based on multi-kernel function adaptive fusion are shown in Table 1.

表1基于混合核函数自适应融合的支持向量机模型参数选择算法具体步骤Table 1. Specific steps of the parameter selection algorithm for the SVM model based on the adaptive fusion of hybrid kernel functions

基于多核函数自适应融合的支持向量回归机模型参数选择算法将多合核函数的所有融合系数与核参数及惩罚参数C作为参数状态向量，然后利用k-折交叉验证法基于LIBSVM对数据集进行预测输出，最终用五阶CKF算法迭代计算最优的的参数状态向量。实际上，整个基于混合核函数自适应融合的支持向量回归机模型参数选择算法是在迭代寻找最优的状态向量γ，使得样本真实目标值y(k)与支持向量回归机的预测输出之间的误差方差最小。The support vector regression model parameter selection algorithm based on multi-kernel function adaptive fusion takes all the fusion coefficients of the multi-kernel function, the kernel parameters and the penalty parameter C as the parameter state vector, and then uses k-fold cross-validation method based on LIBSVM to process the data set Predict the output, and finally use the fifth-order CKF algorithm to iteratively calculate the optimal parameter state vector. In fact, the entire SVR model parameter selection algorithm based on the adaptive fusion of hybrid kernel functions is to iteratively find the optimal state vector γ, so that the real target value y(k) of the sample is consistent with the predicted output of the SVR The error variance between them is the smallest.

结论in conclusion

本发明提出基于多核函数自适应融合的支持向量回归机参数选择方法，详细分析了局部核函数、全局核函数以及多核函数的特性，说明了进行各种核函数组合的必要性。将多核函数的所有融合系数、核函数参数、回归参数组合在一起作为参数状态向量，基于LIBSVM对原始数据集做预测输出，并用五阶容积卡尔曼滤波对融合系数进行自适应调整，对核函数参数与回归参数进行估计选择。最后以PE过程实验证明了基于本发明所提方法构造的核函数及参数能够使得支持向量机的泛化能力更强，故障分类精度更高。The invention proposes a support vector regression machine parameter selection method based on multi-kernel function adaptive fusion, analyzes the characteristics of local kernel functions, global kernel functions and multi-kernel functions in detail, and illustrates the necessity of combining various kernel functions. Combine all the fusion coefficients, kernel function parameters, and regression parameters of the multi-kernel function as a parameter state vector, predict the output of the original data set based on LIBSVM, and use the fifth-order volumetric Kalman filter to adaptively adjust the fusion coefficients, and the kernel function Parameters and regression parameters are estimated and selected. Finally, the PE process experiment proves that the kernel function and parameters constructed based on the method proposed by the present invention can make the generalization ability of the support vector machine stronger and the fault classification accuracy higher.

以上显示和描述了本发明的基本原理和主要特征及本发明的优点。本行业的技术人员应该了解，本发明不受上述实施例的限制，上述实施例和说明书中描述的只是说明本发明的原理，在不脱离本发明精神和范围的前提下，本发明还会有各种变化和改进，这些变化和改进都落入要求保护的本发明范围内。本发明要求保护范围由所附的权利要求书及其等效物界定。The basic principles and main features of the present invention and the advantages of the present invention have been shown and described above. Those skilled in the industry should understand that the present invention is not limited by the above-mentioned embodiments. What are described in the above-mentioned embodiments and the description only illustrate the principle of the present invention. Without departing from the spirit and scope of the present invention, the present invention will also have Variations and improvements are possible, which fall within the scope of the claimed invention. The protection scope of the present invention is defined by the appended claims and their equivalents.