CN106503867A

Movatterモバイル変換

Info

Publication number: CN106503867A
Application number: CN201611003309.1A
Authority: CN
Inventors: 丛玉良; 高超; 丁连根; 刘葳汉; 张利平; 周劲
Original assignee: Jilin University
Current assignee: Jilin University
Priority date: 2016-11-14
Filing date: 2016-11-14
Publication date: 2017-03-15

Abstract

Translated fromChinese

本发明公开了一种遗传算法最小二乘风电功率预测方法，利用已收集到实测风速建立遗传算法最小二乘支持向量机预测模型，确定建模所用的输入、输出变量；对原始数据进行归一化处理，利用遗传算法优化参数的数据、最小二乘支持向量机预测模型训练和测试的样本数据；对遗传算法以及最小二乘支持向量机预测模型参数初始化设置，训练模型，通过遗传算法多代进化获得优化的最小二乘支持向量机预测模型参数，建立最小二乘支持向量机预测模型；用最小二乘支持向量机预测模型对测试样本做风速短期预测。本发明运用遗传算法对LSSVM模型进行参数寻优，建立了基于GA‑LSSVM的风速信息预测模型，可以出色地实现数据的精确预测。

The invention discloses a genetic algorithm least squares wind power prediction method, which uses the collected actual wind speed to establish a genetic algorithm least squares support vector machine prediction model, determines the input and output variables used for modeling, and normalizes the original data The genetic algorithm is used to optimize the parameter data, the sample data of the least squares support vector machine prediction model training and testing; the genetic algorithm and the least squares support vector machine prediction model parameter initialization settings, training models, multi-generational genetic algorithm Evolution obtains the optimized least squares support vector machine prediction model parameters, and establishes the least squares support vector machine prediction model; uses the least squares support vector machine prediction model to make short-term wind speed predictions for test samples. The invention uses the genetic algorithm to optimize the parameters of the LSSVM model, establishes a wind speed information prediction model based on the GA-LSSVM, and can excellently realize accurate prediction of data.

Description

Translated fromChinese

一种遗传算法最小二乘风电功率预测方法A Genetic Algorithm Least Squares Wind Power Prediction Method

技术领域technical field

本发明属于风电技术领域，具体涉及一种遗传算法最小二乘风电功率预测方法。The invention belongs to the technical field of wind power, and in particular relates to a genetic algorithm least squares wind power prediction method.

背景技术Background technique

风功率预测/风电场功率预测WPP(Wind Power Prediction)(也有一些国内专业杂志称为Wind Energy Prediction)风功率预测是指风电场风力发电机发电功率预测。Wind Power Prediction/Wind Farm Power Prediction WPP (Wind Power Prediction) (also known as Wind Energy Prediction in some domestic professional magazines) wind power prediction refers to the prediction of the power generation of wind turbines in wind farms.

现有的风电预测技术大部分是利用数值天气预报进行修正，对于历史数据利用不足，严重依赖数值天气预报的准确性。Most of the existing wind power forecasting techniques use numerical weather forecasting for corrections, and the historical data are insufficiently utilized, relying heavily on the accuracy of numerical weather forecasting.

发明内容Contents of the invention

本发明的目的是提出一种预测风电超短期功率的方法，运用遗传算法对最小二乘向量机(LSSVM)模型进行参数寻优，建立了基于GA-LSSVM(遗传算法最小二乘向量机)的风速信息预测模型，可以出色地实现数据的精确预测，在预测风电场风速方面表现出很强的优越性。The purpose of the present invention is to propose a method for predicting ultra-short-term power of wind power, using genetic algorithm to optimize the parameters of the least squares vector machine (LSSVM) model, and establishing a method based on GA-LSSVM (genetic algorithm least squares vector machine) The wind speed information prediction model can excellently realize the accurate prediction of data, and shows a strong advantage in predicting the wind speed of wind farms.

本发明的目的是通过以下技术方案实现的：The purpose of the present invention is achieved through the following technical solutions:

一种遗传算法最小二乘风电功率预测方法，利用已收集到实测风速建立遗传算法最小二乘支持向量机预测模型，具体步骤如下：A genetic algorithm least squares wind power prediction method, using the collected measured wind speed to establish a genetic algorithm least squares support vector machine prediction model, the specific steps are as follows:

步骤一、确定建模所用的输入、输出变量：每隔10分钟采集一个风速数据，一天所有数据为一组，六组数据为一个周期，以前连续5天的风速作为训练样本，之后的1天作为测试；Step 1. Determine the input and output variables used for modeling: collect a wind speed data every 10 minutes, all the data in one day is one group, six groups of data are one cycle, the wind speed of the previous 5 consecutive days is used as the training sample, and the next as a test;

步骤二、对原始数据进行归一化处理，利用遗传算法优化参数的数据、最小二乘支持向量机预测模型训练和测试的样本数据；Step 2, normalize the original data, use the genetic algorithm to optimize the data of the parameters, the sample data of the least squares support vector machine prediction model training and testing;

步骤三、对遗传算法以及最小二乘支持向量机预测模型参数初始化设置：利用采集来的数据，进行二进制编码，产生初代种群，即初始最小二乘支持向量机模型，然后训练模型，通过遗传算法多代进化获得优化的最小二乘支持向量机预测模型参数，建立最小二乘支持向量机预测模型；Step 3. Initialize the parameters of the genetic algorithm and the least squares support vector machine prediction model: use the collected data to perform binary coding to generate the first generation population, that is, the initial least squares support vector machine model, and then train the model through the genetic algorithm Multi-generational evolution obtains the optimized least squares support vector machine prediction model parameters, and establishes the least squares support vector machine prediction model;

步骤四、用步骤三得到的最小二乘支持向量机预测模型对测试样本做风速短期预测；Step 4, using the least squares support vector machine prediction model obtained in step 3 to do short-term wind speed prediction for the test sample;

步骤五、通过设定的适应度函数验证所得结果的精度，如果不符合要求，则重新设定遗传算法参数，返回步骤三重新训练。Step 5. Verify the accuracy of the obtained results through the set fitness function. If it does not meet the requirements, reset the parameters of the genetic algorithm and return to step 3 for retraining.

本发明运用遗传算法对LSSVM模型进行参数寻优，建立了基于GA-LSSVM的风速信息预测模型，通过仿真分析可知：GA-LSSVM模型的预测结果要优于以往的RBFNN模型，且通过精度与误差的对比，充分说明GA-LSSVM模型是有效可行的，可以出色地实现数据的精确预测，同时也表明了机器学习算法在预测风电场风速方面表现出很强的优越性。The present invention uses the genetic algorithm to optimize the parameters of the LSSVM model, and establishes a wind speed information prediction model based on GA-LSSVM. Through simulation analysis, it can be known that the prediction result of the GA-LSSVM model is better than that of the previous RBFNN model, and through the accuracy and error The comparison fully shows that the GA-LSSVM model is effective and feasible, and can achieve accurate prediction of data excellently. It also shows that the machine learning algorithm has a strong superiority in predicting the wind speed of wind farms.

附图说明Description of drawings

图1为本发明的方法流程图Fig. 1 is method flowchart of the present invention

图2为支持向量机的基本原理图Figure 2 is the basic schematic diagram of the support vector machine

图3为支持向量机向量经过核函数转换图Figure 3 is a conversion diagram of the support vector machine vector through the kernel function

图4为支持向量机二维空间示例Figure 4 is an example of the two-dimensional space of the support vector machine

图5为支持向量机不敏感损失函数和松弛变量Figure 5 shows the support vector machine insensitive loss function and slack variables

图6为LSSVM模型误差处理原理图Figure 6 is a schematic diagram of the error processing of the LSSVM model

图7为遗传算法优化流程图Figure 7 is a flow chart of genetic algorithm optimization

图8为采用GA-LSSVM模型的风电场风速预测结果对比示意图Figure 8 is a schematic diagram of the comparison of wind speed prediction results of wind farms using the GA-LSSVM model

图9为采用GA-LSSVM模型的风电场风速预测结果相对误差示意图Figure 9 is a schematic diagram of the relative error of the wind speed prediction results of the wind farm using the GA-LSSVM model

图10为采用RBFNN模型风电场风速预测结果对比示意图Figure 10 is a schematic diagram of the comparison of wind speed prediction results of wind farms using the RBFNN model

图11为采用RBFNN模型风电场风速预测结果相对误差示意图Figure 11 is a schematic diagram of the relative error of the wind speed prediction results of the wind farm using the RBFNN model

图12为另一测风塔采用GA-LSSVM模型的风电场风速预测结果对比示意图Figure 12 is a schematic diagram of the comparison of wind speed prediction results of wind farms using the GA-LSSVM model for another wind measuring tower

图13为另一测风塔采用GA-LSSVM模型的风电场风速预测结果误差示意图Figure 13 is a schematic diagram of the error of wind speed prediction results of wind farms using the GA-LSSVM model for another wind measuring tower

具体实施方案specific implementation plan

以下结合附图详细介绍本发明的技术方案：Describe technical scheme of the present invention in detail below in conjunction with accompanying drawing:

原理背景Principle background

支持向量机(Support Vector Machine，SVM)是一种机器学习方法，他通过训练样本数据，得到相应的预测模型。SVM相对于其他预测方法的基础优势是结构风险最小化，并且对于样本小，维度高的数据处理预测有自己的优势。正是由于有以上这些优势，很多专家正在研究SVM的应用方式，SVM因此得到快速的发展。SVM可以用来预测，预测精度取决于正规化常数C(表示错误的程度)与松弛变量ξ_t，C与ξ_t随着输入数据的不同而改变，直接影响着预测精度。那么如何影响C与ξ_t的取值，就是各种算法改进方向。为了使SVM更加有利于计算，有学者将最小二乘相关原理引入到支持向量机中，最小二乘支持向量机(LeastSquares Support Vector Machine，LSSVM)被开发出来。LSSVM是基于SVM标准的最小二乘公式。LSSVM有两个特点，一是将支持向量机的不等式运算经过理论推导，转化为等式运算，在后文将有详细推导过程；二是在寻找支持向量时，只关注那些非零信息。显然，拥有以上两种改变，LSSVM的运算过程得到简化，效率大大提高。Support Vector Machine (SVM) is a machine learning method that obtains a corresponding prediction model by training sample data. The basic advantage of SVM over other forecasting methods is the minimization of structural risk, and it has its own advantages for data processing and forecasting with small samples and high dimensions. It is precisely because of these advantages that many experts are studying the application of SVM, so SVM has developed rapidly. SVM can be used for prediction. The prediction accuracy depends on the regularization constant C (representing the degree of error) and the slack variable ξ_t . C and ξ_t change with the input data, directly affecting the prediction accuracy. So how to affect the values of C and_ξt is the direction of improvement of various algorithms. In order to make SVM more conducive to calculation, some scholars introduced the least squares correlation principle into the support vector machine, and the least squares support vector machine (LeastSquares Support Vector Machine, LSSVM) was developed. LSSVM is a least squares formulation based on the SVM standard. LSSVM has two characteristics. One is to transform the inequality operation of the support vector machine into an equality operation through theoretical derivation. The derivation process will be described in detail later; the other is to only pay attention to those non-zero information when looking for support vectors. Obviously, with the above two changes, the operation process of LSSVM is simplified and the efficiency is greatly improved.

基于统计学习的思想，SVM仅仅用于分类研究。SVM可以通过求解一个凸二次优化问题，被应用于非线性函数的回归计算。图2是一个支持向量机的基本原理图。在形式上，与一个前馈式神经网络有很多相似之处，同样具有信息的输入层和信息的输出层。二者的区别是SVM用核函数取代了神经网络的神经元。二者的工作原理也不尽相同，一个3层的前馈ANN输入输出两个传递函数。但是SVM只有一个核函数，该核函数的作用是将低维输入数据向量变换到更高维的向量空间(有时为无穷维)。向量经过核函数转到高纬度空间的过程如图3所示。核函数转换后，SVM就可以选择一些类的优化算法(例如二次规划)来执行回归或分类计算。Based on the idea of statistical learning, SVM is only used for classification research. SVM can be applied to the regression calculation of nonlinear functions by solving a convex quadratic optimization problem. Figure 2 is a basic schematic diagram of a support vector machine. In form, it has many similarities with a feed-forward neural network, which also has an input layer of information and an output layer of information. The difference between the two is that SVM replaces the neurons of the neural network with kernel functions. The working principles of the two are also different. A 3-layer feed-forward ANN inputs and outputs two transfer functions. But SVM has only one kernel function, and the function of this kernel function is to transform the low-dimensional input data vector into a higher-dimensional vector space (sometimes infinite-dimensional). The process of transferring the vector to the high-latitude space through the kernel function is shown in Figure 3. After the kernel function is converted, the SVM can choose some class of optimization algorithms (such as quadratic programming) to perform regression or classification calculations.

如图4所示，灰色框中的向量为支持向量，二维空间中SVM的线性分离向量与支持向量之间的差异。支持向量就是能够确定最小安全距离的向量。还可以从图中看出，支持向量确定了最优分类间隔。As shown in Figure 4, the vectors in the gray boxes are support vectors, and the difference between the linearly separated vectors and support vectors of SVM in two-dimensional space. The support vector is the vector that can determine the minimum safe distance. It can also be seen from the figure that the support vectors determine the optimal classification interval.

假设{(X_t,y_t)}(t＝1,2,…,n)是给定的一组数据集，其中X_t＝(x_t1,x_t2,…,x_tk)是具有k个变量的输入向量，y_t是相应的t时刻的输出数据，可以定义为：Suppose {(X_t ,y_t )}(t=1,2,…,n) is a given set of data sets, where X_t =(x_t1 ,x_t2 ,…,x_tk ) is a set of k The input vector of variables, y_t is the corresponding output data at time t, which can be defined as:

其中，<·，·>表示点积，W是权重向量，b是偏置，是输入向量X_t转换到高维特征空间的映射函数。然后可以得到相应的优化问题：Among them, <·,·> represents the dot product, W is the weight vector, b is the bias, Is the mapping function that transforms the input vector X_t into a high-dimensional feature space. Then the corresponding optimization problem can be obtained:

其中,C是正规化常数，表示错误的程度。ξ_t和ξ_t*是松弛变量，用来衡量如图5所示的训练点上方和下方目标值的误差。宽度为2ε的ε-不敏感损失函数定义为：Among them, C is a regularization constant, indicating the degree of error._ξt and_ξt * are slack variables that measure the error above and below the target value at the training point as shown in Figure 5. The ε-insensitive loss function with width 2ε is defined as:

通过引入拉格朗日乘子，式(1.2)中的问题变为：By introducing Lagrangian multipliers, the problem in equation (1.2) becomes:

其中，α_t，ηt和是拉格朗日乘子。拉格朗日函数分别对W，b，ξ_t和进行偏微分计算并将结果置零，即：Among them, α_t , ηt and is the Lagrangian multiplier. The Lagrangian functions are respectively for W, b, ξ_t and Do the partial differential calculation and set the result to zero, that is:

代入式(1.5)到原拉格朗日函数式(1.4)中，最优化问题转化为如下问题：Substituting formula (1.5) into the original Lagrangian function formula (1.4), the optimization problem is transformed into the following problem:

根据KKT(Karush-Kuhn-Tucker)条件，包含拉格朗日乘数的相关项在最优解时可以消去。这意味着下面的等式成立：According to the KKT (Karush-Kuhn-Tucker) condition, the relevant items including Lagrangian multipliers can be eliminated in the optimal solution. This means that the following equation holds:

式(1.7)表明，对于所有的样本，其拉格朗日乘子等于零被视为非支持向量，而系数非零就被认为是支持向量。同时，当松弛变量ξ_t和为零时，能够得到b的值，即：Equation (1.7) shows that for all samples, the Lagrangian multiplier equal to zero is considered as a non-support vector, and the coefficient is non-zero as a support vector. At the same time, when the slack variables ξ_t and When it is zero, the value of b can be obtained, namely:

最后，SVM的非线性函数估计公式可以写成：Finally, the nonlinear function estimation formula of SVM can be written as:

式(1.9)能够被写成：Equation (1.9) can be written as:

其中，叫做核函数，常用的核函数如式(1.11)～(1.14)。in, It is called the kernel function, and the commonly used kernel functions are as follows (1.11)~(1.14).

(1)线性核函数：(1) Linear kernel function:

K(X,X_t)＝＜X,X_t＞………………………………(1.11)K(X,X_t )＝＜X,X_t ＞…………………………(1.11)

(2)多项式核函数：(2) Polynomial kernel function:

K(X,X_t)＝(＜X,X_t＞+p)^d,d∈N,p＞0…………………(1.12)K(X,X_t )=(<X,X_t >+p)^d ,d∈N,p>0………………(1.12)

(3)高斯核函数：(3) Gaussian kernel function:

(4)S型核函数(Sigmoid)：(4) S-type kernel function (Sigmoid):

K(X,X_t)＝tanh(c＜X,X_t＞+d),c＞0,d＞0………(1.14)K(X,_Xt )=tanh(c<X,_Xt >+d),c>0,d>0...(1.14)

高斯核函数是最强大的非线性函数估计之一。对于SVM^[39,40]的分类计算，决策函数方程(1.10)将输出分类结果，而不是回归结果。多级分类问题可以看成具有多个预定阈值的回归计算。SVM因为其自身的优化算法的局限性，所以在做分类和回归计算时也有一些缺点。The Gaussian kernel is one of the most powerful nonlinear function estimators. For classification computations of SVMs^{[39, 40]} , the decision function equation (1.10) will output classification results, not regression results. A multi-class classification problem can be viewed as a regression computation with multiple predetermined thresholds. Because of the limitations of its own optimization algorithm, SVM also has some shortcomings when doing classification and regression calculations.

LSSVM和SVM有同样的结构。它具有输入层和含有单个或多个输入/输出数据的输出层，隐藏层包含将低维输入数据转换到高维的内核，输入向量经过该内核转换，变为特征空间的假高维度向量，从而即拥有了高维度空间向量的可分性，又有了低维度空间的可计算性，但是在转换后，LSSVM的工作原理与SVM不同。不同于SVM标准下的不等式约束，LSSVM是基于等式约束的思想。使原来的问题从二次规划的问题转化为线性KKT系统的一组线性方程的问题。LSSVM的回归计算于2002年以后被提出，它的思想是把所有的训练数据作为支持向量。LSSVM and SVM have the same structure. It has an input layer and an output layer containing single or multiple input/output data. The hidden layer contains a kernel that converts low-dimensional input data to high-dimensional. The input vector is transformed by the kernel into a fake high-dimensional vector of the feature space. Thus, it has the separability of high-dimensional space vectors and the computability of low-dimensional spaces, but after conversion, the working principle of LSSVM is different from that of SVM. Unlike the inequality constraints under the SVM standard, LSSVM is based on the idea of equality constraints. Transform the original problem from quadratic programming to a set of linear equations of the linear KKT system. The regression calculation of LSSVM was proposed after 2002, and its idea is to use all training data as support vectors.

由式(1.1)，对于LSSVM相应优化问题为：According to formula (1.1), the corresponding optimization problem for LSSVM is:

其中,e_t是t时刻的误差变量，γ是可调节的常数，类似于SVM的C。Among them, e_t is the error variable at time t, and γ is an adjustable constant, similar to C of SVM.

得到拉格朗日函数如下：The Lagrangian function is obtained as follows:

其中,α_t是拉格朗日乘子，拉格朗日函数对原始变量W，b，e_t和α_t的偏导数可由式(1.17)获得。Among them, α_t is the Lagrangian multiplier, and the partial derivatives of the Lagrangian function to the original variables W, b, e_t and α_t can be obtained by formula (1.17).

通过式(1.17)进行替换，可以得到等价公式：By replacing formula (1.17), the equivalent formula can be obtained:

式(1.18)的另一种方程为：Another equation of (1.18) is:

其中，Y＝[y₁,…,y_n]T，1_N＝[1,…,1]T，α＝[α₁,α₂,…,α_n]^T。in, Y=[y₁ ,...,y_n ]T, 1_N =[1,...,1]T, α=[α₁ ,α₂ ,...,α_n ]^T .

由此我们可以得到如下LSSVM预测模型：From this we can get the following LSSVM prediction model:

其中，α和b可由式(1.19)求出，核函数K(X,X_t)采用高斯核函数。可以看出误差惩罚参数γ是影响LSSVM的精度的重要参数。Among them, α and b can be obtained by formula (1.19), and the kernel function K(X, X_t ) adopts Gaussian kernel function. It can be seen that the error penalty parameter γ is an important parameter affecting the accuracy of LSSVM.

遗传算法(Genetic Algorithms，GA)也成为进化算法，是模仿生物遗传机制和达尔文进化论的一种启发式搜索算法方法。它将“物竞天择，适者生存”的生物进化论原理引入寻找最优参数形成的编码串族群体中，按所选择的适应度函数对个体进行筛选，使适应度高的个体被保留下来，通过遗传中的复制、交叉及变异组成新的群体，新的群体既继承了上一代的信息，适应度高的遗传后代的可能性大，适应度低的会逐步被淘汰。这样不断重复对新的种群进行适应度筛选，群体中适应度高的个体数量越来越多，直到满足预先设定的条件，算法终止，这时，适应度最高的个体留在种群中的概率最高，从而得到最优解。遗传算法有利于计算机处理，并能到全局最优解。Genetic Algorithms (GA), also known as evolutionary algorithm, is a heuristic search algorithm method that imitates biological genetic mechanism and Darwinian evolution theory. It introduces the principle of biological evolution of "natural selection, survival of the fittest" into the coding string family group formed by searching for optimal parameters, and screens individuals according to the selected fitness function, so that individuals with high fitness are retained , A new population is formed through replication, crossover and mutation in genetics. The new population not only inherits the information of the previous generation, the genetic offspring with high fitness are more likely, and those with low fitness will be gradually eliminated. In this way, the fitness screening of the new population is repeated continuously, and the number of individuals with high fitness in the population is increasing, until the pre-set conditions are met, and the algorithm is terminated. At this time, the probability of the individual with the highest fitness staying in the population is , so as to obtain the optimal solution. Genetic algorithm is beneficial to computer processing, and can reach the global optimal solution.

本发明技术方案Technical scheme of the present invention

本发明提供一种遗传算法最小二乘风电功率预测方法，如图1所示，利用系统已收集到实测风速建立遗传算法最小二乘支持向量机(GA-LSSVM)预测模型，具体步骤如下：The present invention provides a kind of genetic algorithm least squares wind power prediction method, as shown in Figure 1, utilize system to have collected the measured wind speed and establish genetic algorithm least squares support vector machine (GA-LSSVM) prediction model, concrete steps are as follows:

步骤一、确定建模所用的输入、输出变量：每隔10分钟采集一个风速数据，一天所有数据为一组，六组数据为一个周期，以前连续5天的风速作为训练样本，之后的1天作为测试。Step 1. Determine the input and output variables used for modeling: collect a wind speed data every 10 minutes, all the data in one day is one group, six groups of data are one cycle, the wind speed of the previous 5 consecutive days is used as the training sample, and the next as a test.

步骤二、对原始数据进行归一化处理，以便于进行计算。利用遗传算法优化参数的数据，LSSVM训练和测试的样本数据。Step 2: Perform normalization processing on the original data to facilitate calculation. The data of parameters optimized by genetic algorithm, the sample data of LSSVM training and testing.

步骤三、对GA算法以及LSSVM参数初始化设置。利用采集来的数据，进行二进制编码，产生初代种群，即初始LSSVM模型，然后训练模型，通过GA多代进化获得优化的LSSVM参数，建立LSSVM预测模型。Step 3: Initialize the GA algorithm and LSSVM parameters. Use the collected data to perform binary coding to generate the first-generation population, that is, the initial LSSVM model, then train the model, obtain optimized LSSVM parameters through GA multi-generational evolution, and establish an LSSVM prediction model.

步骤四、用步骤三得到的模型对测试样本作风速短期预测。Step 4: Use the model obtained in Step 3 to make a short-term wind speed forecast for the test sample.

步骤五、通过设定的适应度函数验证所得结果的精度，如果不符合要求，则重新设定GA参数，返回步骤三重新训练。符合精度，或者达到预先设定的运算次数，则认为找到最优参数，代入LSSVM中进行预测运算。Step 5. Verify the accuracy of the obtained results through the set fitness function. If it does not meet the requirements, reset the GA parameters and return to step 3 for retraining. If the accuracy is met, or the preset number of operations is reached, it is considered that the optimal parameters are found, and they are substituted into the LSSVM for prediction operations.

本发明选择高斯核函数作为核函数，这样就有两个重要参数需要确定，即LSSVM模型当中误差惩罚参数γ和核函数中的σ，下面将采用GA来优化这两个参数。The present invention selects the Gaussian kernel function as the kernel function, so there are two important parameters to be determined, namely the error penalty parameter γ in the LSSVM model and σ in the kernel function, and GA will be used to optimize these two parameters below.

1.初始种群20个，遗传代数100代；1. The initial population is 20, and the genetic algebra is 100 generations;

2.要利用GA对LSSVM参数进行优化，首先要对γ和σ进行二进制编码。二者共同组成20位的二进制编码，参与遗传算法运算。2. To use GA to optimize the parameters of LSSVM, firstly, γ and σ should be binary coded. The two together form a 20-bit binary code and participate in the genetic algorithm operation.

3.确定适应度函数，决定得到结果是否符合最优。3. Determine the fitness function to determine whether the obtained result is optimal.

风速信息预测评价标准Wind Speed Information Forecast Evaluation Standard

风电场最终能够上网电量与风电场出力预测结果的好坏紧密相连，预测方法好坏的判定，精度的高低是由预测的评价标准进行评定。以下列出主要的误差判断公式：The wind farm's final grid-connected electricity is closely related to the quality of the wind farm output prediction results. The judgment of the prediction method and the accuracy are determined by the evaluation criteria of the prediction. The main error judgment formulas are listed below:

1.平均绝对百分比误差(Mean Absolute Percent Error，MAPE)1. Mean Absolute Percent Error (MAPE)

2.均方根误差(Root Mean Square Error，RMSE)2. Root Mean Square Error (Root Mean Square Error, RMSE)

3.平均绝对误差(Mean Absolute Error，MAE)3. Mean Absolute Error (Mean Absolute Error, MAE)

4.相对误差(Relative Error)4. Relative Error

5.均等系数(EC)5. Equalization coefficient (EC)

为了更好的表明预测结果与实际风电场出力的相似程度，我们定义了均等系数(后面简称EC)。所谓均等系数，从定义式中可以看出，表达的是预测结果与实际结果的相似程度，是一种简单的判定预测准确程度的方法，从定义公式可以看出，均等系数必然大于0小于1。EC值越大，代表着预测结果越接近实际情况。通常我们认为，如果EC>0.85，就可以看作是较好的预测，如果EC>0.9就认为是满意的预测。In order to better indicate the similarity between the predicted results and the actual output of wind farms, we define an equalization coefficient (hereinafter referred to as EC). The so-called equalization coefficient, as can be seen from the definition formula, expresses the similarity between the predicted result and the actual result, and is a simple method to determine the accuracy of the prediction. From the definition formula, it can be seen that the equalization coefficient must be greater than 0 and less than 1 . The larger the EC value, the closer the predicted result is to the actual situation. Generally, we think that if EC>0.85, it can be regarded as a better prediction, and if EC>0.9, it is considered a satisfactory prediction.

GA-LSSVM的预测结果好坏有许多指标，这里我们选取相对误差(Relative Error)作为验证遗传因子好坏的适应度函数，式中：y_i和分别为样本的实值和预测值；n为测试样本的个数本发明选用多种误差来判断算法的优劣。There are many indicators for the quality of the prediction results of GA-LSSVM. Here we choose the relative error (Relative Error) as the fitness function to verify the quality of genetic factors. In the formula: y_i and are the actual value and predicted value of the sample respectively; n is the number of test samples. In the present invention, various errors are selected to judge the pros and cons of the algorithm.

仿真分析simulation analysis

通过以上的推导分析，建立了GA-LSSVM的风电场风速预测模型，为了验证模型的有效性，运用RBFNN进行对比测试。测试数据来自于吉林省内某风力发电企业的实测数据，时间是2015年的9月和10月，采集频率为10分钟一次，选取了十个距离较远的测风点数据，数据处理也在算法中一并进行，即当风速大于12m/s时，模型默认为12m/s，这在后面的实验输出中有所体现。本章仿真数据使用10月1日-5日5天的数据作为LSSVM模型的训练样本，用已经构造好的LSSVM模型预测第6天的风速信息。GA部分对参数进行交叉对比优化，初始种群的大小设定为20，最大的迭代次数设定为100。Through the above derivation and analysis, a GA-LSSVM wind speed prediction model for wind farms is established. In order to verify the effectiveness of the model, RBFNN is used for comparative testing. The test data comes from the actual measurement data of a wind power generation enterprise in Jilin Province. The time is September and October 2015. The collection frequency is once every 10 minutes. The data of ten wind measurement points far away are selected, and the data processing is also carried out. The algorithm is carried out together, that is, when the wind speed is greater than 12m/s, the model defaults to 12m/s, which is reflected in the later experimental output. The simulation data in this chapter uses the data of 5 days from October 1st to 5th as the training samples of the LSSVM model, and uses the constructed LSSVM model to predict the wind speed information on the sixth day. The GA part performs cross-comparison optimization on the parameters, the initial population size is set to 20, and the maximum number of iterations is set to 100.

图8、图9是采用GA-LSSVM模型的风电场风速预测结果，并且给出了遗传算法寻找的最优参数值。Figure 8 and Figure 9 are the wind speed prediction results of the wind farm using the GA-LSSVM model, and the optimal parameter values found by the genetic algorithm are given.

图8是采用GA-LSSVM模型的风电场功率预测对比结果，GA优化的LSSVM的参数γ＝0.7162和σ＝0.0643。图9为各个点的相对误差，直方图给出了误差的直观感受，可以看出误差都很小，平均相对误差为10.1％，最大相对误差为48％，拟合度为94.1％。与国际上5％的误差还有很大差距，但是与神经网络相比提高了很多。Fig. 8 is the comparison result of wind farm power prediction using GA-LSSVM model, the parameters of GA-optimized LSSVM are γ=0.7162 and σ=0.0643. Figure 9 shows the relative error of each point. The histogram gives an intuitive feeling of the error. It can be seen that the error is very small, the average relative error is 10.1%, the maximum relative error is 48%, and the fitting degree is 94.1%. There is still a big gap with the international 5% error, but it has improved a lot compared with the neural network.

为了对比GA-LSSVM，我们同样适用十月份1-5号作为训练样本，第六天作为测试样本，如图10、图11平均相对误差为13.5，最大相对误差为51％，拟合度为92.4％。可以看出，GA-LSSVM确实比RBFNN提高了精度。In order to compare GA-LSSVM, we also use October 1-5 as the training sample, and the sixth day as the test sample, as shown in Figure 10 and Figure 11. The average relative error is 13.5, the maximum relative error is 51%, and the fitting degree is 92.4 %. It can be seen that GA-LSSVM does improve the accuracy over RBFNN.

图12、图13仍然是采用GA-LSSVM模型的风电场功率预测对比结果，只不过数据采用了同一时期另一个测风塔采集的数据，可以看出结果有很大不同，GA优化的LSSVM的参数γ＝1.291和σ＝0.12。平均误差达到了16％，拟合度为92％。Figure 12 and Figure 13 are still the comparison results of wind farm power prediction using the GA-LSSVM model, but the data used the data collected by another anemometer tower in the same period. It can be seen that the results are quite different. The GA-optimized LSSVM Parameters γ=1.291 and σ=0.12. The average error reached 16%, and the goodness of fit was 92%.

运用遗传算法对LSSVM模型进行参数寻优，建立了基于GA-LSSVM的风速信息预测模型，通过仿真分析可知：GA-LSSVM模型的预测结果要优于以往的RBFNN模型，且通过精度与误差的对比，充分说明本发明提出的GA-LSSVM模型是有效可行的，可以出色地实现数据的精确预测。同时也表明了机器学习算法在预测风电场风速方面表现出很强的优越性。The genetic algorithm is used to optimize the parameters of the LSSVM model, and a wind speed information prediction model based on GA-LSSVM is established. Through simulation analysis, it can be seen that the prediction result of the GA-LSSVM model is better than that of the previous RBFNN model, and through the comparison of accuracy and error , which fully demonstrates that the GA-LSSVM model proposed by the present invention is effective and feasible, and can excellently realize accurate prediction of data. At the same time, it also shows that the machine learning algorithm has a strong superiority in predicting the wind speed of the wind farm.

Claims

1. A least square wind power prediction method of a genetic algorithm is characterized in that a least square support vector machine prediction model of the genetic algorithm is established by using collected actually measured wind speeds, and the method comprises the following specific steps:

step one, determining input and output variables used for modeling: acquiring wind speed data every 10 minutes, wherein all the data in one day are a group, six groups of data are a period, the wind speed of 5 consecutive days is taken as a training sample, and the wind speed of 1 day is taken as a test;

normalizing the original data, and optimizing the data of parameters by using a genetic algorithm and sample data of training and testing a prediction model of a least square support vector machine;

step three, initializing and setting parameters of a prediction model of a genetic algorithm and a least square support vector machine: carrying out binary coding by using the acquired data to generate an initial population, namely an initial least square support vector machine model, then training the model, obtaining optimized least square support vector machine prediction model parameters through multi-generation evolution of a genetic algorithm, and establishing a least square support vector machine prediction model;

step four, performing short-term wind speed prediction on the test sample by using the least square support vector machine prediction model obtained in the step three;

and fifthly, verifying the precision of the obtained result through the set fitness function, resetting the parameters of the genetic algorithm if the result does not meet the requirements, and returning to the step of triple retraining.

2. The method for predicting wind power by genetic algorithm least square as claimed in claim 1, wherein in the third step, a gaussian kernel function is selected as a kernel function, and a genetic algorithm is adopted to optimize an error penalty parameter γ in the least square support vector machine prediction model and σ in the kernel function:

1) 20 initial populations and 100 genetic generations;

2) carrying out binary coding on gamma and sigma, wherein the gamma and sigma jointly form 20-bit binary codes to participate in genetic algorithm operation;

3) and determining a fitness function and determining whether the obtained result is in accordance with the optimum.