Disclosure of Invention
One of the purposes of the invention is to provide a method for predicting the quality of a phenolic resin product under the condition of uncertain raw materials, so that the quality of the produced phenolic resin product can be predicted in real time.
Based on the purpose, the invention provides a phenolic resin product quality prediction method based on a long-time and short-time memory network, which comprises the following steps:
(1) obtaining historical data of phenolic resin production process and corresponding batch product quality
(2) Preprocessing, feature selection and standardization are carried out on historical data;
(3) constructing a phenolic resin product quality prediction model based on a long-time and short-time memory network;
(4) training a phenolic resin product quality prediction model based on the data obtained in the step (2);
(5) and (3) acquiring real-time production data of the current batch, processing the data by using the pretreatment and standardization in the step (2), and predicting the product quality of the current batch through the phenolic resin product quality prediction model.
(6) And (3) when the real-time prediction accuracy in the step (5) does not meet the requirement, processing the data by using the pretreatment and standardization in the step (2) according to the real-time production data obtained in the step (5) as historical data, returning to the step (4) to train the model, and obtaining a new phenolic resin product quality prediction model.
The phenolic resin product prediction method based on the long-time and short-time memory network comprises the following steps:
the historical data is time series data of the production process of the phenolic resin and product quality data of corresponding batches.
The Long Short-Term Memory network (LSTM) is a recurrent neural network, and is suitable for processing and predicting regular information in time sequence data.
The phenolic resin product quality prediction method based on the long-time and short-time memory network is characterized in that a phenolic resin product quality prediction model based on the long-time and short-time memory network is constructed and trained on the basis of historical data of a phenolic resin production process and product quality data of corresponding batches, and the influence rule of uncertain raw materials on the product quality is obtained by extracting the internal trend rule of the historical data, so that the product quality of the phenolic resin production process of the current batch is effectively predicted.
Further, in the phenolic resin product quality prediction method based on the long-time and short-time memory network, historical data of the phenolic resin production process comprise alkylation liquid flow, liquid aldehyde flow, solid aldehyde flow, reaction temperature, pressure and reaction kettle weight, and product quality data of corresponding batches comprise softening point data obtained through laboratory analysis.
Furthermore, in the method for predicting the quality of the phenolic resin product based on the long-time and short-time memory network, in the step (2), the preprocessing of the historical data is to resample the time series data at equal time intervals, so that the dimension of the variable is reduced.
Furthermore, in the method for predicting the quality of the phenolic resin product based on the long-time and short-time memory network, in the step (2), the data is subjected to correlation analysis by using the maximum information coefficient for the feature selection, and the correlation analysis can be calculated by the following formula:
wherein x represents any variable in the historical data of the phenolic aldehyde production process, y represents the product quality index of the phenolic resin, and B is the 0.6 power of the total data. The maximum information coefficient is a decimal number between 0 and 1, and the larger the maximum information coefficient is, the stronger the correlation between the variable and the quality index is.
And selecting a variable with the maximum information coefficient of the product quality index larger than 0.3 as an input variable of the phenolic resin product quality prediction model.
Further, in the method for predicting the quality of a phenol resin product based on a long-term and short-term memory network, in the step (2), the data is normalized by normalizing the obtained data and the obtained product quality respectively, and the data is obtained by the following formula:
in formula (II), x'
ikA k-dimensional variable x representing the ith lot
ikNormalized values in the range of [0,1 ]]The method has the advantages of no dimension,
and
respectively representing the minimum value and the maximum value in the k-dimension variable, and I represents the total number of batches.
Product quality was normalized and obtained by the following formula:
of formula (II) to (III)'
iRepresents the final quality index y of the ith batch
iNormalized numerical values in the range of [0,1 ]]The method has the advantages of no dimension,
and
the minimum and maximum values are indicated separately and I indicates the total number of batches.
Furthermore, in the method for predicting the quality of the phenolic resin product based on the long-term and short-term memory network, in the step (3), the process of constructing the phenolic resin product quality prediction model based on the long-term and short-term memory network includes establishing the long-term and short-term memory network composed of an LSTM layer and an output fully-connected layer, and the model parameters include network weights and offsets of the layers.
In the above scheme, the time t of the network (t e [1,2, …, m)]) The network cell state of (1) is the object, the input of the network is from the output h of the hidden layer at the time t-1t-1Time t-1 cell State Ct-1And input data x at time ttThe input data passes through a forgetting gate, an input gate and an output gate of the LSTM network to finally obtain the output h of the hidden layer at the time ttAnd t time cell status Ct. The specific operation steps of the door structure are as follows:
network input x at time ① ttAnd t-1 output h of the hidden layert-1Merging the input signals as the input of a forgetting gate, and finally outputting a result f of the forgetting gate by a sigmoid functiontNormalized to between 0 and 1, the specific formula is as follows:
ft=σ(Wf*[ht-1,xt]+bf)
where σ denotes a Sigmoid activation function, WfWeight of forgetting gate, ht-1The output of the hidden layer is hidden for the previous moment,txas input at the current time, bfTo forget the biasing of the door.
② t time netThe channel input x
tAnd the output h of the hidden layer at time t-1
t-1The merged vector is subjected to sigmoid function to obtain a value i of the network cell needing to be updated
tMeanwhile, the merged vector also needs to be subjected to tanh function to obtain the candidate cell state
Specifically, the following formula:
it=σ(Wi*[ht-1,xt]+bi)
Ct=tanh(Wc*[ht-1,xt]+bc)
wherein, WiAs the weight of the input gate, biFor the bias of the input gate, tanh is the activation function, WcAs a weight of the candidate cell state, bcIs the bias of the candidate cell state.
③ use t-1 memory cell C
t-1And the output result f of the forgetting gate
tMultiplication, which represents redundant information at a time before discarding; (ii) the candidate cell status
And output result i of input gate
tMultiplication, which represents how much information in the candidate state cells is needed to update the memory cells; and adding the results of the two to complete the updating of the cells, wherein the specific formula is as follows:
Ct=ft*Ct-1+it*Ct
④ LSTM network output value htIs the state C of the cell at time ttCurrent time input xtAnd the output h of the previous momentt-1Are jointly decided. Input x at time ttOutput h from time t-1t-1The merged vector of (1) is passed through a sigmoid function to indicate which information in the memory cells at the current moment needs to be output. State C of the cells at time ttOutput o of the tanh function and the sigmoid functiontMultiplying to obtain the output result h of the output gatetI.e. the output of the hidden layer at time t. Specifically, the following formula:
ot=σ(Wo*[ht-1,xt]+bo)
ht=ot*tanh(Ct)
wherein, WoAs weights of output gates, boIs the biasing of the output gate.
The weight and bias of each time network in the LSTM network are shared, and the output h of the hidden layer of the m time network is finally obtained through the transmission of m time datam. The output of the LSTM at the m moment is subjected to full connection layer to finally obtain the output value y of the prediction networkoutSpecifically, the following formula:
yout=Wout×hm+bout
furthermore, in the method for predicting the quality of the phenolic resin product based on the long-time memory network, in the step (4), when the phenolic resin product quality prediction model is trained based on the production process and historical data of the quality of the corresponding batch of products, the data obtained in the step is sorted by adopting a sliding window, and the phenolic resin product quality prediction model is trained by adopting a time-based back propagation algorithm.
In the above scheme, the step (2) of sorting the data obtained by using the sliding window includes using m batches of data from the starting batch as the 1 st sample of the model, where the product quality corresponding to the sample is the product quality index of the mth batch, then using m batches of data from the 2 nd batch as the 2 nd sample of the model, where the product quality corresponding to the sample is the product quality index of the m +1 th batch, and so on. When the total batch number is I, the number of input samples of the model is I-m + 1. And keeping the front and back sequence of the samples, taking the first 80% of samples as a training sample set of the model, and taking the second 20% of samples as a testing sample set of the model for model training and testing.
Furthermore, in the method for predicting the quality of the phenolic resin product based on the long-time and short-time memory network, in the step (5), the data obtained after the real-time reading of the phenolic resin production process data is processed according to the step (2) is used as input, and the product quality of the phenolic resin of the current batch is predicted by using the phenolic resin product quality prediction model obtained in the step (4).
Furthermore, in the method for predicting the quality of the phenolic resin product based on the long-time and short-time memory network, in the step (6), the fact that the accuracy of the real-time prediction is not satisfactory means that the prediction of the quality of the product of the current batch is smaller than the factory-specified range, and the temperature value including but not limited to the softening point is outside the ± 2 ℃ interval of the final assay analysis value.
Detailed Description
The technical scheme of the invention is further explained by combining the drawings and the embodiment of the specification.
As shown in fig. 1, the embodiment of the present invention and its specific implementation are as follows:
and (1) acquiring historical data of the production process of the phenolic resin and the quality of products of corresponding batches.
On a distributed control system configured in the production process, process variable data in the production process of the phenolic resin is collected through an OPC interface, wherein the process variable data comprises the flow of alkylation liquid, the flow of liquid aldehyde, the flow of solid aldehyde, reaction temperature, reaction pressure and reaction kettle weight, the sampling time is set to be 1 minute, and the phenolic resin process data is sampled to obtain a process data matrix of each batch. Collecting the latest 100 batches of phenolic resin production data, collecting the product softening point of the corresponding batch of phenolic resin analyzed by a laboratory as a quality index,
step (2) preprocessing, feature selection and standardization of the historical data obtained in the step (1)
In the pretreatment process of the step (2), 100 batches of normal phenolic resin production data collected in the step (1) are resampled at equal time intervals (the time intervals are set to be 10 minutes), and 28-dimensional data are obtained.
And (3) for 100 batches of data obtained by preprocessing in the step (2), according to a calculation formula of a maximum information coefficient:
wherein x represents any of the variables in the phenolic production process history data and y represents the softening point of the phenolic resin product. And (4) calculating to obtain the maximum information correlation coefficient of the 28-dimensional data, and selecting variables corresponding to the first 4 maximum values as independent variables of the model, namely the mass of the alkylation liquid, the mass of the liquid aldehyde, the temperature of the liquid aldehyde during addition and the temperature of the solid aldehyde after addition.
Normalizing the data after the feature selection in the step (2), wherein the normalization formula is as follows:
in formula (II), x'
ikA k-dimensional variable x representing the ith lot
ikNormalized values in the range of [0,1 ]]The method has the advantages of no dimension,
and
respectively representing the minimum value and the maximum value in the k-dimension variable.
The softening point data, which is an index of the quality of phenolic resin products, is relatively concentrated and therefore is not standardized.
Step (3) constructing a phenolic resin product quality prediction model based on a long-time and short-time memory network;
the phenolic resin product quality prediction model is composed of 2 LSTM layers and 1 full-connection layer, wherein the dimension of input data is 4, the sequence length is 4, the number of neurons of the LSTM layers is 50, and the number of neurons of the full-connection layer is 1. The batch times of model training are set to be 30, the model training algebra is set to be 200, the loss function of the model is set to be the root mean square error, and the optimizer is adam.
Step (4) training a phenolic resin product quality prediction model based on the data obtained in the step (2);
and (4) processing the data obtained in the step (2) by adopting a sliding window method. And 4, setting a sliding window to be 4, starting from the initial batch of the phenolic resin production data normalized in the step 3, extracting 4 continuous batches of data as a first sample of the model, wherein a predicted value corresponding to the sample is the resin softening point of the 4 th batch of data, and the like. When the total number of batches is 100, the number of samples of the LSTM model is 97. The batch property of the sample is kept, and the first 80% of data is selected as training data, and the second 20% of data is selected as testing data.
And (3) taking the obtained data as input and output data, and training the network constructed in the step (3) by adopting adam as an optimizer to obtain a prediction model of the final quality index of the phenolic resin based on the LSTM.
And (5) acquiring real-time production data of the current batch, processing the data by using the pretreatment and standardization in the step (2), and predicting the product quality of the current batch by using the phenolic resin product quality prediction model.
And (3) reading the production process data of the phenolic resin in real time, calculating to obtain a standardized value of the process variable in the step (2), reading the combination of the latest previous 3 batches of historical data and the batch of data, taking the combination as the input of a model for predicting the softening point of the phenolic resin of the current batch, and predicting the softening point of the phenolic resin of the current batch of data by using the model trained in the step (4). The predicted effect is shown in fig. 3.
The above examples are only intended to illustrate the method and system of the present invention, but the present invention is not limited to the examples, and any simple modifications, equivalent changes and modifications made to the above examples according to the technical implementation of the present invention are within the scope of the present invention.