Steel multi-variety demand prediction method based on intelligent supply chainTechnical Field
The invention relates to the technical field of supply chain production management, in particular to a steel multi-variety demand prediction method based on an intelligent supply chain.
Background
With the development of the industrial internet, the downstream supply chain of the steel industry is also being transformed and upgraded, and the production organization mode of the main raw material steel is also in need of being changed, so that steel production enterprises are required to be transformed from a rigid production organization mode for producing pins to a flexible production organization mode for balancing production pins. The method has the advantages that the information intercommunication with the industrial product manufacturing process of the downstream supply chain enterprises is needed, the rule of industrial product production data is excavated by combining an effective technical means, the multi-variety demand of steel products of the downstream supply chain in the steel industry is predicted in advance in a certain period in the future, the steel production efficiency and the capacity resource utilization rate of the steel product production enterprises are improved, and the steel product consumption demand of the downstream supply chain is met more rapidly and timely.
However, the current technical means for analyzing, processing and deciding the demands of the downstream supply chain are relatively backward, and the industrial product production data of the downstream supply chain enterprise collected in the intelligent supply chain is mainly predicted by means of manual experience or a simple mathematical formula. Based on the current demand prediction, the steel mill is difficult to reasonably and efficiently arrange production plans of different steel varieties, so that the production efficiency and the capacity resource utilization rate are difficult to improve, and the technical problem to be improved and solved is urgent. In other fields, when demand prediction is performed by various big data prediction technical means, non-technical problems such as enterprise profit and cost are more considered. The current demand prediction technical means and methods are long in time consumption, poor in systematicness, low in integrity, low in accuracy and low in response and output speed.
In summary, the existing steel multi-variety demand prediction method cannot improve the production efficiency and the productivity resource utilization rate, and cannot meet intelligent linkage from intelligent supply chain industrial product production to steel production enterprises for storing and scheduling the steel multi-variety demands.
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provide a steel multi-variety demand prediction method based on an intelligent supply chain, the method utilizes the law of industrial production data in an intelligent supply chain and a targeted data processing method, and can predict the requirements of various steel products in a certain time period in time, accurately and stably.
The aim of the invention can be achieved by the following technical scheme:
a steel multi-variety demand prediction method based on an intelligent supply chain comprises the following steps:
s1, acquiring a time sequence of demand data of multiple varieties of steel based on industrial production data of an intelligent supply chain system. The industrial production data of the intelligent supply chain system comprises production BOM data, production data of a host production plant and production data of a part production plant. The method for acquiring the time sequence of the demand data of the multiple varieties of steel materials comprises the following specific steps:
11 Taking the acquired production data of the industrial products in the intelligent supply chain system as an original data time sequence;
12 Cleaning an original data time sequence based on an SARIMA model technical framework, adjusting a data date format, filling missing data by using a moving average method, setting a threshold number of months, filling data above the threshold number of continuously missing months to be zero, and screening out 90% of data before ordering the data amount in a certain period as an available data source;
13 According to the production BOM data, the production data of the main machine production factory and the production data of the part production factory, establishing a functional relation Wt=g(P1,P2,...,Pn between the production data and the demand data, wherein Wt is the demand value of different varieties of steel materials in different months, P1,P2,...,Pn is the production value of different industrial products, and solving and converting the production data of the intelligent supply chain industrial products into a corresponding time sequence of the demand data of multiple varieties of steel materials by utilizing the functional relation;
14 And (3) carrying out moving average processing on the time series of the demand data of the multiple varieties of steel materials obtained in the step (13) to obtain the sample data length used for rolling prediction of the demand prediction model and the preliminary stable time series of the demand data.
S2, constructing an SARIMA time sequence model based on the time sequence of the demand data of the multiple varieties of steel.
The method comprises the following specific steps:
21 Based on seasonal rules, building a SARIMA time series model:
Wherein Wt、wt-n、wt-sn is the required values of different varieties of steel in t, t-n and t-sn months, mu is a constant term, epsilont、εt-n、εt-sn is the error of the required values of different varieties of steel in t, t-n and t-sn months, P is the number of trending autoregressive terms, Q is the number of trending sliding average terms, P is the number of seasonal autoregressive terms, and Q is the number of seasonal sliding average terms. Alphan is a trending autoregressive coefficient, thetan is a trending moving average coefficient, phin is a seasonal autoregressive coefficient, etan is a seasonal moving average coefficient.
22 According to the results of the three feature components obtained by model decomposition, calculating the standard deviation between the seasonal feature component and the random feature component, and adjusting the abnormal values of all the time sequences of the demand data in a set interval, wherein the calculation formula of the standard deviation sigma between the seasonal feature component and the random feature component is as follows:
wherein Cs and Cr are seasonal characteristic components and random characteristic components which are automatically decomposed from the time sequence of the demand data respectively;
And (3) obtaining the upper limit and the lower limit of the time sequence value of the demand data before each prediction by utilizing the standard deviation of the trend characteristic component Ct+/-standard deviation, and adjusting the abnormal values of all the time sequences of the demand data within a program setting interval, namely [ Ct-sigma, ct+sigma ].
23 And (3) carrying out grid search training and optimization by using a model to select to obtain different combinations of SARIMA (P, D, Q) (P, D, Q) s parameters, D is the number of trending difference times which is made to be a final stable sequence, D is the number of seasonal difference times which is made to be a final stable sequence, s is the number of time steps in a single season, P is the number of trending autoregressive terms, Q is the number of trending moving average terms, P is the number of seasonal autoregressive terms, Q is the number of seasonal moving average terms, calculating AIC values of the different combinations, and selecting a parameter combination with the minimum AIC value as an optimal parameter combination to finish the final condition of establishing the SARIMA time sequence model.
S3, inputting a training sample set to be tested into the constructed SARIMA time sequence model to obtain a demand quantity prediction result of multiple varieties of steel materials.
And S4, constructing SARIMAX a time sequence model based on the time sequence of the demand data of the multiple varieties of the steel materials in the step S1.
The method comprises the following specific steps:
41 Based on the time series of the production data of the main machine production plant and the demand data of various steel products corresponding to the production data of the part production plant obtained in the S1), carrying out normalization processing on the time series to obtain two groups of data mapped in the (0, 1) range;
42 Calculating the data correlation coefficients of the two groups of mapping by utilizing correlation analysis, screening out data corresponding to the correlation coefficient not less than 0.7 as advance data, and taking the demand data of multiple varieties of steel corresponding to the production data of the part production plant before normalization processing as the external variable X of SARIMAX model, wherein the following are:
In the formula,Is the demand value of various steel products corresponding to the production data of the parts production plant, and betan is the external variableR is the regression term number of the external variable;
43 Based on the time sequence of the demand data obtained after the S2 processing, training and optimizing again by using grid search selected by the model to obtain different combinations of SARIMAX (P, D, Q) (P, D, Q) S parameters, calculating AIC values of the different combinations, selecting a parameter combination with the minimum AIC value as an optimal parameter combination, and completing the establishment of the final condition of the SARIMAX time sequence model.
S5, inputting a training sample set to be tested into the constructed SARIMAX time sequence model, and obtaining a demand prediction result of various steel products corrected by the algorithm.
And S6, obtaining a final prediction result according to the demand prediction result output by the SARIMA time model obtained in the S3 and the corrected demand prediction result output by the SARIMAX time sequence model obtained in the S5.
The specific contents are as follows:
S3-based SARIMA time model output demand prediction resultCorrected demand prediction results output by SARIMAX time series model obtained in S5Obtaining final prediction resultsWherein the method comprises the steps ofTo output, and remove, the demand time series of the results output by the available SARIMAX time series model using the SARIMA time series model.
Further, in step 12), the data of 90% of the data quantity in the first complete period is selected as the available data source according to the rule of twelve months as a complete period.
Further, in step 42), data with a correlation coefficient of 0.7 or more and Wa in Wb, Wa、Wb being the data mapped in the two sets (0, 1) of the range obtained in step 41) are screened out.
Compared with the prior art, the intelligent supply chain-based steel multi-variety demand prediction method provided by the invention at least has the following beneficial effects:
1) The method can obtain the sample data length for rolling prediction of the demand prediction model and the preliminary stable demand data time sequence by utilizing the law of industrial production data in an intelligent supply chain and a targeted data processing method, and the demand data time sequence is more stable, so that the prediction accuracy is improved;
2) The SARIMA and SARIMAX time sequence model is established, so that the time sequence operation data of the demand data of multiple varieties of steel materials, namely, the non-stationary time sequence can be predicted, the non-stationary time sequence can be converted into the stationary time sequence, the influence of seasonal factors is considered, the effective characteristics of the construction model are formed by mining the natural rules existing among the production data of different industrial products, the algorithm efficiency and complexity of the data source searching and correlation factor analysis dimension are reduced, and the prediction result with high accuracy and engineering application can be obtained based on the time sequence;
3) The intelligent supply chain-based steel multi-variety demand prediction method provided by the invention can be used for timely, accurately and stably predicting the steel multi-variety demand in a certain time period in the future, and the technical problem that the production efficiency and the productivity resource utilization rate are difficult to improve due to the fact that steel mills cannot reasonably and efficiently arrange production plans of different steel varieties is solved.
Drawings
FIG. 1 is a flow chart of a method for predicting demand of multiple varieties of steel products based on an intelligent supply chain in an embodiment.
Detailed Description
The invention will now be described in detail with reference to the drawings and specific examples. It will be apparent that the described embodiments are some, but not all, embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present invention without making any inventive effort, shall fall within the scope of the present invention.
Examples
As shown in fig. 1, the present invention relates to a steel multi-variety demand prediction method based on an intelligent supply chain, which comprises the following steps:
Step one, obtaining a time sequence of demand data of multiple varieties of steel.
The method comprises the steps of obtaining production data of industrial products from an intelligent supply chain system, including raw data time sequences such as production BOM (bill of materials) data, production data of a host production plant, production data of a part production plant and the like, and carrying out data preprocessing such as cleaning, screening, conversion, sliding average and the like on the raw data time sequences based on a SARIMA model technical framework to obtain the data length of a plurality of varieties of sample data of steel materials which are used for rolling prediction by a demand prediction model and a required stable demand data time sequence. Specifically:
step101, acquiring production data of industrial products in an intelligent supply chain system, wherein the production data comprises raw data time sequences such as production BOM (bill of materials) data, production data of a main machine manufacturer, production data of a part manufacturer and the like;
Step102, cleaning an original data time sequence based on an SARIMA model technical framework, adjusting a data date format by utilizing a program, filling missing data by utilizing a moving average method, setting a threshold value to be 6 months, continuously missing data for more than 6 months, filling the missing data to be 0, and screening 90% of data before sequencing the data volume of 1 complete period as an available data source according to the rule that 12 months are 1 complete period in combination with the production data of industrial products;
Step103, establishing a functional relation between production data and demand data according to production BOM (bill of materials) data, production data of a host production plant and production data of a part production plant, and solving and converting the production data of an intelligent supply chain industrial product into a corresponding demand data time sequence of multiple varieties of steel products, wherein Wt is a demand value of different varieties of steel products in different months, and P1,P2,...,Pn is a production value of different industrial products;
step104, carrying out 3-month moving average processing on the time series of the demand data of the multiple varieties of steel materials obtained in Step103, and obtaining the sample data length for rolling prediction of the demand prediction model and the preliminary stable time series of the demand data.
And step two, establishing an SARIMA time sequence model based on the time sequence of the demand data obtained in the step one.
The method comprises the steps of continuously adjusting and optimizing a SARIMA algorithm model, automatically decomposing a trend feature component, a seasonal feature component and a random feature component from a demand data time sequence, calculating standard deviation between the seasonal feature component and the random feature component by using a program according to the result of three feature components obtained by decomposing the model, obtaining upper and lower limits of demand data time sequence values by using the standard deviation of the trend feature component, adjusting abnormal values of all demand data time sequences in a program setting interval to avoid the influence of extreme demand data of individual months on an overall data rule, searching, training and optimizing by using grids selected by the model to obtain different combinations of SARIMA (P, D, Q) x (P, D, Q) parameters, calculating AIC (red pool information criterion) values of the different combinations by using the program, and selecting a parameter combination with the minimum AIC value as an optimal parameter combination to finish the final condition of the SARIMA time sequence model. The method comprises the following specific steps:
Step201, because the intelligent supply chain industrial product production data related to the consumption of steel products has obvious seasonal rules, namely the production capacity in summer and winter is greater than that in spring and autumn, the demand data of multiple varieties of steel products have seasonal changes, and a SARIMA time sequence model can be constructed:
Wherein Wt、wt-n、wt-sn is the required values of different varieties of steel in t, t-n and t-sn months, mu is a constant term, epsilont、εt-n、εt-sn is the error of the required values of different varieties of steel in t, t-n and t-sn months, P is the number of trending autoregressive terms, Q is the number of trending sliding average terms, P is the number of seasonal autoregressive terms, and Q is the number of seasonal sliding average terms. Alphan is a trending autoregressive coefficient, thetan is a trending moving average coefficient, phin is a seasonal autoregressive coefficient, etan is a seasonal moving average coefficient.
And automatically adjusting and optimizing by utilizing the SARIMA algorithm model, and automatically decomposing a trend characteristic component, a seasonal characteristic component and a random characteristic component from the time sequence of the demand data, wherein the trend characteristic component, the seasonal characteristic component and the random characteristic component are respectively marked as Ct, cs and Cr.
Step202, calculating standard deviation between seasonal characteristic components and random characteristic components by using a program according to the results of three characteristic components obtained by model decomposition, wherein the calculation formula is as follows:
In order to avoid the influence of extreme demand data of individual months on the overall data rule, the trend characteristic component + -standard deviation is used for obtaining the upper limit and the lower limit of the demand data time sequence value before each prediction, and the abnormal values of all the demand data time sequences are adjusted in a program setting interval, namely [ Ct-sigma, ct+sigma ].
Step203, training and optimizing the grid search by using the model selection to obtain different combinations of SARIMA (P, D, Q) (P, D, Q) s parameters, wherein D is the number of trending difference times made for making the SARIMA into a final stable sequence, D is the number of seasonal difference times made for making the SARIMA into the final stable sequence, and s is the number of time steps in a single season. And calculating AIC (red pool information criterion) values of different combinations by using a program, and selecting a parameter combination with the minimum AIC value as an optimal parameter combination to finish the establishment of the final condition of the SARIMA time sequence model.
And thirdly, based on the SARIMA time sequence model obtained in the second step, based on the final stable demand data time sequence obtained in the first and second steps, selecting the data with the time period length of [ T-24, T ] as a training sample set, inputting the training sample set to be tested, and carrying out rolling prediction on the demand of multiple varieties of steel products within T+2 months (30-60 days in the future), so as to obtain the demand prediction result of the multiple varieties of steel products.
And fourthly, due to the obvious natural law of production data of a host production factory and production data of a part production factory, the production data of the part production factory always leads the change of the production data of the host production factory, the law is utilized to obtain a time sequence of required data of multiple varieties of steels corresponding to the production data of the host production factory and the production data of the part production factory, normalization processing is carried out on the time sequence based on the first step to obtain two sets of data mapped in the range of 0 and 1, correlation analysis is utilized to calculate the data correlation coefficient of the two sets of mapping, external variable X which can be used as SARIMAX models is screened out, based on SARIMA algorithm models of the second step and the third step, different combinations of parameters of grid search training and optimization selected by the models are utilized to obtain SARIMAX (P, D, Q) (P, D, Q) s, AIC (red pool information criterion) values of the different combinations are calculated by programs, and the parameter combination with the minimum AIC value is selected as the optimal parameter combination, and the final condition of the SARIMAX time sequence model is completed. The method comprises the following specific steps:
Step401, carrying out normalization processing on the time series of the required data of various steels corresponding to the production data of the main machine production plant and the production data of the part production plant based on the first Step to obtain two groups of data mapped in the (0, 1) range, wherein the data are respectively recorded as Wa、Wb;
Step402, calculating two groups of mapped data correlation coefficients by utilizing correlation analysis, screening out data with the correlation coefficient more than or equal to 0.7 and the precedent of Wa in Wb, taking the requirement data of multiple varieties of steels corresponding to the production data of the normalized front part production plant as an external variable X of a SARIMAX model, wherein the following are:
Wherein,Is the demand value of various steel products corresponding to the production data of the parts production plant, and betan is the external variableR is the regression term number of the external variable;
Step403, based on the time series of the demand data obtained after the Step two, in order to avoid the influence of the combination selection of SARIMAX (P, D, Q) s parameters after adding the external variable X, training and optimizing again by using the grid search selected by the model to obtain different combinations of SARIMAX (P, D, Q) s parameters, calculating AIC (red pool information criterion) values of the different combinations by using a program, selecting a parameter combination with the minimum AIC value as an optimal parameter combination, and completing the establishment of the final condition of the SARIMAX time series model.
And fifthly, based on the SARIMAX time sequence model obtained in the fourth step, based on the final stable demand data time sequence obtained in the first and second steps, selecting the data with the time period length of [ T-24, T ] as a training sample set, inputting the training sample set to be tested, and predicting the demand of multiple varieties of steel products within T+2 months (60 days in the future), thereby obtaining the demand prediction result of the multiple varieties of steel products after algorithm correction.
Step six, based on the SARIMA model output demand prediction result obtained in the step threeAnd a corrected demand prediction result based on SARIMAX model output obtained in the fifth stepFinal prediction resultWherein the method comprises the steps ofThe time series of demands for the outcome of the available SARIMAX model output is removed for output using the SARIMA model.
The method can obtain the sample data length for rolling prediction of the demand prediction model and the preliminary stable demand data time sequence by utilizing the law of industrial production data in an intelligent supply chain and a targeted data processing method, and the demand data time sequence is more stable, so that the prediction accuracy is improved. The SARIMA and SARIMAX time sequence model is built, the time sequence operation data of the demand data of multiple varieties of steel materials, namely the non-stable time sequence can be predicted, the non-stable time sequence can be converted into the stable time sequence, the influence of seasonal factors is considered, the effective characteristics of the built model are formed by mining the natural rules existing among the production data of different industrial products, the algorithm efficiency and complexity of the data source searching and correlation factor analysis dimension are reduced, and the prediction result with high accuracy and engineering application can be obtained based on the time sequence.
While the invention has been described with reference to certain preferred embodiments, it will be understood by those skilled in the art that various changes and substitutions may be made without departing from the spirit and scope of the invention as defined by the appended claims. Therefore, the protection scope of the invention is subject to the protection scope of the claims.