Movatterモバイル変換


[0]ホーム

URL:


CN120215281B - A dynamic optimization control method for tin smelting process - Google Patents

A dynamic optimization control method for tin smelting process

Info

Publication number
CN120215281B
CN120215281BCN202510677737.5ACN202510677737ACN120215281BCN 120215281 BCN120215281 BCN 120215281BCN 202510677737 ACN202510677737 ACN 202510677737ACN 120215281 BCN120215281 BCN 120215281B
Authority
CN
China
Prior art keywords
state
model
tin
smelting process
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202510677737.5A
Other languages
Chinese (zh)
Other versions
CN120215281A (en
Inventor
刘英莉
熊正
杨玲
沈韬
袁海滨
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Yunnan Tin Industry Co ltd
Kunming University of Science and Technology
Original Assignee
Yunnan Tin Industry Co ltd
Kunming University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Yunnan Tin Industry Co ltd, Kunming University of Science and TechnologyfiledCriticalYunnan Tin Industry Co ltd
Priority to CN202510677737.5ApriorityCriticalpatent/CN120215281B/en
Publication of CN120215281ApublicationCriticalpatent/CN120215281A/en
Application grantedgrantedCritical
Publication of CN120215281BpublicationCriticalpatent/CN120215281B/en
Activelegal-statusCriticalCurrent
Anticipated expirationlegal-statusCritical

Links

Classifications

Landscapes

Abstract

The invention relates to a dynamic optimization control method for a tin smelting process, and belongs to the technical field of intersection of metallurgical engineering and artificial intelligence. Firstly, collecting data, preprocessing, removing abnormal values, generating missing values by cGAN to make up for data blank, then modeling key state variables in the tin smelting process by using an LSTM model, capturing complex time sequence change rules, and embedding metallurgical knowledge as constraint. Next, a data-driven based state update model is added to the reinforcement learning environment. Finally, a deep reinforcement learning algorithm is introduced, the operation parameters of the spray gun are used as an action space, and the state in the tin smelting process is used as a state space. In the bonus function design, a prediction result of key parameters is integrated, and a dynamic optimization control strategy is realized through intelligent agent learning. The method has remarkable advantages in the aspects of dynamically optimizing the tin purity and reducing the CO emission, and provides an innovative solution for the intelligent and efficient production of the metallurgical industry.

Description

Dynamic optimization control method for tin smelting process
Technical Field
The invention relates to a dynamic optimization control method for a tin smelting process, and belongs to the technical field of intersection of metallurgical engineering and artificial intelligence.
Background
Tin smelting is an important process in nonferrous metal smelting, and has complex physicochemical reaction and multivariable coupling characteristics. Conventional tin smelting control methods generally rely on empirical rules and static optimization strategies, and are difficult to cope with dynamic changes and nonlinear complex relationships within the furnace. The method is faced with a plurality of challenges in practical application, such as difficult accurate prediction of dynamic changes of key indexes (such as tin purity, CO emission concentration and the like) in a smelting process, difficult real-time optimization of a control strategy, and high energy consumption, large emission, fluctuation of product quality and the like.
In recent years, with the rapid development of intelligent manufacturing technology, deep learning and reinforcement learning provide a new solution for intelligent optimization of a tin smelting process. The excellent performance of the cyclic neural network and an improved model (such as a long-short-term memory network) thereof in time sequence modeling enables accurate prediction of key parameters in a tin smelting process, and a deep reinforcement learning technology provides a theoretical basis for construction of a dynamic optimization control strategy. However, relying solely on data-driven machine learning models, it is easy to ignore physical constraints such as stoichiometry, energy balance, etc. in metallurgical processes, which may lead to insufficient reliability of the prediction and optimization results.
Therefore, the dynamic optimization control method integrating data driving and metallurgical knowledge is researched, the data potential in the tin smelting process can be fully mined, and meanwhile, field knowledge is embedded to enhance the interpretability and accuracy of the model. By constructing a reinforcement learning environment, taking the operating parameters of the spray gun as an action space, taking the key state of the tin smelting process as a state space, designing a reward function capable of sensing the change of the process conditions in real time, and enabling an intelligent body to autonomously learn and dynamically optimize a control strategy, the improvement of tin purity, the reduction of CO emission and the minimization of energy consumption are realized. The technical background lays a theoretical and practical foundation for intelligent optimization of the tin smelting process.
Disclosure of Invention
The technical problem to be solved by the invention is to provide a dynamic optimization control method for a tin smelting process, which can optimize operation parameters in the smelting process in real time so as to solve the problems.
The technical scheme of the invention is that the dynamic optimization control method for the tin smelting process comprises the steps of firstly, collecting relevant data from a tin smelting factory database, wherein the relevant data comprise state variables and spray gun operation parameters at each moment in the smelting process. And secondly, preprocessing the data, removing abnormal values recorded by the sensor, and generating key variables of low-frequency measurement by using a Conditional GAN (cGAN) to make up for the data blank. Then, a Long Short-Term Memory (LSTM) model is utilized to model key state variables (such as tin purity, CO concentration and the like) in the tin smelting process, a time sequence change rule is captured, metallurgical knowledge (such as oxygen-coal stoichiometric ratio, oxygen flow change and the like) is embedded as constraint, and accuracy and reliability of model prediction are improved. In the reinforcement learning environment construction, a data-driven state update model is adopted, key state variables and operating parameters of the spray gun are used as input, the state at the next moment is predicted through a deep learning model, and the reliability of state transition is improved by combining metallurgical knowledge. Finally, a deep reinforcement learning algorithm is introduced, the operation parameters of the spray gun are used as an action space, the state of tin smelting is used as a state space, and the optimization targets of tin purity improvement and CO emission reduction are reflected by designing a reward function. The agent is led to learn an optimal strategy that can guide the agent to generate optimal lance operating advice under different environmental conditions.
The method comprises the following specific steps:
Step1, collecting relevant parameter data in the tin smelting process, wherein the relevant parameter data comprise state variable values and spray gun operation parameter values at each moment in a smelting furnace, and providing a comprehensive data base for subsequent analysis and modeling;
step2, preprocessing the related parameter data, deleting the abnormal value recorded by the sensor, and generating a variable with the measurement frequency lower than a preset threshold value by cGAN so as to compensate for the data vacancy and improve the data integrity;
step3, modeling the tin purity and the CO concentration in the tin smelting process by adopting an improved LSTM model, capturing a time sequence change rule of a state variable value and a spray gun operation parameter value, and improving the accuracy and reliability of prediction;
Step4, constructing a reinforcement learning environment, adopting a data-driven state updating model, taking a state variable value and a spray gun operation parameter value in a tin smelting process as input, predicting a state value at the next moment by using a deep learning model, and enhancing the reliability of state transition and the actual performance of an environment model by fusing metallurgical knowledge constraint;
Step5, designing a reward function, guiding the reinforcement learning agent to learn in the direction of tin purity improvement and CO emission reduction based on the captured time sequence change rule, and feeding back the decision effect of the agent in real time by the reward function to provide effective guidance for the optimization process;
step6, introducing a deep reinforcement learning (Deep Reinforcement Learning, DRL) algorithm, designing the operation parameters of the spray gun as an action space, designing the state in the tin smelting process as a state space, performing interactive learning on the intelligent agent and the state updating model, and continuously optimizing a control strategy to realize a dynamic optimization target, wherein the strategy can guide the intelligent agent to generate the optimal spray gun operation proposal under different environmental states.
The Step2 specifically comprises the following steps:
step2.1, setting threshold values of various variables according to the measuring range of the sensor and the experience of a process expert, and directly eliminating abnormal values exceeding the threshold values to ensure the accuracy and reliability of the follow-up module data;
Step2.2, carrying out data division, taking a complete high-frequency measured variable as a condition input of a model, taking a low-frequency measured variable as a target generation variable, and carrying out missing value generation by utilizing cGAN models, wherein the high-frequency measured variable is taken as the condition input into a generator and a discriminator to provide additional constraint information, so that the data integrity and consistency are improved.
The Step3 specifically comprises the following steps:
Step3.1, extracting core variables reflecting dynamic changes in the furnace based on the existing industrial data in the tin smelting process, generating a new feature sequence of the smelting process through mathematical transformation, wherein the features not only enrich data dimension, but also provide more meaningful input information for subsequent modeling;
embedding physical constraint in the metallurgical field in a loss function of an original LSTM model, guiding the LSTM model to learn a predicted result conforming to an actual process rule through constraint penalty items, and improving scientificity and interpretability of the predicted result;
step3.3, adopting an improved LSTM model to conduct predictive training on index variables in the tin smelting process, capturing complex time dependency relations through a gating mechanism, and accurately modeling dynamic changes among the variables;
And step3.4, after training, verifying the prediction performance of the model by using an independent test data set, evaluating the prediction accuracy of the tin purity and the CO concentration, and storing LSTM model parameters to ensure the applicability of the model in industrial scenes.
The Step4 specifically comprises the following steps:
Step4.1, acquiring state changes of continuous time steps through time sequence data, wherein the state changes comprise state variables and action variables, and sorting the data into a triplet form:
In the formula,A state variable representing the current time of day,An action variable representing the current time of day,Representing the state variable at the next moment, and laying a foundation for training a subsequent state update model by the data format;
Step4.2, constructing a state update model of the fully connected neural network, and carrying out current stateAnd current actionsAs input, predict the next time stateTraining a state updating model by utilizing the constructed triplet data;
Step4.3, evaluating the performance of the state update model on the verification set, storing the state update model parameters meeting the preset index requirements, and providing a reliable state update mechanism for the subsequent embedded reinforcement learning environment;
And step4.4, in the reinforcement learning environment, the trained state is used for updating model parameters, the reinforcement learning environment state at the next moment is predicted through the current environment and the current executed action, the intelligent agent is guided to perform action selection, the description capability of the environment on complex dynamic changes is enhanced through the embedding process, more real state feedback is provided for the intelligent agent, the optimization action selection is guided, and the high-efficiency control on the tin smelting process is realized.
The Step5 specifically comprises the following steps:
Step5.1, setting optimization targets, namely improving the tin purity and reducing the CO emission. Simultaneously, auxiliary targets are set, including reducing energy consumption and maintaining stability of key process parameters such as furnace pressure, so that multi-target cooperative optimization is realized, and efficient operation of a tin smelting process is ensured;
Step5.2, predicting the influence of the adjustment of the current process parameters on tin purity and CO emission in a certain preset time period in the future by using the stored LSTM model parameters, and using a prediction result as a basis for calculating a reward function so as to accurately reflect the contribution of an optimization strategy to a long-term target;
step5.3, designing a reward function, wherein the formula is as follows:
()
where R is the bonus function,For the weight coefficients, for balancing the importance of multiple targets,Is the purity content of the tin at the current moment,Is the carbon monoxide content at the current moment,The purity content of the tin at the next moment,Is the content of carbon monoxide at the next moment,Is the energy consumption value at the next moment;
In order to enhance the instantaneity and the robustness, the reward function combines a key parameter prediction result based on an LSTM model in design, and a long-term optimization target of a tin smelting process is decomposed into a plurality of short-term optimization sub-targets so as to more accurately reflect the influence of current process adjustment on a future state, thereby effectively guiding an intelligent agent to learn and continuously optimizing a control strategy.
The Step6 specifically comprises the following steps:
Regulating and controlling operation parameters of a spray gun based on DDPG algorithm, defining operation executed by the spray gun in the smelting furnace as an action space, and defining ten core variables including concentration total amount, molten pool accumulation, oxygen content percentage, furnace bottom middle temperature, furnace bottom outer temperature, furnace bottom inner temperature, furnace raised temperature, furnace pressure, waste gas CO analysis and total energy consumption in the smelting furnace as a state space;
Step6.2, predicting optimal actions based on the current state by using an Actor network in DDPG algorithm, for adjusting the operation parameters of the spray gun, and evaluating the Q value of the current strategy by using a Critic network, for guiding the Actor to optimize;
step6.3, storing interaction data of the reinforcement learning agent and the furnace environment in an experience pool, randomly sampling a plurality of furnace states and gun operation parameter data from the interaction data to train, breaking time correlation, enabling the reinforcement learning agent to learn the operation experience which is not available in the previous continuous furnace period data, and improving training stability and generalization capability of the model;
Step6.4, enhancing the exploration capability of the intelligent body to the action space by adding noise, avoiding sinking into local optimum, and simulating the fault condition of a factory to complete model training;
Step6.5, utilizing the trained model to realize intelligent dynamic optimization control in the tin smelting process, providing optimization suggestions for a spray gun operator by combining a model optimization result through real-time monitoring of state variables, and dynamically adjusting operation parameters, thereby improving tin purity, reducing CO emission and realizing process energy efficiency maximization.
The step3.2 specifically comprises the following steps:
adding metallurgical knowledge constraint in a loss function of the LSTM model, wherein a constraint formula is as follows:
Wherein, theIs the total loss value of the total loss,Is the predicted loss value of the LSTM model,Is a constant value, and is a function of the constant,The function needs to be combined with metallurgical knowledge, and the specific formula is as follows:
Wherein, theIs a constant value, and is a function of the constant,The difference between the predicted value and the theoretical value, the theoretical value of CO is calculated as follows:
In the formula,The flow rate of the combustion coal is indicated,Is a constant related to the calorific value of fuel coal and the combustion efficiency,Is the flow rate of the oxygen gas,Is the theoretical stoichiometric ratio of the burning coal.
The beneficial effects of the invention are as follows:
(1) Dynamic optimization control capability compared with the existing static optimization method, the method realizes the dynamic optimization control of the tin smelting process by combining a deep reinforcement learning algorithm and a data driving model. The intelligent agent can monitor the state in real time and autonomously optimize the operation strategy of the spray gun to cope with complex process changes;
(2) Different from the traditional data driving model, the invention introduces metallurgical physical constraints such as oxygen coal stoichiometric ratio and the like, combines an improved cyclic neural network (RNN), accurately captures the dynamic change rule of key parameters, and improves the prediction accuracy and reliability;
(3) The invention adopts a data-driven state update model to accurately reflect the dynamic characteristics of tin smelting, provides a reliable state transition mechanism for reinforcement learning, avoids failure of an optimization strategy caused by accumulation of environmental errors, and improves learning efficiency and optimization performance.
Drawings
FIG. 1 is an overall frame diagram of the present invention;
FIG. 2 is a graph showing the comparison of predicted and actual values of the total amount of concentration in accordance with an embodiment of the present invention;
FIG. 3 is a graph of predicted versus actual values accumulated for a puddle in accordance with an embodiment of the present invention;
FIG. 4 is a graph showing the comparison of predicted and actual values of the percentage of oxygen content in accordance with an embodiment of the present invention;
FIG. 5 is a graph showing the comparison of predicted and actual values of the temperature in the middle of the furnace bottom according to an embodiment of the present invention;
FIG. 6 is a graph showing the comparison of predicted values and actual values of the outside temperature of the furnace bottom according to the embodiment of the present invention;
FIG. 7 is a graph showing the comparison of predicted values and actual values of the internal furnace bottom temperature according to the embodiment of the present invention;
FIG. 8 is a graph of predicted versus actual values of elevated furnace temperatures in accordance with an embodiment of the present invention;
FIG. 9 is a graph showing the comparison of predicted and actual values of the furnace pressure according to an embodiment of the present invention;
FIG. 10 is a graph showing the comparison of predicted and actual values of CO content in exhaust gas according to an embodiment of the present invention;
FIG. 11 is a graph showing the predicted value versus the actual value of the total energy consumption according to an embodiment of the present invention.
Detailed Description
The invention will be further described with reference to the drawings and detailed description.
Embodiment 1 As shown in FIG. 1, the method for dynamically optimizing and controlling the tin smelting process comprises the following specific steps:
Step1, collecting relevant parameter data in the tin smelting process, wherein the relevant parameter data comprise state variable values and spray gun operation parameter values at each moment in a smelting furnace, and providing a comprehensive data base for subsequent analysis and modeling.
Specifically, in order to verify the real-time dynamic optimization effect of the model in the tin smelting process, the embodiment evaluates the effects of tin purity improvement, CO emission reduction and energy consumption optimization by simulating interaction between the reinforcement learning model and the industrial production environment based on actual operation data of a certain tin smelting factory. Smelting data for 30 consecutive days was extracted from the data system of a tin smelting plant Ausmel furnace with a time resolution of 1 minute. Data size-total 43200 records containing 10 state variables and 8 operating parameters. Specifically, the results are shown in tables 1 and 2.
Table 1 action variables performed by the lance in tin factory data
Action variable nameUnit (B)
Total material flowTon/hr
Fuel coal flow rateKg/h
Coal-carrying air flowStandard cubic meter/h
Coal-carrying wind pressureKilopascals
Spray gun back pressureKilopascals
Gun positionMillimeter (mm)
Oxygen flow rateStandard cubic meter/hour
Air flow rateStandard cubic meter/h
TABLE 2 State variables for Ausmel furnaces in tin factory data
State variable nameUnit (B)
Concentrating the total amountTon (ton)
Pool accumulationTon (ton)
Oxygen concentrationPercentage of
Middle temperature of furnace bottomTemperature (temperature)
External temperature of furnace bottomTemperature (temperature)
Internal temperature of furnace bottomTemperature (temperature)
Furnace temperature riseTemperature (temperature)
Internal pressure of furnaceHandkerchief
CO content of exhaust gasAt a level of per million
Energy consumptionKilowatt (kilowatt)
Step2, preprocessing the related parameter data, deleting the abnormal value recorded by the sensor, and generating a variable with the measurement frequency lower than a preset threshold value by cGAN, so that the data vacancy is filled and the data integrity is improved.
Step2.1, setting threshold values of various variables according to the measuring range of the sensor and the experience of a process expert, and directly eliminating abnormal values exceeding the threshold values to ensure the accuracy and reliability of the follow-up module data;
Step2.2, carrying out data division, taking a complete high-frequency measurement variable (such as oxygen flow, furnace pressure and the like) as a condition input of a model, taking a low-frequency measurement variable (such as tin purity and the like) as a target generation variable, and carrying out missing value generation by utilizing a cGAN model, wherein the high-frequency measurement variable is taken as a condition input into a generator and a discriminator to provide additional constraint information, so that the data integrity and consistency are improved.
Step3, modeling the tin purity and the CO concentration in the tin smelting process by adopting an improved LSTM model, capturing a time sequence change rule of a state variable value and a spray gun operation parameter value, and improving the accuracy and reliability of prediction.
Step3.1, extracting core variables (such as oxygen flow, fuel coal flow, furnace pressure and the like) reflecting dynamic changes in the furnace based on the existing industrial data in the tin smelting process, generating a new feature sequence of the smelting process through mathematical transformation, wherein the features not only enrich data dimension, but also provide more meaningful input information for subsequent modeling;
The new characteristic list comprises oxygen flow rate variation, combustion coal flow rate variation and stoichiometric ratio of oxygen and coal, wherein the oxygen flow rate variation and the combustion coal flow rate variation help model capture time sequence dynamic and nonlinear relation of combustion reaction in a furnace, the stoichiometric ratio of oxygen and coal reflects supply and demand relation of oxygen and fuel, and a specific calculation formula is as follows:
In the formula,Indicating the amount of change in the flow rate of oxygen,The oxygen flow rate for the current time step is indicated,An oxygen flow rate representing a previous time step;
In the formula,The flow rate variation of the combustion coal is shown,Representing the flow of combustion coal representing the current time step,Representing the flow rate of the combustion coal in the previous time step;
In the formula,Represents the stoichiometric ratio of the oxycoal,Indicating the flow rate of oxygen gas,Indicating the purity of the oxygen gas,Indicating the density of the oxygen gas,The flow rate of the combustion coal is indicated,Indicating the purity of the combustion coal.
Embedding physical constraint in the metallurgical field in a loss function of an original LSTM model, guiding the LSTM model to learn a prediction result conforming to an actual process rule through constraint penalty items (such as the stoichiometric ratio deviation of oxygen coal), and improving the scientificity and the interpretability of the prediction result;
specifically, metallurgical knowledge constraint is added into the loss function of the LSTM model, and the constraint formula is as follows:
Wherein, theIs the total loss value of the total loss,Is the predicted loss value of the LSTM model,Is a constant value, and is a function of the constant,The function needs to be combined with metallurgical knowledge, and the specific formula is as follows:
Wherein, theIs a constant value, and is a function of the constant,The difference between the predicted value and the theoretical value, the theoretical value of CO is calculated as follows:
In the formula,The flow rate of the combustion coal is indicated,Is a constant related to the calorific value of fuel coal and the combustion efficiency,Is the flow rate of the oxygen gas,Is the theoretical stoichiometric ratio of the burning coal.
Step3.3, adopting an improved LSTM model to conduct predictive training on index variables (such as tin purity, CO concentration and the like) in the tin smelting process, capturing a complex time dependence by a gating mechanism, and accurately modeling dynamic changes among the variables;
And step3.4, after training, verifying the prediction performance of the model by using an independent test data set, evaluating the prediction accuracy of the tin purity and the CO concentration, and storing LSTM model parameters to ensure the applicability of the model in industrial scenes.
Step4, constructing a reinforcement learning environment, adopting a data-driven state updating model, taking a state variable value and a spray gun operation parameter value in the tin smelting process as input, predicting a state value at the next moment by using a deep learning model, and enhancing the reliability of state transition and the actual performance of an environment model by fusing metallurgical knowledge constraint.
Step4.1, acquiring state changes of continuous time steps through time sequence data, wherein the state changes comprise state variables and action variables, and sorting the data into a triplet form:
In the formula,A state variable representing the current time of day,An action variable representing the current time of day,Representing the state variable at the next moment, and laying a foundation for training a subsequent state update model by the data format;
Step4.2, constructing a state update model of the fully connected neural network, and carrying out current stateAnd current actionsAs input, predict the next time stateTraining a state updating model by utilizing the constructed triplet data;
Step4.3, evaluating the performance of the state updating model on a verification set, storing state updating model parameters meeting preset index requirements, providing a reliable state updating mechanism for a subsequent embedded reinforcement learning environment, wherein the state updating effect is shown as a graph in fig. 2-11, and the predicted values of the reinforcement learning state updating model for ten different states are compared with the actual values, the ten states are specifically concentrated total, the molten pool is accumulated, the oxygen content percentage, the middle furnace bottom temperature, the outer furnace bottom temperature, the inner furnace bottom temperature, the furnace raised temperature, the furnace pressure, the waste gas CO content and the total energy consumption, wherein the predicted values represent the states of the next moment predicted by the states and actions at the current moment, and the actual values represent the actual states at the next moment;
And step4.4, in the reinforcement learning environment, the trained state is used for updating model parameters, the reinforcement learning environment state at the next moment is predicted through the current environment and the current executed action, the intelligent agent is guided to perform action selection, the description capability of the environment on complex dynamic changes is enhanced through the embedding process, more real state feedback is provided for the intelligent agent, the optimization action selection is guided, and the high-efficiency control on the tin smelting process is realized.
Step5, designing a reward function, guiding the reinforcement learning agent to learn in the direction of tin purity improvement and CO emission reduction based on the captured time sequence change rule, and feeding back the decision effect of the agent in real time by the reward function, thereby providing effective guidance for the optimization process.
Step5.1, setting optimization targets, namely improving the tin purity and reducing the CO emission. Simultaneously, auxiliary targets are set, including reducing energy consumption and maintaining stability of key process parameters such as furnace pressure, so that multi-target cooperative optimization is realized, and efficient operation of a tin smelting process is ensured;
Step5.2, predicting the influence of the adjustment of the current process parameters on the tin purity and CO emission in the future 3 minutes by using the stored LSTM model parameters, and using the prediction result as a basis for calculating a reward function so as to accurately reflect the contribution of the optimization strategy to a long-term target;
step5.3, designing a reward function, wherein the formula is as follows:
()
where R is the bonus function,For the weight coefficients, for balancing the importance of multiple targets,Is the purity content of the tin at the current moment,Is the carbon monoxide content at the current moment,The purity content of the tin at the next moment,Is the content of carbon monoxide at the next moment,Is the energy consumption value at the next moment.
Further, when the LSTM model predictions show that adjustment of the current operating parameters can effectively increase tin purity and decrease CO concentration, the reward function will give positive excitation, and conversely, if the operation results in an increase in CO concentration or a decrease in tin purity, the reward function will give negative feedback.
Step6, introducing a deep reinforcement learning (Deep Reinforcement Learning, DRL) algorithm, designing the operation parameters of the spray gun as an action space, designing the state in the tin smelting process as a state space, performing interactive learning on the intelligent agent and the state updating model, and continuously optimizing a control strategy to realize a dynamic optimization target, wherein the strategy can guide the intelligent agent to generate the optimal spray gun operation proposal under different environmental states.
Regulating and controlling operation parameters of a spray gun based on DDPG algorithm, defining operation (such as oxygen flow, coal loading wind pressure and the like) executed by the spray gun in a smelting furnace as an action space, and defining ten core variables including concentration total amount, bath accumulation, oxygen content percentage, furnace bottom middle temperature, furnace bottom outside temperature, furnace bottom inside temperature, furnace raised temperature, furnace pressure, waste gas CO analysis and total energy consumption in the smelting furnace as a state space;
Step6.2, predicting optimal actions based on the current state by using an Actor network in DDPG algorithm, for adjusting the operation parameters of the spray gun, and evaluating the Q value of the current strategy by using a Critic network, for guiding the Actor to optimize;
step6.3, storing interaction data of the reinforcement learning agent and the furnace environment in an experience pool, randomly sampling a plurality of furnace states and gun operation parameter data from the interaction data to train, breaking time correlation, enabling the reinforcement learning agent to learn the operation experience which is not available in the previous continuous furnace period data, and improving training stability and generalization capability of the model;
Step6.4, enhancing the exploration capability of the intelligent body to the action space by adding noise, avoiding sinking into local optimum, and simulating the fault condition of a factory to complete model training;
Step6.5, utilizing the trained model to realize intelligent dynamic optimization control in the tin smelting process, providing optimization suggestions for a spray gun operator by combining a model optimization result through real-time monitoring of state variables, and dynamically adjusting operation parameters, thereby improving tin purity, reducing CO emission and realizing process energy efficiency maximization.
Step7, verifying the performance of the reinforcement learning model on the test data set, and evaluating the effects of improving the tin purity, reducing the CO emission and optimizing the energy consumption. The experimental results were as follows, the current furnace state, total concentration of 5.777605 tons, accumulation of molten pool: 5.4335 tons, oxygen concentration: 1.2%, furnace bottom middle temperature: 594.0 temperature units, furnace bottom outside temperature: 518.0 temperature units, furnace bottom inside temperature: 518.0 temperature units, furnace temperature rise temperature: 740.0 temperature units, furnace internal pressure: 3.490 Pa, exhaust gas CO content: 2.000 ppm (content per million), energy consumption: 1.780 kw. The recommended actions of the algorithm are 86.5 tons/hr of total material flow, 9960 kg/hr of fuel coal flow, 0.106 standard cubic meter/hr of coal-carrying air flow, 32.5 kilopascals of coal-carrying air pressure, 410 kilopascals of spray gun back pressure, 7031 millimeters of gun position, 17832 standard cubic meters/hr of oxygen flow, and 17832 standard cubic meters/hr of air flow. When the intelligent agent executes the operation parameters of the spray gun according to the optimal strategy, the tin purity is averagely improved by 1.5%, and the CO emission concentration is reduced by 5%.
While the present invention has been described in detail with reference to the drawings, the present invention is not limited to the above embodiments, and various changes can be made without departing from the spirit of the present invention within the knowledge of those skilled in the art.

Claims (3)

Translated fromChinese
1.一种锡熔炼过程动态优化控制方法,其特征在于,所述方法具体包括:1. A method for dynamic optimization control of a tin smelting process, characterized in that the method specifically comprises:Step1:收集锡熔炼过程中相关参数数据,所述相关参数数据包括熔炼炉中每个时刻的状态变量值和喷枪操作参数值;Step 1: Collect relevant parameter data during the tin smelting process, including the state variable values and spray gun operation parameter values at each moment in the smelting furnace;Step2:对所述相关参数数据进行预处理,删除传感器记录的异常值,并利用cGAN生成测量频率低于预设阈值的变量;Step 2: Preprocess the relevant parameter data, delete the abnormal values recorded by the sensor, and use cGAN to generate variables with measurement frequency below the preset threshold;Step3:采用改进的LSTM模型对锡熔炼过程中的锡纯度、CO浓度进行建模,捕捉状态变量值和喷枪操作参数值的时间序列变化规律;Step 3: Use the improved LSTM model to model the tin purity and CO concentration during the tin smelting process, capturing the time series variation patterns of state variable values and spray gun operating parameter values.Step4:构建强化学习环境,采用数据驱动的状态更新模型,将锡熔炼过程中的状态变量值和喷枪操作参数值作为输入,利用深度学习模型预测下一时刻的状态值;Step 4: Build a reinforcement learning environment and adopt a data-driven state update model. Take the state variable values and spray gun operation parameter values of the tin smelting process as input, and use the deep learning model to predict the state value at the next moment.Step5:设计奖励函数,基于捕捉的时间序列变化规律指导强化学习的智能体;Step 5: Design a reward function to guide the reinforcement learning agent based on the captured time series variation patterns.Step6:引入深度强化学习算法,将喷枪操作参数设计为动作空间,将锡熔炼过程中的状态设计为状态空间,智能体与状态更新模型进行交互学习,持续优化控制策略以实现动态优化目标;Step 6: Introducing a deep reinforcement learning algorithm, the spray gun operating parameters are designed as an action space, and the state of the tin smelting process is designed as a state space. The intelligent agent and the state update model interact and learn, continuously optimizing the control strategy to achieve the dynamic optimization goal;所述Step3具体为:The Step 3 is specifically as follows:Step3.1:基于锡熔炼过程中已有的工业数据,提取反映炉内动态变化的核心变量,通过数学变换生成与熔炼过程的新特征列;Step 3.1: Based on the existing industrial data of the tin smelting process, the core variables reflecting the dynamic changes in the furnace are extracted, and new feature columns related to the smelting process are generated through mathematical transformation;Step3.2:在原始LSTM模型的损失函数中嵌入冶金领域的物理约束,通过约束惩罚项指导LSTM模型学习符合实际工艺规律的预测结果;Step 3.2: Embed the physical constraints of the metallurgical field into the loss function of the original LSTM model. The constraint penalty term guides the LSTM model to learn prediction results that conform to the actual process laws.Step3.3:采用改进的LSTM模型对锡熔炼过程中的指标变量进行预测训练;Step 3.3: Use the improved LSTM model to perform prediction training on the indicator variables in the tin smelting process;Step3.4:在训练完成后,利用独立的测试数据集验证模型的预测性能,评估锡纯度、CO浓度的预测精度,并保存LSTM模型参数;Step 3.4: After training is complete, use an independent test dataset to verify the model's prediction performance, evaluate the prediction accuracy of tin purity and CO concentration, and save the LSTM model parameters.所述Step5具体为:The Step 5 is specifically as follows:Step5.1:制定优化目标;Step 5.1: Set optimization goals;Step5.2:利用保存的LSTM模型参数,预测当前工艺参数的调整对未来某一预设时间段锡纯度和CO排放的影响,并利用预测结果作为奖励函数计算的依据;Step 5.2: Use the saved LSTM model parameters to predict the impact of adjusting the current process parameters on tin purity and CO emissions in a preset time period in the future, and use the prediction results as the basis for calculating the reward function;Step5.3:设计奖励函数,公式为:Step 5.3: Design the reward function, the formula is:() ( ) ;式中,R是奖励函数,为权重系数,用于平衡多个目标的重要性,是当前时刻锡纯度含量,是当前时刻一氧化碳含量,是下一时刻锡纯度含量,是下一时刻一氧化碳含量,是下一时刻能耗值;Where R is the reward function, is the weight coefficient, which is used to balance the importance of multiple objectives. is the tin purity content at the current moment, is the carbon monoxide content at the current moment, is the tin purity content at the next moment, is the carbon monoxide content at the next moment, is the energy consumption value at the next moment;所述Step6具体为:The Step 6 is specifically as follows:Step6.1:基于DDPG算法对喷枪操作参数进行调控,将熔炉内的喷枪执行的操作定义为动作空间,将熔炉内:浓缩总量、熔池累加、氧气含量百分比、炉底中部温度、炉底外部温度、炉底内温、炉升高的温度、炉压、废气CO分析、总能耗共十个核心变量定义为状态空间;Step 6.1: Regulate the spray gun operating parameters based on the DDPG algorithm. Define the operations performed by the spray gun in the furnace as the action space, and define the ten core variables in the furnace as the state space: total concentration, melt pool accumulation, oxygen content percentage, furnace bottom middle temperature, furnace bottom external temperature, furnace bottom internal temperature, furnace rise temperature, furnace pressure, exhaust CO analysis, and total energy consumption.Step6.2:使用DDPG算法中的Actor网络基于当前状态预测最优动作,用于调节喷枪操作参数,使用Critic网络评估当前策略的Q值,用于指导Actor优化;Step 6.2: Use the Actor network in the DDPG algorithm to predict the optimal action based on the current state to adjust the spray gun operating parameters. Use the Critic network to evaluate the Q value of the current strategy to guide Actor optimization.Step6.3:将强化学习中的智能体与熔炉环境的交互数据存储在经验池中,从中随机采样若干熔炉状态与喷枪操作参数数据进行训练,使强化学习智能体学习到之前连续炉期数据未有的操作经验;Step 6.3: The interaction data between the reinforcement learning agent and the furnace environment is stored in the experience pool. A number of furnace status and spray gun operating parameter data are randomly sampled from the experience pool for training. This allows the reinforcement learning agent to learn operating experience that was not previously available in the continuous furnace period data.Step6.4:通过添加噪声,模拟工厂遇到故障的情况,完成模型训练;Step 6.4: Add noise to simulate a factory failure and complete model training.Step6.5:利用训练好的模型,在锡熔炼过程中实现智能动态优化控制;Step 6.5: Using the trained model, intelligent dynamic optimization control is implemented during the tin smelting process.所述Step3.2具体为:The Step 3.2 is specifically as follows:在LSTM模型的损失函数中加入冶金知识约束,约束公式为:Add metallurgical knowledge constraints to the loss function of the LSTM model. The constraint formula is: ;其中,是总的损失值,是LSTM模型的预测损失值,是常数,函数则需要结合冶金知识,具体公式如下:in, is the total loss value, is the prediction loss value of the LSTM model, is a constant, The function needs to be combined with metallurgical knowledge. The specific formula is as follows: ;其中,是常数,是预测值与理论值的差值,CO的理论值计算如下:in, is a constant, is the difference between the predicted value and the theoretical value. The theoretical value of CO is calculated as follows: ;式中,表示燃烧煤流量,是燃料煤热值和燃烧效率相关的常数,是氧气流量,是燃烧煤的理论化学计量比。Where, represents the combustion coal flow rate, is a constant related to the calorific value and combustion efficiency of fuel coal, is the oxygen flow rate, is the theoretical stoichiometric ratio for burning coal.2.根据权利要求1所述的一种锡熔炼过程动态优化控制方法,其特征在于,所述Step2具体为:2. The method for dynamic optimization control of a tin smelting process according to claim 1, wherein Step 2 is specifically:Step2.1:根据传感器测量范围和工艺专家的经验,设定各个变量的阈值取值,对超出阈值的异常值直接剔除;Step 2.1: Set the threshold values for each variable based on the sensor measurement range and the experience of process experts, and directly eliminate abnormal values that exceed the threshold;Step2.2:进行数据划分,将完整的高频测量变量作为模型的条件输入,低频测量变量作为目标生成变量,利用cGAN模型进行缺失值生成,其中,所述高频测量变量作为条件输入到生成器和判别器中以提供额外的约束信息。Step 2.2: Perform data partitioning, use the complete high-frequency measurement variables as the conditional input of the model, and the low-frequency measurement variables as the target generation variables, and use the cGAN model to generate missing values, wherein the high-frequency measurement variables are input as conditions to the generator and discriminator to provide additional constraint information.3.根据权利要求1所述的一种锡熔炼过程动态优化控制方法,其特征在于,所述Step4具体为:3. The method for dynamic optimization control of a tin smelting process according to claim 1, wherein Step 4 is specifically:Step4.1:通过时间序列数据获取连续时间步的状态变化,包括状态变量和动作变量,将数据整理为三元组形式:Step 4.1: Obtain the state changes of consecutive time steps through time series data, including state variables and action variables, and organize the data into triples: ;式中,表示当前时刻的状态变量,表示当前时刻的动作变量,表示下一时刻的状态变量;Where, Represents the state variable at the current moment, Represents the action variable at the current moment, Represents the state variable at the next moment;Step4.2:构建全连接神经网络的状态更新模型,将当前状态和当前动作作为输入,预测下一时刻状态,利用构建好的三元组数据,训练状态更新模型;Step 4.2: Build a state update model of the fully connected neural network, and update the current state and the current action As input, predict the next state ,Use the constructed triple data to train the state update model;Step4.3:在验证集上评估状态更新模型的性能,保存达到预设指标要求的状态更新模型参数;Step 4.3: Evaluate the performance of the state update model on the validation set and save the state update model parameters that meet the preset indicator requirements;Step4.4:在强化学习环境中,使用训练好的状态更新模型参数,通过当前环境与当前执行的动作预测下一时刻强化学习的环境状态,指导智能体进行动作选择。Step 4.4: In the reinforcement learning environment, use the trained state to update the model parameters, predict the environment state of reinforcement learning at the next moment through the current environment and the currently executed action, and guide the intelligent agent to make action selections.
CN202510677737.5A2025-05-262025-05-26 A dynamic optimization control method for tin smelting processActiveCN120215281B (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN202510677737.5ACN120215281B (en)2025-05-262025-05-26 A dynamic optimization control method for tin smelting process

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN202510677737.5ACN120215281B (en)2025-05-262025-05-26 A dynamic optimization control method for tin smelting process

Publications (2)

Publication NumberPublication Date
CN120215281A CN120215281A (en)2025-06-27
CN120215281Btrue CN120215281B (en)2025-08-26

Family

ID=96115630

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN202510677737.5AActiveCN120215281B (en)2025-05-262025-05-26 A dynamic optimization control method for tin smelting process

Country Status (1)

CountryLink
CN (1)CN120215281B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN112088070A (en)*2017-07-252020-12-15M·奥利尼克 Systems and methods for operating robotic systems and performing robotic interactions
CN112304106A (en)*2019-08-022021-02-02乔治洛德方法研究和开发液化空气有限公司Furnace control system, furnace control method, and furnace provided with same

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US11574148B2 (en)*2018-11-052023-02-07Royal Bank Of CanadaSystem and method for deep reinforcement learning
US12204846B2 (en)*2021-11-062025-01-21International Business Machines CorporationEnhancing natural language processing accuracy in computer systems
CN116562127A (en)*2023-04-142023-08-08中南大学Blast furnace smelting operation optimization method and system based on offline reinforcement learning
CN119889584A (en)*2023-10-162025-04-25西安大医集团股份有限公司Model training method, treatment planning device and medium
CN118917571A (en)*2024-07-012024-11-08昆明理工大学Tin smelting production scheduling optimization method based on graph convolution network and reinforcement learning
CN118966311A (en)*2024-07-262024-11-15天津理工大学 A decision-making planning method for autonomous driving based on deep reinforcement learning and deep learning
CN119310945A (en)*2024-10-122025-01-14昆明理工大学 An AI artificial intelligence intelligent tin smelting control system and method
CN119813402A (en)*2024-12-272025-04-11国网江苏省电力有限公司 A method and system for improving transmission capacity of AC/DC hybrid transmission corridor
CN119937305A (en)*2025-01-032025-05-06西安石油大学 An automated control method for key drilling parameters based on improved DDPG algorithm

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN112088070A (en)*2017-07-252020-12-15M·奥利尼克 Systems and methods for operating robotic systems and performing robotic interactions
CN112304106A (en)*2019-08-022021-02-02乔治洛德方法研究和开发液化空气有限公司Furnace control system, furnace control method, and furnace provided with same

Also Published As

Publication numberPublication date
CN120215281A (en)2025-06-27

Similar Documents

PublicationPublication DateTitle
CN107368125B (en)A kind of blast furnace temperature control system and method based on CBR Yu the parallel mixed inference of RBR
CN108469180A (en)The method for building up of sintering end point forecasting system based on big data and machine learning
CN102652925B (en)System for measuring granularity of pulverized coal of blast furnace coal powder injection middle-speed milling system
CN116127345B (en) Converter steelmaking process model design method based on deep clustering generative adversarial network
CN102778538A (en) A Soft Sensing Method for Carbon Content in Boiler Fly Ash Based on Improved Support Vector Machine
CN110427715B (en)Method for predicting furnace hearth thermal state trend based on time sequence and multiple dimensions of blast furnace
CN102540879A (en)Multi-target evaluation optimization method based on group decision making retrieval strategy
CN103246801A (en)Shaft furnace fault condition forecasting method based on improved case-based reasoning
CN119802566A (en) Multi-objective combustion optimization method based on economic predictive control
CN116755409A (en) A coordinated control method for coal-fired power generation system based on value distribution DDPG algorithm
CN120215281B (en) A dynamic optimization control method for tin smelting process
CN102654444B (en)Method for measuring pulverized coal granularity of blast furnace coal-injection medium-speed milling system
CN115186900B (en) Dynamic blast furnace gas production prediction method and system applicable to multiple operating conditions
CN113512620B (en)Dynamic control method for endpoint carbon in whole converter smelting process of gas analysis and sublance
CN112861276B (en)Blast furnace burden surface optimization method based on data and knowledge dual drive
CN119847089A (en)Automatic control system for removing sulfur in flue gas
CN118333229A (en) A precise coal blending method and system based on deep reinforcement learning
CN117930764A (en)Intelligent decision-making method in sintering production process
CN111520740B (en)Method for coordinately optimizing operation of multiple porous medium combustors
Lee et al.Predictive control of blast furnace temperature in steelmaking with hybrid depth-infused quantum neural networks
CN119538125B (en)Blast furnace slag alkalinity prediction method based on big data
CN118822357B (en) A coke quality prediction method based on production dynamic data
KR102872362B1 (en)System for controlling incinerator including reinforcement learing model for controlling incinerator and operation method thereof
CN117369263B (en)Intelligent combustion control method of hot blast stove based on reinforcement learning and attention mechanism
KR102872360B1 (en)System for controlling incinerator evaluating performance of model predicting state of incinerator and operation method thereof

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination
GR01Patent grant
GR01Patent grant

[8]ページ先頭

©2009-2025 Movatter.jp