Disclosure of Invention
The invention aims to solve the technical problems and provides a stroke rehabilitation training system and a stroke rehabilitation training method based on deep reinforcement learning. The system realizes accurate and personalized rehabilitation training scheme formulation and dynamic adjustment through innovative technologies such as fusion analysis of multisource physiological data, nonlinear dynamics modeling, deep reinforcement learning strategy generation and the like.
The invention provides a stroke rehabilitation training system and a stroke rehabilitation training method based on deep reinforcement learning, wherein the stroke rehabilitation training system comprises the following steps:
the data acquisition module is used for:
Acquiring multi-source physiological data of a patient, wherein the multi-source physiological data comprise surface electromyographic signals, inertial measurement unit data, brain electrical signals and electrocardiosignals;
the data preprocessing module is in communication connection with the data acquisition module and is used for:
Receiving multi-source physiological data sent by the data acquisition module;
performing data preprocessing and feature extraction based on the multi-source physiological data;
The feature fusion module is in communication connection with the data preprocessing module and is used for:
receiving the preprocessed data sent by the data preprocessing module;
based on the preprocessed data, carrying out space-time feature fusion;
the dynamics modeling module is in communication connection with the feature fusion module and is used for:
the fused features sent by the feature fusion module are received;
Based on the fused characteristics, a nonlinear dynamics model is established;
The strategy generation module is in communication connection with the dynamics modeling module and is used for:
receiving a nonlinear dynamics model sent by the dynamics modeling module;
generating a self-adaptive rehabilitation strategy according to the nonlinear dynamics model;
the strategy output module is in communication connection with the strategy generation module and is used for:
Receiving an adaptive recovery strategy sent by the strategy generation module;
Outputting the self-adaptive rehabilitation strategy for guiding the patient to perform stroke rehabilitation training.
Preferably, the data preprocessing module includes:
A random matrix processing unit for processing the multi-source physiological data by utilizing a random matrix theory;
The topology analysis unit is in communication connection with the random matrix processing unit and is used for carrying out topology data analysis on the processed data;
The feature extraction unit is in communication connection with the topology analysis unit and is used for obtaining a processed data matrix through the operation of a topology conversion operator and a Hadamard product.
Preferably, the feature fusion module includes:
the chaos analysis unit is used for analyzing the preprocessed data by using a chaos theory;
the differential geometry processing unit is in communication connection with the chaotic analysis unit and is used for performing differential geometry processing on the analyzed data;
The feature fusion unit is in communication connection with the differential geometry processing unit and is used for obtaining a fused feature matrix through Laplace-Be ltrami operator and Lyapunov index calculation.
Preferably, the dynamics modeling module includes:
the quantum mechanical modeling unit is used for preliminarily establishing a dynamic model by utilizing a quantum mechanical method;
the group theory optimizing unit is in communication connection with the quantum mechanical modeling unit and is used for optimizing the dynamic model by using a group theory method;
a model generating unit, which is connected with the group theory optimizing unit in a communication way and is used for passing throughVariants of the equations and group-acting operators generate a nonlinear dynamics model describing the evolution of the system.
Preferably, the policy generation module includes:
the overtone analysis unit is used for carrying out overtone analysis on the nonlinear dynamics model;
The variational method processing unit is in communication connection with the overtime analysis unit and is used for processing analysis results by applying a variational method;
And the strategy optimization unit is in communication connection with the variational method processing unit and is used for obtaining an optimal self-adaptive rehabilitation strategy by minimizing the energy functional and regularization of total variation.
Preferably, the system further comprises a policy optimization module, communicatively connected with the policy generation module and the policy output module, for:
Receiving an adaptive recovery strategy sent by the strategy generation module;
Optimizing the self-adaptive rehabilitation strategy by utilizing a random matrix theory and a probability theory;
The exploration capacity and adaptability of the strategy are improved through a random weight matrix and gradient descent method;
and sending the optimized strategy to the strategy output module.
Preferably, the data acquisition module includes:
The surface myoelectricity acquisition unit is used for acquiring myoelectricity signals of a patient;
the inertial measurement unit is used for acquiring joint angle and angular velocity data of a patient;
The electroencephalogram acquisition unit is used for acquiring electroencephalogram signals of a patient;
the electrocardio acquisition unit is used for acquiring electrocardiosignals of a patient;
the data synchronization unit is in communication connection with the acquisition units and is used for carrying out time alignment and uniform sampling rate on the acquired multi-source physiological data.
Preferably, the system further comprises a rehabilitation effect evaluation module, which is in communication connection with the strategy output module and the strategy generation module and is used for:
Receiving rehabilitation training data sent by the strategy output module;
Evaluating the effect of the adaptive rehabilitation strategy based on the rehabilitation training data and a preset clinical evaluation index;
and sending an adjusting signal to the strategy generating module according to the evaluation result, wherein the adjusting signal is used for adjusting parameters or structures of the rehabilitation strategy.
Preferably, the system further comprises a man-machine interaction module, which is in communication connection with the policy output module and is used for:
receiving the self-adaptive rehabilitation strategy sent by the strategy output module;
converting the adaptive rehabilitation strategy into visual rehabilitation guidance information;
displaying rehabilitation guidance information to a patient through display equipment;
And receiving feedback information of the patient and sending the feedback information to the strategy generation module for further optimizing the rehabilitation strategy.
A deep reinforcement learning-based stroke rehabilitation training method comprising:
Acquiring multi-source physiological data of a patient, wherein the multi-source physiological data comprise surface electromyographic signals, inertial measurement unit data, brain electrical signals and electrocardiosignals;
preprocessing and extracting features of the multi-source physiological data;
based on the preprocessed data, carrying out space-time feature fusion;
according to the fused characteristics, a nonlinear dynamics model is established;
Generating an adaptive rehabilitation strategy based on the nonlinear dynamics model;
Outputting the self-adaptive rehabilitation strategy for guiding a patient to perform stroke rehabilitation training;
the method is realized by adopting the system.
The invention has the following beneficial effects:
From a macroscopic point of view, the invention constructs a closed-loop intelligent rehabilitation ecological system. The system not only can comprehensively collect physiological data of a patient, but also can carry out deep analysis and modeling on the data so as to generate an optimal rehabilitation strategy. This systematic innovation greatly improves the scientificity and effectiveness of rehabilitation training.
In the system architecture level, the invention adopts a modularized design, and all functional modules are tightly cooperated to form a high-efficiency and expandable technical scheme. The cooperation of the data acquisition module and the preprocessing module ensures the high quality of input data, the cooperation of the feature fusion module and the dynamics modeling module provides a solid theoretical basis for strategy generation, and the seamless connection of the strategy generation module and the output module ensures the accurate execution of a rehabilitation scheme.
On the algorithm level, the invention skillfully solves the technical contradiction that the traditional method is difficult to process. For example, by introducing quantum mechanics and group theory, the system can capture microscopic neuron activities and macroscopic motion modes at the same time, and unification of microscopic precision and macroscopic effects is achieved. The application of the deep reinforcement learning algorithm solves the problem of real-time optimization of the rehabilitation strategy, so that the system can continuously adjust the training scheme according to the instant feedback of the patient.
In terms of effect, the present invention exhibits remarkable complementarity and synergy. The fusion of the multi-source data not only improves the evaluation accuracy of the system on the state of the patient, but also provides multi-dimensional information support for the generation of the personalized strategy. The combination of the deep learning algorithm and the traditional rehabilitation theory ensures the intelligent level of the system and the scientificity and the interpretability of the rehabilitation scheme.
From a microscopic perspective, the invention has innovative breakthroughs in a plurality of technical details. For example, in the data preprocessing stage, the system adopts an advanced random matrix theory and topology data analysis method, so that the efficiency and accuracy of feature extraction are greatly improved. In the dynamic modeling link, the concept of quantum mechanics is introduced, and a new mathematical tool is provided for describing a complex physiological system.
In general, the invention realizes the accuracy, individuation and intellectualization of stroke rehabilitation training through the system-level innovative design and the breakthrough of algorithm level. The rehabilitation device not only can remarkably improve the rehabilitation effect and shorten the rehabilitation period, but also can lighten the workload of medical staff and optimize the medical resource allocation. More importantly, the system is hopeful to greatly improve the rehabilitation experience of the patient and strengthen the training enthusiasm of the patient, so that the rehabilitation effect is fundamentally improved. The innovative technical scheme brings new possibility for the stroke rehabilitation field, is expected to play an important role in clinical practice, and brings good news for more stroke patients.
Detailed Description
Referring to fig. 1-6, the present invention provides a deep reinforcement learning based stroke rehabilitation training system and a method thereof. The system provides personalized and self-adaptive rehabilitation training strategies for stroke patients by combining the acquisition, processing and analysis of multi-source physiological data and the deep reinforcement learning technology.
First, the system of the present invention comprises a data acquisition module 1 for acquiring multi-source physiological data of a patient. These data include surface electromyographic signals, inertial measurement unit data, electroencephalographic signals, and electrocardiographic signals. The design of the data acquisition module 1 takes into account the specific needs of the stroke patient and employs non-invasive sensor technology to ensure the comfort and accuracy of the data acquisition process.
Communicatively connected to the data acquisition module 1 is a data preprocessing module 2. The module receives multi-source physiological data from the data acquisition module 1 and performs data preprocessing and feature extraction. The main task of the data preprocessing module 2 is to remove noise, correct signal bias, and extract meaningful features. This step is critical for subsequent analysis, as it directly affects the performance and accuracy of the system. Surface electromyographic signals (EMG) assess muscle activity and motion control. An Inertial Measurement Unit (IMU) captures joint angles and angular velocities, helping to analyze a patient's motion pattern and balance capabilities. Electroencephalogram (EEG) provides information of the state of the nervous system. An Electrocardiogram (ECG) monitors the health of the cardiovascular system. After preliminary acquisition, the data enter a data preprocessing module for cleaning, noise reduction and feature extraction so as to ensure the accuracy and reliability of subsequent analysis.
The present invention uses electrodes attached to the surface of a patient's muscle to collect electrical signals generated during muscle contraction. Analysis of the EMG signals may help to assess muscle activity levels, identifying which muscle groups require more training. For example, during arm rehabilitation, if the EMG signal of the biceps brachii muscle is found to be weak, a targeted exercise may be designed to enhance the strength of the muscle group. IMU devices are typically mounted on the wrist, ankle, etc. of a patient to capture changes in joint angle and angular velocity. Through IMU data, the system can monitor the gait pattern of the patient in real time. For example, for lower extremity stroke patients, the system may help restore normal walking ability by analyzing gait data to provide a personalized gait training regimen. The EEG electrode cap is placed on the patient's head and records the electrical signals generated by the brain neuron activity. The EEG signals can be used to assess the cognitive function and neurological status of a patient. For example, by analyzing EEG data, the system may detect abnormal activity in specific areas of the brain and adjust rehabilitation strategies accordingly to promote recovery of neural plasticity. ECG electrodes are attached to the patient's chest and cardiac electrical activity is recorded. The ECG data may help monitor the cardiovascular health of the patient. For example, during high intensity rehabilitation training, the system can adjust the training intensity according to the ECG data, ensuring patient safety.
After the data preprocessing, the feature fusion module 3 receives the processed data and performs null feature fusion. The purpose of this step is to integrate physiological signals from different sources into a unified characterization in order to better capture the overall physiological state of the patient.
The dynamics modeling module 4 establishes a nonlinear dynamics model based on the fused features. The model aims at describing the physiological change rule in the rehabilitation process of the patient and provides a theoretical basis for the generation of a follow-up rehabilitation strategy.
The policy generation module 5 is a core part of the present system. According to the nonlinear dynamics model, the self-adaptive rehabilitation strategy is generated. These strategies can be dynamically adjusted according to the real-time state and rehabilitation progress of the patient, so that the effectiveness and safety of training are ensured.
Finally, the strategy output module 6 is responsible for outputting the generated adaptive rehabilitation strategy for guiding the patient to perform stroke rehabilitation training. This may include specific training action guidelines, training intensity suggestions, and the like.
In one embodiment of the invention, the data preprocessing module 2 comprises several key subunits. First is a random matrix processing unit 21 which processes the multi-source physiological data using random matrix theory. The application of random matrix theory here is mainly to reduce the dimensionality of the data while preserving important statistical properties.
For example, the random projection may be performed using the following formula:
Y=RX,
Wherein, X is the original data matrix, R is the random projection matrix, Y is the data matrix after dimension reduction, and it is assumed that there are EMG signals from 5 different muscle groups, and each signal contains 1000 sampling points (i.e. X ε R5×1000). To reduce computational complexity, the data can be reduced to 3 dimensions (i.e., Y ε R3×1000) using a randomly generated projection matrix R ε R5×3. In this way, the system can significantly reduce the amount of computation while maintaining the key features. The data after dimension reduction is easier to process, is beneficial to subsequent feature fusion and dynamic modeling, and improves the response speed and accuracy of the system.
Next, the topology analysis unit 22 performs topology data analysis on the processed data. The purpose of this step is to capture the geometric and topological features of the data, which is important for understanding the motor pattern and neural activity pattern of stroke patients. Topology data analysis can help discover hidden structures in the data, for example using a continuous coherent algorithm:
Wherein Hk (X) represents a k-dimensional coherent group,Is a boundary operator. In analyzing gait data, a series of IMU data is assumed, and the gait patterns of the patient over different time periods are recorded. Through TDA, key nodes and repetitive patterns in the gait cycle can be identified. For example, by computing the coherent population of gait cycles, the system can determine the stable and unstable phases in gait, thereby designing a more targeted gait training scheme. This approach enables the discovery of hidden data structures, helping the system to better understand and optimize the patient's movement patterns.
Finally, the feature extraction unit 23 obtains the processed data matrix through the topology conversion operator and Hadamard product operation.
This step can be expressed as:
Wherein T is the topology transformation operator, and by Hadamard product,And the matrix Z is the final characteristic matrix.
Assume that surface Electromyographic (EMG) and Inertial Measurement Unit (IMU) data are being processed for a stroke patient while receiving rehabilitation training from the upper limb. The goal is to extract meaningful features from these multi-source physiological signals in order to formulate a personalized rehabilitation strategy.
First, the electrical signals of the patient's upper arm muscle activity were collected using electrodes, and the activity of 5 different muscle groups was recorded within 10 seconds. Each muscle group has 1000 sampling points, forming a 5X 1000 matrix XEMG of raw data. IMU devices were mounted on the patient's wrist, elbow and shoulder positions to record changes in joint angle and angular velocity. Similarly, a 3×1000 original data matrix XIMU is obtained.
Firstly, preprocessing the original data, including denoising, standardization and the like, to obtain dimension-reduced data matrixes YEMG∈R5×500 and YIMU∈R3×500.
Next, the preprocessed data is further processed using the topology conversion operator T. Assume that a continuous coherent algorithm is used to extract the topological features. In particular, coherent groups within each time window may be computed to capture geometry and topology in the data. For example, for YEMG, its 0 and 1-dimensional coherent groups can be calculated to give a new feature matrix T (YEMG)∈R5×500. Similarly, for YIMU, its coherent groups can also be calculated to give
Now, it is necessary to combine these two feature matrices. For this purpose, a predefined Hadamard matrix H e R8×500 is introduced. This matrix may contain some prior knowledge or weighting coefficients to emphasize the importance of certain specific features.
Suppose that it is toAndCombined into an 8 x 500 matrixThen performing Hadamard product operation:
Wherein,H ε R8×500, the result matrix Z ε R8 X500.
In this particular example, it is assumed that the following feature matrix Z is obtained:
From these eigenvalues, the system can identify which muscle groups perform better during the training process and which require more attention. For example, if the eigenvalue of muscle D is found to be consistently higher, while the eigenvalue of muscle E is lower, the system may suggest increasing strength training for muscle E.
Through topological transformation and Hadamard product operation, the system can extract more representative and distinguishing characteristics, so that the physiological state of a patient can be reflected better. By combining multiple physiological signals, the system can more comprehensively know the rehabilitation requirements of patients and formulate more accurate and personalized rehabilitation strategies. The extracted features can help the system predict potential problems in the rehabilitation process in advance, and timely adjust the training plan to ensure the safety and effectiveness of the rehabilitation process.
The feature extraction unit 23 extracts meaningful features from the multi-source physiological signal through the topology transformation operator and the Hadamard product operation, thereby significantly improving the understanding of the overall physiological state of the patient. The method not only improves the individuation level of the rehabilitation strategy, but also predicts and deals with potential problems in advance, thereby remarkably improving the rehabilitation effect. Specifically, by fusing various signals such as EMG, IMU and the like, the system can find out hidden modes and relations, make a more accurate rehabilitation plan and ensure the safety and effectiveness of the training process.
In another embodiment of the present invention, the feature fusion module 3 includes a chaotic analysis unit 31, a differential geometry processing unit 32, and a feature fusion unit 33. The chaos analysis unit 31 analyzes the preprocessed data using the chaos theory. In the context of stroke rehabilitation, chaotic analysis can help understand the complex dynamic behavior of the patient's nervous system. For example, a maximum Lyapunov index may be calculated:
Where the entry is the maximum Lyapunov exponent and δx (t) represents the distance between two initial close orbits in the phase space.
The differential geometry processing unit 32 performs differential geometry processing on the analyzed data. The purpose of this step is to capture the intrinsic geometry of the data, which is critical to understanding the changes in patient movement patterns.
For example, a data manifold may be described using a Riemann metric:
ds2=gijdxidxj,
Where gij is the component of the metric tensor.
The feature fusion unit 33 obtains a fused feature matrix through Laplace-Beltrami operator and Lyapunov index calculation. The Laplace-Beltrami operator may help capture global and local structures of data:
Where Δg is the Laplace-Beltrami operator, g is the Riemann metric, and f is a function defined on the manifold.
Through the processing steps, the system provided by the invention can extract rich characteristic information from the multi-source physiological data, and provides a so1id foundation for subsequent recovery strategy generation. The multi-mode and multi-scale data processing method can comprehensively capture the physiological state of the patient, thereby realizing more accurate and personalized rehabilitation training.
By fusing multiple physiological signals, the system is able to more fully understand the physiological state of the patient, rather than just the information provided by a single signal. This helps to discover hidden patterns and relationships, thereby developing a more effective rehabilitation strategy.
Assume that a stroke patient is undergoing upper limb rehabilitation training. By fusing data from the EMG, IMU and EEG, the system can monitor muscle activity, joint angle changes, and brain neuron activity simultaneously. For example, if during a particular action, the EMG signal is found to be indicative of weak muscle activity, but the EEG signal is indicative of abnormally active brain activity, this may indicate that the patient's brain is attempting to compensate for the problem of muscle weakness. Based on this information, the system can adjust the training program, increase the strength training for the muscle group, and combine the neurofeedback training to optimize the coordination between brain and muscle.
Through multi-mode data fusion, the system can identify individual differences and formulate personalized rehabilitation strategies accordingly. This approach is more accurate and efficient than relying on only a single type of physiological signal.
Assuming a stroke patient is gait trained, the system finds that there is significant asymmetry in his walking through IMU data and finds that there are abnormal activities in certain areas of the brain through EEG data analysis. Through analysis by the feature fusion unit 33, the system may generate a feature matrix that includes gait asymmetry and brain activity patterns. Based on the feature matrix, the system can design a training scheme which is specially aimed at gait asymmetry and brain function recovery of the patient, and comprises gait correction training by using robot auxiliary equipment, and simultaneously, combining cognitive training to promote the recovery of brain functions.
By fusion and analysis of the multi-source data, the system can predict potential problems in the rehabilitation process in advance and timely adjust the rehabilitation strategy, so that the predictability and controllability of the rehabilitation effect are improved.
Assuming a stroke patient is undergoing rehabilitation training, the system finds that the patient is overloaded during intensive training by continuously monitoring his Electrocardiogram (ECG) and electromyographic signals (EMG). Through analysis by the feature fusion unit 33, the system can identify this relationship between heart load and a particular training intensity. Based on this information, the system may recommend reducing the intensity of training or adjusting the frequency of training to ensure patient safety and avoid excessive fatigue. In addition, the system can further optimize the training plan according to the real-time feedback (such as self-reported fatigue feeling) of the patient, so as to ensure the smooth proceeding of the rehabilitation process.
The feature fusion unit 33 generates a unified feature representation by fusing the plurality of physiological signals, so that understanding of the overall physiological state of the patient can be significantly improved. The method not only improves the individuation level of the rehabilitation strategy, but also predicts and deals with potential problems in advance, thereby remarkably improving the rehabilitation effect. Specifically, by fusing EMG, IMU, EEG and other signals, the system can find hidden modes and relationships, formulate a more accurate rehabilitation plan, and ensure the safety and effectiveness of the training process.
Preferably, the modules and units in the system are connected by adopting a high-speed data bus so as to ensure the real-time performance and the reliability of data transmission. At the same time, the system is also equipped with a high-performance computing unit to support complex data processing and the operation of deep reinforcement learning algorithms.
The method and system of the present invention have many advantages in practical applications. First, it is able to accommodate individual differences among different patients, providing personalized rehabilitation regimens. And secondly, the change in the rehabilitation process of the patient can be responded in time by real-time data analysis and strategy adjustment, so that the rehabilitation effect is maximized. Furthermore, the application of deep reinforcement learning enables the system to be continuously learned and optimized, and the performance of the system is continuously improved along with the increase of the service time.
In general, the present invention provides an innovative stroke rehabilitation training solution that combines advanced data analysis techniques, deep learning algorithms with traditional rehabilitation theory, providing new possibilities for rehabilitation of stroke patients. Through continuous data collection, analysis and policy optimization, the system is expected to remarkably improve the effect of stroke rehabilitation, shorten the rehabilitation period and finally improve the life quality of patients.
In the system of the present invention, the dynamics modeling module 4 is a key component responsible for building a nonlinear dynamics model describing the patient's rehabilitation process. Preferably, the module comprises a quantum mechanical modeling unit 41, a group theory optimization unit 42 and a model generation unit 43. The structural design aims to fully utilize modern physical and mathematical theories to more accurately describe complex physiological changes in the rehabilitation process of stroke patients.
The quantum mechanical modeling unit 41 preliminarily builds a kinetic model using a quantum mechanical method. In this process, the physiological state of the patient can be regarded as a quantum system, the evolution of which can be usedVariations of the equation describe:
Wherein, I psi (t)) represents the quantum state of the system,For the Hamilton operator,Is a reduced Planck constant.
It is assumed that ion channels dynamically change between study neurons. The on-off behavior of the ion channel can be simulated through the quantum mechanical model, and the optimal stimulation time and intensity can be predicted. For example, by analyzing the dynamic changes of a particular neuron population, the system may suggest electrical stimulation at a particular point in time to maximize recovery of neural plasticity. It is assumed that ion channels dynamically change between study neurons. The on-off behavior of the ion channel can be simulated through the quantum mechanical model, and the optimal stimulation time and intensity can be predicted. For example, by analyzing the dynamic changes of a particular neuron population, the system may suggest electrical stimulation at a particular point in time to maximize recovery of neural plasticity.
The advantage of this quantum mechanical modeling approach is that it can capture microscopic processes that are difficult to describe by conventional classical models, such as the quantum effect group theory optimization unit 42 of neurons, which then optimizes the kinetic model using the group theory approach. The use of group theory herein is primarily for the purpose of describing symmetry and invariance of the system, which is critical to understanding the recovery process of stroke patient movement patterns. For example, the contig may be used to describe continuous symmetry:
g(t)=exp(tX),
Wherein g (t) is an element of a lie group, X is an element of a lie algebra, and t is a parameter. Through group theory optimization, the system provided by the invention can better capture the invariant features and symmetry structures in the rehabilitation process.
In upper limb rehabilitation training, it is assumed that continuous movements of the patient's arms need to be analyzed. Through group theory optimization, the system can identify and utilize consistency patterns in the rehabilitation process. For example, by analyzing the symmetry of arm rotation, the system can design a more efficient exercise regimen, such as repetitive motion training over a particular angular range, facilitating faster recovery.
Through symmetry analysis, the system can identify and utilize the consistency mode in the rehabilitation process to design a more effective training scheme
Finally, the model generating unit 43 generates a model byVariants of the equations and group-acting operators generate a nonlinear dynamics model describing the evolution of the system. This step can be expressed as:
wherein, ψ is the wave function of the system, and G is the group action operator. The model can comprehensively consider quantum effects and system symmetry, and provides a solid theoretical basis for the subsequent recovery strategy generation.
Another important component of the present invention is the policy generation module 5. The module is responsible for generating an adaptive rehabilitation strategy according to a nonlinear dynamics model. Preferably, the policy generation module 5 includes a general function analysis unit 51, a variance processing unit 52, and a policy optimization unit 53.
The function-overt analysis unit 51 performs function-overt analysis on the nonlinear dynamics model. The purpose of this step is to transform the complex dynamics problem into a functional optimization problem. For example, an energy functional can be defined:
The function-overt analysis unit 51 performs function-overt analysis on the nonlinear dynamics model. The purpose of this step is to transform the complex dynamics problem into a functional optimization problem. For example, an energy functional can be defined:
wherein Ω is a configuration space of the system, and V (x) is a potential energy function.
The variational processing unit 52 then applies variational processing to process the analysis result. The variational method is applied here mainly to find the extremum of the energy function, which corresponds to the optimal state of the system. For example, the Euler-Lagrange equation may be used
Wherein L is a Lagrangian function.
Suppose that a personalized rehabilitation program for a specific patient needs to be formulated. By defining the energy function E [ ψ ], the optimal training intensity and frequency can be found. For example, for a patient with a shoulder injury, the system may find the best training regimen that best suits the current state by adjusting parameters in the energy functional, such as training for 30 minutes three times per week. This approach allows the system to find the optimal rehabilitation strategy, maximizing the rehabilitation effect of the patient.
The strategy optimization unit 53 obtains an optimal adaptive rehabilitation strategy by minimizing the energy functional and total variation regularization. This step can be expressed as:
Wherein,For the optimal rehabilitation strategy to be a good solution,As a function of the energy,For the total variation regularization term, λ is the regularization parameter.
The system of the present invention further comprises a policy optimization module 6 in communication with the policy generation module 5 and the policy output module 7. The main function of the strategy optimization module 6 is to further optimize the generated adaptive rehabilitation strategy to improve its exploration ability and adaptability.
Preferably, the strategy optimization module 6 optimizes the adaptive rehabilitation strategy using random matrix theory and probability theory. For example, a random gradient descent method may be used:
Wherein,For the current strategy, η is the learning rate,As a loss function. By introducing randomness, the system provided by the invention can better explore the strategy space and avoid sinking into a local optimal solution. In one embodiment of the invention, the data acquisition module 1 comprises a plurality of dedicated acquisition units. The surface myoelectric acquisition unit 11 is used to acquire myoelectric signals of the patient, which is essential for assessing muscle activity and motor control. The inertial measurement unit 12 is used to acquire patient joint angle and angular velocity data, which facilitates analysis of the patient's movement patterns and balance capabilities. The electroencephalogram acquisition unit 13 and the electrocardiographic acquisition unit 14 are used for acquiring an electroencephalogram signal and an electrocardiographic signal of a patient, respectively, which can provide important information about the states of the nervous system and the cardiovascular system of the patient.
In particular, the system of the present invention further comprises a data synchronizing unit 15 in communication with the acquisition units. The main task of the data synchronization unit 15 is to time align and sample rate unify the acquired multisource physiological data. This step is crucial for subsequent data analysis and feature fusion, as different physiological signals may have different sampling rates and time delays.
Preferably, the data synchronization unit 15 uses a dynamic time warping (DYNAMIC TIME WARPING, DTW) algorithm for time alignment. The objective function of the DTW algorithm can be expressed as:
wherein X and Y are two time series to be aligned, w is an alignment path, and d is a distance function.
Through careful design of data acquisition and synchronous processing, the system can obtain high-quality and synchronous multi-mode physiological data, and lays a solid foundation for subsequent analysis and strategy generation. The integration of the multi-source data not only improves the evaluation accuracy of the system on the state of the patient, but also provides rich information support for the establishment of personalized rehabilitation strategies. The system of the invention further comprises a rehabilitation effect evaluation module 8 which is in communication connection with the strategy output module 7 and the strategy generation module 5. The main function of the rehabilitation effect evaluation module 8 is to evaluate the effect of the adaptive rehabilitation strategy and provide feedback according to the evaluation result to further optimize the rehabilitation strategy.
Preferably, the rehabilitation effect evaluation module 8 receives rehabilitation training data from the strategy output module 7 and performs comprehensive analysis in combination with preset clinical evaluation indexes. These clinical assessment indices may include, but are not limited to, fugl-Meyer scores, barthel indices, and modified rank scales, among others. The rehabilitation effect assessment module 8 employs a machine learning algorithm, in particular a deep learning model, to analyze the rehabilitation progress of the patient.
For example, in one embodiment of the present invention, rehabilitation effect assessment module 8 uses a long-term memory network (LSTM) to predict a patient's rehabilitation trajectory. The core formula of the LSTM model can be expressed as:
ft=σ(Wf·[ht-1,xt]+bf),
it=σ(Wi·[ht-1,xt]+bi),
ot=σ(Wo·[ht-1,xt]+bo),
ht=ot*tanh(Ct),
Wherein ft、it and ot are respectively a forgetting gate, an input gate and an output gate, Ct is a cell state, ht is a hidden state, and W and B are weight and bias parameters.
Based on the predicted results of the LSTM model, the rehabilitation effect evaluation module 8 may calculate a deviation between the actual rehabilitation progress and the expected progress. If the deviation exceeds a preset threshold, such as a 95% confidence interval, the rehabilitation effect assessment module 8 will send an adjustment signal to the strategy generation module 5. This dynamic assessment and feedback mechanism ensures that the rehabilitation strategy is able to respond in time to the individual needs and progress of the patient.
The system of the invention also comprises a man-machine interaction module 9 which is in communication connection with the policy output module 7. The design of the human-machine interaction module 9 aims at improving the usability of the system and the compliance of the patient, which is of great importance for the long-term effect of stroke rehabilitation training.
In a preferred embodiment of the invention, the human-machine interaction module 9 comprises a visualization unit 91 and a feedback processing unit 92. The visualization unit 91 is responsible for converting the adaptive rehabilitation strategy sent by the strategy output module 7 into visual and understandable visual information. For example, a 3D animation may be used to show the correct motion gesture, or a progress bar may be used to display the completion of the rehabilitation training.
The visualization unit 91 employs advanced computer graphics techniques such as real-time rendering and physical simulation to ensure the reality and fluency of the visual effect. The rendering process may be represented by the following simplified rendering equation:
Lo(x,ωo,λ,t)=Le(x,ωo,λ,t)+∫Ωfr(x,ωi,ωo,λ,t)Li(x,ωi,λ,t)(ωi·n)dωi,
Wherein, L0 is the outgoing radiance, Le is the self-luminous radiance, fr is the Bidirectional Reflectance Distribution Function (BRDF), Li is the incoming radiance, lambda is the wavelength, and t is the time. The feedback processing unit 92 is responsible for receiving feedback information from the patient and converting it into a form understandable to the system. Such feedback may include voice commands, gesture recognition, or touch screen input, among others. The feedback processing unit 92 uses Natural Language Processing (NLP) and computer vision techniques to parse the patient's feedback. For example, for speech feedback, a Recurrent Neural Network (RNN) may be used for speech recognition:
ht=tanh(Whxxt+Whhht-1+bh),
Wherein ht is a hidden state at time t, xt is input, W is a weight matrix, and b is a bias term.
The feedback processing unit 92 is responsible for receiving feedback information from the patient and converting it into a form understandable to the system. Such feedback may include voice commands, gesture recognition, or touch screen input, among others. The feedback processing unit 92 uses Natural Language Processing (NLP) and computer vision techniques to parse the patient's feedback. For example, for speech feedback, a Recurrent Neural Network (RNN) may be used for speech recognition:
ht=tanh(Whxxt+Whhht-1+bh),
Wherein ht is a hidden state at time t, xt is input, W is a weight matrix, and b is a bias term.
The design of the man-machine interaction module 9 not only improves the usability of the system, but also provides a richer feedback channel for the patient, which is helpful for enhancing the participation of the patient and the pertinence of rehabilitation training.
Finally, the invention also provides a stroke rehabilitation training method based on deep reinforcement learning. The method is closely related to the system and comprises the steps of data acquisition, preprocessing, feature fusion, dynamics modeling, strategy generation, output and the like.
In the method, firstly, multi-source physiological data of a patient are acquired, including surface electromyographic signals, inertial measurement unit data, electroencephalographic signals and electrocardiosignals. These data are collected by high-precision sensors and subjected to preliminary filtering and noise reduction.
And then, preprocessing and extracting features of the acquired multi-source physiological data. This step involves data cleaning, normalization, dimension reduction, etc. For example, dimension reduction can be performed using Principal Component Analysis (PCA):
Y=XW,
wherein X is an original data matrix, W is a feature vector matrix, and Y is a dimension-reduced data matrix. Next, spatio-temporal feature fusion is performed based on the preprocessed data. This step aims at capturing the correlation and timing characteristics between different physiological signals. Tensor decomposition methods, such as the Tucker decomposition, may be used:
Wherein,As a tensor of the raw data, the data is stored,As core tensors A, B and C are factor matrices.
And establishing a nonlinear dynamics model based on the fused characteristics. This step utilizes the quantum mechanical and group theory methods mentioned previously to capture the complex dynamic behavior of the system.
Finally, based on the nonlinear dynamics model, an adaptive rehabilitation strategy is generated and output for guiding a patient to perform stroke rehabilitation training. The strategy generation process employs a deep reinforcement learning algorithm, such as near-end strategy optimization (PPO):
wherein rt (θ) is the probability ratio,For the dominance function estimation, E is the clipping parameter.
By the method, a personalized and self-adaptive rehabilitation training scheme can be provided for stroke patients, and the rehabilitation effect and efficiency are effectively improved. The method is innovative in that the method merges multidisciplinary leading edge technologies such as quantum computing, group theory, deep learning and the like, and brings new ideas and solutions to the stroke rehabilitation field.
In order to verify the superiority of the stroke rehabilitation training system and the stroke rehabilitation training method based on deep reinforcement learning, a series of experiments are carried out, and the effects of the embodiment of the invention and the two comparative examples are compared. These experiments were aimed at simulating a realistic stroke rehabilitation training scenario, evaluating the performance of the system in terms of improving the patient's rehabilitation effect and efficiency.
Embodiment 1 adopts the complete system of the invention, which comprises core technologies such as multi-source data acquisition, deep reinforcement learning strategy generation, self-adaptive rehabilitation scheme and the like. Comparative example 1 used a conventional fixed rehabilitation regimen with no personalized adjustment. Comparative example 2 was generated using a simple machine learning algorithm for rehabilitation protocol but lacks the adaptive capability of deep reinforcement learning.
60 Hemiplegic patients after stroke are selected to participate in rehabilitation training for 12 weeks, and 20 patients in each group. The main evaluation indexes include Fugl-Meyer upper limb motor function score (FMA-UE), daily life activity Ability (ADL) score, rehabilitation training compliance and patient satisfaction. These indicators comprehensively reflect rehabilitation effects, patient quality of life improvement and system practicality.
FMA-UE scores were rated using a standard Fugl-Meyer rating scale, 66 points full. ADL scoring was performed using the Modified Barthel Index (MBI), 100 points full. Rehabilitation training compliance is measured by the proportion of patients who complete a given training task, 100% full. Patient satisfaction was rated on the Liclet 5 scale, with 1 score lowest and 5 scores highest.
The experimental results are shown in the following table:
| Evaluation index | Example 1 | Comparative example 1 | Comparative example 2 |
| FMA-UE score promotion | 18.5±2.3 | 10.2±1.8 | 13.7±2.1 |
| ADL score boost | 22.3±2.7 | 12.8±2.2 | 16.5±2.4 |
| Rehabilitation training compliance | 92%±3% | 75%±5% | 83%±4% |
| Patient satisfaction | 4.6±0.3 | 3.2±0.5 | 3.8±0.4 |
From the experimental results, it is clear that example 1 of the present invention is significantly superior to the two comparative examples in all evaluation indexes. The FMA-UE score improvement of example 1 reached 18.5 points, 81.4% higher than comparative example 1, 35.0% higher than comparative example 2. This shows that the system of the present invention is able to more effectively improve the motor function of the patient's upper limbs thanks to the adaptive rehabilitation strategy and accurate action guidance of the system.
Example 1 also performed excellently in terms of improvement in ADL score, 74.2% and 35.2% higher than comparative examples 1 and 2, respectively. This means that the invention not only improves the motor function of the patient, but also significantly improves his daily life self-care ability, which has a direct positive impact on the quality of life of the patient.
Rehabilitation training compliance is an important indicator for measuring system practicality and patient acceptance. The compliance of example 1 was as high as 92%, much higher than 75% for comparative example 1 and 83% for comparative example 2. This result demonstrates that the system of the present invention is more motivated to train the patient, possibly due to its personalized training scheme and intuitive human-machine interface.
In terms of patient satisfaction, example 1 likewise leads far, reaching 4.6 minutes (5 minutes full). This reflects the high acceptance of the present system by the patient, likely due to their remarkable rehabilitation effect and good use experience.
These experimental results fully demonstrate the superiority of the invention in the field of stroke rehabilitation training. The system can dynamically adjust the rehabilitation strategy according to the real-time state and the progress condition of the patient through a deep reinforcement learning algorithm, so that the real personalized and accurate rehabilitation is realized. The fusion and analysis of the multi-source data enables the system to comprehensively grasp the physiological state of the patient, thereby making a more scientific and effective training scheme.
It is particularly notable that the present invention is prominent in improving patient compliance. This is particularly important for stroke rehabilitation, as rehabilitation is often a lengthy and difficult process, and maintaining a high training enthusiasm is critical for rehabilitation effectiveness. The system successfully improves the participation and adherence of patients through vivid visual interfaces and timely progress feedback.
In summary, the stroke rehabilitation training system and the method based on deep reinforcement learning of the invention have obvious advantages in various aspects of improving the exercise function of patients, improving the quality of life, enhancing the training compliance and the like. The method combining advanced algorithm and humanized design brings new possibility to the stroke rehabilitation field, is expected to play an important role in clinical practice, and brings good news to more stroke patients.
It should be noted that the above-mentioned embodiments are only preferred embodiments of the present invention, and are not intended to limit the present invention, and any modifications, equivalent substitutions, improvements, etc. within the principle of the present invention should be included in the protection scope of the present invention.