Disclosure of Invention
The embodiment of the application solves the problem of low operation and maintenance management efficiency in the prior art by providing the operation and maintenance management system and the operation and maintenance management method for the smart city, and realizes the improvement of the operation and maintenance management efficiency.
The embodiment of the application provides an operation and maintenance management system for a smart city, which comprises the following components: the system comprises a data acquisition and encryption module, a fault management module, a fault prediction model construction module and a resource scheduling module; the data acquisition and encryption module is used for collecting operation and maintenance data of the smart city in a preset time period in real time, preprocessing the operation and maintenance data to obtain first operation and maintenance data, and storing the first operation and maintenance data into the operation and maintenance database; the fault management module is used for carrying out fault detection on the first operation and maintenance data according to a preset fault early warning rule and feeding back a fault detection result to operation and maintenance personnel in a chart form, the fault detection is used for judging whether a fault alarm needs to be triggered according to a fault early warning index, and the fault early warning index is used for measuring the matching degree between the first operation and maintenance data and the preset fault early warning rule; the fault prediction model construction module is used for generating a fault label according to the first operation and maintenance data after fault detection, and simultaneously constructing a fault prediction model by combining the real-time operation state of the operation and maintenance equipment in a preset environment and performing feature matrix training; the resource scheduling module is used for adjusting the resource allocation and task scheduling of the operation and maintenance equipment according to the real-time load condition of the operation and maintenance equipment in the operation process and combining a fault prediction model and a resource adjustment dimension, wherein the resource adjustment dimension is used for measuring the priority degree of the resource allocation and task scheduling of the operation and maintenance equipment.
Further, the specific obtaining step of the fault early warning index includes: collecting equipment performance data and network transmission data of operation and maintenance equipment in a preset time period in real time, wherein the equipment performance data comprise equipment environment data and memory disk utilization rate, and the network transmission data comprise bandwidth delay rate and data packet loss rate; acquiring a memory disk coefficient, a bandwidth delay coefficient and a data packet loss coefficient according to the memory disk reference use rate, the bandwidth reference delay rate, the data reference packet loss rate and the corresponding reference deviation; obtaining a fault early warning index by combining equipment environment influence factors set by preset performance indexes of equipment, wherein the fault early warning index is calculated by the following formula:
,
wherein m is the number of a preset time period,M is the total number of preset time periods,/>The fault early warning index of the first operation and maintenance data in the mth preset time period is represented, e is a natural constant, and is expressed as' vMemory disk coefficient representing first operation and maintenance data in mth preset time period,/>A bandwidth delay coefficient representing the first operation and maintenance data within the mth preset time period,Data packet loss coefficient of first operation and maintenance data in mth preset time period,/>, andRepresenting the device environmental impact factor.
Further, the specific steps of feature matrix training include: taking the first operation data with the fault labels as the rows and columns of the feature matrix to determine the position of the fault to be predicted, and setting initial parameters of feature matrix training according to model complexity coefficients of a fault prediction model in a preset time period, wherein the model complexity coefficients are used for measuring the complexity degree of the fault prediction model; inputting real-time running state data of the operation and maintenance equipment into a fault prediction model according to initial parameters of feature matrix training, and simultaneously adjusting parameters of the fault prediction model in the training process in real time until preset training indexes of the model are met; and mapping the parameters adjusted in real time and the position of the fault to be predicted into the corresponding fault labels so as to update the feature matrix periodically.
Further, the specific obtaining steps of the resource adjustment dimension are as follows: setting a target range of load balancing based on performance indexes and task demands of the operation and maintenance equipment, and acquiring a load balancing rate by combining operation load conditions of the operation and maintenance equipment; acquiring resource utilization rate of the operation and maintenance equipment in real time by using resource utilization data in the operation process, and obtaining a resource adjustment dimension by combining the average utilization rate of the resources of the operation and maintenance equipment in a preset time period and a load reference balance rate, wherein the resource adjustment dimension is calculated by the following formula:
,
wherein m is the number of a preset time period,M is the total number of preset time periods, g is the number of operation and maintenance equipment, and IG, G is the total number of operation and maintenance devices,/>Representing the resource adjustment dimension of the g-th operation and maintenance equipment in the m-th preset time period, wherein e is a natural constant, and is/areRepresenting the load balancing rate of the g-th operation and maintenance equipment in the m-th preset time period,/>Representing load reference balance rate,/>Representing load balancing rate reference deviation,/>Representing the resource utilization rate of the g-th operation and maintenance equipment in the m-th preset time period,/>Representing average utilization rate of resources,/>Representing the resource utilization reference deviation.
The embodiment of the application provides an operation and maintenance management method for a smart city, which comprises the following steps of: s1, collecting operation and maintenance data of a smart city in a preset time period in real time, preprocessing the operation and maintenance data to obtain first operation and maintenance data, and storing the first operation and maintenance data into an operation and maintenance database; s2, performing fault detection on the first operation and maintenance data according to a preset fault early warning rule, and feeding back a fault detection result to operation and maintenance personnel in a chart form, wherein the fault detection is used for judging whether a fault alarm needs to be triggered according to a fault early warning index, and the fault early warning index is used for measuring the matching degree between the first operation and maintenance data and the preset fault early warning rule; s3, generating a fault label according to the first operation and maintenance data after fault detection, and simultaneously constructing a fault prediction model and performing feature matrix training by combining the real-time operation state of the operation and maintenance equipment in a preset environment; and S4, adjusting the resource allocation and task scheduling of the operation and maintenance equipment according to the real-time load condition of the operation and maintenance equipment in the operation process by combining a fault prediction model and a resource adjustment dimension, wherein the resource adjustment dimension is used for measuring the priority degree of the resource allocation and task scheduling of the operation and maintenance equipment.
One or more technical solutions provided in the embodiments of the present application at least have the following technical effects or advantages:
1. Acquiring first operation and maintenance data by collecting operation and maintenance data of a smart city in a preset time period in real time and preprocessing, then carrying out fault detection on the first operation and maintenance data according to a preset fault early warning rule, constructing a fault prediction model and carrying out feature matrix training by combining the real-time operation state of operation and maintenance equipment in a preset environment, and finally adjusting the resource allocation and task scheduling of the operation and maintenance equipment according to the real-time load condition of the operation and maintenance equipment in the operation process by combining the fault prediction model and the resource adjustment dimension, thereby realizing the improvement of the operation and maintenance equipment operation efficiency, further realizing the improvement of the operation and maintenance management efficiency, and effectively solving the problem of low operation and maintenance management efficiency in the prior art.
2. The method comprises the steps of collecting real-time running state data of operation and maintenance equipment in a preset environment in real time, carrying out normalization processing, extracting fault prediction feature data from the normalized real-time running state data, adding corresponding fault labels for the fault prediction feature data by combining first operation and maintenance data after fault detection, constructing a fault prediction framework by combining the fault prediction feature data with the fault labels based on a fault detection algorithm, and finally inputting the fault prediction feature data into the constructed fault prediction framework to construct a fault prediction model, so that the accurate construction of the fault prediction framework is realized, and the accuracy and reliability of the fault prediction model are improved.
3. The method comprises the steps of collecting task state information of operation and maintenance equipment in a normal operation process in real time, predicting failure trend of the operation and maintenance equipment by using a trained failure prediction model, identifying overload resources according to operation load conditions, sequencing the priority of the overload resources by combining resource adjustment dimensions, and finally determining decisions of resource adjustment and task scheduling according to the identified overload resources and carrying out dynamic delay on low-priority tasks by combining actual operation requirements of the operation and maintenance equipment, so that the operation and maintenance resource allocation efficiency and accuracy are improved, and more accurate training of the failure prediction model is realized.
Detailed Description
The embodiment of the application solves the problem of low operation and maintenance management efficiency in the prior art by providing an operation and maintenance management system and a method thereof for a smart city, collects operation and maintenance data of the smart city in real time within a preset time period through a data acquisition and encryption module, preprocesses the operation and maintenance data to obtain first operation and maintenance data, stores the first operation and maintenance data into an operation and maintenance database, then carries out fault detection on the first operation and maintenance data through a fault management module according to a preset fault early warning rule, feeds back a fault detection result to operation and maintenance personnel in a chart form, then generates a fault label according to the first operation and maintenance data after the fault detection through a fault prediction model construction module, simultaneously builds a fault prediction model and carries out feature matrix training in combination with the real-time operation state of operation and maintenance equipment in a preset environment, and finally adjusts resource allocation and task scheduling of the operation and maintenance equipment through a resource scheduling module according to the real-time load condition of the operation and maintenance equipment in the operation process.
The technical scheme in the embodiment of the application aims to solve the problem of low operation and maintenance management efficiency, and the general thought is as follows:
The method comprises the steps of collecting the operation and maintenance data of the smart city in a preset time period in real time, preprocessing the operation and maintenance data to obtain a first operation and maintenance data, storing the first operation and maintenance data into an operation and maintenance data base, then carrying out fault detection on the first operation and maintenance data according to a preset fault early warning rule to generate a fault label, simultaneously constructing a fault prediction model and carrying out feature matrix training according to the real-time operation state of operation and maintenance equipment in a preset environment, and finally adjusting the resource allocation and task scheduling of the operation and maintenance equipment according to the real-time load condition of the operation and maintenance equipment in the operation process and combining the fault prediction model and the resource adjustment dimension, so that the effect of improving the operation and maintenance management efficiency is achieved.
In order to better understand the above technical solutions, the following detailed description will refer to the accompanying drawings and specific embodiments.
As shown in fig. 1, a schematic structure diagram of an operation and maintenance management system for a smart city according to an embodiment of the present application includes: the system comprises a data acquisition and encryption module, a fault management module, a fault prediction model construction module and a resource scheduling module; the data acquisition and encryption module is used for collecting operation and maintenance data of the smart city in a preset time period in real time and preprocessing to obtain first operation and maintenance data, meanwhile, the first operation and maintenance data are stored in the operation and maintenance data base, the operation and maintenance data are used for reflecting the real-time operation state of operation and maintenance equipment in the preset time period, the preprocessing comprises analog-to-digital conversion and data encryption, the operation and maintenance data base represents a data storage center containing a query mechanism, data backup and recovery test, and the first operation and maintenance data comprise operation and maintenance equipment performance data and network transmission data; the fault management module is used for carrying out fault detection on the first operation and maintenance data according to a preset fault early warning rule and feeding back a fault detection result to operation and maintenance personnel in a chart form, the fault detection is used for judging whether a fault alarm needs to be triggered according to a fault early warning index, and the fault early warning index is used for measuring the matching degree between the first operation and maintenance data and the preset fault early warning rule; the fault prediction model construction module is used for generating a fault label according to the first operation and maintenance data after fault detection, simultaneously combining the real-time operation state of the operation and maintenance equipment in a preset environment to construct a fault prediction model and perform feature matrix training, the fault label is used for marking the type and the position of the fault prediction model in the feature matrix training process, the fault prediction model is used for visually monitoring the real-time operation state of the operation and maintenance equipment in a preset time period and identifying an abnormal operation state, and the feature matrix training is used for predicting the mapping relation between the abnormal operation state and the first operation and maintenance data and presenting the mapping relation in a coordinate form; the resource scheduling module is used for adjusting the resource allocation and task scheduling of the operation and maintenance equipment according to the real-time load condition of the operation and maintenance equipment in the operation process by combining the fault prediction model and the resource adjustment dimension, wherein the resource adjustment dimension is used for measuring the priority degree of the resource allocation and task scheduling of the operation and maintenance equipment.
In this embodiment, in actual application, after the first operation and maintenance data is collected, a statistical method is further required to detect an abnormal mode in the first operation and maintenance data, and the performance and the network condition of the operation and maintenance device are fed back to the operation and maintenance manager; the feature matrix is a representative feature set extracted from historical data of the operation and maintenance equipment, the feature matrix can describe the operation state and possible fault modes of the operation and maintenance equipment, the model can learn the relation between the features and faults and conduct fault prediction according to the relation through feature matrix training, so that operation and maintenance personnel can discover potential faults in advance, and meanwhile corresponding measures are taken for prevention and repair; based on the real-time load condition, the fault prediction result and the resource adjustment dimension, the resource scheduling module dynamically adjusts the resource allocation and task scheduling of the operation and maintenance equipment, for example, if the load of one operation and maintenance equipment is too high and the fault is predicted to occur within a preset time period, the module can transfer part of tasks to other equipment or allocate more resources for the equipment so as to improve the processing capacity of the equipment, thereby improving the operation and maintenance management efficiency.
Further, as shown in fig. 2, a schematic structural diagram of a data acquisition and encryption module in an operation and maintenance management system for a smart city according to an embodiment of the present application is shown, where the data acquisition and encryption module includes a data acquisition unit, a data conversion unit, a data encryption unit and a data storage unit; a data acquisition unit: the method comprises the steps that physical data of operation and maintenance equipment in a smart city in a preset time period are obtained through a wireless sensor, wherein the physical data comprise voltage data, current data, temperature data and load data; a data conversion unit: the analog signal data corresponding to the physical data are converted into digital signal data through analog-to-digital conversion; a data encryption unit: the operation and maintenance data are encrypted according to a symmetric encryption algorithm and combined with the digital signal so as to prevent the operation and maintenance data from being tampered in the transmission process; a data storage unit: and the operation and maintenance data processing system is used for storing the operation and maintenance data after data conversion and data encryption into an operation and maintenance database and carrying out interactive retrieval, wherein the interactive retrieval represents that operation and maintenance data required by a user is retrieved from the operation and maintenance database through preset query conditions and interface operation.
In this embodiment, the data acquisition unit generally acquires physical data of the operation and maintenance device in the smart city in real time through the wireless sensor network, where the physical data may include temperature, humidity, pressure, vibration and electric quantity, and directly reflects the operation condition and environmental condition of the device; the analog signals in the data conversion unit are continuously changed signals, the digital signals are discrete signals with limited values, and the converted digital signals are easier to process and analyze by a computer; the data storage unit generally adopts high-performance storage equipment and redundancy design, supports mass storage and efficient expansion of the first operation and maintenance data, and meets the increasing data demand in the smart city; through real-time acquisition, accurate conversion, safe encryption and efficient storage of operation and maintenance data, powerful data support is provided for city managers, and the improvement of operation and maintenance management safety and stability is realized.
Further, the specific process of fault detection for the first operation and maintenance data is as follows: s11, performing fault matching on the first operation and maintenance data and a preset fault early warning rule to identify key operation and maintenance data, wherein the key operation and maintenance data are used for describing the use condition of operation and maintenance equipment and the loss condition of a data packet in the network transmission process; s12, detecting key operation and maintenance data according to the fault early warning index, judging whether the fault early warning index meets a preset fault threshold value or not, if so, monitoring the operation state of operation and maintenance equipment in a preset time period in real time, and if not, executing S13; and S13, triggering a fault alarm in the fault management module according to the detection result of the key operation and maintenance data, displaying alarm information on a monitoring interface, and generating a fault report according to the alarm information, wherein the fault report is used for visualizing the fault occurrence rate and fault type distribution of the operation and maintenance equipment in a preset time period.
In this embodiment, before performing fault matching, a fault pre-warning rule, that is, a preset fault pre-warning rule, is required to be predefined, where the rules are generally formulated based on historical fault data and equipment specifications, and are used to identify potential fault risks, in practical application, the fault pre-warning rule generally includes threshold judgment, pattern recognition and association analysis, meanwhile, format conversion is performed on collected data according to the requirement of the rule, then the cleaned and preprocessed first operation and maintenance data is matched with the preset fault pre-warning rule, the process is generally implemented through programming, the first operation and maintenance data is scanned by using an algorithm and a tool in computer programming, a data item conforming to the rule is found, and corresponding subsequent processing measures are adopted according to the matching result, for example, for high risk fault pre-warning, immediate fault checking and repairing may be required, and for low risk pre-warning, further monitoring and processing may be performed; the preset fault threshold is usually set according to the historical operation data and the fault prediction requirement of the operation and maintenance equipment, if the fault early warning index is lower than the fault threshold, the operation and maintenance equipment is in a normal state, normal operation and monitoring can be continued, if the fault early warning index exceeds the fault threshold, the potential fault risk of the operation and maintenance equipment is indicated, an alarm in the operation and maintenance equipment is required to be triggered at the moment, alarm information is sent to operation and maintenance management personnel, and related fault information and a suggested processing scheme are provided so as to eliminate faults and restore the normal operation of a system, and the improvement of the fault detection accuracy and reliability is realized.
Further, the first operation and maintenance data are processed and encoded through a statistical method to obtain an average value and a standard deviation, and meanwhile, the deviation degree between the current first operation and maintenance data and the average value is evaluated by utilizing a standardized score in a machine learning algorithm to obtain a fault early warning index, so that whether a fault risk exists is judged; the fault early warning index can be obtained and calculated through the following steps and formulas besides the method: the specific acquisition steps of the fault early warning index comprise: collecting equipment performance data and network transmission data of operation and maintenance equipment in a preset time period in real time, wherein the equipment performance data comprise equipment environment data and memory disk utilization rate, and the network transmission data comprise bandwidth delay rate and data packet loss rate; acquiring a memory disk coefficient, a bandwidth delay coefficient and a data packet loss coefficient according to a memory disk reference utilization rate, a bandwidth reference delay rate, a data reference packet loss rate and corresponding reference deviation, wherein the memory disk coefficient is used for measuring the utilization efficiency of a memory disk in operation and maintenance equipment in a preset time period, the bandwidth delay coefficient is used for measuring the delay condition of first operation and maintenance data in a transmission process, and the data packet loss coefficient is used for measuring the packet loss condition of the first operation and maintenance data in the transmission process; obtaining a fault early warning index by combining equipment environment influence factors set by preset performance indexes of equipment, wherein the equipment environment influence factors are used for evaluating the influence degree of environmental conditions on operation and maintenance equipment; the fault early warning index is calculated by the following formula:
,
wherein m is the number of a preset time period,M is the total number of preset time periods,/>The fault early warning index of the first operation and maintenance data in the mth preset time period is represented, e is a natural constant, and is expressed as' vMemory disk coefficient representing first operation and maintenance data in mth preset time period,/>A bandwidth delay coefficient representing the first operation and maintenance data within the mth preset time period,Data packet loss coefficient of first operation and maintenance data in mth preset time period,/>, andRepresenting the device environmental impact factor.
In this embodiment, the memory disk usage rate may be generally obtained from a monitoring device or hardware monitoring software of the operation and maintenance device, the bandwidth delay rate and the data packet loss rate may be generally obtained through a management interface of a network monitoring tool or a switch, the bandwidth delay rate and the data packet loss rate may be generally obtained by using a ping command or a special network performance testing tool, in practical application, the memory disk reference usage rate, the bandwidth reference delay rate, the data reference packet loss rate and the corresponding reference deviation are generally obtained after weighted averaging based on historical operation data of the operation and maintenance device, and the memory disk coefficient, the bandwidth delay coefficient and the data packet loss coefficient are generally calculated by combining these reference values and reference deviations and using a linear transformation method according to practical operation and maintenance management requirements; in order to acquire these data periodically and calculate the corresponding coefficients, in the actual acquisition process, the operation and maintenance device can be operated by writing an automation script at regular time and the data in the operation process is stored in a designated position, it should be noted that when=1 And/>When the fault early warning index is the maximum value, the fault occurrence rate of the operation and maintenance equipment in the operation process is the highest, and when/>, the fault early warning index is the maximum value=0 And/>When the fault early warning index is the minimum value, the operation and maintenance equipment is in a normal running state, the specific calculation method also depends on the environment change condition of the operation and maintenance equipment, the more accurate acquisition of the fault early warning index is realized, the improvement of the operation and maintenance management efficiency is further realized, and the problem of low operation and maintenance management efficiency in the prior art is effectively solved.
Further, the specific flow for constructing the fault prediction model is as follows: collecting real-time running state data of the operation and maintenance equipment in a preset environment in real time and carrying out normalization processing, wherein the real-time running state data is used for reflecting the current working state and environmental condition of the operation and maintenance equipment, and the normalization processing is used for removing noise data and redundant information data in the real-time running state data; extracting fault prediction feature data from the normalized real-time running state data, and simultaneously adding a corresponding fault label for the fault prediction feature data by combining the first operation and maintenance data after fault detection, wherein the fault prediction feature data is used for reflecting the difference of operation and maintenance equipment in a preset fault mode; and constructing a fault prediction framework based on a fault detection algorithm and combining fault prediction characteristic data with fault labels, and simultaneously inputting the fault prediction characteristic data into the constructed fault prediction framework to construct a fault prediction model.
In this embodiment, the normalization process generally converts the data of different scales and units into the same scale for comparison and analysis, which helps to eliminate noise and redundant information in the data, improve the quality of the data, and select a proper normalization method according to the characteristics and requirements of the real-time running state data, where the normalization method is generally used as the min-max normalization (mapping the data between [0,1 ]); performing fault detection on the normalized real-time running state data by using a fault detection algorithm (such as statistical detection and machine learning detection), recording the occurrence time, type and position of the fault when the fault is detected, acquiring fault prediction characteristic data based on historical fault data and deep understanding of operation and maintenance equipment, and adding a corresponding fault tag to the extracted fault prediction characteristic data by combining the first operation and maintenance data after the fault detection; the specific steps for constructing the fault prediction framework are as follows: designing a fault prediction architecture, wherein the architecture generally comprises a data preprocessing layer, a feature extraction layer and a prediction output layer; at the data preprocessing layer, carrying out necessary cleaning, conversion and scaling on the input fault prediction characteristic data; at the feature extraction layer, further processing the data according to the selected features to extract feature vectors for training the model; at a prediction output layer, predicting faults according to a deep learning algorithm, wherein common algorithms comprise logistic regression, decision trees and random forests; through the construction flow, a fault prediction framework based on a fault detection algorithm and fault prediction characteristic data with fault labels can be obtained, and more accurate prediction of a fault prediction model is realized.
Further, as shown in fig. 3, in a feature matrix training flowchart of a fault prediction model provided by an embodiment of the present application, the specific steps for performing feature matrix training on the fault prediction model include: taking the first operation data with the fault labels as the rows and columns of the feature matrix to determine the position of the fault to be predicted, and setting initial parameters of feature matrix training according to model complexity coefficients of the fault prediction model in a preset time period, wherein the model complexity coefficients are used for measuring the complexity degree of the fault prediction model, and the initial parameters are used for visualizing the iteration times of feature matrix training; inputting real-time running state data of the operation and maintenance equipment into a fault prediction model according to initial parameters of feature matrix training, and simultaneously adjusting parameters of the fault prediction model in the training process in real time until preset training indexes of the model are met; and mapping the parameters adjusted in real time and the position of the fault to be predicted into the corresponding fault labels so as to update the feature matrix periodically.
In this embodiment, during the training of the fault prediction model, the first operation data with the fault label is not usually directly used as the row and column of the feature matrix to determine the position of the fault to be predicted, in fact, the row of the feature matrix usually represents samples (such as different time points, different devices or different operation conditions), and the column represents features (i.e. describes the attribute or variable of the sample), in the trained fault prediction model, new sample data (i.e. new feature vector) is input, at this time, the fault prediction model outputs a prediction result, which may be a probability value (indicating the probability of occurrence of the fault), or a specific fault type or level, for example, if there is a sample for each device or each time point, and the fault prediction model outputs a fault probability for each sample, which device or time point is most likely to be faulty can be determined by comparing the probability values; the specific steps of updating the feature matrix are as follows: determining new samples, such as each new time point, equipment state or parameter adjustment, can be regarded as a new sample, then creating a feature vector for each new sample, including relevant features extracted from the integrated data, and finally adding the new feature vectors to the feature matrix as new rows, and updating or adding corresponding fault labels as columns or separate label sets, thereby realizing improvement of the training efficiency and accuracy of the feature matrix.
Further, the model complexity coefficients are obtained by the following method: collecting model parameters of operation and maintenance equipment, comparing variation amounts of model parameters in a preset time period to obtain model parameter updating amplitude, and simultaneously obtaining a feature matrix training cycle number according to convergence speed of a fault prediction model in a feature matrix training process, wherein the model parameter updating amplitude is used for measuring sensitivity of the fault prediction model to first operation and maintenance data with fault labels, and the feature matrix training cycle number is used for measuring fitting degree of the fault prediction model to the first operation and maintenance data with the fault labels; combining the average training period number of the feature matrix in a preset time period and a model complexity coefficient of a corresponding reference value of the fault prediction model.
In this embodiment, by comparing the parameter values at two time points (such as between two training iterations or between two actual operation and maintenance periods), the variation of the parameter can be calculated, and this variation can reflect the ability of the model to adapt to new data or new environment, where the model parameter represents the parameter value of the model in the feature matrix training process, the magnitude of the variation can be regarded as the magnitude of the model parameter update, and a larger update magnitude may mean that the model is actively adapting to the data change, but may also cause instability or overfitting of the model; in practical application, the number of training cycles of the feature matrix indicates the convergence speed based on the fault prediction model, the convergence speed refers to the time or the iteration number required by the fault prediction model to reach stable performance (such as the loss function value is not significantly reduced) in the training process of the feature matrix, by observing the performance change of the fault early-warning model in the training process of the feature matrix, generally, a more complex model may need more iteration number to converge, and in addition, a cross-validation method and a regularization method may be used to control the complexity of the model in practical application.
Specifically, the complexity degree of the fault prediction model is evaluated by analyzing model complexity measurement indexes of the fault prediction model in the feature matrix training process and combining the existing interpretability of the model, so that model complexity coefficients of the fault prediction model in a preset time period are obtained; in addition to obtaining the model complexity coefficients by the above-described interpretability evaluation method, the calculation can be performed by the following formula:
,
wherein m is the number of a preset time period,M is the total number of preset time periods,/>Model complexity coefficient representing fault prediction model in mth preset time period, e is natural constant, and is/isModel parameter update amplitude of fault prediction model in mth preset time period is represented by/>Representing the update amplitude of model parameter reference,/>Representing model parameter update amplitude reference deviation,/>Representing the training cycle number of the feature matrix of the fault prediction model in the mth preset time period,/>Representing the average training period number of the feature matrix,/>Representing the reference deviation of the training period number of the feature matrix.
Specifically, the model parameter reference update amplitude and the model parameter update amplitude reference deviation are generally obtained by carrying out fitting and averaging on historical operation data and model parameters of the operation and maintenance equipment, wherein the fitting represents that the historical operation data of the operation and maintenance equipment is input into an existing model and integrated with the model parameters in the model, and the feature matrix training cycle number reference deviation is generally obtained by carrying out weighting and averaging on the training cycle number of the existing fault prediction model; the statistical table of the variation of the model complexity coefficients is shown in table 1:
Table 1 statistical table of variation of model complexity coefficients
It should be understood that, as can be seen from table 1, the closer the model complexity coefficient is to 1.5, the more complex the fault prediction model is, when=0 And/>When=0, it is explained that the simpler the failure prediction model at this time, i.e., the easier the failure prediction model predicts the failure trend, when/>And/>When the model complexity coefficient is increased along with the increase of the model parameter updating amplitude, the model complexity coefficient is reduced along with the increase of the feature matrix training period number, and the specific model complexity coefficient is calculated and analyzed by combining with the actual requirement of operation and maintenance equipment, so that the model complexity coefficient is more accurately acquired, the improvement of the operation and maintenance management efficiency is further realized, and the problem of low operation and maintenance management efficiency in the prior art is effectively solved.
Further, the specific steps of adjusting the resource allocation and task scheduling of the operation and maintenance equipment are as follows: task state information of the operation and maintenance equipment in the normal operation process is collected in real time, and the failure trend of the operation and maintenance equipment is predicted by utilizing a trained failure prediction model, wherein the task state information comprises operation load conditions, execution time and residual workload; identifying overload resources according to the operation load condition, and sequencing the priority of the overload resources by combining with the resource adjustment dimension, wherein the overload resources represent that the resource utilization rate of the operation and maintenance equipment exceeds the maximum capacity of the actual operation and maintenance equipment; and according to the identified decision of the overload resource clear resource adjustment and task scheduling, and in combination with the actual running requirement of the operation and maintenance equipment, carrying out dynamic delay on the low-priority task, wherein the low-priority task is used for reflecting the response speed of the resource adjustment and task scheduling to the operation and maintenance equipment under the condition of instantaneous load, and the dynamic delay is used for preventing the impact of the instantaneous load to the operation and maintenance equipment.
In this embodiment, if the operation and maintenance device provides an API interface, the task state may be obtained directly through an API call, or a special monitoring tool (such as promethaus) may be used, then the task state information collected in real time is input into a failure prediction model, the failure prediction model outputs a probability or risk level of failure in a period of time in the future, in practical application, an alarm threshold is set according to the output of the failure prediction model, which resources are in an overload state is determined according to the set threshold, for example, CPU usage rate exceeding 90% may be regarded as overload, and then the overload resources are prioritized according to importance and overload degree of the resources, for example, CPU overload priority of a critical service system is higher than disk overload of a non-critical system; the dynamic delay of the low-priority tasks can be realized by a task scheduling system or an automatic script according to the actual running requirement of the operation and maintenance equipment, the systems or the scripts can automatically put the low-priority tasks into a waiting queue or pause the execution of the low-priority tasks according to preset rules or algorithms, when the low-priority tasks are delayed, the influence of the low-priority tasks on the whole system is required to be considered, other problems or chain reactions are not caused, if the resource condition is improved, the delayed low-priority tasks can be gradually released to resume the execution of the low-priority tasks, and the normal running efficiency of the operation and maintenance equipment is improved.
Further, the resource scheduling dimension of the operation and maintenance equipment is obtained by inputting the resource usage data and the operation load data of the operation and maintenance equipment in a preset time period into a machine learning algorithm model for fitting, and the resource scheduling dimension can be obtained and calculated through the following method and formula: the specific acquisition steps of the resource adjustment dimension are as follows: setting a load balancing target range based on performance indexes and task demands of the operation and maintenance equipment, and acquiring a load balancing rate by combining the operation load condition of the operation and maintenance equipment, wherein the load balancing rate is used for measuring the deviation degree of the operation load condition and the load balancing target range; acquiring resource utilization rate of the operation and maintenance equipment in real time, wherein the resource utilization rate comprises disk space utilization rate and network bandwidth occupation rate, and the resource utilization rate represents the ratio of the actual resource utilization amount to the total capacity of the operation and maintenance equipment; combining the average utilization rate of the resources of the operation and maintenance equipment in a preset time period and the load reference balance rate to obtain a resource adjustment dimension, wherein the resource adjustment dimension is calculated by the following formula:
,
wherein m is the number of a preset time period,M is the total number of preset time periods, g is the number of operation and maintenance equipment, and IG, G is the total number of operation and maintenance devices,/>Representing the resource adjustment dimension of the g-th operation and maintenance equipment in the m-th preset time period, wherein e is a natural constant, and is/areRepresenting the load balancing rate of the g-th operation and maintenance equipment in the m-th preset time period,/>Representing load reference balance rate,/>Representing load balancing rate reference deviation,/>Representing the resource utilization rate of the g-th operation and maintenance equipment in the m-th preset time period,/>Representing average utilization rate of resources,/>Representing the resource utilization reference deviation.
In this embodiment, before the load balancing rate is obtained, a load balancing target range needs to be set, first, performance indexes, such as processing capability and response time, of the operation and maintenance equipment need to be analyzed, demands of different tasks on resources, such as a CPU, a memory and a network bandwidth, are understood, according to the performance indexes and the task demands, a load balancing target range is set, and the load balancing target range can be a specific numerical value interval or a dynamic range based on actual load, and then the load balancing rate is obtained by monitoring the operation load condition of the operation and maintenance equipment in real time and comparing the operation load condition with the load balancing target range; similarly, in practical application, obtaining the actual resource usage amount of the operation and maintenance equipment according to the collected resource usage data, then obtaining the resource utilization rate and the average resource utilization rate according to the total capacity of the operation and maintenance equipment, such as the total capacity of a disk and the total bandwidth of a network, by dividing the actual usage amount by the total capacity, weighting the actual resource utilization rate and taking a standard deviation to obtain a resource utilization rate reference deviation; and obtaining a load reference balance rate and a load balance rate reference deviation after carrying out weighted averaging on historical operation load data of the operation and maintenance equipment.
In particular, in the practical application process,,/>Wherein/>Load balancing rate coefficient of g operation and maintenance equipment in m preset time period,/>As shown in fig. 4, which is a three-dimensional analysis schematic diagram of the resource adjustment dimension provided by the embodiment of the present application, the resource utilization coefficient has a significant effect on the resource adjustment dimension, when the resource utilization coefficient gradually decreases from 0 to 1 (when the load balancing coefficient has a smaller effect on the resource adjustment dimension), the corresponding resource adjustment dimension is more accurate, and the image variation amplitude is more obvious, when the load balancing coefficient and the resource utilization coefficient increase or decrease simultaneously, the obtained resource adjustment dimension is less accurate, and it is noted that if the resource utilization coefficient is too high and the load balancing coefficient is low, the resource capacity needs to be increased or the task allocation needs to be optimized, and when/>=0 And/>When the operation and maintenance equipment is in the range of 0, the resource adjustment dimension of the operation and maintenance equipment is equal to 2, at the moment, the resource scheduling and the task reassignment are not needed, the operation and maintenance equipment has the highest operation efficiency at the moment, the operation stability of the operation and maintenance equipment is improved, the operation and maintenance management efficiency is further improved, and the problem of low operation and maintenance management efficiency in the prior art is effectively solved.
As shown in fig. 5, a flowchart of an operation and maintenance management method for a smart city according to an embodiment of the present application includes the following steps: s1, collecting operation and maintenance data of a smart city in a preset time period in real time, preprocessing to obtain first operation and maintenance data, and simultaneously storing the first operation and maintenance data into an operation and maintenance database, wherein the operation and maintenance data are used for reflecting the real-time operation state of operation and maintenance equipment in the preset time period, the preprocessing comprises analog-to-digital conversion and data encryption, the operation and maintenance database represents a data storage center containing a query mechanism, data backup and recovery test, and the first operation and maintenance data comprise operation and maintenance equipment performance data and network transmission data; s2, performing fault detection on the first operation and maintenance data according to a preset fault early warning rule, and feeding back a fault detection result to operation and maintenance personnel in a chart form, wherein the fault detection is used for judging whether a fault alarm needs to be triggered according to a fault early warning index, and the fault early warning index is used for measuring the matching degree between the first operation and maintenance data and the preset fault early warning rule; s3, generating a fault label according to the first operation and maintenance data after fault detection, simultaneously constructing a fault prediction model by combining the real-time operation state of the operation and maintenance equipment in a preset environment and performing feature matrix training, wherein the fault label is used for marking the type and the position of the fault prediction model in the feature matrix training process, the fault prediction model is used for visually monitoring the real-time operation state of the operation and maintenance equipment in a preset time period and identifying an abnormal operation state, and the feature matrix training is used for predicting the mapping relation between the abnormal operation state and the first operation and maintenance data and presenting the mapping relation in a coordinate form; and S4, adjusting the resource allocation and task scheduling of the operation and maintenance equipment according to the real-time load condition of the operation and maintenance equipment in the operation process by combining a fault prediction model and a resource adjustment dimension, wherein the resource adjustment dimension is used for measuring the priority degree of the resource allocation and task scheduling of the operation and maintenance equipment.
In this embodiment, the specific steps of feature matrix training are: the method comprises the steps of training a feature matrix by using historical operation and maintenance data of operation and maintenance equipment and known abnormal state labels, learning a mapping relation between the feature matrix and the abnormal operation state through a deep learning algorithm, inputting the preprocessed first operation and maintenance data into a trained fault prediction model, calculating probability or score of the abnormal state according to the input data, and if the probability or score exceeds a preset threshold value, considering that the equipment is in the abnormal operation state, displaying a prediction result to operation and maintenance personnel in a chart form so that the operation and maintenance personnel can quickly know the real-time operation state of the equipment, wherein the chart can comprise a real-time data curve, an abnormal state mark and an early warning grade, the coordinate form can be used for showing relevance among different features and positions of the abnormal state in the feature space, and the fault prediction model can learn the features and modes through training the mapping relation between the feature matrix and the abnormal state, so that the accurate identification of the abnormal operation state is realized.
The technical scheme provided by the embodiment of the application at least has the following technical effects or advantages: relative to the bulletin number: according to the intelligent city optimization management method and system based on multi-source data fusion, in the embodiment of the application, real-time running state data of operation and maintenance equipment in a preset environment is collected in real time and normalized, meanwhile, fault prediction feature data are extracted from the normalized real-time running state data and are added with corresponding fault labels for the fault prediction feature data by combining first operation and maintenance data after fault detection, then a fault prediction framework is constructed by combining fault prediction feature data with fault labels based on a fault detection algorithm, and finally the fault prediction feature data are input into the constructed fault prediction framework to construct a fault prediction model, so that accurate construction of the fault prediction framework is realized, and accuracy and reliability of the fault prediction model are improved; relative to the bulletin number: according to the log data storage method based on the intelligent security management platform disclosed by the application, task state information of operation and maintenance equipment in a normal operation process is collected in real time, a trained fault prediction model is utilized to predict fault trend of the operation and maintenance equipment, overload resources are identified according to operation load conditions, priority ranking is carried out on the overload resources by combining resource adjustment dimensions, and finally dynamic delay is carried out on low-priority tasks according to decision of clear resource adjustment and task scheduling of the identified overload resources and actual operation requirements of the operation and maintenance equipment, so that improvement of operation and maintenance resource allocation efficiency and accuracy is achieved, and more accurate training of the fault prediction model is achieved.
It will be appreciated by those skilled in the art that embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of systems, apparatuses (systems) and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. It is therefore intended that the following claims be interpreted as including the preferred embodiments and all such alterations and modifications as fall within the scope of the invention.
It will be apparent to those skilled in the art that various modifications and variations can be made to the present invention without departing from the spirit or scope of the invention. Thus, it is intended that the present invention also include such modifications and alterations insofar as they come within the scope of the appended claims or the equivalents thereof.