Forestry pest control risk assessment system based on data analysisTechnical Field
The invention belongs to the technical field of forest pest control, and particularly relates to a forest pest control risk assessment system based on data analysis.
Background
Forestry is one of important industries for protecting ecological environment and promoting economic development, and a plurality of harmful organisms exist in the forestry, which can propagate and spread and harm forest ecology and economic development, so that the control of the harmful organisms in the forestry is very important, and the traditional forestry harmful organism control method is usually based on experience and manual observation at present, and has low efficiency, high cost and low precision, and is difficult to meet the control requirements of modern forestry;
however, in the current pest control risk assessment system, a great deal of time and manpower and material resources are generally required for collecting the pest number distribution, so that the timeliness and accuracy of control are greatly limited. Therefore, the development of a method capable of reducing the collection time of the distribution of the number of harmful organisms has very important significance for improving the efficiency and effect of forestry control;
therefore, the invention provides a forestry pest control risk assessment system based on data analysis.
Disclosure of Invention
The present invention aims to solve at least one of the technical problems existing in the prior art. Therefore, the invention provides a forest pest control risk assessment system based on data analysis, which reduces the collection time of pest quantity distribution, thereby improving the timeliness of forest personnel to the forest risk.
To achieve the above objective, an embodiment according to a first aspect of the present invention provides a forest pest control risk assessment system based on data analysis, including a forest data collection module, a forest clustering module, a historical data collection module, a model training module, a data collection module to be assessed, and a risk assessment module; wherein, each module is connected by an electric and/or wireless network mode;
the forestry data collection module is mainly used for collecting forestry basic data of each forest in advance;
the forestry basic data comprise position data, climate data, vegetation data and soil data of each piece of forest, and the position data, the climate data, the vegetation data and the soil data all comprise a plurality of basic attributes;
The basic attributes included in the position data are longitude, latitude and altitude of each forest;
The weather data comprise basic attributes of annual average temperature, average rainfall and average rainfall of each forest;
The vegetation data comprises basic attributes of vegetation coverage rate of each forest, vegetation types and distribution proportion of various types of vegetation;
The soil data comprises basic attributes of the soil type, the soil texture, the soil pH value and the soil nutrition level of each forest;
the forestry data collection module sends the collected forestry basic data to the forestry clustering module;
the forestry clustering module is mainly used for classifying forestry basic data;
the forestry clustering module classifies the forestry basic data in the following modes:
combining all basic attributes of forestry basic data of each forest into a form of feature vectors, taking the feature vectors of all forests as input of a clustering algorithm, and training the clustering algorithm to obtain K clusters and a clustering center of each cluster; wherein K is the number of clusters; each cluster represents a forestry category;
the forestry clustering module sends the clustering center and the forestry category corresponding to each cluster to the historical data collection module and the risk assessment module;
The historical data collection module is mainly used for collecting historical pest data of each forestry category in advance;
the mode of the historical data collection module for collecting the historical pest data of each forestry category is as follows:
collecting a forest set of each forestry category;
The manner in which the forest collections for each forestry category are collected is: combining all basic attributes of forestry basic data of each forest into a feature vector form, and calculating the distance between the feature vector and each cluster center to obtain the cluster center closest to the feature vector; the forestry category to which each forest belongs is the forestry category corresponding to the nearest cluster center; each forest category corresponds to a forest set, and each forest in the forest set belongs to the forest category;
The historical pest data of each forestry category comprises the types of pests collected from each forest of the corresponding forest set of the forestry category, the number distribution of each pest, the climate data, the vegetation disease infection type and the severity of each disease in a historical manner at preset data collection time intervals;
The historical data collection module sends historical pest data of each forest to the model training module;
The model training module is mainly used for training out weather data and disease severity degree based on each disease for each forestry category and predicting Bayesian network of the distribution of the number of each pest;
The model training module trains the Bayesian network for predicting the quantity distribution of each pest in the following way:
Marking the number of the forestry category as i, and marking the number of the disease as j; the pest number is marked as k;
for the ith forestry category and the jth disease:
establishing a first-level network structure and a second-level network structure;
The first influence node of the first-stage network structure is climate data and the severity of the j-th disease as influence nodes, and the first prediction node of the first-stage network structure is the quantity distribution of each pest;
Wherein the second influencing node of the second level network structure comprises a mutual influencing relation between the number of each kind of pest; in the second-level network structure, the number distribution of each pest is a second influence node, and the mutual influence between the second influence nodes is a directed edge, wherein the mutual influence edge is expressed as an influence probability matrix, each row and each column of the influence probability matrix represent one pest, and each element in the influence probability matrix represents the influence probability of one pest on another pest;
when the second-level network structure is constructed, harmful organisms with the quantity distribution lower than a preset quantity distribution threshold value can be screened out, so that the complexity of the second-level network structure is reduced;
The model training module reads the quantity distribution of each pest, climate data and severity of the j-th disease collected every data collection time period from a forest set corresponding to the i-th forestry category from the historical pest data as training data; inputting training data into the constructed Bayesian network, and training the Bayesian network; marking the Bayesian network of the jth disease of the ith forestry category after training as Bij;
the model training module sends the trained Bayesian network Bij to the risk assessment module;
the to-be-evaluated data collection module is mainly used for collecting to-be-evaluated data in a forest to be evaluated;
The data to be evaluated comprises forestry basic data of a forest to be evaluated, disease types of vegetation infection in the forest and severity of each disease;
The to-be-evaluated data collection module sends to-be-evaluated data in the forest to be evaluated to the risk evaluation module;
wherein the risk assessment module is mainly used for assessing the quantity distribution among various pests based on climate data and the severity of each disease;
The risk assessment module assesses the number distribution among the various pests in the following manner:
Based on the forest basic data all basic attribute combinations of the forest to be evaluated as feature vector forms, calculating the distance between the feature vector and each cluster center, obtaining the cluster center closest to the feature vector, and obtaining a corresponding forest category i based on the closest cluster center;
based on the forestry category and each disease category j infected by vegetation in the forest to be evaluated, a corresponding Bayesian network Bij is obtained, climate data of the forest to be evaluated and the severity of the j disease are used as input of the Bayesian network Bij, and an influence probability matrix of pests is obtained; by collecting the number distribution of one of the pests, the number distribution of the other pests is obtained based on the influence probability matrix.
Compared with the prior art, the invention has the beneficial effects that:
the invention divides different forests into different forestry categories based on forestry basic data and historical pest data by collecting the forestry basic data and the historical pest data of each forest in advance, trains out the weather data and the severity degree of diseases based on each disease for each forestry category based on the historical pest data in the different forestry categories, predicts the Bayesian network of the quantity distribution of each pest, and evaluates the influence probability matrix of various pests in the forest to be evaluated by using the Bayesian network after training is completed when the forestry basic data of the forest to be evaluated, the disease type of vegetation infection in the forest and the severity degree of each disease are collected; therefore, the quantity distribution of one kind of harmful organisms in the forest is only required to be collected, the quantity distribution of various other harmful organisms can be obtained, the collection time of the quantity distribution of the harmful organisms is reduced, and the timeliness of forestry personnel for coping with forestry risks is improved.
Drawings
Fig. 1 is a block diagram of a forestry pest control risk assessment system of embodiment 1 of the present invention.
Detailed Description
The technical solutions of the present invention will be clearly and completely described in connection with the embodiments, and it is obvious that the described embodiments are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Example 1
As shown in fig. 1, a forest pest control risk assessment system based on data analysis comprises a forest data collection module, a forest clustering module, a historical data collection module, a model training module, a data collection module to be assessed and a risk assessment module; wherein, each module is connected by an electric and/or wireless network mode;
The forestry data collection module is mainly used for collecting forestry basic data of each forest in advance;
In a preferred embodiment, the forestry basic data comprises position data, climate data, vegetation data and soil data of each piece of forest, and the position data, the climate data, the vegetation data and the soil data all comprise a plurality of basic attributes;
the basic attributes included in the position data are longitude, latitude and altitude of each forest;
the weather data comprise basic attributes of the average temperature, the average rainfall and the average rainfall of each forest in the past year;
The vegetation data comprises basic attributes including vegetation coverage rate of each forest, vegetation types and distribution proportion of various types of vegetation; it should be noted that the vegetation types may be conifer, broadleaf, shrub, etc.;
The soil data comprises basic attributes of the soil type, the soil texture, the soil pH value and the soil nutrition grade of each forest; the soil types can be gray soil, red soil, yellow soil and the like; the soil nutrition level can be specifically evaluated according to the content of organic matters, nitrogen, phosphorus, potassium and other nutrient elements contained in the soil;
The forestry data collection module sends the collected forestry basic data to the forestry clustering module;
The forestry clustering module is mainly used for classifying forestry basic data;
In a preferred embodiment, the forestry clustering module classifies the forestry base data in the following manner:
Combining all basic attributes of forestry basic data of each forest into a form of feature vectors, taking the feature vectors of all forests as input of a clustering algorithm, and training the clustering algorithm to obtain K clusters and a clustering center of each cluster; wherein K is the number of clusters; each cluster represents a forestry category; preferably, the clustering algorithm can be K-Means or DBSCAN algorithm;
The forestry clustering module sends the clustering center and the forestry category corresponding to each cluster to the historical data collection module and the risk assessment module;
the historical data collection module is mainly used for collecting historical pest data of each forestry category in advance;
In a preferred embodiment, the historical data collection module collects historical pest data for each forestry category in the manner of:
collecting a forest set of each forestry category;
The manner in which the forest collections for each forestry category are collected is: combining all basic attributes of forestry basic data of each forest into a feature vector form, and calculating the distance between the feature vector and each cluster center to obtain the cluster center closest to the feature vector; the forestry category to which each forest belongs is the forestry category corresponding to the nearest cluster center; each forest category corresponds to a forest set, and each forest in the forest set belongs to the forest category;
In a preferred embodiment, the historical pest data for each forestry category includes the category of pest collected from each forest of the corresponding forest set for the forestry category, the distribution of the number of each pest, the climate data, the type of vegetation disease infection, and the severity of each disease historically every preset data collection time period;
Wherein the number distribution of the pests is the number of each pest in the forest in a unit area; the species of pest is selected based on common knowledge of forestry; the vegetation disease infection type is the plant disease type diagnosed by forestry researchers on various vegetation in the forest in each data collection time period; the severity of each disease is rated by the forestry care personnel based on medical common sense according to the specific infection severity of each vegetation disease;
the historical data collection module sends historical pest data of each forest to the model training module;
the model training module is mainly used for training out weather data and disease severity based on each disease for each forestry category and predicting a Bayesian network of the quantity distribution of each pest;
In a preferred embodiment, the model training module trains the Bayesian network to predict the number distribution of each pest in the following manner:
Marking the number of the forestry category as i, and marking the number of the disease as j; the pest number is marked as k;
for the ith forestry category and the jth disease:
establishing a first-level network structure and a second-level network structure;
The first influence node of the first-stage network structure is climate data and the severity of the j-th disease as influence nodes, and the first prediction node of the first-stage network structure is the quantity distribution of each pest;
Wherein the second influencing node of the second level network structure comprises a mutual influencing relation between the number of each kind of pest; in the second-level network structure, the number distribution of each pest is a second influence node, the mutual influence between the second influence nodes is a directed edge, the mutual influence edge is expressed as an influence probability matrix, each row and each column of the influence probability matrix represent one pest, and each element in the influence probability matrix represents the influence probability of one pest on the other pest;
In a preferred embodiment, in constructing the second-level network structure, all harmful organisms with the quantity distribution lower than a preset quantity distribution threshold value can be screened out, so that the complexity of the second-level network structure is reduced;
The model training module reads the quantity distribution of each pest, climate data and severity of the jth disease, which are collected every data collection time period, from a forest set corresponding to the ith forestry category from the historical pest data as training data; inputting training data into the constructed Bayesian network, and training the Bayesian network; marking the Bayesian network of the jth disease of the ith forestry category after training as Bij;
The model training module sends the trained Bayesian network Bij to the risk assessment module;
the data collection module to be evaluated is mainly used for collecting data to be evaluated in a forest to be evaluated;
in a preferred embodiment, the data to be evaluated includes forestry base data of the forest to be evaluated, the type of disease of the vegetation infection in the forest, and the severity of each disease;
The data to be evaluated collecting module sends data to be evaluated in the forest to be evaluated to the risk evaluating module;
the risk assessment module is mainly used for assessing the quantity distribution among various harmful organisms based on climate data and the severity of each disease;
In a preferred embodiment, the risk assessment module assesses the number distribution among the various pests in the following manner:
Based on the forest basic data all basic attribute combinations of the forest to be evaluated as feature vector forms, calculating the distance between the feature vector and each cluster center, obtaining the cluster center closest to the feature vector, and obtaining a corresponding forest category i based on the closest cluster center;
based on the forestry category and each disease category j infected by vegetation in the forest to be evaluated, a corresponding Bayesian network Bij is obtained, climate data of the forest to be evaluated and the severity of the j disease are used as input of the Bayesian network Bij, and an influence probability matrix of pests is obtained; by collecting the number distribution of one of the pests, the number distribution of the other pests is obtained based on the influence probability matrix.
The above embodiments are only for illustrating the technical method of the present invention and not for limiting the same, and it should be understood by those skilled in the art that the technical method of the present invention may be modified or substituted without departing from the spirit and scope of the technical method of the present invention.