Disclosure of Invention
The invention provides a data storage method of a monitoring system, which aims to solve the problem of low compression and storage efficiency of existing crop growth monitoring data, and adopts the following technical scheme:
one embodiment of the present invention provides a data storage method of a monitoring system, the method comprising the steps of:
acquiring height information of each position in a crop growing area, and acquiring various wave band data;
clustering is carried out according to the height information of different positions, and a plurality of first categories and category heights are obtained;
the method comprises the steps of carrying out ascending arrangement on class heights of a plurality of first classes to obtain class height sequences, obtaining entropy value sequences of each wave band according to various wave band data and the first classes and the class heights, obtaining first similarity of each wave band according to the entropy value sequences and the class height sequences, and obtaining reserved wave bands according to the first similarity;
factor analysis is carried out on each reserved wave band in each first category to obtain the height characteristic duty ratio of each reserved wave band, and the height correlation degree of each reserved wave band is obtained according to the height characteristic duty ratio of each reserved wave band and the corresponding first similarity;
obtaining second similarity of position pairs formed by any two positions according to the band data of each position in different reserved bands, classifying the positions according to the second similarity to obtain a plurality of second categories, superposing the first category corresponding areas and the second category corresponding areas, obtaining a clustering center according to the ratio of the superposed intersection to the union, taking the ratio of the number of the intersection to the number of the first category elements as a first weight value of a height difference value between the positions, taking the ratio of the number of the intersection to the number of the second category elements as a second weight value of the second similarity between the positions, taking the weighted sum result of the height difference value between the positions and the second similarity as a clustering distance, and clustering to obtain a plurality of third categories according to the clustering center and the clustering distance;
and respectively compressing and storing the acquired monitoring data in each third category.
Optionally, the obtaining the plurality of first categories and the category heights includes the following specific methods:
clustering the different positions according to the height information, wherein the clustering distance is the difference value of the height information among the different positions, the clustering result comprises a plurality of categories, each category is a height category, the height category is marked as a first category, the average value of all the height information in each first category is calculated as the average height, and the average height is marked as the category height.
Optionally, the method for obtaining the entropy value sequence of each band includes the following specific steps:
each first class corresponds to one region in each wave band data, and the information entropy of the wave band data at all positions in each region is used as the entropy value of the corresponding wave band data in the corresponding first class;
obtaining entropy values of each band data in different first categories, and arranging the entropy values according to ascending order of heights of the first categories corresponding to the categories to obtain entropy value sequences of each band.
Optionally, the method for obtaining the first similarity of each band according to the entropy sequence and the class height sequence includes the following specific steps:
wherein ,
represent the first
A first degree of similarity of the seed bands,
represent the first
The sequence of entropy values of the seed band,
the sequence of class heights is represented by a sequence of class heights,
a cosine similarity calculation function representing the two sequences.
Optionally, the method for acquiring the height characteristic duty ratio of each reserved band includes the following specific steps:
wherein ,
represent the first
The height characteristic duty cycle of the individual reserved bands,
the number of the first category is indicated,
represent the first
The reserved wave band is at the first
The method for calculating the high correlation characteristic duty ratio in the first category comprises the following steps:
wherein ,
the entropy value of the common factor vector obtained for the factor analysis,
obtained for factor analysis
The reserved wave band is at the first
Entropy values of the special factor vectors corresponding to the reserved vectors formed in the first category.
Optionally, the method for obtaining the high correlation degree of each reserved band includes the following specific steps:
wherein ,
represent the first
The degree of high correlation of the individual reserved bands,
represent the first
The height characteristic duty cycle of the individual reserved bands,
represent the first
A first degree of similarity of the reserved bands,
indicating the high of all reserved bandsThe maximum value of the product of the degree characteristic and the first similarity,
representing the height characteristic duty cycle of each reserved band,
representing a first similarity for each reserved band.
Optionally, the obtaining the second similarity of the position pair formed by any two positions includes the following specific methods:
wherein ,
represent the first
A second degree of similarity for the pairs of positions,
representing sharing of
The number of reserved bands of wavelengths,
represent the first
The position pairs are at the first
The band similarity in the band is preserved,
represent the first
The height correlation degree of each reserved wave band;
the band similarity is at the first position through two positions
And obtaining the ratio between the band data in the reserved bands.
Optionally, the classifying the different positions according to the second similarity to obtain a plurality of second categories includes the following specific methods:
according to the obtained second similarity of all the position pairs, a second similarity sequence is obtained according to descending order, the number of the position pairs corresponding to different second similarities is counted, the second similarity with the number closest to the number of the first similarities is selected, and each corresponding position pair is used as an initial second category respectively;
and respectively calculating the positions which are not divided into any second category, and the average value of the second similarity between the positions in each second category, and taking the second category corresponding to the maximum value obtained by each position as the second category to which each position belongs.
Compared with the prior art, the invention has at least the following beneficial effects: the heights of crops are the same and are only sufficient conditions in the same growth stage, but not necessary conditions, the precision of the correlation degree of each calculated wave band data and the height information is higher by calculating the wave band data which are highly similar to the types and combining the height characteristic duty ratio in each wave band data, the first types obtained by the height clustering are reclustered according to the precision, the crop types divided according to the growth stages are obtained with higher precision, and the type data corresponding to each growth stage are respectively compressed and stored; meanwhile, if new data are generated, the new data are only required to be classified and stored according to the similarity relation between the new data and the existing category; by taking monitoring data of crops as an example, the monitoring data of the crops in different aspects are clustered according to the correlation among the characteristics, and then the corresponding category data are compressed and stored.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Referring to fig. 1, a flowchart of a data storage method of a monitoring system according to an embodiment of the invention is shown, the method includes the following steps:
and S001, acquiring the height information of each position in the crop growing area and various wave band data.
The purpose of the embodiment is to extract the correlation of crop monitoring data through data mining, so as to realize analysis of the growth stage of crops, and further compress and store the monitoring data of the growth stage with stronger correlation; therefore, laser radar point cloud data of a crop growing area are acquired through the high-precision laser radar to obtain the height information of each position in the crop growing area.
It should be further noted that, the growth of crops is often affected by different hormones, such as nitrogen, chlorophyll, etc., the band data corresponding to the different hormones are often different, and the difference of the hormones also causes the difference of the heights of the crops; therefore, the wave band data corresponding to different hormones are required to be acquired, and various wave band data are obtained.
Step S002, clustering the height information of different positions to obtain a plurality of first categories.
K-means clustering is carried out on different positions according to the height information,in the present embodiment
The value of (2) is set to 9, the clustering distance is the difference value of the height information between different positions according to the implementation scene, the clustering result comprises a plurality of categories, the clustering result is expressed as a plurality of first category areas in the growing area of crops, the height information of different positions in the same category is similar, the difference of the height information of different positions among different categories is larger, each category is a height category, the height category is marked as a first category, the average value of all the height information in each first category is calculated as the average height, and the average height is marked as the category height.
Step S003, a reserved band is obtained according to the first similarity between the entropy value of each band data in different first categories and the category height.
For the same crop, the same high probability of the height information belongs to the same growth stage, but the height information does not necessarily belong to the same growth stage, namely the same height information is only a sufficient condition of the same growth stage but not a necessary condition, so that the crops with the same height information also need to be further analyzed to obtain the commonalities of the crops, further corresponding band data are obtained, the similarity of the band data and the height information is large, and the band data are obtained by calculating the similarity of the band data and the class heights in different first classes.
The obtained class heights are arranged in ascending order from small to large to obtain a class height sequence, and the corresponding first classes are also ordered in the same order, wherein each first class contains all band data; each first class corresponds to a region in each wave band data, and the information entropy of the wave band data at all positions in each region is used as the entropy value of the corresponding wave band data in the corresponding first class; calculating entropy values of each band data in different first categories, arranging according to the arrangement sequence of the first categories to obtain an entropy value sequence, and calculating cosine similarity between the entropy value sequence and the category height sequence to obtain first similarity corresponding to each band data.
Specifically, by the first
For example, the first similarity of the band data is calculated
The method of (1) is as follows:
wherein ,
represent the first
The sequence of entropy values of the seed band data,
the sequence of class heights is represented by a sequence of class heights,
a cosine similarity calculation function representing two sequences; at this time, the first
The greater the first similarity of the seed wave band data, the greater the correlation between the information of the seed wave band data and the height information of crops, and the more the seed wave band data needs to be reserved as the wave band data mutually verified with the height information; and the smaller the first similarity, the smaller the correlation between the information of the band data and the crop height information, and the smaller the necessity of the band data as mutually authenticated with the height information.
Further, according to the above method, a first similarity of each band data is obtained, and a first preset threshold is given
For judging the correlation between the band data and the height information, the first preset threshold value in this embodiment is adopted
Calculating, and taking the band data with the first similarity larger than a first preset threshold value as a reserved band; it should be noted that the reserved bands are independent of the first class, that is, all the first classes include a reserved band, and the reserved bands in all the first classes are the same.
And S004, performing factor analysis on each reserved wave band to obtain the high correlation degree of each reserved wave band.
It should be noted that the reserved bands are obtained through calculation of all first categories, and then the correlation degree of each reserved band and the height information is obtained through extraction of the commonality characteristics of the reserved bands in the first categories; each first category not only contains information related to the height, but also contains other information, so that the commonality characteristic of reserved wave bands is required to be calculated, the commonality of the reserved wave bands, namely the correlation between reserved wave bands and the height is large, and the extracted commonality reflects the height related characteristic in the wave band data; and calculating the duty ratio of the highly relevant features in each reserved wave band, and combining the first similarity of the reserved wave bands to further obtain the degree of correlation between each reserved wave band and the height information.
Specifically, the height characteristic duty ratio of each reserved wave band is obtained through a factor analysis method, wherein the factor analysis is a method capable of extracting common characteristics and special characteristics of a plurality of vectors; converting the band data of each reserved band in all positions in each first class corresponding region into a vector form in a line-by-line end-to-end connection mode, namely a reserved vector, and taking the reserved vector as input of factor analysis, wherein factor analysis output obtains a common factor vector and a special factor vector which corresponds to the reserved vector one by one, and the common factor vector represents common characteristics of input vectors, namely highly relevant characteristics; and calculating the duty ratio of the common factor vector for each input reserved vector, namely, calculating the duty ratio of the highly relevant feature, and then calculating the average value of the highly relevant feature duty ratio of the reserved vector of each first class for a certain reserved wave band, namely, the height feature duty ratio of the corresponding reserved wave band.
Specifically, by the first
For example, the reserved band is calculated to be at the first
Highly correlated feature duty cycle in a first category
The method of (1) is as follows:
wherein ,
the entropy value of the common factor vector obtained for the factor analysis,
obtained for factor analysis
The reserved wave band is at the first
Entropy values of special factor vectors corresponding to the reserved vectors formed in the first categories; the calculation process of the entropy value of the common factor vector comprises the following steps: rounding each element in the vector to form an integer sequence, and then calculating the entropy value of the integer sequence, wherein the calculation method of the entropy value of the special factor vector is the same.
Further, according to the first
The duty cycle of the highly correlated features of each reserved band in all the first categories yields the duty cycle of the highly correlated features of the reserved band
The calculation method of (1) is as follows:
wherein ,
the number of the first category is indicated,
represent the first
The reserved wave band is at the first
Highly correlated feature duty cycles in the first category.
Further, the height correlation degree of each reserved wave band is obtained according to the obtained height characteristic duty ratio of the reserved wave band and the first similarity, so as to
For example, a reserved band, the degree of correlation is high
The calculation method of (1) is as follows:
wherein ,
represent the first
The height characteristic duty cycle of the individual reserved bands,
represent the first
First similarity of each reserved band, and first similarity of each acquired band
The product can be obtained by the method,
representing the height characteristic duty cycle of each reserved band,
representing a first similarity for each reserved band,
representing the maximum value of the product of the height characteristic duty ratio of all the reserved wave bands and the first similarity, and using the maximum value for normalization processing; at this time, the first
Height characteristic duty cycle of each reserved band
The larger the first similarity is, the larger the common characteristic between the band data of the reserved band and the crop height information is, the characteristic duty ratio of the height information in each band is obtained through factor analysis, only the height characteristics in the reserved band are saved, other characteristics are ignored, and the correlation between the band data and the crop height information is better verified by combining the first similarity; and obtaining the height correlation degree of all the reserved wave bands according to the calculation method, wherein the height correlation degree is used for reflecting the common characteristic duty ratio of the reserved wave bands and the height information in the first class.
Step S005, obtaining second similarity between different positions according to the band data of each position in different reserved bands, obtaining a plurality of second categories according to the second similarity, and re-clustering through the first category and the second category to obtain a third category.
It should be noted that, the first category is clustering according to the height of the crops, at this time, the common characteristic duty ratio between the reserved waveband and the height information is obtained, the second category classified according to the waveband data information is obtained according to the common characteristic, and then clustering is performed again according to the superposition intersection expression of the first category region and the second category region, so as to improve the classification accuracy between the growth monitoring data corresponding to each category.
In particular, when it is reservedAfter the high correlation degree of the wave bands, acquiring a plurality of wave band data of each same position in different reserved wave bands, and analyzing the similarity of the wave band data of different positions in each reserved wave band to further obtain a second similarity; it should be noted that the same position may form different position pairs with any other different positions; in the first place
For example, a second similarity of position pairs
The calculation method of (1) is as follows:
wherein ,
representing sharing of
The number of reserved bands of wavelengths,
represent the first
The position pairs are at the first
Band similarity in the reserved bands, at the first two positions
The ratio of the band data in the reserved bands is obtained, the ratio numerator is smaller band data, the denominator is larger band data,
is the first
The degree of high correlation of the bands is preserved.
Further, the method comprises the steps of,according to the obtained second similarity of all the position pairs, keeping a decimal place of the obtained second similarity, obtaining a second similarity sequence according to descending order, counting the number of the position pairs corresponding to different second similarities, and selecting the number and the first category number
The nearest second similarity, each corresponding position pair is respectively taken as the initial second category and is shared by
And the second categories are respectively used for calculating the average value of the second similarity between each position which is not divided into any second category and all the positions in each second category, and the second category corresponding to the maximum value obtained by each position is used as the second category to which the position belongs.
It should be noted that the closest is selected
Number of (3)
The corresponding position pairs serve as initial second categories, namely, the first category obtained by clustering the crop heights and the second category obtained by calculating the reserved wave bands can represent the growth categories, but errors exist, so that the most accurate positions belonging to the same growth category are obtained through mutual verification of two types of data, then the positions are used as initial clustering centers for re-clustering, a third category with higher accuracy and smaller errors is obtained, and therefore, in order to reduce the calculated amount, the nearest type is selected
Number of (3)
The corresponding pair of locations serves as the initial second category.
Further, at this time, all positions in the data of different wave bands have been clustered to obtain a first category and classified to obtain a second category,overlapping the corresponding region of the first category in each reserved wave band with the corresponding region of the second category in each reserved wave band to obtain an intersection of the first category and the second category, wherein the first category is
The method for judging the area center of the intersection set as the reclustering center point by taking the intersection set as an example comprises the following steps:
wherein ,
for the duty cycle between the intersection and the corresponding union,
the intersection portion is represented by a representation of the intersection,
for the first category of the parts,
for the second category of parts,
i.e. the part of the union that is sought for
Greater than a second preset threshold
Is retained, the second preset threshold value of the embodiment adopts
And (4) calculating, wherein the central point of the intersection area is used as a clustering central point of the reclustering.
It should be further noted that, in the process of calculating the clustering center, the height difference information and the second similarity information of two positions are referred to at the same time, so that different weights are respectively given. Specifically, still by the first
By way of example, the first
Clustering distance of individual position pairs
The calculation method of (1) is as follows:
wherein ,
i.e., the duty ratio of the intersection in the corresponding first category, i.e., the element number duty ratio, as a first weight value of the altitude difference information,
is the first
The height difference between the pairs of positions is obtained from the areas of the first category corresponding to the different reserved bands,
for the duty ratio of the intersection in the corresponding second category, the duty ratio is the element number duty ratio, and is taken as a second weight value of the second similarity,
is the first
A second similarity between pairs of positions. At this time, when the first category and the second category are superimposed to generate an intersection, the larger the ratio of the intersection in the first category and the second category is, the higher the initial precision of the third category expressed by the corresponding category is, and the higher the reliability is; and obtaining the clustering distance between the two positions according to the method.
Further, the obtained gathers according to the intersection regionClass center point, and cluster distance between locations
Reclustering the positions by adopting a K-means clustering method, wherein
The value is still the same as the number of the first category, i.e. using
And calculating, wherein the obtained clustering results are recorded as a plurality of third categories.
And step S006, respectively compressing and storing the acquired monitoring data in each third category.
Respectively compressing and storing the acquired data in each third category, wherein the data in the third category comprises growth monitoring data of the same crop in the same growth stage, and on one hand, the similarity between the data in the same third category is larger; on the other hand, the data are compressed and stored according to the third category, and if new data are generated, the data are classified and stored only according to the similarity relation between the new data and the existing category; the compressed storage method of each third category can be compressed by using a huffman coding compression method or a run-length coding compression method.
Therefore, compression storage of the monitoring data is completed, correlation between height information and band data among crops at different positions is analyzed through data mining, then crops with similar growth states are obtained through clustering, the monitoring data of the crops with similar growth states are compressed and stored, and compression efficiency of the monitoring data in compression storage is improved.
It should be noted that, in this embodiment, the band data is obtained by performing hyperspectral detection on a growing area of a crop, each channel in the obtained hyperspectral image is one band data, the data of each pixel point in each channel in the hyperspectral image is the band data of each position in each band data, and each pixel point in the hyperspectral image corresponds to each position in the growing area, so each position corresponds to one height information and multiple band data.
The foregoing description of the preferred embodiments of the invention is not intended to be limiting, but rather is intended to cover all modifications, equivalents, alternatives, and improvements that fall within the spirit and scope of the invention.