Extraction of Plant Physiological Status from Hyperspectral Signatures Using Machine Learning Methods
Abstract
:1. Introduction
2. Data and Methods
2.1. Laboratory Experiment
2.2. Field Measurements
2.2.1. UFZ
2.2.2. GFZ
2.3. Simulation of Spectral Signatures Using PROSAIL
Dataset | Research Question | Varied Model Parameters |
---|---|---|
Simulation 1 | Basic biophysical and structural parameters | Cab, Cw, Cm, LAI |
Simulation 2 | Additional soil influence | Cab, Cw, Cm, LAI + psoil + different soils |
Simulation 3 | Total variability | Cab, Cw, Cm, LAI + psoil + different soils + Car, Cbrown, N, angl, hspot |
Simulation 4 | Total variability + Measurement uncertainties + Measurement bias | Cab, Cw, Cm, LAI + psoil + different soils + Car, Cbrown, N, angl, hspot + Additional noise (3%,Gaussian) + Additional bias (2%) |
2.4. Random Forest Methodology
Parameter | Mean | SD | Min | Max | Distribution | Reference |
---|---|---|---|---|---|---|
Cab | Data | Data | 0 | 100 | Gaussian Mixture | Data |
Car | 10 | 2 | 0 | 100 | Truncated Gaussian | - |
Cbrown | 0 | 0.2 | 0 | 1.5 | Truncated Gaussian | [22] |
Cw | 0.015 | 0.003 | 0.006 | 0.03 | Truncated Gaussian | Data |
Cm | 0.005 | 0.001 | 0.002 | 0.01 | Truncated Gaussian | [17,24] |
N | 1.3 | 0.1 | 1 | 1.5 | Truncated Gaussian | [17,23] |
LAI | Data | Data | 0 | 15 | Truncated Gaussian | Data |
angl | 45 | - | 20 | 70 | Uniform | [17] |
psoil | Data | Data | 0 | 1 | Truncated Gaussian | Data |
skyl | 0.01 | - | - | - | Fixed | - |
hspot | 0.1 | 0.3 | 0.001 | 1 | Truncated Gaussian | [22] |
tts | 45 | - | - | - | Fixed | Laboratory conditions |
tto | 0 | - | - | - | Fixed | Laboratory conditions |
psi | 0 | - | - | - | Fixed | Laboratory conditions |
3. Results
3.1. In Situ Measurements
3.1.1. Laboratory
3.1.2. Field
3.2. PROSAIL Simulations
3.3. Random Forest Prediction
Dataset | R2 Cab | MAE Cab | R2 LAI | MAE LAI | R2 BBCH | MAE BBCH |
---|---|---|---|---|---|---|
Simulation 1 | 0.99 | 2.09 | 0.98 | 0.56 | - | - |
Simulation 2 | 0.99 | 2.37 | 0.97 | 0.63 | - | - |
Simulation 3 | 0.98 | 3.21 | 0.88 | 1.10 | - | - |
Simulation 4 | 0.98 | 3.23 | 0.89 | 0.88 | - | - |
Laboratory data | 0.94 | 4.66 | 0.80 | 0.91 | 0.91 | 8.01 |
Field data UFZ | 0.89 | 6.94 | 0.89 | 0.65 | 0.80 | 10.72 |
Field data GFZ | - | - | - | - | 0.85/0.88 | 9.87/12.07 |
3.4. Important Wavelengths
Dataset | Cab | LAI | BBCH |
---|---|---|---|
Simulation 1 | 700–720 | 750, 1140, 1270, 1870 | - |
Simulation 2 | 700–720 | 1,140, 1270, 1870 | - |
Simulation 3 | 680–715 | 930, 960, 1700, 1870 | - |
Simulation 4 | 680–715 | 950, 1450, 1700, 1870 | - |
Laboratory data | 680–710, 2300 | 715, 810–910, 1100–1300 | 700, 1450, 2300 |
Field data UFZ | 680–810, 1150, 1430, 2300 | 450–500, 750–950, 1200 | 700, 780, 1300, 2300 |
Field data GFZ | - | - | 400, 700, 780, 1300, 1450, 2000 |
4. Discussion
- (1)
- RF limitations related to hyperspectral data: Many highly correlated predictor variables were used to model vegetation status by the RF algorithm; variable importance often included groups of adjacent wavelengths. Furthermore, wavelengths that are not physically related to the investigated response, but that may contain unique information content (like the 2300-nm spectral region), can be selected as important prediction variables. Nevertheless, most predictors were selected from biophysically meaningful spectral areas. Given the RF preceding feature reduction, future studies might have to include a set of prominent machine learning/kernel methods. Wavelengths consistently selected by all applied methods could then be assumed as robust prediction variables. Even though PCA is often used for feature reduction of hyperspectral data [43], there are other methods already available that sometimes outperform PCA, such as maximum noise fraction transformation (MNF, [44]) or non-parametric weighted feature extraction (NWFE, [45,46]. Still, the Hughes’ phenomenon is likely to affect many pre-processing and data mining methods [47,48].
- (2)
- Limitations of inverse modeling: Naturally, the retrieval of biophysical variables from remote sensing data is in any case affected by the problem that several biophysical and structural vegetation characteristics have an impact on the same spectral regions. As the inversion of RTM is generally an under-determined problem, knowledge on the distribution of model parameters is helpful [15], which requires one to determine as many parameters required by the used RTM as possible.
- (3)
- Limitations of PROSAIL. Although PROSAIL is an extensively used and tested RTM [15], a simple one-dimensional model like this has limitations. The canopy description by the means of LAI and an LIDF should be mentioned here. A possible improvement might be the use of a more sophisticated, but also more complicated, three-dimensional model, where vegetation structure can be better represented [6]. However, subsequently, the estimation of many more vegetation structure parameters would be necessary. Obviously, summer barley canopy structure changed substantially between growth, maturation and senescence phases. Describing these canopy structure changes is difficult based on parameter mean values. The collected ASD spectra over the whole measurement period could not be fully represented in the model parameterization. Especially, the appearance of ears/awns specific to summer barley cannot be described within PROSAIL. Investigating different growing stages separately could improve prediction accuracy, but limits transferability to field or airborne data acquisitions.
- (4)
- Limitations ofin situ measurements: The selection of unexpected wavelengths by RF for LAI might originate from difficulties in providing a representative sample size ofin situ observations, despite intensive sampling efforts. Furthermore, Cab, LAI and BBCHin situ data are affected by measurement uncertainties. For example, indices based on hyperspectral reflectance measurements have been found to derive Cab more robustly than SPAD measurements [49,50]. Finally, it was observed that there is an important influence of the phenological phase on the appearance of summer barley spectra, which is difficult to capture by PROSAIL model parameterization. Furthermore, the effect of ears specific to summer barley on ASD measurements could be detected, as already mentioned.
- (1)
- RF was shown to be a robust prediction method for simulated, laboratory and field data. A performance decrease between simulated and measured hyperspectral signatures has to be expected given measuring inaccuracies (bothin situ and hyperspectral); for example, inherent slight diversions from nadir positions performing ASD measurements even with strict measurement protocols. Temporal sampling in the laboratory, even on identical barley plots, was very dense compared to field measurements. In this respect, slight decreases in prediction performance are negligible indicating that environmental conditions, e.g., changing illumination, seem to have only minor influences. Prediction differences between the two independent field datasets of GFZ and UFZ can most likely be attributed to the different coverage of BBCH stages and different acquisition procedures, but are still comparably close.
- (2)
- The inclusion of multi-temporal hyperspectral data with different growth stages (and physiological status’) posed no limitations on RF prediction performance. Most studies evaluated machine learning techniques on hyperspectral data acquisitions with very homogeneous growth stages. Here, we could show similar prediction accuracies with all phenological growth stages included. Likewise, [51] used crop ASD measurements at different scales and different BBCH stages for successfully predicting biomass.
- (3)
- Variables for predictive RF models were in most cases selected from biophysically meaningful spectral regions. Wavelengths selected for LAI prediction do vary more because changing LAI affects a considerable part of VNIR and SWIR, even extending into VIS at low LAI values.
- (4)
- No extensive parameter tuning is required when applying RF on hyperspectral data. Changing “mtry” or “ntree” had little effect on prediction quality and robustness. The proposed settings of “mtry = 1000” and “ntree = 700” should also be applicable to other studies dealing with hyperspectral data. This is in contrast to other machine learning or kernel methods, such as in support vector regression or kernel ridge regression, where extensive parameter tuning, with the risk of overfitting, is standard procedure. Chan and Paelinckx [52] also found RF to be a fast, stable and robust method when deriving land use classes.
5. Conclusions
Acknowledgments
Author Contributions
Conflicts of Interest
References
- Baret, F.; Houles, V.; Guerif, M. Quantification of plant stress using remote sensing observations and crop models: The case of nitrogen management.J. Exp. Bot.2007,58, 869–880. [Google Scholar] [CrossRef] [PubMed]
- Migliavacca, M.; Meroni, M.; Busetto, L.; Colombo, R.; Zenone, T.; Matteucci, G.; Manca, G.; Seufert, G. Modeling gross primary production of agro-forestry ecosystems by assimilation of satellite-derived information in a process-based model.Sensors2009,9, 922–942. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Boegh, E.; Thorsen, M.; Butts, M.B.; Hansen, S.; Christiansen, J.S.; Abrahamsen, P.; Hasager, C.B.; Jensen, N.O.; van der Keur, P.; Refsgaard, J.C.;et al. Incorporating remote sensing data in physically based distributed agro-hydrological modeling.J. Hydrol.2004,287, 279–299. [Google Scholar] [CrossRef]
- Raupach, M.R.; Rayner, P.J.; Barrett, D.J.; DeFries, R.S.; Heimann, M.; Ojima, D.S.; Quegan, S.; Schmullius, C.C. Model-data synthesis in terrestrial carbon observation: Methods, data requirements and data uncertainty specifications.Glob. Chang. Biol.2005,11, 378–397. [Google Scholar] [CrossRef]
- Govender, M.; Dye, P.J.; Weiersbye, I.M.; Witkowski, E.T.F.; Ahmed, F. Review of commonly used remote sensing and ground-based technologies to measure plant water stress.Water SA2009,35, 741–752. [Google Scholar] [CrossRef]
- Dorigo, W.A.; Zurita-Milla, R.; de Wit, A.J.W.; Brazile, J.; Singh, R.; Schaepman, M.E. A review on reflective remote sensing and data assimilation techniques for enhanced agroecosystem modeling.Int. J. Appl. Earth Observ. Geoinf.2007,9, 165–193. [Google Scholar] [CrossRef]
- Breiman, L. Random forests.Mach. Learn.2001,45, 5–32. [Google Scholar] [CrossRef]
- Ismail, R.; Mutanga, O. A comparison of regression tree ensembles: PredictingSirex noctilio induced water stress inPinus patula forests of KwaZulu-Natal, South Africa.Int. J. Appl. Earth Observ. Geoinf.2010,12, S45–S51. [Google Scholar] [CrossRef]
- Prasad, A.M.; Iverson, L.R.; Liaw, A. Newer classification and regression tree techniques: Bagging and random forests for ecological prediction.Ecosystems2006,9, 181–199. [Google Scholar] [CrossRef]
- Cutler, D.R.; Edwards, T.C.; Beard, K.H.; Cutler, A.; Hess, K.T. Random forests for classification in ecology.Ecology2007,88, 2783–2792. [Google Scholar] [CrossRef] [PubMed]
- Lawrence, R.L.; Wood, S.D.; Sheley, R.L. Mapping invasive plants using hyperspectral imagery and Breiman Cutler classifications (RandomForest).Remote Sens. Environ.2006,100, 356–362. [Google Scholar] [CrossRef]
- Chan, J.C.W.; Paelinckx, D. Evaluation of Random Forest and Adaboost tree-based ensemble classification and spectral band selection for ecotope mapping using airborne hyperspectral imagery.Remote Sens. Environ.2008,112, 2999–3011. [Google Scholar] [CrossRef]
- Jacquemoud, S.; Baret, F. Prospect—A model of leaf optical-properties spectra.Remote Sens. Environ.1990,34, 75–91. [Google Scholar] [CrossRef]
- Verhoef, W. Light-scattering by leaf layers with application to canopy reflectance modeling—The sail model.Remote Sens. Environ.1984,16, 125–141. [Google Scholar] [CrossRef]
- Jacquemoud, S.; Verhoef, W.; Baret, F.; Bacour, C.; Zarco-Tejada, P.J.; Asner, G.P.; Francois, C.; Ustin, S.L. PROSPECT plus SAIL models: A review of use for vegetation characterization.Remote Sens. Environ.2009,113, S56–S66. [Google Scholar] [CrossRef]
- Lichtenthaler, H.K. Chlorophylls and carotenoids—Pigments of photosynthetic biomembranes.Methods Enzymol.1987,148, 350–382. [Google Scholar]
- Vohland, M.; Mader, S.; Dorigo, W. Applying different inversion techniques to retrieve stand variables of summer barley with PROSPECT plus SAIL.Int. J. Appl. Earth Observ. Geoinf.2010,12, 71–80. [Google Scholar] [CrossRef]
- Markwell, J.; Osterman, J.C.; Mitchell, J.L. Calibration of the Minolta SPAD-502 leaf chlorophyll meter.Photosynth. Res.1995,46, 467–472. [Google Scholar] [CrossRef] [PubMed]
- Bleiholder, H.; Weber, E.; Hess, M.; Wicke, H.; van den Boom, T.; Lancashire, P.; Buhr, L.; Hack, H.; Klose, R.; Stauss, R.Growth Stages of Mono- and Dicotyledonous Plants, BBCH Monograph; Federal Biological Research Centre for Agriculture and Forestry: Berlin/Braunschweig, Germany, 2001; p. 158. [Google Scholar]
- Feret, J.B.; Francois, C.; Asner, G.P.; Gitelson, A.A.; Martin, R.E.; Bidel, L.P.R.; Ustin, S.L.; le Maire, G.; Jacquemoud, S. PROSPECT-4 and 5: Advances in the leaf optical properties model separating photosynthetic pigments.Remote Sens. Environ.2008,112, 3030–3043. [Google Scholar] [CrossRef]
- Verhoef, W.; Jia, L.; Xiao, Q.; Su, Z. Unified optical-thermal four-stream radiative transfer theory for homogeneous vegetation canopies.IEEE Trans. Geosci. Remote Sens.2007,45, 1808–1822. [Google Scholar] [CrossRef]
- Baret, F.; Hagolle, O.; Geiger, B.; Bicheron, P.; Miras, B.; Huc, M.; Berthelot, B.; Nino, F.; Weiss, M.; Samain, O.;et al. LAI, fAPAR and fCover CYCLOPES global products derived from VEGETATION—Part 1: Principles of the algorithm.Remote Sens. Environ.2007,110, 275–286. [Google Scholar] [CrossRef] [Green Version]
- Jacquemoud, S. Inversion of the prospect + sail canopy reflectance model from aviris equivalent spectra—Theoretical-study.Remote Sens. Environ.1993,44, 281–292. [Google Scholar] [CrossRef]
- Weiss, M.; Baret, F. Evaluation of canopy biophysical variable retrieval performances from the accumulation of large swath satellite data.Remote Sens. Environ.1999,70, 293–306. [Google Scholar] [CrossRef]
- Breiman, L.; Friedman, J.; Olshen, R.; Stone, C.Classification and Regression Trees; Chapman and Hall: London, UK, 1984. [Google Scholar]
- Liaw, A.; Wiener, M. Package “randomForest”.Breiman and Cutler’s Random Forests for Classification and Regression. 2014. Available online:http://cran.r-project.org/web/packages/randomForest/randomForest.pdf (accessed on 28 October 2014).
- R Core Team.R Core Team; R Foundation for Statistical Computing: Vienna, Austria. Available online:http://www.R-project.org/ (accessed on 5 November 2014).
- Gislason, P.; Benediktsson, J.; Sveinsson, J. Random Forests for land cover classification.Pattern Recognit. Lett.2006,27, 294–300. [Google Scholar] [CrossRef]
- Waske, B.; Benediktsson, J.A.; Arnason, K.; Sveinsson, J.R. Mapping of hyperspectral AVIRIS data using machine-learning algorithms.Can. J. Remote Sens.2009,35, S106–S116. [Google Scholar] [CrossRef]
- Okujeni, A.; van der Linden, S.; Jakimow, B.; Rabe, A.; Verrelst, J.; Hostert, P. A comparison of advanced regression algorithms for quantifying urban land cover.Remote Sens.2014,6, 6324–6346. [Google Scholar] [CrossRef]
- Dalponte, M.; Orka, H.O.; Gobakken, T.; Gianelle, D.; Naesset, E. Tree species classification in boreal forests with hyperspectral data.IEEE Trans. Geosci. Remote Sens.2013,51, 2632–2645. [Google Scholar] [CrossRef]
- Schwieder, M.; Leitao, P.J.; Suess, S.; Senf, C.; Hostert, P. Estimating fractional shrub cover using simulated EnMAP data: A comparison of three machine learning regression techniques.Remote Sens.2014,6, 3427–3445. [Google Scholar] [CrossRef]
- Castaings, T.; Waske, B.; Benediktsson, J.A.; Chanussot, J. On the influence of feature reduction for the classification of hyperspectral images based on the extended morphological profile.Int. J. Remote Sens.2010,31, 5921–5939. [Google Scholar] [CrossRef]
- Fassnacht, F.E.; Neumann, C.; Foerster, M.; Buddenbaum, H.; Ghosh, A.; Clasen, A.; Joshi, P.K.; Koch, B. Comparison of feature reduction algorithms for classifying tree species with hyperspectral data on three central European test sites.IEEE J. Sel. Top. Appl. Earth Observ. Remote Sens.2014,7, 2547–2561. [Google Scholar]
- Atzberger, C.; Guerif, M.; Baret, F.; Werner, W. Comparative analysis of three chemometric techniques for the spectroradiometric assessment of canopy chlorophyll content in winter wheat.Comput. Electr. Agric.2010,73, 165–173. [Google Scholar] [CrossRef]
- Jarmer, T. Spectroscopy and hyperspectral imagery for monitoring summer barley.Int. J. Remote Sens.2013,34, 6067–6078. [Google Scholar] [CrossRef]
- Verrelst, J.; Alonso, L.; Rivera Caicedo, J.P.; Moreno, J.; Camps-Valls, G. Gaussian process retrieval of chlorophyll content from imaging spectroscopy data.IEEE J. Sel. Top. Appl. Earth Observ. Remote Sens.2013,6, 867–874. [Google Scholar] [CrossRef]
- Bacour, C.; Baret, F.; Jacquemoud, S. Information content of HyMap hyperspectral imagery. In Proceedings of the 1st International Symposium on Recent Advances in Quantitative Remote Sensing, Valencia, Spain, 16–20 September 2002; Sobrino, J., Ed.; University of Valencia: Valencia, Spain, 2002; pp. 503–508. [Google Scholar]
- Bacour, C.; Jacquemoud, S.; Tourbier, Y.; Dechambre, M.; Frangi, J.P. Design and analysis of numerical experiments to compare four canopy reflectance models.Remote Sens. Environ.2002,79, 72–83. [Google Scholar] [CrossRef]
- Li, X.; Zhang, Y.; Bao, Y.; Luo, J.; Jin, X.; Xu, X.; Song, X.; Yang, G. Exploring the best hyperspectral features for LAI estimation using partial least squares regression.Remote Sens.2014,6, 6221–6241. [Google Scholar] [CrossRef]
- Dormann, C.F.; Elith, J.; Bacher, S.; Buchmann, C.; Carl, G.; Carre, G.; Marquez, J.R.G.; Gruber, B.; Lafourcade, B.; Leitao, P.J.;et al. Collinearity: A review of methods to deal with it and a simulation study evaluating their performance.Ecography2013,36, 27–46. [Google Scholar] [CrossRef]
- Galvão, L. Crop type discrimination using hyperspectral data. InHyperspectral Remote Sensing of Vegetation; Thenkabail, P.S., Lyon, J.G., Huete, A., Eds.; CRC Press, Taylor and Francis Group: New York, NY, USA, 2012; Chapter 17; pp. 397–422. [Google Scholar]
- Thenkabail, P.S.; Lyon, J.G.; Huete, A. Advances in hyperspectral remote sensing of vegetation and agricultural crops. InHyperspectral Remote Sensing of Vegetation; Thenkabail, P.S., Lyon, J.G., Huete, A., Eds.; CRC Press, Taylor and Francis Group: New York, NY, USA, 2012; Chapter 1; pp. 3–29. [Google Scholar]
- Green, A.; Berman, M.; Switzer, P.; Craig, M. A transformation for ordering multispectral data in terms of image quality with implications for noise removal.IEEE Trans. Geosci. Remote Sens.1988,26, 65–74. [Google Scholar]
- Falco, N.; Benediktsson, J.A.; Bruzzone, L. A study on the effectiveness of different independent component analysis algorithms for hyperspectral image classification.IEEE J. Sel. Top. Appl. Earth Observ. Remote Sens.2014,7, 2183–2199. [Google Scholar] [CrossRef]
- Kuo, B.; Landgrebe, D. Nonparametric weighted feature extraction for classification.IEEE Trans. Geosci. Remote Sens.2004,42, 1096–1105. [Google Scholar] [CrossRef]
- Hughes, G. On the mean accuracy of statistical pattern recognizers.IEEE Trans. Inf. Theory1968,14, 55–63. [Google Scholar] [CrossRef]
- Thenkabail, P.S.; Gumma, M.K.; Teluguntla, P.; Mohammed, I.A. Hyperspectral remote sensing of vegetation and agricultural crops.Photogramm. Eng. Remote Sens.2014,80, 697–709. [Google Scholar]
- Richardson, A.; Duigan, S.; Berlyn, G. An evaluation of noninvasive methods to estimate foliar chlorophyll content.New Phytol.2002,153, 185–194. [Google Scholar] [CrossRef]
- Gitelson, A. Non-destructive estimation of foliar pigment (chlorophylls, carotenoids, and anthocyanins) contents: Evaluating a semi-analytical three-band model. InHyperspectral Remote Sensing of Vegetation; Thenkabail, P.S., Lyon, J.G., Huete, A., Eds.; CRC Press, Taylor and Francis Group: New York, NY, USA, 2012; Chapter 6; pp. 141–166. [Google Scholar]
- Gnyp, M.L.; Bareth, G.; Li, F.; Lenz-Wiedemann, V.I.S.; Koppe, W.; Miao, Y.; Hennig, S.D.; Jia, L.; Laudien, R.; Chen, X.;et al. Development and implementation of a multiscale biomass model using hyperspectral vegetation indices for winter wheat in the North China Plain.Int. J. Appl. Earth Observ. Geoinf.2014,33, 232–242. [Google Scholar] [CrossRef]
- Chan, J.C.W.; Paelinckx, D. Evaluation of Random Forest and Adaboost tree-based ensemble classification and spectral band selection for ecotope mapping using airborne hyperspectral imagery.Remote Sens. Environ.2008,112, 2999–3011. [Google Scholar] [CrossRef]
© 2014 by the authors; licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Doktor, D.; Lausch, A.; Spengler, D.; Thurner, M. Extraction of Plant Physiological Status from Hyperspectral Signatures Using Machine Learning Methods.Remote Sens.2014,6, 12247-12274. https://doi.org/10.3390/rs61212247
Doktor D, Lausch A, Spengler D, Thurner M. Extraction of Plant Physiological Status from Hyperspectral Signatures Using Machine Learning Methods.Remote Sensing. 2014; 6(12):12247-12274. https://doi.org/10.3390/rs61212247
Chicago/Turabian StyleDoktor, Daniel, Angela Lausch, Daniel Spengler, and Martin Thurner. 2014. "Extraction of Plant Physiological Status from Hyperspectral Signatures Using Machine Learning Methods"Remote Sensing 6, no. 12: 12247-12274. https://doi.org/10.3390/rs61212247
APA StyleDoktor, D., Lausch, A., Spengler, D., & Thurner, M. (2014). Extraction of Plant Physiological Status from Hyperspectral Signatures Using Machine Learning Methods.Remote Sensing,6(12), 12247-12274. https://doi.org/10.3390/rs61212247