New types of index formulations will be developed and tested across the full optical spec-trum. Only SR and ND indices have been assessed here. Yet, alternative mathematical formulations are to be tested as well, e.g. the formulations used for soil adjusted vegeta-tion index (SAVI), enhanced vegetavegeta-tion index (EVI) or more complex ones.
Although in this paper only two-band formulations have been assessed, the SI toolbox offers the option for a systematic analysis of a full optical spectrum of band combinations with SI formulations of up to ten different bands in the range of 400 to 2500 nm. Since there is no reason to believe that two-band indices lead to the most successful SI models, this multiple band approach may lead to the development of more accurate and sensitive spectral indices.
Regardless of the above-mentioned research objectives, essentially, the SI assessment toolbox provides all necessary tools for quality assurance and quality control (QA/QC), and for a rapid and comprehensive biophysical parameter mapping. The toolbox is intuitive and can assist sig-nificantly in precision farming and landscape ecology monitoring applications, thereby allowing to optimize SIs, even per land cover type.
3.6 Conclusions
A newly developed Spectral Indices (SI) assessment toolbox in the ARTMO modeling envi-ronment, enables the analysis and assessment of the accuracy of an indefinite number of SI models. Basically, the toolbox offers a systematic but still empirical approach for the assess-ment of all possible 2, 3 or 4-band SI formulations. Datasets can be partitioned into calibration and validation subsets. These datasets may originate from simulations, e.g. as generated by the optical radiative transfer models in ARTMO, or from field campaigns. Several options have been included in the SI assessment approach, amongst which: (1) The addition of noise and the possibility to select fitting functions (e.g., linear, exponential, power or polynomial functions);
(2) the SI toolbox virtually allows for any type of spectral index model to be formulated and evaluated using up to ten spectral bands; and (3) the possibility to assess and apply SIs per land cover class.
Using HyMap data calibrated with field measured data, the predictive power of generic nar-rowband spectral indices in the 430-2490 nm range to quantify LAI and LCC has been assessed.
For LAI retrieval, theB1 in the green (B1: 539, 555, 570 nm) combined with longwave SWIR (e.g., B2: 2421, 2453 nm) has been evaluated as the most accurate approach (RMSE: 0.61;r2: 0.83). For LCC retrieval, theB1 in the far-red (692 nm) combined with theB2NIR (e.g., 1340 nm) or shortwave SWIR (e.g., 1661, 1686 nm) has been evaluated as the most accurate ap-proach (RMSE: 4.21-4.70,r2: 0.91- 0.93). In either case, the identification of the SWIR rather than the conventional NIR as an important spectral region reinforces our suggestion that the ARTMO SI assessment toolbox significantly facilitates the development of new and especially better performing spectral indices for biophysical parameter mapping applications.
LUT-based RTM inversion 4
Contents
4.1 Abstract . . . 52 4.2 Introduction . . . 53 4.3 Methodology . . . 55 4.4 Results . . . 63 4.5 Discussion . . . 69 4.6 Conclusions . . . 72
This chapter is based on:
Juan Pablo Rivera Caicedo, Jochem Verrelst, Ganna Leonenko and José Moreno (2013) Multiple cost functions and regularization options for improved retrieval of leaf chlorophyll
content and LAI through inversion of the PROSAIL model Remote Sensing 5(7): 3280-3304.
DOI:10.3390/RS5073280
51
4.1 Abstract
Lookup-table (LUT)-based radiative transfer model inversion is considered a physically-sound and robust method to retrieve biophysical parameters from Earth observation data but regu-larization strategies are needed to mitigate the drawback of ill-posedness. We systematically evaluated various regularization options to improve leaf chlorophyll content (LCC) and leaf area index (LAI) retrievals over agricultural lands, including the role of 1) cost functions (CFs), 2) added noise, and 3) multiple solutions in LUT-based inversion. Three families of CFs were compared: information measures, M-estimates and minimum contrast methods. We have only selected CFs without additional parameters to be tuned, and thus they can be immediately im-plemented in processing chains. The coupled leaf/canopy model PROSAIL was inverted against simulated Sentinel-2 imagery at 20 m spatial resolution (8 bands) and validated against field data from the ESA-led SPARC (Barrax, Spain) campaign. For all 18 considered CFs with noise in-troduction and opting for the mean of multiple best solutions considerably improved retrievals;
relative errors can be twice reduced as opposed to those without these regularization options.
M-estimates were found most successful, but also data normalization influences the accuracy of the retrievals. Here, best LCC retrievals were obtained using a normalized ’L1-estimate’ func-tion with a relative error of 17.6% (r2: 0.73), while best LAI retrievals were obtained through non-normalized ’least-squares estimator’ (LSE) with a relative error of 15.3% (r2: 0.74).
4.2 INTRODUCTION 53
4.2 Introduction
Leaf area index (LAI) and leaf chlorophyll content (LCC) are essential land biophysical param-eters retrievable from optical Earth observation (EO) data [Whittaker and Marks, 1975; Lich-tenthaler, 1987; Malenovský et al., 2012]. These parameters give insight in the phenological stage and health status (e.g., development, productivity, stress) of crops and forests [Lichten-thaler et al., 1996; Sampson et al., 2003]. The quantification of these parameters over large areas has become an important aspect in agroecological, environmental and climatic studies [Dorigo et al., 2007]. At the same time, remotely sensed observations are increasingly being applied at a within-field scale for dedicated agronomical monitoring applications [Gianquinto et al., 2011;
Delegido et al., 2013].
For the last few decades, various space-based LAI and LCC retrieval approaches have been proposed (see review in [Dorigo et al., 2007]), some of them eventually led to operational retrieval strategies. Particularly LAI proved to be successful parameter for being operationally and globally retrieved at resolutions of 250 m to 1 km (e.g. MODIS, CYCLOPES and Geoland2 products) [Myneni et al., 2002; Baret et al., 2007]. Moreover, a lot of efforts are being under-taken to generate a global LAI product at a 30-m Landsat scale [Ganguly et al., 2012]. However, operationally retrieved land LCC products are scarce. Until the loss of the ENVISAT spacecraft LCC maps were routinely delivered at a medium spatial resolution through the MERIS ter-restrial chlorophyll index [Dash and Curran, 2004] or through a trained neural net [Bacour et al., 2006]. But routinely generated LCC maps originated from high spatial resolution images (e.g. ≤20 m) are absent until now, although some operational approaches have been proposed for SPOT data [Houborg and Boegh, 2008; Houborg et al., 2009]. Meanwhile, new genera-tion of high resolugenera-tion (i.e. 10-60 m) land monitoring EO missions are being constructed to be launched such as the forthcoming superspectral Sentinel-2 (S2) mission and hyperspectral missions such as Enmap [Stuffler et al., 2007], PRISMA [Labate et al., 2009] and HyspIRI [Roberts et al., 2012]. Such unprecedented richness of high spectral and spatial resolution data streams makes the availability of robust retrieval methods more important than ever.
To implement a method in an operational processing chain the method should be able to deliver accurate estimates with easy implementation in practice. Inversion of physically-based canopy radiative transfer models (RTMs) against actual EO data is generally considered as one of the most robust approaches to map biophysical parameters over terrestrial surfaces [Dorigo et al., 2007; Darvishzadeh et al., 2008]. But this approach is not straightforward. Accord-ing to Hadamarad postulates, mathematical models of physical phenomena are mathematically invertible if the solution of the inverse problem to be solved exists, is unique and depends con-tinuously on variables [Knyazikhin et al., 1998a]. Unfortunately this assumption is not met. In fact, the inversion of canopy RTMs is by nature an ill-posed problem mainly for two reasons [Durbha et al., 2007]: on the one hand, several combinations of canopy biophysical and leaf biochemical parameters have a mutually compensating effect on canopy reflectance thus lead-ing to very similar solutions. On the other hand, model uncertainties and simplifications (e.g.
1D nature of some models) may induce large inaccuracies in the modelled canopy reflectance.
Several strategies have been proposed to circumvent the drawback of ill-posedness, includ-ing Lookup-table (LUT)-based inversion strategies [Combal et al., 2003; Darvishzadeh et al., 2008; Knyazikhin et al., 1998a; Richter et al., 2009; Weiss et al., 2000], hybrid approaches in which LUTs are generated to feed machine learning approaches [Weiss and Baret, 9991;
Walthall et al., 2004; Fang and Liang, 2005; Bacour et al., 2006; Durbha et al., 2007; Qu et al., 2008], or LUT-based iterative numerical optimization methods [Jacquemoud et al., 1995]. They all have their strengths and weaknesses in specific situations. But the main advantage of LUT-based inversion approaches is that it can be fast because the most computationally expensive part of the inversion procedure is completed before the inversion itself [Dorigo et al., 2007].
LUT-based inversion in its essential form, i.e. direct comparison of LUT spectra against an observed spectra through a cost function (CF), also in some cases known as distance, merit func-tion, metric or divergence measure, is part of the majority of applied inversion approaches. Such a function yields a value for one or multiple biophysical parameters by minimizing the summed differences between simulated and measured reflectances for all wavelengths [Knyazikhin et al., 1998a]. Various regularization strategies have been proposed to further optimize the robustness of the estimates: 1) the use of prior knowledge about model parameters [Baret and Buis, 2008; Combal et al., 2002, 2003; Darvishzadeh et al., 2008; Dorigo et al., 2009], 2) the use of multiple best solutions in the inversion (instead of the single best solution) [Combal et al., 2002; Koetz et al., 2005; Richter et al., 2009, 2011], 3) adding noise to account for uncer-tainties attached to measurements and models [Koetz et al., 2005; Richter et al., 2009, 2011], and, 4) the combination of single variables into synthetic variables such as canopy chlorophyll content [Weiss et al., 2000; Bacour et al., 2006; Dorigo et al., 2007; Darvishzadeh et al., 2012].
Nevertheless, aforementioned approaches face limitations when implementing them into a more operational context.
First, in the majority of these studies root mean square error (RMSE) was used as CF between simulated and measured spectra. However, in case of outliers and nonlinearity, the residuals are distorted and therefore the key assumption for using RMSE (maximum likelihood estimation with the Gaussian noise) is violated [Leonenko et al., 2013]. The latter authors sug-gested that alternative CFs may provide a more robust way to estimate biophysical parameters since they allow retrievals for cases where errors are not normally distributed and allow dealing with nonlinear high-parametric problems. Verrelst et al. [2014b] recently demonstrated that al-ternative CFs, in combination with aforementioned regularization strategies, can considerably improve biophysical parameters retrievals. Yet only three alternative CFs - out of more than 60 - were extensively evaluated so far, which leaves an urgent need to evaluate the performance of other promising CFs.
Second, the majority of these studies focus on a specific vegetation type such as croplands, often identified as a land cover class within an image [Richter et al., 2009, 2011; Atzberger and Richter, 2012]. However, this approach can turn out problematic when applying LUT-based inversion over larger areas. While land cover classification schemes help to split the problem into sub-domains for which prior information is attributed separately [Chen et al., 2002] it assumes that up-to-date knowledge of land cover types is available at high spatial resolution, which is usually not the case in an operational context. These limitations imply that alternatives
4.3 METHODOLOGY 55