The SMOS satellite’s ability to see though dense canopy, as well as its high temporal resolution, are technological advancements that have generated new environmental data. One recently discovered application is the ability of the SMOS satellite to track the growth and senescence of vegetation growth. Similar to traditional vegetation indices, such as NDVI, the SMOS signal has a seasonal harmonic curve, highly correlated with the phenology
of row crops. However, unlike NDVI, the SMOS data product, τ , contains signal and
noise from multiple processes. As mentioned previously, one likely confounding variable is represented by changes in soil surface roughness; others include the signal processing itself that relies on a changing algorithm that processes the data to reduce noise. The goal of this paper has been twofold. One to smooth the data in order to better understand the seasonal patterns of τ in respect to crop development. Second, to provide an estimate, along with a measure of uncertainty, for the annual timing when crops reach their maximum growth stage.
Estimating a DLM with a random walk and harmonic term, we were able to model the time series of τ for 7 growing seasons from 2011 to 2017. The timing for when τ reached its maximum value was fairly homogenous within a growing season. When ranking the regions spatially from low to high latitude there were no statistically significant differences in the credible intervals for the DOY of peak τ . There was, however, heterogeneity in the timing of peak τ across growing seasons. Most notable, in 2012 when τ reached its peak faster than any of the other 7 years. This early peak makes sense given the weather conditions of 2012. The Iowa Environmental Mesonet (IEM) has daily data on growing degree days for each region across years. When the data is aggregated from May 1 to September 1, of all the seasons analyzed, 2012 was hottest and driest growing season.
We should point out some important limitations from this analysis. We assumed mutual independence of the regions and growing seasons. With only 7 years of data the assumption of independent seasons seems reasonable; however, there is likely some seasonal and/or
spatial dependence across regions within a fixed year. We also assumed the signal was a function of a single harmonic and random walk term. However, there is an early dip at the start of the growing season and second peak after harvest (early September). These changes are hypothesized to be a result of changes in soil roughness, but could provide information to agronomists if incorporated into the model. Lastly, the flexibility of the DLM allows the estimates of the latent states to quickly change in the positive or negative direction. However, as crops develop they accumulate moisture monotonically until they research their peak, and then start to gradually lose moisture and dry out. Rapid changes in measurements of τ are a result of noise in the satellite signal not rapid changes in τ . Restricting the DLM to have non-decreasing/non-increasing latent states would make the DLM model more congruous with the mechanism of the crop phenology process.
Other directions to improve the current model involve reducing the uncertainty of the latent state estimates. As demonstrated in the simulation study, of all the model parameters, the observational variance of the DLM primarily controls the width of the posterior CIs for τ . Therefore, initially smoothing the data could improve the precision of the retrospective estimates of τ . Another possibility is to add more parametric structure to the DLM–either though covariates (such as accumulated thermal time) or additional model terms in the evolution equation. Of course, as with all statistical modeling, reducing the variance of parameter estimates comes with the cost of potentially inducing bias into the model.
CHAPTER 3. A NONLINEAR HIERARCHICAL MODEL FOR MONITORING CROP GROWTH
A paper submitted to the Journal of Agricultural, Biological and Environmental Statistics (JABES)
Colin Lewis-Beck, Jarad Niemi, Petrut¸a Caragea, Brian Hornbuckle, Victoria Walker
3.1 Introduction
Accurately monitoring the development of row crops over a growing season is useful for both agronomists and climatologists (Zeng et al., 2016). Satellite measurements can be used to quantify the amount of ground vegetation present, and track the progression of crops through their life stages. Time series of satellite measurements exhibit a relatively stable signature across years, and certain characteristics, such as the maximum value over a growing season, are related to phenological states (Hornbuckle et al., 2016). If a crop’s vegetation index varies significantly from previous seasons, or differs from nearby crops within a growing season, this provides information about regional or local weather condi- tions, length of the growing season, timing of harvest, or expected crop yield (Bolton and Friedl, 2013). Modeling these curves, however, is challenging because of annual differences in crop development as well as heterogeneity across geographic regions within a season. Varying environmental conditions, agricultural practices, and noise in the data collection process contribute to differing vegetation patterns.
Much of the modeling using remote sensing vegetation data has sought to extract key phenological growth stages retrospectively. These approaches use filtering and parametric
models to smooth time series of vegetation data. Estimates of the start, peak, and end of the growing season are then calculated using the processed data. For example, wavelet and Fourier transformations have been applied to enhanced vegetation index (EVI) data to extract the dates of key phenological stages (Sakamoto et al., 2005). Growing degree day (GDD) models and moderate resolution imaging spectroradiometer (MODIS) image data were integrated together to estimate of phenological stages of rice crops (Boschetti et al., 2009). Other approaches include discrete stochastic models with GDD as an explanatory variable to predict the bloom dates of fruit crops (Cai et al., 2014). More complex models combine multiple covariates, such as historical crop data, simulated climate output, and different vegetation indices to update growth forecasts throughout the year (Newlands et al., 2014).
On the parametric side, a frequently used model is the asymmetric (AG) or double Gaussian function. Related to the asymmetric Gaussian distribution, the asymmetric func- tion has a similar shape but without the normalizing constant (Wallis et al., 2014). In its most general form, the AG function has six parameters, which makes it more flexible than the standard Gaussian curve. Nonlinear least squares was used to fit the asymmetric Gaussian function to normalized difference vegetation index (NDVI) data (Jonsson and Ek- lundh, 2002). More recent papers have smoothed vegetation data and estimated the dates of key phenological events using the asymmetric Gaussian curve (Beck et al., 2006; Atkin- son et al., 2012). Another parametric model appearing in the crop phenology literature is the double logistic function. An extension of the logistic curve, the double logistic function concatenates two logistic functions together to capture nonlinear patterns at the beginning and end of the growing season. Applications of this function to chlorophyll index data in in India, and EVI data in North America, are detailed in Atkinson et al. (2012) and Wu et al. (2014). Parametric curves can accurately smooth the pattern of crop growth and senescence but are typically fit using numerical optimization techniques that lack measures of uncertainty.
In order to describe a population of vegetation curves within and across seasons, we propose a novel hierarchical model using the asymmetrical Gaussian function. Modeling curves hierarchically within a season borrows information across satellite measurements taken at different locations; jointly modeling growing seasons borrows data about crop growth patterns across years. Estimation is performed using a Bayesian approach, which provides measures of uncertainty for model parameters, and functionals of model parameters that have practical importance, such as the length and shape of the growing season.
While nonlinear mean functions are familiar in the crop phenology literature, this paper offers a new modeling framework that reflects the structure of the satellite data, borrows information at multiple levels, incorporates prior information about crop phenology, and provides uncertainty quantification. The combination of noisy satellite measurements and nonlinear growth curves makes model parameters sensitive to outliers. As we will show, borrowing information across regions and years imposes “soft” constraints on model pa- rameters and more stable estimates. Lastly, we demonstrate our model using data from a microwave remote sensing satellite and find it may provide an alternative to USDA ground- based estimates of crop phenology stages. This is especially important because the USDA stopped reporting key crop development stages after the 2013 growing season.
The remainder of the article is organized as follows. Section 2 introduces the satellite data. Section 3 describes the nonlinear hierarchical model, and presents a Bayesian approach to estimation. In Section 4, we compare estimated parameters from the AG function across growing seasons. Section 5 compares the AG function against other parametric models, estimates from USDA survey data, and data from another type of satellite. In Section 6, we discuss the benefits of our approach, model assumptions, and possible extensions.