Using a conceptual model of VIF prediction that includes two distinct sources of error, it is worth considering how using an ensemble system (as opposed to a single deterministic NWP model) fits into this conceptual model. Perhaps it is best to recognize that every WRF run will have error whether it is a deterministic run or a member of an ensemble, but the benefit of using an ensemble is to be able to sample at least part of that error so that it may be better understood and incorporated into a decision process by the end user. (For a general history and summary of ensemble forecast systems, see Kalnay 2003; for a real-world example of the cost-benefit of using an ensemble for ceiling and VIF prediction in the airline industry, see Keith and Leyton 2007). While the primary focus here is to identify and adapt for deficiencies in the WRF that result in prediction error in individual integrations, we perform this analysis in the context of an ensemble for several reasons. First, since each member of the ensemble varies not only in initial
conditions (IC) but also in physics suites (the ensemble setup is detailed in Chapter III), we can be more confident that consistent errors occurring in every member are likely to be attributable to a systematic WRF deficiency rather than due to a particular physics configuration or errors in the IC. Secondly, MEPS and other ensembles are already in wide use in DOD and elsewhere, and so we limit the operational value of our findings if we examine NWP VIF prediction errors without also considering and measuring the ensemble dispersion characteristics of those errors; that is, the degree to which the members tend to collectively sample the errors. By using deterministic verification techniques, we will show that the WRF output in MEPS is subject to systematic deficiencies that will negatively impact its skill in VIF prediction but can be improved with the addition of a conservative statistical component to the framework. Although the aim is not to revisit the design of the ensemble itself in this work (i.e., number of members, perturbation strategies, etc.) typical probabilistic verification practices are used to demonstrate how the skill of the MEPS is impacted by this work’s findings, with the understanding that probabilistic verification measures are affected by both the errors from individual WRF members and ensemble dispersion shortfalls. With little modification, the methodology and results developed here could just as well be applied to deterministic WRF output to reduce error and improve skill, albeit without the benefit of error sampling an ensemble provides.
Furthermore, the focus on systematic WRF deficiencies rather than individual member behavior is quite different from an ensemble calibration, which Eckel and Mass (2005) suggested should be performed separately on each member. Recent history suggests MEPS members will continue to be periodically added, deleted, and modified in attempts to improve some aspect of prediction (but not necessarily always improving VIF prediction), so addressing the observed systematic deficiencies demonstrated by most or all of the members represents the most impactful, enduring contribution toward achieving our aim. Instances where individual member behavior is particularly noteworthy will be highlighted to help inform future research on NWP development, particularly with regard to planetary boundary layer and microphysics parameterizations.
Besides error from NWP predictions and from visibility parameterizations, other sources of error exist that will not be thoroughly examined in this work but warrant consideration. In their work, Geiszler et al. (2000) alluded to error incurred by using a single model grid point for verification. Known as subsubgrid-scale variability or representativeness error, this error stems from the fact that the NWP predictions represent average values in a model grid box, yet the verifying observations are taken at a single point within that box. Even for the 4-km model grid used in this research, smaller-scale fog structure exists within the grid square that will contribute to error when verification is performed against a point observation. This research will not closely investigate subgrid- scale variability, but it is briefly examined and discussed in Chapter IV to gauge its potential impact. Where examined, it was not believed to substantially affect the results.
Observation error can be defined as the measurement error of a given instrument or procedure. In an ensemble verification, Hacker et al. (2011) found that ignoring observation error had the effect of making the ensemble appear less dispersive than it is, which can in turn affect its overall skill. It is not as crucial to address observation error when performing comparative verification since it affects all techniques relatively equally over time, and it will not be considered in this work. Nevertheless, the challenges inherent in gathering VIF observations mean observation error is likely to be greater than what might be expected for verification of temperature, for example. These challenges are documented in the next chapter.
Three other previous studies helped inform the setup and approach ultimately used in this research. Bang (2006) tested deterministic VIF predictions for a heavy fog case at Incheon, South Korea using both the Weather Research and Forecasting (WRF) model and Fifth-Generation Penn State/NCAR Mesoscale Model (MM5) at various horizontal grid spacing from 54 km to 2 km. The high-resolution WRF predictions were the most skillful, lending promise to the prospects of using MEPS, which is based off of 4-km grid spacing WRF runs, for this work. They found the WRF model runs tended to underforecast fog, and dissipate it too rapidly.
Tardif (2007) examined the impact of NWP model vertical resolution on radiation fog prediction at the Paris-Charles De Gaulle airport. Using a sophisticated 1-D model
designed specifically for fog (COBEL), he found having more vertical layers near the surface improved the timing of fog onset, which tended to be delayed in the lower- resolution experiment due to the inability to create a shallow fog layer, resulting in inadequate radiative cooling (note that fog droplets have higher longwave emissivity than unsaturated air, and therefore will cool a layer more quickly when present). When increasing the resolution isn’t possible, he suggested examining radiative cooling rates in the NWP model for signatures that may assist with radiation fog initiation. The lowest model level in MEPS (about 20 m above ground level) is even higher than the lowest model level in the low-resolution COBEL case (about 12.2 m above ground level), and we will show that similar behavior was observed.
Lastly, Zhou and Ferrier (2008) described a process for obtaining LWC values during radiation fog events by explicitly solving the governing equation that describes LWC as a function of turbulent exchange coefficient, droplet gravitational settling flux, condensation rate due to cooling, and height of the fog layer. Verification of the technique during an observed fog event was promising, and the authors suggest the technique could be successfully utilized to adjust the initial LWC predictions provided by NWP predictions if the NWP model is able to provide accurate predictions of the dependent variables. Our research examined the prospects for such an approach in MEPS, but as we will show, it would not provide large skill improvements due to the high number of cases in MEPS of missed fog, for which the fog depth is zero and the technique maintains zero LWC.