CRITERIOS DE CALIDAD INSTITUCIONAL
IV. CONSIDERACIONES FINALES
In recent years there has been a growing interest in the application of probabilis- tic techniques to estimate consumer exposure to chemicals in food. In contrast to the deterministic methodology, probabilistic techniques allow the distribution of in- takes for multiple individuals in a specified population to be estimated, taking into consideration the variability in food consumption between individuals and the vari- ability in occurrence of residues in food commodities. As in the deterministic IESTI equations, estimating intake from one commodity for a single person on a single day requires the multiplication of the amount of commodity they consumed by the concentration of pesticide it contained, followed by a division by the person’s body weight. To assess how often that person’s intakes exceed the ARfD, this process can be repeated for every day of the year. If we want to assess what proportion of a population exceeds the ARfD, we need to repeat this calculation for each person in the population. Since this is not possible in practice, dietary exposure models are based on the principle that, if we have a representative sample from the population, we should be able to make inferences about characteristics of the whole population.
For dietary risk assessment, probabilistic approaches infer these characteristics by taking descriptions of the variation in consumption and body weights for multiple people and multiple days and combining them with a description of the variation in residue levels, selected at random. Consumption and body weight data are derived from national dietary surveys and residue concentrations are derived from supervised field trials or monitoring programmes, depending on whether the risk assessment is
part of the registration process or not. The basic procedure is as follows:
1. Select one ‘person-day’ record from a dietary survey, comprising consump- tion and body weight. The consumption and body weight data are sampled together to account for the perceived dependencies between those quantities. 2. Sample a single concentration at random from a distribution describing the
variation in pesticide residue levels.
3. Calculate the modelled intake for this person-day by multiplying consumption with concentration and dividing this product by body weight.
4. Repeat steps 1-3 for a large number of person-days, calculating a modelled intake for each.
5. Determine the percentage of modelled intakes for all the person-days that are below the ARfD for the pesticide.
Until EFSA (2012) recently developed guidelines on the use of probabilistic method- ology for modelling dietary exposure to pesticide residues, little guidance existed on how probabilistic dietary modelling should be conducted. EFSA (2012) proposes a tiered approach for probabilistic dietary risk assessments and focuses on a ‘basic’ assessment which may be refined if it results in uncertainty about the risk associ- ated with pesticide exposure. This ‘basic’ assessment consists of two model runs, a pessimistic model run that is expected to overestimate intake and an optimistic model run that should lead to an underestimate of the intake. The idea is that if the former does not raise any concern for risk managers, the ‘true’ dietary intake should also not raise concerns. If the optimistic model indicates an unacceptable level of risk, it is considered that refining the model is unlikely to be worthwhile. Various probabilistic dietary risk assessment models have been developed (CREMe, McNamara et al., 2003; MCRA, De Boer and Van der Voet, 2011; Uni-HB, EFSA, 2007b).
Most models used in probabilistic dietary risk assessment include several of the following characteristics:
• Residue Levels
◦ Data: For a proposed new use, typically only supervised field trial data on composites of food items are available. Each composite sample con- sists of several units of a raw agricultural commodity from a supervised field trial. If the pesticide is already used for other commodities, mon- itoring data may be available for those commodities. If a product has been approved, monitoring data can be used to either assess the risk as- sociated with a high residue event (i.e. one of the monitored samples has residue levels above the MRL) or for an evaluation of risk associated with pesticide exposure.
It is important to note that concentration data are often used as ac- tual residue levels, not accounting for measurement errors and report- ing/rounding errors. Data below the limit of determination may be mod- elled using simple replacement rules (e.g. set to LOD, half the LOD or zero) or by more advanced modelling that treats them as latent (censored) values from either a residue level distribution or a mixture distribution, allowing for a proportion of these values to be true zeros.
◦ Choice of Model: Currently pesticide residue levels may be modelled with empirical or parametric distributions. In the former case, composite residue samples are resampled with replacement. Sometimes a bootstrap approach (Efron, 1979) is applied to account for uncertainty. Bootstrap- ping involves resampling the data with replacement to generate new ‘data sets’ of the same size which can be described by empirical distributions. To model the variation in residue levels these empirical distributions are then subsequently sampled with replacement. In the parametric case, a (set of) distribution(s) is fitted to the residue data and samples from this (set of) distribution(s) are drawn to generate estimates of the mean residue level. EFSA (2012) recommends using either an empirical distri-
bution or a Lognormal distribution although more advanced models have been suggested that make use of extreme value theory (Kennedy et al., 2011).
◦ Unit variation: Unit variation can be modelled using two different ap- proaches (EFSA, 2012) depending on the data available:
− Sample-based: This approach comes from interpreting each of the composite samples as the average concentration of a population of a finite number of units (e.g. the potatoes in a bag of potatoes or a bunch of bananas). We can describe the variation in the mean residue levels using an empirical or parametric distribution, F , assuming composite data are representative of the field mean. Once we have generated a new mean R from F , the finite number of units, n, implies that there is an upper bound on the unit distribution: the highest possible residue is now equal to n × R (i.e. the case where all of the residue is contained in one unit). EFSA (2012) suggests that in this case a Beta distribution should be used to sample a unit residue value.
− Lot-based: This approach can be thought of as having m composite sample values based on taking n units (e.g. potatoes) from each of the m fields. In contrast to the sample-based approach, this method assumes that there are an infinite number of units in each field. We can again use an empirical or parametric distribution, F , to describe the variation in mean residues. To sample a unit residue level for a unit from a random field, a Lognormal distribution is assumed with the mean value sampled from F and the variance calculated using this mean and a variability factor, representing variation in residue levels between units. The value of the variability factor depends on the type of data. For supervised field trial data, the variability factor is sampled from a Lognormal distribution based on unit field trial data (EFSA, 2005) or fixed at a value of 3 or 6.83 (EFSA, 2007a). For monitoring data the variability factor is sampled from a Lognormal
distribution based on unit monitoring data (EFSA, 2005) or fixed at 6.83, 5 or 1 (EFSA, 2007a).
◦ Food Processing: Residue levels are likely to be affected by various pro- cessing steps before the raw agricultural commodity is consumed. Dietary risk assessment models use fixed values of processing factors, defined as the ratio of the concentration in processed and unprocessed food, when processing information is available.
• Consumption
◦ Data: Consumption data are taken from dietary surveys for various age groups and are obtained from a wide range of survey types (see Section 1.3.1.2).
◦ Choice of Model: Variation in consumption is typically modelled em- pirically (EFSA, 2012), resampling the observed consumption data as recorded in a dietary survey with replacement, rather than by fitting parametric models to the data. This approach retains potentially com- plex patterns in the data, in particular correlations between consumption of different foods. However, modelling a variable empirically using the observed data is likely to underestimate the maximum intake. This is because it is unlikely that the survey recorded the most extreme eating event in the population for every commodity. An alternative would be to use parametric approaches, which allow values higher than the highest observed consumption amount, but this would require modelling of de- pendencies. In order to model dependencies using parametric approaches, many observations are needed. As these are often not available for food types that are consumed rarely this approach may only be reasonable for some food types (e.g. staple foods consumed frequently such as bread or potatoes). One approach to model consumption parametrically is to use a latent Gaussian model (Allcroft2007, Chatterjee2008). Rather than in- troducing a parameter to account for non-consumption events the model uses an underlying multivariate Gaussian distribution such that the part
of the distribution below a defined threshold corresponds to zero con- sumption.
◦ Unit Weights: The total amount consumed (in kg food/day) needs to be converted into the number of items consumed so we can account for the effect of unit variation in residue levels on intake.
◦ Recipes: Dietary consumption surveys record data on food items ‘as eaten’ whereas dietary risk assessment models are based on residue levels on raw agricultural commodities. Therefore, consumption data from sur- veys need to be converted to (units of) RACs. This conversion consists of two steps: a) identify which ingredients are used and b) for each ingredi- ent, convert the amount (e.g. flour, tomato puree) to a RAC (e.g. wheat, tomatoes) using standard recipes (e.g. a pizza contains 17% wheat and 8% tomatoes, etc.).
◦ Body Weight: Information on body weight comes from the consumption surveys. To account for the dependency of consumption and body weight, both quantities are often sampled together.
• Model Characteristics
◦ Population: Dietary exposure assessments may focus on the whole pop- ulation or on various subgroups of the population. The latter could refer to only those individuals who consume the commodity in question, vul- nerable groups (e.g. children, pregnant women, etc.) or groups that are expected to have higher exposures from other routes (e.g. operators, workers, etc.).
◦ Monte Carlo: Monte Carlo approaches are often used to obtain popula- tion intake distributions by sampling from the consumption and residue level distributions.
◦ Uncertainty: Typically uncertainty in consumption and residue data is quantified using bootstrap or parametric approaches (EFSA, 2012). Uncertainty for other factors (e.g. processing factors) is generally not quantified with the exception of the variability factor.
◦ Model Output: Probabilistic dietary risk assessment methods will re- sult in an intake distribution. If a probabilistic intake assessment replaced the deterministic IESTI equations, the outcome would be a probability that the ARfD is exceeded (with a confidence or credibility statement). In this section we have discussed the data and models available for the pesticide registration process. In the next section we will discuss issues with both.