When collecting data for life cycle inventories, practitioners may use a value from a single data source (e.g. the mean of sample of values taken from a single unit process), or a value representing a number of values taken from multiple sources (e.g. the mean of the means of individual samples taken from a variety of examples of the unit process). Using SimaPro, it is possible to add an uncertainty value to inputs and outputs using a built-in function that uses the pedigree matrix detailed in Table 4.4. This allows the user to select a basic uncertainty value and select the appropriate quality level for each data quality indicator. This is useful when an input quantifying value for which the uncertainty is being calculated, is representative of a single unit process (single data source). However, its use can be problematic when the value being entered is representative of a variety of data sources. In this study, when literature values representing unit processes are being used to build the life cycle inventories, efforts have been made to collect data from a number of literature sources. This decision is based on the assumption that using a greater number of data sources can improve the accuracy of the quantifying process values, or, in other words, reduce the associated uncertainty. This seems an intuitive approach. For example, if the LCI data is being collected to represent the operation of a process at a national level, data from one example of that process within the country will not be as representative as data taken from all examples of the process (assuming there is sufficient variation nationwide). However, if, for example, a collection of means, each from different literature sources, are horizontally averaged to produce a mean value to be used as the quantifying value for a process input, this averaging itself will introduce further uncertainty in the form of spread. As the horizontal averaging of values introduces further uncertainty, and the application of the pedigree matrix to the mean value this averaging produces, is problematic. In this study, the generating uncertainty
estimates for horizontal averaging of data from different sources has been performed following the method developed by Henriksson et al (2014). The approach and calculation steps used are described in detail below, as the use of this method is a significant and somewhat novel component of this study, intended to improve the quality of life cycle assessments of aquaculture production systems.
Indicator
1 Verified data based on measurements 2
Verified data partly based on assumptions OR non-verified data based on measurements
3 Non-verified data partly based on qualified estimates 4
Qualified estimate (e.g. by industrial expert); data derived from theoretical information (stoichiometry, enthalpy, etc.)
5 Non-qualified estimate
1
Representative data from all sites relevant for the market considered over an adequate period to even out normal fluctuations
2
Representative data from >50% of the sites relevant for the market considered, over an adequate period to even out normal
fluctuations 3
Representative data from only some sites (<<50%) relevant for the market considered or >50% of sites but from shorter periods 4
Representative data from only one site relevant for the market considered OR some sites but from shorter periods
5
Representativeness unknown or data from a small number of sites and from shorter periods
1 Less than 3 years of difference to the time period of the dataset 2 Less than 6 years of difference to the time period of the dataset 3 Less than 10 years of difference to the time period of the dataset 4 Less than 15 years of difference to the time period of the dataset 5
age of data unknown or more than 15 years of difference to the time period of the dataset
1 Data from area under study 2
Average data from larger area in which the area under study is included
3 Data from smaller area than area under study, or from similar area 4 Data from area with slightly similar production conditions
5 Data from unknown OR distinctly different area
1 Data from enterprises, processes and materials under study 2
Data from processes and materials under study (i.e. identical technology) but from different enterprises
3
Data on related processesor materials but same technology, OR Data from processes and materials under study but from different technology
4
Data on related processes or materials but different technology, OR data on laboratory scale processes and same technology
5
Data on related processes or materials but on laboratory scale of different technology
Indicator score Uncertainty factor
1.1 1 1.05 1.2 1.5 2 1.2 1.5 1 1.001 1 1.05 Further technological correlation 1 1.05 1.1 1.2 1.5 1 1.02 1.05 1.1 Reliability Completeness Temporal correlation Geographical correlation 1.2 1 1.03 1.1
The method developed by Henriksson et al. (2014) uses a weighting procedure that produces a weighted average of values that are each from a different data source. The method also generates an
overall dispersion value (σo) that is used to describe the uncertainty associated with the weighted
mean. Weighting of the mean is done using a weighting factor, calculated using a value for
representativeness (σr) and inherent uncertainty (σu). The value for representativeness is obtained
from the uncertainty factors of the pedigree matrix, which is applied to each of the data points that are to be averaged. The inherent uncertainty for each is the standard deviation for each mean value from the individual data sources. The application of the pedigree matrix to each data point is intended to avoid the problems associated with applying the pedigree matrix to one value obtained from the horizontal averaging of a sample of data values from different sources, as described above. The pedigree matrix used will depend upon the intended application, and supplementary material by Henriksson et al. (2014) provide spreadsheets to be used for calculating arithmetic weighted means and uncertainty values using the pedigree matrices described by Frischknecht et al. (2005) and Weidema et al. (2013). However, as the calculation of uncertainty in in SimaPro uses a pedigree matrix blending features of both Frischknecht et al. (2005) and Weidema (2016), and as the majority of quantifying values for each input within a process are assumed to be from a lognormal distribution
Input Name Unit
ln(x̅gi/x̅g)^2 0.15 ln(x̅gi/x̅g)^2 0.01 ln(x̅gi/x̅g)^2 0.09 ln(x̅gi/x̅g)^2 ln(x̅gi/x̅g)^2
x̅g(wt) 4.62478197 x̅gi x̅g 5.428835233 σgu
σgu 1.05Reliability 1. Verified data based on measurements1.00 4. Qualified estimate (e.g. by industrial expert); data derived from theoretical information (stoichiometry, enthalpy, etc.)1.20 1. Verified data based on measurements1.00 1. Verified data based on measurements1.00 1. Verified data based on measurements1.00
σgr 1.009950494Completeness 5. Representativeness unknown or data from a small number of sites and from shorter periods1.20 1. Representative data from all sites relevant for the market considered over an adequate period to even out normal fluctuations1.00 2. Representative data from >50% of the sites relevant for the market considered, over an adequate period to even out normal fluctuations1.02 1. Representative data from all sites relevant for the market considered over an adequate period to even out normal fluctuations1.00 1. Representative data from all sites relevant for the market considered over an adequate period to even out normal fluctuations1.00
σgs 1.424504764Temporal correlation 1. Less than 3 years of difference to the time period of the dataset1.00 1. Less than 3 years of difference to the time period of the dataset1.00 1. Less than 3 years of difference to the time period of the dataset1.00 1. Less than 3 years of difference to the time period of the dataset1.00 2. Less than 6 years of difference to the time period of the dataset1.03
Geographical correlation 1. Data from area under study1.00 1. Data from area under study1.00 1. Data from area under study1.00 1. Data from area under study1.00 1. Data from area under study1.00
(σgo)2 2.043408085Further technical correlation1. Data from enterprises, processes and materials under study1.00 1. Data from enterprises, processes and materials under study1.00 1. Data from enterprises, processes and materials under study1.00 1. Data from enterprises, processes and materials under study1.00 1. Data from enterprises, processes and materials under study1.00 σgr σgu r wi wilnx̅i 93.53865259 194.50816 93.53865259 150.5446538 403.4672146 559.3243245 1.108930645 1.05104478 1.05 8 1.05 1.05 5 1.05 4 1.05 1.095445115 1.095445115 1.009950494 1.108930645
Figure 4.3. Spreadsheet built for the horizontal averaging of data points, producing a weighted geometric
mean and overall uncertainty value suitable to be used alongside the ecoinvent V3. database within SimaPro 8 software, based upon the protocol described by Henriksson et al. (2014). In this example, values from 3 hypothetical sources have been entered. The empty white cells are those into which the user enters a value.
Values (x̅gi) are entered from each separate source, each describing a particular common input or output of an
activity. The geometric standard deviation (σgu) of each of these values is entered in the cells below. Below
each of these cells are a further 5 white cells, each of which is a drop-down list, allowing the user to select the
correct qualitative level for the specific value. For example, an x̅gi value of 8 has a σgu of 1.05, and is assigned
the level 1, 5, 1, 1, 1, for the quality indicators reliability, completeness, temporal correlation, geographic
correlation, and technical correlation. The calculated weighted mean (x̅g(wt)) of the three x̅gi values 8, 5 and 4,
calculated using their respective standard deviation and quality scores, is an output of the spreadsheet, and in
(unless otherwise apparent), in this study the protocol developed by Henriksson et al. (2014) was modified to calculate the weighted geometric mean using the pedigree matrix as it features in SimaPro. This was used to build a spreadsheet for the horizontal averaging of data points to produce a weighted geometric mean and overall uncertainty value suitable to be used alongside the ecoinvent v3. database within SimaPro 8 software (Figure 4.3.). The calculation procedure is now described below. For calculating weighted means using the pedigrees of Frischknecht et al. (2005) and Weidema et al. (2016), as well as example calculations, description of equations, and background information for the methods development, please refer to Henriksson et al. (2014).
Each mean value (x̅gi) from the literature sources was assumed to be of a lognormal distribution unless
otherwise stated. The values of inherent uncertainty for each data point were entered as geometric
standard deviations (σgu). When this value was not supplied by the data source, or the source
contained insufficient data to calculate the value, a default value was used. The default values used were those provided as default basic uncertainty values within SimaPro, the most common of which
is 1.05. Representativeness (σgr) was calculated as the sum of squared uncertainty factors (Eq.4.6),
with the uncertainty factors being those provided by the pedigree matrix available in Simapro. The weighting factor (w) was then calculated using equation 4.7. For each data point from each data
source, a value for σgu andσgr and w are generated. The weighted mean was then calculated using
equation 4.8. >?Z = [exp D[ln(H I)]@+ [ln(H@)]@+ [ln(HK)]@+ [ln(HL)]@+ [ln(HM)]@+ [ln(HN)]@ \ = 1 ln(>^_Z)@ 2(`a) = 1 ∑ \c P \c2c
Overall dispersion was calculated using equation 4.10 and its output is the square of the geometric
standard deviation (σgo), the input value for uncertainty in Simapro when lognormal distributions are
assumed. As recommended by Henriksson et al. (2014), the lowest reported inherent uncertainty and representativeness are used in the calculation of overall dispersion, as their calculation in for each data source contributes towards the weighting factors. Spread of the data values was calculated as
the geometric standard deviation of the entered data points using equation 4.9, where xg is the
Eq.4.7 Eq.4.6
>?d= exp ⎝ ⎜ ⎜ ⎛h∑ iln22?cj @ $ − 1 ⎠ ⎟ ⎟ ⎞ >?Yn= exp [olnp>?^ n qr@+ olnp>?dn qr@+ olnp>?Znqr @ Eq.4.10 Eq.4.9