PROYECTO LIFE TRACHEMYS (2011-2013)
V.5. ZONA DE CONECTIVIDAD ECOLÓGICA (ZE)
A Bayesian Belief Network (BBN) is essentially a set of variables, represented as a network of nodes that are linked by probabilities, that affect some outcome(s) of interest (Marcot, 2006). A BBN can be represented in the form of a network diagram, known as a directed acyclic graph (DAG), to provide a visual representation of the components and dependencies of a domain (Newton, 2009a).
In a Bayesian network diagram, variables (known as nodes), data and parameters are represented by different shapes (such as ellipses and rectangles), which are connected by arrows (known as directed links) to indicate conditional dependencies (Newton, 2009a). A link between two nodes, from node A (a ‗parent‘ node of B) to node B (a ‗child‘ node of A), indicates that A and B are functionally related, or that A and B are statistically correlated (Newton, 2009a). Variables without parents are known as input nodes. Directed links are representations of conditional dependence and represent influence rather than causality, as there is no requirement that that links represent causal impact (Lauritzen and Spiegelhalter, 1988; Jensen and Nielsen, 2007; Newton, 2009a). The graph links have directions but no directed cycles, or closed loops, are permitted (Newton, 2009a). Propagations can be made through the links in either direction of the network, enabling the model to be explored in reverse: the BBN can be used to infer the most likely set of conditions for a given outcome, in contrast to most other modelling approaches (McCann et al., 2006).
Each child node (i.e. a node linked to one or more parents) contains a conditional probability table (CPT) which gives the conditional probability for the node being in a specific state given the configuration of the states of its parent nodes (Newton, 2009a). When networks are compiled, Bayes‘ theorem is applied according to the values in the CPT, so that changes in the probability distribution for the states at node A are reflected in changes in the probability distribution for the states at node B (Jensen and Nielsen, 2007; Newton, 2009a). A BBN can be explored by changing
131
the states of the nodes (or variables) incorporated within the model (Newton, 2009a) and when the state of a node is known, that variable is said to be instantiated (Jensen and Nielsen, 2007). Once a node has been instantiated, this will then influence the probabilities associated with the states of other nodes to which it is linked, according to the values in the CPTs (Newton, 2009a).
There is general agreement that it is advantageous to have a relatively simple network structure with the minimum number of nodes (preferably three or fewer parent nodes) and no more states per node (preferably five or fewer) than are necessary (Marcot et al., 2006). This keeps the associated conditional-probability table (CPT) small enough to be tractable and understandable (Marcot et al., 2006). In addition, Marcot et al. (2006) suggest that the depth of the model – the number of layers of nodes – should be kept to four or fewer, to reduce propagation of unnecessary uncertainty from input nodes to output nodes and to prevent the sensitivity of the output node to input nodes being swamped and dampened by intermediate nodes.
When developing BBNs, it will usually be necessary to discretise variables by defining a number of discrete states for each variable (this reflects the computational difficulties of performing Bayesian inference with continuous variables) (Newton, 2009a). Although algorithms are available for discretising continuous variables (for example, Clarke and Barton (2000)), these are aimed for use with machine-learning of (generally extensive) data and not for models based on expert knowledge. Discretisation may not necessarily be straightforward and some degree of interpretation and subjective judgement can be used to a certain extent (Newton, 2009a). Pollino et al. (2007) provide a useful description of how they achieved this, by establishing states, where possible using recognised classifications, management thresholds or guidelines. Where these were not available, sub-ranges were specified with the guidance of experts (Pollino et al., 2007). The number of ‗states‘ or ‗classes‘ assigned to each variable were not pre-determined, but evaluated and assigned on an individual basis. Pollino et al. (2007) note that while discretisation of continuous variables is not desirable, it facilitates parameterisation process by simplifying expert elicitation, and it acknowledges that understanding of many
132
parameters, and the data available to support such relationships, is often quite rudimentary.
It is widely recognised that obtaining values to populate the CPTs is one of the main challenges to modelling with BBNs as obtaining appropriate values is often difficult because of a lack of appropriate information, yet the entered values will have a major influence on the performance of the model (Newton, 2009a). The problem increases with the number of directed links associated with each node (Newton, 2009a). It can also be a particularly difficult task for rare events and when the number of probabilities to be estimated is large (McCann et al., 2006). The sources of information that are used (often in combination) are expert knowledge, observational or experimental evidence or data which is either available directly or extracted from the scientific literature, outputs of other empirical, mechanistic or stochastic models, or stakeholder consultation (Newton, 2009a). Different information sources, such as data and expert estimations, can be combined and weighted, which is a key advantage of BBNs (Pollino et al., 2007).
BBNs are able to learn CPT values directly from a data set, although this is rarely possible in investigations relating to conservation management, where available datasets are often limited. Therefore, they are frequently completed using expert knowledge, but in situations where information is lacking conditional probability values may be based on very restricted information, something which should be borne in mind when interpreting results (Marcot et al., 2001; Newton et al., 2007). In addition, BBN models can easily be built that reflect personal biases, although peer review can help to prevent this (Marcot et al., 2001).
The creation of a network diagram, representing the domain of interest (i.e. identification of relevant variables and the relationships between them) is the first stage of BBN development, followed by assignment of states and probabilities to each variable (Newton, 2009a). This can itself be a useful way of eliciting information from experts and structuring the information available and the visual nature of the network can foster communication between interested parties (Newton, 2009a). This participatory modelling process of BBNs can also help to document and communicate current understanding and identify key uncertainties or gaps in
133
knowledge and also identify suitable indicators to provide a basis for monitoring and adaptive mangement (Nyberg et al., 2006; Smith et al., 2007; Newton, 2009a). This is seen as a key advantage of BBNs.
Although BBNs are somewhat similar to decision trees and other decision models, their interactive and graphical representation is a great advantage, particularly in permitting more effective communication of cumulative effects and outcomes of alternative conditions and decisions than do more static models such as decision trees and other traditional statistical approaches like classification or regression trees (McCann et al., 2006). They are also more readily understood by non-modellers (McCann et al., 2006), which is an important advantage, particularly when they often rely on expert knowledge, and may well be used by non-modellers.
The ease with which BBNs can be created and amended is an advantage over other modelling approaches (McCann et al., 2006). Different model structures may be explored and simulations can be run very rapidly at relatively low cost as extensive computer programming or modelling expertise is not required to develop and update models (Smith et al., 2007). By being able to instantly recalculate and display probabilities of conditions and outcomes as alternative decisions are specified (for example by comparing the probability of different outcomes arising from alternative management decisions), McCann et al. (2006) suggest that BBNs offer a uniquely valuable tool for supporting decision-making. BBNs can also be used to infer the most likely set of causal conditions for a given outcome by solving the models conditional probabilities backward through the model structure, which is something that many other models, such as decision trees, cannot provide (McCann et al., 2006).
The fact that BBNs use probabilistic, rather than deterministic, expressions to describe the relationships among variables also makes them different from most other environmental modelling approaches and makes them particularly useful in the context of risk assessment and for supporting decision making (Borsuk et al., 2004; Newton, 2009a). This use of BBNs can also be enhanced by incorporating decision nodes and utility nodes to create influence diagrams (related to decision trees) (Newton, 2009a). Decision nodes represent two or more choices or decisions (made
134
by a user of the model) that influence the values of other response nodes and do not have CPTs associated with them (Nyberg et al., 2006; Newton, 2009a). A utility node represents some measure that can be used to assess the success or failure of a decision and is associated with a utility table, which specifies the utility of each configuration of the parents of the utility node (which may be influenced by a decision, through a link with a decision node) (Nyberg et al., 2006; Newton, 2009a). Once parameterised, such a model can be explored to identify the choice in each decision node that minimises the costs or maximises the benefits or values considered (Nyberg et al., 2006).
BBNs are particularly useful when there are uncertainties in the available information used to construct a model, something typically associated with environmental modelling (Newton, 2009a). For example, Newton (2009c) used a Bayesian Network to produce IUCN Red List (www.redlist.org) classifications for taxa in situations where the input data were uncertain. Newton et al. (2007) suggest that a key feature of BBNs is that the results are presented as probability distributions or relative likelihoods of different outcomes, which provides a highly visual means of representing the uncertainty surrounding the potential outcomes of, for example, conservation management interventions. Smith et al. (2007) also highlight the suitability of BBNs for species habitat modelling in data and information poor environments, and for accounting for uncertainty in data and knowledge. These features, as well as the ability of BBNs to combine empirical data with expert judgement, makes them an extremely flexible and useful modelling tool (Smith et al., 2007).
However, despite their advantages, BBNs are prone to many of the general limitations common to other modelling approaches, including the difficulty of incorporating all sources of causality, uncertainty and variability in a model without errors and inaccuracies (McCann et al., 2006). In addition, in some environmental domains, the interactions among variables may be highly complex and difficult to quantify (Gu et al., 1996). BBNs, like other decision models, may oversimplify criteria affecting a decision and fail to depict subtle variations of decisions and changing conditions that so often occur in real-world situations (Nyberg et al., 2006).
135
A further shortcoming of BBNs is that they do not strictly permit feedback functions either within a node or from response (output) variables back to predictor (input) variables. Feedback can be important in many systems such as density-dependent survivorship and reproduction in wildlife population models and consumer performance in economic models (Nyberg et al., 2006). They are also poorly suited to examine dynamics over time, although there are some approaches that can be used to overcome this (Newton, 2009a). Another issue is that discretising continuous- variable distributions, as is necessary in most BBNs, might oversimplify state responses (Nyberg et al., 2006). Some of these drawbacks will be more of an issue in certain applications than others, but Nyberg et al. (2006) suggest that BBNs be viewed as decision-aiding tools to help inform and advise the decision-maker who, ultimately, must weigh the ramifications of decisions that can be far more subtle and complex than any model can depict. As with any modelling approach, it is important that BBNs are used with appropriate knowledge of their strengths and weaknesses (Nyberg et al., 2006).