• No se han encontrado resultados

Decentralised planning offers many advantages over centralised planning approaches, particularly in outdoor environments where there is less permanent and reliable in- frastructure. Most importantly, decentralised planning avoids having a single point of failure, and the robots should continue to behave reasonably even if communication is temporarily interrupted. Also, distributing the computing efforts over multiple nodes can increase the computational resources available to the team. Often, faster decisions can be made since the relevant computing is performed primarily onboard the robot that executes each decision.

However, there are many challenges to performing decentralised planning. The al- gorithms are inherently parallel algorithms, typically without time synchronisation, which can make the behaviour unpredictable. Each processing node (robot) has less information about the world compared to centralised systems; this includes knowledge about the state of the environment, the state of the other robots, and the intentions of the other robots. Ideally, the robots should still make reasonable decisions even if communication is disrupted. We review existing approaches to addressing these chal- lenges as follows, beginning with relatively simple planning problems, then leading to active perception problems.

Swarm robotics

The swarm robotics literature (Brambilla et al., 2013) considers systems of very large teams of robots where each robot has extremely limited knowledge of the world, such

as only knowing the proximity to neighbouring robots. Additionally, communica- tion is typically very limited, such as no explicit communication at all, or only being able to send small packets between local neighbours. Due to this limited informa- tion, the decision making performed by each robot is relatively simple. The focus is therefore on designing simple local decisions that result in emergent collective be- haviours. This allows the team to complete tasks such as controlling local densities of robots (Demir et al., 2015), perimeter following (Caccavale and Schwager, 2017) and manipulation (Culbertson and Schwager, 2018). Unfortunately, this simple decision making and limited sensing is not enough to solve richer perception tasks. We believe it is much more appropriate to solve most active perception problems with smaller teams of robots that have increased onboard sensing, computation and communica- tion capabilities.

Distributed task assignment

Task assignment problems (Munkres, 1957) involve assigning a set of tasks to a set of agents, such that each task is assigned to one agent, and each agent is assigned one task. Each task-agent pair has an associated assignment cost, and the aim is to minimise the sum of these costs. More formally, this problem involves finding a minimum-weight matching in a bipartite graph. The Hungarian algorithm solves this problem in a centralised manner in polynomial time (Munkres, 1957). Several decentralised algorithms have been proposed such as the distributed Hungarian al- gorithm (Chopra et al., 2017) and local task swaps (Liu et al., 2015). Auction-based methods consider generalisations where each agent can be assigned more than one task, or each task can be assigned to more than one agent (Dias et al., 2006).

The main benefit of this formulation and approaches is that they are relatively easy to compute. However, this formulation is typically not expressive enough for active perception tasks since rewards and/or costs are not additive. Also these approaches, particularly in the one-to-one case, are myopic planning. However, they have been used as a sub-routine of more sophisticated methods, e.g., for target tracking prob- lems (Xu et al., 2013).

Non-myopic planning

Decentralised myopic methods with performance guarantees have been proposed for monotone submodular problems (Hollinger et al., 2009; Patten et al., 2013; Garg and Ayanian, 2014; Kemna et al., 2017). However, the benefits of myopic planning are equally applicable to the multi-robot case, as they are to the single-robot case (dis- cussed in Section 2.2.2). Existing decentralised planning algorithms for multi-robot informative path planning typically involve exploiting problem-specific characteris- tics. The auction-based methods mention above (Dias et al., 2006) involve each robot negotiating over which tasks it will perform, and are more appropriate for cov- erage and exploration problems (Zlot et al., 2002). Stranders et al. (2009) combine max-sum message passing (discussed further in Section 2.3.6) with branch and bound pruning (discussed further in Section 2.3.3) to find sequences of viewpoints that min- imise the entropy of a Gaussian process. Hollinger et al. (2009) propose solving a finite-horizon POMDP (discussed in Section 2.3.1) for target search problems. Corah and Michael (2017) propose a distributed sequential greedy assignment algorithm for multi-robot exploration, and provide performance guarantees by exploiting a submod- ularity assumption. Gan et al. (2014) solve search problems with inter-agent collision avoidance by solving a constraint optimisation problem and refining trajectories to avoid collisions. Atanasov et al. (2015) propose a decentralised algorithm for tracking targets that have linear Gaussian dynamics, such as for active SLAM (discussed in Section 2.1.1).

Our proposed algorithm in Chapter 3 is applicable to a general class of problems, which includes all problems mentioned above, since it does not rely on specific assump- tions about the problem; however, our approach can readily incorporate problem- specific approximate solutions, such as those above, as heuristics to guide the search. The algorithm proposed in Chapter 6 is also a decentralised, non-myopic planning algorithm; however, in this case we make several assumptions about the problem in order to formulate an efficient solution for this specific case.

Communication considerations

The majority of decentralised coordination algorithms involve communicating each robot’s plan to other robots. This communicated information is used to ensure the team’s objectives are being met. One advantage of decentralised planning is that reasonable behaviour should still be exhibited if the communication breaks down. The role of communication has been considered in various ways, which we discuss as follows.

Several planners have been demonstrated to have a graceful degradation of perfor- mance as communication becomes less reliable. The max-sum algorithm (discussed further in Section 2.3.6) demonstrates this property, which is explained as being due to message redundancy (Farinelli et al., 2008). Otte and Correll (2013) demonstrate this property for a distributed RRT algorithm that solves coordinated path planning with collision avoidance. Otte et al. (2017) compare the performance of distributed auction algorithms for task allocation in harsh communication environments. The Dec-MCTS algorithm we propose in Chapter 3 is also demonstrated to have this useful property.

It is also possible to take an active approach to exploit this communication redun- dancy by explicitly selecting which messages to transmit. In most cases, “commu- nication planning” has been performed where the messages are observations. The value of these messages can be measured by considering their effect on data fusion accuracy (Williamson et al., 2008; Kassir et al., 2015). Planning to communicate plans, rather than observations, is less common; Unhelkar and Shah (2016) address this problem for Dec-POMDP formulations by defining communication value as the reduction in reward caused by not communicating. We recently proposed a new ap- proach to this problem that maintains a probabilistic belief over the future plans, and then measures the information value as uncertainty of the reward distributions; we summarise our approach in the context of Dec-MCTS in Section 3.7.

An alternative, and complementary, approach to improve communication is to actively position the robots in order to improve the communication channel. This problem is known as communication-aware motion planning (CAMP), where communication objectives, such as maximising network throughput, is formulated as secondary objec-

tives when planning the motion of the robots. It is not surprising that robot motion can be exploited to improve communication since communication quality is spatially varying, due to, e.g., the well studied effects of path attenuation, or the more com- plex issue of multi-path fading (Lindhé, 2012). Examples of CAMP problems include selecting robot paths to perform connectivity maintenance (Sabattini et al., 2013), periodic connectivity (Hollinger and Singh, 2012), or communicate with a fixed base station (Ghaffarkhah and Mostofi, 2011; Lindhé and Johansson, 2013). In Chapter 5 and Chapter 6 we consider a new problem of this type where robots choose to stop and communicate at times and locations that have a high prediction probability of communication success.

A difficulty in applying CAMP approaches in practice is that they assume a known model that maps pairs of spatial locations to communication quality. In general, learning this model with sufficient accuracy is an unsolvable challenge. For specific tasks, sufficient models may be learnt in indoor environments (Banfi et al., 2017), or by using local measurements of multi-path fading in complex environments (Lindhé and Johansson, 2013). We propose new models suitable for mission monitoring in Chapter 5, which combines communication models with probabilistic trajectory pre- diction models (introduced in Section 2.1.2).