7. Desarrollo de la propuesta
7.2. Elaboración de un herbario digital
We now analyze the predictive performance of our model. That is, how well is it able to predict the disturbance statistics of the test set? Given the control signal and visual features at each point in the test set, we are able to compute a predicted disturbance mean
and variance. In Fig. 4.8, we depict the mean and 2σenvelope for our algorithm’s prediction
overlaid on the raw disturbance signal from our test set. Note that our algorithm is able to faithfully predict the statistics of the disturbance signal over a test set that has not been used for training. While this picture does hint at the advantages of pursing statistical
estimation of model disturbance, e.g., the 2σ bound does a good job of capturing the rapid
variation present in the raw disturbance data, a more compelling result is played out when one examines the full-state uncertainty propagation.
If we integrate a time-series of controlukaccording to (4.7) withs(xk,uk) given by (4.19)
and ˆg(ak−1) given by our online dictionary model, we can predict not only the expected
vehicle trajectory but also it’s full-state covariance. Figure 4.7 depicts example predicted trajectories from our test set. To illustrate the systematic errors that a control algorithm is subject to when no consideration of model disturbance is made, we show the comparison between (1) our initial prediction before any training data has been ingested and (2) the final prediction after our trained model has converged.
Figure 4.8: The statistics of the disturbance prediction across the test set is visualized in blue as a
solid line for the mean predicted disturbance with a shaded envelope depicting±2σ, and the true
disturbance is given by the green line.
In particular, the green dashed line denotes the actual driven path, the blue filled ellipses show the prediction based on our dictionary learning algorithm, and the red path/ellipses
depict theaveragemodel. Examining the final prediction in these figures, it is clear that our
learned model using visual perception is successfully predicting significant terrain-dependent
disturbance in both the grass and pavement settings. We do note that while the average
model does a poor job of predicting the mean disturbance statistics, it does adopt a variance that admits the actually driven path.
In this chapter we used an online dictionary learning technique to generate predictions of the model disturbance distribution given real-world experiential data. We are able to generate these predictions from motion control signals and low-level visual features, thus
bypassing the need for explicit terrain classification. Further, these predictions are an
important input to modern planning and control systems because they enable robust control decisions. Given the promising results shown here, the flexibility of the model with respect to input features, and the applicability of this technique to real-time operation, we are confident our approach can be adapted to field robotic systems. The success of this approach on a truly challenging learning problem suggests that dictionary methods may hold promise for collaborative multi-agent learning, which we investigate in the next chapter.
Chapter 5
Dictionary Learning in
Multi-Agent Systems
In Chapters 2 and 3, we observe that GLM-based methods while provably stable, lack competitive accuracy performance. On the other hand, dictionary learning methods define non-convex optimization problems which elude stability, but tend to perform well in prac- tice (Chapter 4). Motivated by these observations, in this chapter we attempt to extend supervised dictionary methods to multi-agent settings with streaming data. We exploit the Lagrange duality tools that work well for the GLM setting, and the techniques devel- oped in the previous chapter based on dictionary learning. Ultimately, we would like to develop tools for multi-agent systems to achieve stability as well as achieve good inference performance from streaming data.
One salient motivation for this problem formulation is designing robotic teams which exhibit real-time adaptivity. More specifically, consider a team of mobile robots deployed throughout an unknown environment and charged with some high-level task such as explo- ration or navigation. Critical to the success of this team is the ability of each platform to learn from and adapt to its new surroundings. While individual platforms can make and learn from their own local observations, it would be far more efficient if the team could perform these tasks jointly using its communication network. In self-supervised learning- for-control tasks, for example, the team as a whole can more quickly cover the entire obser- vation space, thereby providing individual platforms access to observations that they may not be able to experience locally. Additionally, the quality of inferences based on informa- tion aggregated over the entire network are likely to be superior to those based on local observations, based on classical theory of consistency of statistical estimators.
Thus, we extend online discriminative dictionary learning [12] to networked settings, where a team of agents seeks to learn a common dictionary and model parameters based upon streaming data. We consider tools from stochastic approximation [161] and its de-
centralized extensions which have incorporated ideas from distributed optimization such as weighted averaging [77, 134, 157], dual reformulations where each agent ascends in the dual domain [78, 154], and primal-dual methods [9, 93, 134].
Our main technical contribution is the formulation of the online multi-agent discrimina- tive dictionary learning problem as a distributed stochastic program and the development of a block variant of the primal-dual algorithm proposed in [9, 93] (Section 5.2). More- over, we establish that under an attenuating step-size regime, an asymptotic stationary solution is attained in expectation in Section 5.3. In Section 5.4, we analyze the proposed framework’s empirical performance on a texture classification problem based upon image data for a variety of network settings and demonstrate its capacity to solve a new class of collaborative multi-class classification problems in decentralized settings. In Section 5.5 we consider the algorithm’s use in a mobile robotic team for navigability assessment of unknown environments.
5.1
Task-Driven Dictionaries for Multi-Agent Systems
We want to solve (4.12) in distributed settings where signal and observation pairs are independently observed by agents of a network. Agents aim to learn a dictionary and model parameters that are common with all others while having access to local information
only. In particular, associated with each agenti is a local random variable and associated
output variable (xi,yi) and each agent’s goal is to learn over the aggregate training domain
{(xi,yi)}Vi=1. As in Chapters 2 and 3, letG= (V,E) be a symmetric and connected network
with node setV ={1, . . . , V} and E=|E|directed edges of the forme= (i, j) and further
define the neighborhood of i as the set of nodes ni := {j : (i, j) ∈ E} that share an edge
withi. When each of theV agents observes a pair (xi,yi), the function h in (4.12) can be
written as a sum of local losses,
`(D,w; (x,y)) =
V
X
i=1
`i(Di,wi; (xi,yi)), (5.1)
where we have defined the vertically concatenated dictionaryD:= [D1;. . .;DV]∈RV m×k,
with eachDi∈Rm×k and model parameterw:= [w1;. . .;wV]∈RV k.
Substituting (5.1) into the objective in (4.12) yields a problem in which the agents learn dictionaries and classifiers that depend on their local observations only. The problem to be formulated here is one in which the agents learn common dictionaries and models. Since
the network G is assumed to be connected, this relationship can be attained by imposing
these constraints, we obtain the distributed stochastic program {D∗i,wi∗}Vi=1 := argmin Di∈D,wi∈W V X i=1 Eyi,xi h `i(Di,wi; (xi,yi) i s.t. Di =Dj, wi =wj, j ∈ni. (5.2)
When the agreement constraints in (5.2) are satisfied, the objective is equivalent to one in which all the observations are made at a central location and a single dictionary and
model are learnt. Thus, (5.2) corresponds to a problem in which each agent i, having
only observed the local pairs (xi,yi), aims to learn a dictionary representation and model
parameters that are optimal when information is aggregated globally over the network. The decentralized discriminative learning problem is to develop an iterative algorithm that
relies on communication with neighbors only so that agent ilearns the optimal (common)
dictionaryD∗i =D∗j and discriminative modelw∗i =w∗j. We present in the following section
an algorithm that is shown in Section 5.3 to converge to a local optimum of (5.2).
Remark 4 Decentralized learning techniques may be applied to solving pattern recognition
tasks in networks of autonomous robots operating in real-time, provided that realizations of the output variables are generated by a process which is internal to the individual platforms.
In particular, consider the formulation in (5.2), and letyi represent the difference between
state information associated with a commanded trajectory and that which is observed by
on-board sensors of robot i. Most robots are equipped with sensors such as gyroscopes,
accelerometers, and inertial measurement units, which make this self-supervisory informa- tion available. In this case, the interconnected network of robots does not need external supervision or human in the loop in order to perform discriminative learning. In Section 5.5 we propose solving problems of the form (5.2) in a network of interconnected robots op- erating in a field setting by tracking the difference between measurements made via inertial measurement units (IMU) and movements which are controlled by a joystick.