The danger with scaling the multiagent substrate presented thus far is the confusion that may result from conflating coordinate axes that have different meanings. In particular, r(x), which is the internal x-axis of the agent’s ANN, is defined as a periodic function of the global x-axis of the team. In addition to conflating these potentially orthogonal axes, to scale the team, the geometric relationships of the nodes on the substrate must be altered. Specifically, the horizontal size of each ANN is reduced, and some of the nodes of new agents end up where the nodes of old agents previously existed. Figure 7.2 demonstrates this conflict by
1 Y -1 X -1 1 5 Agent Substrate
scaled to
1 Y -1 X -1 1 7 Agent SubstrateFigure 7.1: Initial Concept for Heterogeneous Scaling. Because multiagent HyperNEAT represents teams as a pattern of policies, it is possible in principle to interpolate new polices in the policy geometry for additional agents by sampling new points on the substrate. The substrate remains the same size and the additional agents are squeezed horizontally so that the new number of agents fit. Additionally, the r(x) function is altered slightly so that it repeats the correct number of times. However, as explained in Section 7.1.1, this approach to interpolation can still be improved.
showing the boundaries of agents on a team of size five and how those boundaries shift when the team is scaled up to seven agents. The problem is that if the learned policy geometry is based heavily on the global x coordinates then the pattern could be disrupted across multiple team sizes because the global coordinates may not refer to the same agent on which the pattern was trained.
Thus, as an alternative to the original approach, axes that define coordinates within the agent’s “head” and within the team at large can avoid conflation by being orthogonal to each other. Because the ANN already has two axes (assuming it is in two-dimensional space), the geometric configuration that captures the appropriate orthogonal structure is
1 Y
-1
X -1 1
(a) 5 Agent Substrate
1 Y
-1
X -1 1
(b) 7 Agent Substrate
Figure 7.2: Potential Problem with Scaling. The substrate based on the r(x) function has a fixed amount of space that is evenly divided among all the agents that are on the team. When additional agents are added, the agents are compressed horizontally to accommodate the new number, resulting in parts of the space that used to compose one agent now containing parts of other agents. This problem is illustrated by observing the agents’ borders at team sizes five (a) and seven (b). If the pattern of policies is based heavily on the global x coordinate, new policies may not correctly interpolate because of this conflation.
z-axis; r(x) is no longer necessary, and z simply represents the discrete position of the agent on the team. Additionally, because the ANN for each agent exists at a discrete z coordinate, there is no longer a need to compress the substrate coordinates: There is effectively room for an infinite number of agents without conflict. When scaling, agents can therefore simply be added to the stack at discrete points (figure 7.3). This method is called the stacked substrate to differentiate it from the previous substrate, which will be called the divided substrate.
It was previously shown that the divided substrate could be seeded with a strong starter policy (Section 5.3). The stacked substrate also possesses this capability. Recall that a
CPPN that represents a single agent takes the four inputs x1, y1, x2, and y2 (figure 5.1a).
Z
5 Agent Substrate
scaled to
Z
7 Agent Substrate
Figure 7.3: Stacked Substrate Heterogeneous Scaling. The stacked substrate places the ANN for each agent at a coordinate on the z-axis, effectively making a stack of two-dimensional substrates. It is scaled by inserting new two-dimensional substrate slices along the z-axis. Because each new slice exists at a fixed z coordinate, adding new agents does not affect existing agents. This representation is thus more amenable to scaling than the divided approach (figure 7.2).
substrate is three-dimensional. Therefore, a new z input is added to the network, although with no connections to the existing network (figure 7.4). Only one z input is necessary because each agent exists at an infinitesimal point on the z-axis. In this way, the CPPN can now be queried for the policies of a team of agents. However, because the only factor that differentiates the agents, z, is not connected to the network, the team is initially homogeneous (as with the initial seeded divided substrate). However, once z is connected to the network by mutation, the CPPN can create variations of the seed policy based on the policy geometry along z.
To demonstrate the benefit of the stacked substrate over the divided substrate, it is necessary to compare the two to ensure that the purported benefits of the stack do not come at the price of representational power or hidden disadvantages. Therefore, both substrates
X1 Y1 Y2 Out X2 Evolved Seed converted to X1 Y1 Y2 Out X2 Z Team Genome
Figure 7.4: Stacked Substrate Seeding. The stacked substrate can also be seeded with
a high-performing single-agent policy. To do so, the original CPPN (left) is given a new z input that determines which agent is being sampled. This method preserves the original seed pattern, but again allows multiagent HyperNEAT to create a pattern of policies relevant to the team’s policy geometry, which varies along z.