“Ninety percent of life […] is just being there; and we have indeed charted lots of ways in which the facts of embodiment and environmental location bear substantial weight in explaining our adaptive success. […] [But] we should not be too quick to reject the more traditional explanatory apparatuses of computation and representation. Minds may be essentially embodied and embedded and still depend crucially on brains which compute and represent.” (Clark, 1997a: 143; text into brackets added)
Andy Clark, despite acknowledges that most part of our lives is “just being there” (that is to say that our lives primarily unfold through the embodied, practical and smooth engagement with the world), claims that our “being there” has to be explained by making use of some classical concepts of orthodox cognitive science: computation and representation.
His core idea can be put in this way: our cognitive practices are embodied and embedded and, precisely in virtue of these features, we should claim that cognition unfolds relying on representations that account for embodiment and embeddedness. Those kinds of representations are called “action-oriented representations”.
According to Clark, Action-Oriented Representations (AORs) radically differ from classic “chunky”, explicit, symbolic representations because, instead of
i) being representations whose key contents are tokenable strings of symbols, operated upon by a ‘read/write/copy’ architecture (Clark, Toribio 1994: 403),
and instead of
ii) re-presenting (i.e. mirroring, or accurately describing) properties of the world,
AORs simultaneously describe aspects of the world and prescribe possible actions. They are neither passive, crystallized pictures of properties of the external reality, nor pure control structures. On the contrary, they are poised between these two “cognitive slices” (Clark 1997a: 49). Moreover, they are described as “local”, “action-dependent”, and as reflecting “the profound role of bodily motion […] in shaping and simplifying the information-processing problems to be solved” (Clark 1997a: 149).
To understand what Clark means by action oriented representations better, I think that it is worth to start my discussion trying to understand what prescriptive and descriptive functions of AORS are. To consider this point, it could be interesting to look at one of the sources that Clark considers in his discussion (Clark 1997a: 50; 238, footnote 5): Ruth Millikan’s Pushmi-Pullyu Representations.
Millikan describes Pushmi-Pullyu Representations (PPRs) as representations that, despite are not the sum of a descriptive function and of a directive function, face both those ways at once. To make this point more clear, I give a closer look at the meaning of “descriptive” and “directive”, by considering the example that opens Millikan’s article Pushmi-Pullyu Representations (Millikan 1995: 185-186).
Consider a list of groceries. This one can be used for two distinct purposes. It might be used as a shopping list, which tells what to buy, or it can be used as an inventory list, which tells what has been bought. The list, considered in the first sense, has a directive function: it tells us what to do in order to accomplish a task. Moreover, it makes a normative claim (in the sense explained in §I.7) about the world, according to which the world is supposed to conform to the representation: if the shopping list does not match what the grocery bag contains, it is what is in the bag that is at fault. On the contrary, if the grocery list is used as inventory list, the function the two elements of the normative relation have is inverted: the representation is supposed to conform to the world, namely if the list does not match what the grocery bag contains, it is the list that is at fault. Indeed, the shopping list used as an inventory list describes what is the case in the world.
PPRs have both these two dimensions: they tell us what to do, by saying what is the case.
Another example of PPR is the following one (Millikan 1995: 190-192): the food call of a hen to its brood, whose function is to make the chicks coming to the place where food is. What is the structure of this very primitive representation? The call is evidently directive. It says something like “come here and eat”. On the other hand, the representation has in itself the condition for the successful performance of the task (coming there and eating food). Indeed, when it directs action, it simultaneously says “here’s food now”. It gives information about the spatial location of the food, by describing a state of affairs. What is interesting is that such a representation does not merely mirror external reality, but by saying what is in the world, affords a specific kind of action. It connects directly with action because its nature is action-oriented. This goal-directedness makes the representation vary as a direct function of a certain variation in the environment, “directly translating the shape of the
environment into the shape of a certain kind of conforming action” (Millikan 1995: 190).
According to Millikan, pushmi-pullyu representations are also interesting conceptual tools to explain human cognition. For instance, intentions could be considered to be PPRs. Millikan considers intentions as internal states, as internal representations that have a directive nature (viz. they cause a certain kind of behavior). Nevertheless, she claims that explaining them only as directive is misleading. Indeed, it is commonly accepted that a person cannot actually intend something without believing that she will do that action. Then, if intention implies having a belief that P, then these kinds of representation have also a descriptive nature: the representational content of the belief describes what will be done.
Another example of PPRs are perceptual representations. Those ones are mental representations that map variations of the agent’s perceived world by encoding those variations as possible perceptual actions: they map “variations in goals directly onto the represented future world” (Millikan 1995: 192). This is to say, perceptual pushmi pullyu representations function as proxies of future perceptual situations: they represent how the world will look like when the agent will act upon it.
This last example of pushmi-pullyu representation is particular interesting for the purposes of this paragraph. Indeed, it seems to me that what Clark calls “Action-Oriented Representation” is precisely what Millikan calls “pushmi- pullyu perceptual representation”. To understand this point, it is worth to have
a look at the first example of Action-Oriented
Representation that Clark gives in his book Being there: Maja Mataric’s work on mobots.
At the MIT Artificial Intelligence Laboratory, Mataric and colleagues worked on a project in AI aimed at designing a neurobiologically-flexible spatial representational model, which scientists implemented and tested on a physical autonomous mobile robot.52
Mataric’s mobot works in this way (Clark 1997a: 47 and ff.). It uses a set of quasi-independent layers, and each one constitutes a process route from inputs to outputs. Each layer works on a specific part of the environment. One generates boundary tracing (the walls the robot follows while it avoids obstacles); another one detects landmarks, registered as a combination of the robot’s motion and its sensory inputs, and a third layer uses this information to produce a map of the environment. Those representations constitute what is called a cognitive map. This one is made of a network of landmarks, which are a combination of motor and sensory readings. The nodes
of the map process information in parallel: an active node excites the nodes in the area nearby, generating expectations of the next landmarks that will be encountered in the map.
The basic idea of this model is that the cognitive maps the robot makes use of are made of nodes that combine descriptive information about the robot’s own movement with perceptual information: this makes the map working as a controller for the robot’s action. Moreover, by mapping the relation between information about the robot’s movement and perceptual information about the environment using the propagation of signals among nodes, the cognitive map generates plans for real movement, making the robot able to react to real-time environmental conditions.
If I understand the description of Mataric’s robot that Clark gives well, I think that it is possible to claim that the robot’s cognitive maps are very similar to what Millikan describes by the expression “perceptual Pushmi-Pullyu Representations. Cognitive maps have this function: by generating plans for real movements, they internally simulate53
what should be done by the robot in the environmental space, making it ready to cope with changes (e.g. angles, obstacles, and so on) in its perceptual real-time space. Then, the idea is that the robot successfully copes with its environment because it internally represents what it can do during the engagement with the real world, encoding perceptual signals as possibilities or “orders” for action.
Now, action-oriented representations in human cognition, according to Clark, are really close to Mataric’s cognitive maps. They are internal
53It is worth to notice that Clark does not use the concept of simulation here. Nevertheless, it seems to me that “simulation” is a concept that fits with the general theoretical frame in which Clark discusses representations. For example, in the article he wrote with Grush in 1999 (Clark, Grush, “Towards a Cognitive Robotics”, Adaptive Behavior, 7(1), pp. 5-16), which foreshadows many of the ideas developed by Clark in more recent publications, the example given in order to defend a form of minimal robust representationalism is the emulator one. An emulator is a mechanism (circuitry, software routine, and so on) that takes information about the starting (or current) state of a system and about the control commands issued as its inputs, and then gives a prediction of the next state of the system, by representing it as an output (Clark, Grush 1999: 4). This prediction sounds like a simulation of the state in which the system will be when it will do something according to the motor commands issued. This interpretation of the simulative power of those representations is not aimed at claiming that emulator theories and simulation theories conflate the one on the other. As Grush explains (Grush 2004), simulation theory and the emulator one differ because the first one usually claims that motor commands are just simulated, the latter claims that those controls are executed. Nevertheless, it seems to me that in Clark’s explanation there is a simulative aspect: the representations is a proxy of states that action in the environment will provide to the system. This interpretation of the simulative aspect of those representations can be also supported by considering Clark’s more recent work on Predictive Processing. For instance, in Clark 2013b, he talks about action planning and action-selection in terms of simulation (“simulations that allow us to explore possible course of future action”), and he mentions Clark, Grush 1999 in this discussion (Clark 2013b: 1-2).
personalized representational states, conceived as neural encodings, which map idiosyncratic, locally effective features to guide behavior (Clark 1997a: 151).
Clark considers them to be the most evolutionary and developmentally basic kinds of representations (Clark 1997a: 152) because they seem to be at the core of humans’ fundamental and primordial cognitive activity: reacting selectively to environmental stimuli, which are complex and unruly (Clark, Toribio 1994: 419). This allows the agent to display the right kind of behavior in a given situation.
AORs are defined as:
a) action-specific because they are tailored to the production of the specific behavior required (they are the mental antecedent or a simulation of an action performed on-line, in the real world);
b) egocentric, because they encode features of the environment in a way that accounts for the robot’s history of sensorimotor experiences, namely features of the environment are represented as intertwined with memories of bodily motions;
c) intrinsically context-dependent, because context is “woven into the representation-using mechanism’s basic operating principles” (Wheeler 2010a: 326). This is to say that AORs co-vary with external states, explaining how something inside the agent is about something outside the agent (Chemero 2009: 50).
Then, AORs seem to be perceptual pushmi-pullyu representations “neurally located”. They encode perceptions as motor commands tailored to selected features of the environment: they are said to guide action because they represent parts of the environment as cognitive maps endowed with a conative power54
. Indeed, like Millikan’s representations, those internal descriptions of selected parts of the perceptual array have a “you must do” nature, and they are endowed with this feature because of their semiotic structure. They are indexical or deictic representations55
. They are entities that relate specifically to the agent and they have a functional value because they play a specific role in the activity the agent is engaged in. They are not objective, they are not tokens of a symbolic type, but they rather point to different situated objects (adapted from Agre 1997: 243). By referring to those specific objects as they are perceived by the agent, they make her ready to react, to perceptually engage
54
This is a liberal use of the expression “conative aspect of pragmatic representations”, found in Nanay 2013: 20.
55 This is my development of Clark’s short explanation of AORs, motivated by the connection between AORs, Millikan’s pushmi-pullyu representations and Agre’s deictic representations that he individuates (Clark 1997: 152).
with those objects without making complex operations of subsumption of the singular represented item under the concept that explains it. AORs are perceptual in the sense that they represent this or that individual object, with its perceptual features, and specific sensorimotor commands, which are constrained by the situation in which the object represented is embedded, and by the history of the agent’s previous situated experiences.
That is why those internal representations are said to be embodied and embedded56
: “they stand for what’s happening to me, right here, right now” (Chemero 1998). They are embodied because they represent perceptual stimuli as tied to the agent’s sensorimotor experience, and they are embedded because the prescriptive representation of motor commands also encodes information about the local, specific environment, where by the expression “local environment” I refer to the agent’s peripersonal space. Moreover, they are active: the content of those representations activates a disposition towards embodied actions.
At a general level, it seems possible to claim that the reason why Clark talks about mental representations as embedded, embodied and action-specific is the one taken into account at the beginning of my discussion about second- wave cognitive sciences, in §I.4: the symbol grounding problem. The argument pro “grounded representations” was the following one: in order to account for cognition as a situated and active process, scientists should i) put action, ii) the body, and iii) the specificity of the context in representational mechanisms. In doing so, according to Clark’s approach to cognition, representations should be also consistent with an extended explanation of the mind, precisely because mental representational activities are conceived as dependent on parts of the cognitive machinery that are not embedded within the boundaries of the skull.
More specifically, the way Clark thinks of action-oriented representations seems to be an attempt to get rid of any highly intellectualistic and abstract conception of cognition, project that characterizes his philosophical production.57
Nevertheless, despite this attempt to get rid of the intellectualism of classical cognitive sciences, Clark’s endorsement of AORs seems to hide a bit of conservationism, whose reasons, to me, are not always clear to understand.
Indeed, looking at his book Being there, it can be found out that AORs have been introduced in Clark’s explanation as a critique to Gibson’s theory of direct perception. About this point Clark says:
56
See Rupert 2009: 200 for the expression “embedded representations”. 57
“A related view of internal representation was pioneered by the psychologist James Gibson (1950,1968,1979). This work made the mistake, however, of seeming to attack the notion of complex mediating inner states tout court.
Despite this rhetorical slip, Gibsonian approaches are most engagingly seen only as opposing the encoding or mirroring view of internal representation. Gibson's claim, thus sanitized, was that perception is not generally mediated by action-neutral, detailed inner-world models. It is not mediated by inner states which themselves require further inspection or computational effort (by some other inner agency) in order to yield appropriate actions. This is not, then, to deny the existence and the importance of mediating inner states altogether” (Clark 1997a: 50).
If I understand well what Clark says, it seems that AORs should be considered to be pervasive epistemic posits, in the sense that any perceptual episode should be said to be guided by those internal structures.
Nonetheless it is not actually clear to me why, according to Clark, we need action-oriented representations to account for the relation between action and perception. Indeed, Clark only says that the attack to the notion of internal representation tout court is just a rhetorical slip, and later he says that the anti- representationalism of what he calls “Radical Embodied Cognition” (REC) is unwarranted and counterproductive, because “it invites competition where progress demands cooperation” (Clark 1997a: 149). Later, to provide some arguments to justify the claim that anti-representationalism is unwarranted, he states that REC rejects representations tout court because:
i) it sticks to a narrow concept of mental representation (i.e. amodal, chunky, symbolic and explicit), without considering the possibility of explaining cognition as a continuum of representational degrees (see also Clark, Toribio 1994);
ii) it is concerned with cognitive phenomena that are not representation-hungry enough (Clark 1997a: 149) - namely cases that involve simply physically present and simply specifiable parameters” (Clark, Toribio 1994: 412)- and inferring from this that cognition, in general, does not unfold through internal representations.
Considering the first point, it can be said that this claim could sound true only if we consider Being There only. Clark has been one of the first scholars to introduce the term “action-oriented representation”. At that time, the debate about AORs was not widely developed, so anti-representationalist approaches to cognition Clark refers to in 1997 actually reacted to classical views of
representations, and not to embodied-embedded representations. Obviously, while the debate went on, objections directed specifically against the concept of AORs have been raised. I will consider these objections in the following paragraphs.
The second point, namely the idea that REC considers only cases that are not representation hungry enough and then endorses an anti- representationalist approach to cognition is trickier, and it needs to be taken into account more seriously.
As previously said, in Being there, Clark develops the concept of action- oriented representation to reassess in a representational fashion Gibson’s idea that perception is an activity that takes place through the relation between an embodied and moving agent and her environment. Hence, as I said before, AORs seems to be an occasion to state that perceptual experiences, that is cognitive processes usually considered as low-level cognition, always unfold relying on AORs. The on-line, real-time, and active cognitive phenomenon of perception requires the mediation of internal representations in order to be explained. Indeed, according to Clark, the idea that perception is direct (i.e. non-representational, non-inferential, not internally regulated) is misleading: claiming that perception is direct is just a rhetorical device aimed at dismissing the idea that perception lays in the realm of passivity.