1.5.1 Overview
Some of the key theoretical disputes of the so-called spontaneous perspective
taking phenomenon lay within the following; task relevance, empirical
inconsistencies, gaze cueing, mental models, mental imagery and rotation, mental
self-rotation, submentalizing, and knowledge attribution. Each will now be outlined,
as well as empirical suggestions that could be used to further investigate these
debates.
1.5.2 Task Relevance
Relevance is a concept that influences many aspects of psychology that focus
on information processing (Schamber, Eisenberg & Nilan, 1990). However,
definitions for relevance are limited, as the concept is perceived intuitively
(Saracevic, 1996). For the current work relevance involves “an interactive, dynamic establishment of a relation by inference, with intentions toward a context” (Saracevic,
1996, p. 206). In other words, relevance implies a dynamic relationship between an
input and output, (Cosijn & Ingwersen, 2000). Consequently, in terms of perspective taking the input, such as the empirical stimuli and the output, such as participant
response, can be criticised to be lacking in task relevance. For example, relevance is
increased when the experimental task is associated with the alternative perspective
and not replaced by an irrelevant distractor task. In terms of perspective taking,
Zwickel (2009) assessed this issue by examining the importance of a human body, as
did Frischen et al. (2009) who investigated action and action cues.
In regard to spontaneous perspective taking, Zwickel and Muller (2010)
examined the impact of task relevance. Participants were required to respond to discs
stimulus. The distraction stimulus was a face with either a fearful or neutral
expression, or a rectangle, and the questions posed after the disc identification either
increased or decreased the relevance of the experimental task. As with other literature
investigating spontaneous perspective taking, the comparison of congruent and
incongruent RT was also assessed. However, as the primary focus was on task
relevance, Zwickel and Muller (2010) emphasised that the mere presence of a face
would produce differences in RTs, regardless of congruency.
The authors found perspective taking effects were apparent when a face was
presented with a fearful expression, and not a neutral expression. This would indicate
that relevance to the task, such as emotional responses, increases the magnitude of the
perspective taking influence. Zwickel and Muller (2010) also identified that merely
observing action and the action cues of another, does not necessarily result in
spontaneous perspective taking.
Reflecting upon the dot perspective paradigm, participants were first presented
with a screen stating which perspective to adopt, ‘YOU’ or ‘HIM/HER’.
Consequently, one could argue that the screen is highlighting the relevance of the
different perspectives of the avatar and participants, thus contributing to the RT
differences. In other words, the screen may have increased the relevance of the task,
priming participant response, and therefore reducing the likelihood of an automatic
process occurring. Additionally, if the spontaneous perspective taking notion is solely
dependent upon a degree of relevance, then finding the RT differences associated with
spontaneous perspective taking, when a distractor task is present, would be
problematic. As a consequence, relevance may not be the only influencing factor of
As spontaneous perspective taking is a relatively new line of investigation,
technicalities such as task relevance are yet to be examined extensively. Zwickel and
Muller (2010) have highlighted the importance that relevance has upon the
phenomenon, but whether a process can be truly ‘spontaneous’, yet dependent upon
definitive factors, is still under examination. Future work in this area should begin to
assess whether relevance is bound to perspective taking in terms of vision alone, or
whether it can impact other perspective taking abilities that are not primarily based in
vision.
1.5.3 Empirical Inconsistencies
Most interestingly some authors have found that the visibility manipulation,
otherwise known as the barrier method, applied to the dot perspective paradigm
modulates spontaneous perspective taking (Furlanetto, et al. 2016; Baker, Levin, &
Saylor, 2016) while others document the opposite (Cole et al. 2015; 2016; Conway et
al., 2017). It could be argued that this is due to the fact that it is a common occurrence
to generate an effect when a phenomenon is first reported, as it would be
unreasonable to expect authors to immediately undertake and report all the work
necessary to understand the mechanisms responsible for a phenomenon. Especially
when accounting for the current publication trend of null effects. It is also understandable as visual cognition literature often examine an effect’s various
parameters and ‘boundary conditions’, initially asking questions such as how long a
phenomenon lasts, is it automatic, is it perceptual, attentional, or as a result of a
decision process (e.g., Inhibition of return, Posner & Cohen, 1984; attentional blink,
Raymond, Shapiro, & Arnell, 1992)? Thus, the replication number of publications
increase. However, theories that develop an understanding for results inconsistencies
spontaneous perspective taking research. Instead, the field has been dominated by a
long list of similar empirical investigations that, aside from their inherent interest,
have not generated many explanations.
One possible explanation for these inconsistencies within the dot perspective
paradigm reside within reflexive gaze following. Again, recall that during the dot
perspective paradigm, participants are required to judge the number of dots from both
their own egocentric perspective and, on other trials, from the allocentric agent
perspective. Within certain experiments (e.g., Samson et al. 2010; Santiesteban et al.
2014; Nielsen et al. 2015) this occurs within-block such that participants are informed
at the start of each trial which perspective they should adopt. Consequently, this
procedure could be criticised that participant attention is being drawn to the representation of the agent’s perspective even when they are not explicitly instructed
to do so. This is as a result of the participants assuming that the adoption of differing
perspectives is an important part of the experiment. It is worth noting that the effect of
top-down knowledge upon participant attention to features within an experimental set
up, and specifically, within the stimuli presented has been well-established since the
findings of Folk, Remington, and Johnston (1992). Indeed, the effects of attention
work have shown how a stimulus that is nominally task irrelevant can in fact form part of an observer’s response cue. Most importantly, this type of attentional influence
has been shown to occur in perspective taking paradigms (e.g., Stephenson &
Wicklund, 1983). To reiterate, merely instructing participants to consider their own
egocentric perspective seems to induce consideration of an alternative allocentric
perspective. As a consequence, other authors (e.g., Cole et al. (2015, 2016, 2017;
Conway et al., 2017) did not include the manipulation of forced adopted perspectives.
(but recall also did so when the agent could not see). However, the extent to which
spontaneous perspective taking is depending upon other processes still needs to be
further explored. Especially in relation to the inconsistencies within the dot
perspective paradigm.
1.5.4 Gaze Cueing
Another influential paradigm that has been argued to influence spontaneous
perspective taking is gaze cueing. Recall that gaze cueing is the finding in which the observation of another individual’s attention influences the attention for the observer
(Nuku & Bekkering, 2008; Teufel, Alexis, Clayton, & Davis, 2010; Teufel et al.,
2009; Teufel, Fletcher, & Davis, 2010). The majority of literature investigating this
phenomenon presents participants with a face that directs attention to one side of the
display. This movement is presented in conjunction with a target that is presented
either in the gazed-at direction (‘Valid’) or on the opposite side of the display
(‘Invalid’; Frischen & Kingstone, 1998; Langton & Bruce, 1999). Differences in RT
consequently lead authors to conclude that seeing gaze movements trigger the
attention of the observer to shift accordingly. Consequently, RT for Valid conditions
are increased, and decreased for Invalid conditions. Additionally, it has been
suggested that gaze direction can be used to imply intentions and goals associated
with the object that is being attended to (Calder et al. 2002; Nuku & Bekkering, 2008;
Morgan, Freeth, & Smith, 2018). Yet there are also authors that dispute this claim
(Driver et al. 1999; Caron, Butler, & Brooks, 2002; Teufel et al. 2010). Cole et al.
(2015) combined the use of the gaze cueing procedure with a traditional nonhuman
animal attention task, in the form of an occluding barrier. The authors found the same
patterns for validity consistent with other gaze cueing research, irrespective of the
mental state attribution, in the form of ‘seeing’ is not (reliably) modulated by the gaze
cueing paradigm.
In relation to the so-called spontaneous perspective taking notion, gaze cueing
can be argued to be significantly influential. For example, Samson et al. (2010),
Teufel et al. (2010), and Gardner et al. (2018) can all be argued to be affected by gaze
cueing. This criticism is supported by the work of Cole and colleagues (2015; 2016)
who were unable to isolate the spontaneous perspective taking effect to conditions in
which the avatar was able to see the target. Instead, the effect was observed in Valid
conditions regardless of the visibility manipulations. However, this criticism mainly
resides within the reflective gaze following and dot perspective paradigm
methodologies. Conversely the ambiguous number paradigm emphasises
comprehension, as participants are required to interpret the ambiguous number, thus
gaze cuing has not as yet been extensively examined. Therefore, future work would
benefit from the addition of occluding barriers in the ambiguous number paradigm,
which has previously been explored in the gaze cueing (Cole et al. 2015) and dot
perspective method (Cole et al. 2016).
1.5.5 Mental Models
Craik (1943) proposed that humans use small-scale models when processing
information in the form of a mental model. Visual stimuli and written descriptions are
two examples of the information that can be used in the formation of these small-scale
representations. The depth of processing required to form these small-scale
representations, is one area of investigation that has been popular in the development
of this field. For example, Mani and Johnson-Laird (1982) attempted to investigate
the importance of spatial descriptions upon the formation of mental models. They
which reflects upon the improved recall. Mani and Johnson-Laird (1982) concluded
that there are two types of encoding spatial descriptions. Firstly, propositional
representations are relatively easy to process but are harder to recall. Secondly mental
models, which are harder to process but are easier to recall. Consequently, the work of
Mani and Johnson-Laird (1982) would suggest that mental models require a greater
depth of processing when being encoded, which increases the recall ability. Craik &
Lockhart, (1972) and Johnson-Laird & Bethell-Fox, (1978) support this finding.
Once these representations are processed and encoded, they can then be used as a cue
or reminder to formulate judgements about a scene (Tversky, 1981). Applying this
concept to the spontaneous perspective taking notion, it could be argued that
participants may not be assuming the allocentric perspective, as suggested, but instead
be developing a mental model of the scene. In other words, the participant is not
transforming their sense of self into the position of the avatar or agent, but instead is
using the avatar or agent, as well as all other forms of information to create a mental
model of the scene. This mental model can therefore be used to form judgements
when the participant is asked questions regarding the scene. In this sense the
discrepancies in terms of RT, may not be due to the assumption of an allocentric
perspective, but instead be due to the processing of the scene, and the mental model
transformations required to generate the necessary judgements. However, this is a
considerable theoretical debate, which would require examination of brain region
activation to support or refute the mental models claim. This debate will now be
extended in relation to mental imagery and rotation in the following section.
1.5.6 Mental Imagery and Rotation
Building upon mental models, mental imagery and rotation is another
mental imagery, the form that mental models take has been heavily disputed. Kosslyn
(1994) claims that mental models are processed using visual representation, which
Dennett (1991) supports. For example, if an individual were asked to think about their
car, Kosslyn would claim that the individual would hold a small-scale image of their
car in their “mind’s eye”. However, Pylyshyn (1973) disagrees with this claim, and
instead suggests that the individual would use descriptions, prior experiences, and
pre-existing knowledge. Thus, Pylyshyn would suggest that when an individual is
required to think about their car, they would simple know what model, make and
colour it is, due to pre-existing knowledge, and not because of a small-scale image
held in their minds eye. Interestingly, advances in neuroimaging have highlighted
different activated neural pathways for images and prior knowledge dependency
(O'Craven, & Kanwisher, 2000; Kosslyn, & Thompson, 2003), yet the results conflict
and the debate of mental imagery remains.
As previously stated, discrepancies in RT during experimentation on
spontaneous perspective taking could be a result of mental models and the required
transformation of the mental image, and not the assumption of an allocentric
perspective. Shepard and Metzler (1971) supports this claim as they found that RT
could be progressively influenced with the increased number of mental rotations
required for processing. Just and Carpenter (1976) and Hochberg and Gellman (1977)
support this claim. Hence, ‘spontaneous perspective taking’ may actually be a
function for the number of mental rotations required to process the mental model, and
not due to the computation of the allocentric perspective. However, in order to assess
this claim, clarification is needed in terms of the impact of mental transformations.
Consequently, future work would benefit from identifying the number of mental
1.5.7 Mental Self-Rotation or Object Rotation
An alternative account that may be able to explain the spontaneous perspective
taking phenomenon, is object rotation. This is the ability to mentally rotate an object
absent of an allocentric perspective in the form of an agent (Shepard & Metzler,1971).
Object rotation has been extensive investigated, particularly in relation to spatial
perspective taking (Huttenlocher &Presson, 1973; Levine, Jankovic &Palij, 1982;
Kessler & Thomson, 2010). Kessler and Colleagues (e.g., Kessler, 2000; Kessler &
Thomson, 2010; Kessler, & Rutherford, 2010; Kessler, & Wang, 2012) acknowledged
the embodied nature of perspective taking and identified that the deeper the level of
processing (e.g., level 2 visual perspective taking and level-2 type spatial perspective
taking) the more cognitively demanding the process, and therefore increased effort for
the embodied process. However, further classification in terms of the specific aspect
of object rotation and the relevantly new strain of literature investigating the
spontaneous perspective taking theory is still required.
In contrast to mental rotation of the self, which the majority of perspective
taking emphasises in relation to the assumption of the alternative perspective (e.g.,
Samson et al. 2010; Baker et al. 2016; Gardner et al. 2018), object rotation suggests a
different cognitive operation is performed. Instead of a rotation of the self, in
reference to either spatial frames of reference for spatial perspective taking (Michelon
& Zacks, 2006), or embodied line of sight tracing and mental transformation for
visual perspective taking (Surtees et al 2013), object rotation emphasises a centralised
rotation of a target object in isolation (Kessler & Thomson, 2010). Consequently,
disparities when comparing the differences between these processes have arisen.
Kozhevnikov et al. (2006) identified that enhanced perspective taking ability
and Hegarty (2001) found that although perspective taking and object rotation
abilities are similar, improved performance of one of these skills related to a reduced
ability of the other. Additionally, a number of experiments have identified that mental
self-rotation, used within perspective taking, is reportedly less cognitively demanding
(is fast and accurate) compared with object rotation (Keehner et al. 2006; Wraga,
Creem & Proffitt, 1999; Wraga et al. 2005; Zacks & Michelon, 2005). The increased
angle required for rotation has also been found to effect mental self-rotation used
within perspective taking and object rotation differently. For perspective taking,
processing time remains fairly constant (e.g., Graf, 1994; Kozhevnikov & Hegarty,
2001; Keehner et al. 2006; Michelon & Zacks, 2006), whereas for object rotation, a
progressive increase in RT correlates with the increased rotated angle (e.g., Shepard
& Metzler, 1971; Graf, 1994; Keehner et al. 2006; Michelon & Zacks, 2006). It is
these fundamental differences in which spontaneous perspective taking can be
applied, as this phenomenon has not yet been extensively investigated in relation to
object rotation.
Firstly, as stated above enhanced navigational skills have been correlated with
mental self-rotation and perspective taking abilities, but not object rotation
(Kozhevnikov et al. 2006). Recall that navigational skills have been used to support
spontaneous perspective taking within joint action tasks (e.g., Surtees, et al. 2016b).
Thus, this distinction would counter the dispute that object rotation could be applied
instead of a self-rotation in the spontaneous perspective taking theory. Alternatively,
the distinction that an enhanced ability of one rotation process often leads to a reduced
ability of the other (Kozhevnikov & Hegarty, 2001), cannot be applied without
specific experimentation of both rotation processes within a spontaneous perspective
research. Thirdly, the key distinction that mental self-rotation is cognitively less
demanding in that it is fast and more accurate compared with object rotation, is a key
characteristic that directly relates to the spontaneous perspective taking phenomenon.
Currently, spontaneous perspective taking has been found to be rapid and spontaneous
in the assumption of an alternative visual perspective, which correlated to the
embodied self-rotation account. However, if future work disputes this claim, object
rotation may be one contributing mechanism identified. One way that this distinction
could be assessed is through additional conditions in which an ambiguous object
replaces the ambiguous number within the ambiguous number paradigm. Lastly, it has
been identified that increasing the angle of rotation progressively impacts object
rotation RT whereas it does not for mental self-rotation. Hence, this could be one way
to disentangle the dispute that object rotation may be influencing the so-called
spontaneous perspective taking phenomenon. An experiment could be created, similar
to the research carried out by Michelon and Zacks (2006) in which the required
rotation of perspective, be that in relation to the embodied perspective or object
rotation, is simultaneously manipulated alongside the consistency of perspective for
the participant and agent. Thus, if RT is affected by the progressive angle disparity,
this would indicate that object rotation may be influencing the so-called spontaneous
perspective taking phenomenon and warrant further investigation.
As can be seen, there are a number of cross-comparisons that can be made
when investigating perspective taking in terms of mental transformations of the self or
target object. Consequently, this is one area of examination that future work critically
1.5.8 Submentalizing
As previously outlined, ToM can also be referred to as mentalizing, and
deconstructed to include submentalizing. Submentalizing is one component that can
be argued to hold significance over perspective taking and spontaneous perspective
taking literature. Recall, mentalizing refers to fully functioning ToM abilities,