This section describes the way in which MASSIVE-1 presents itself to a normal user. This includes the appearance of the system and also the types of control and interac- tion which are made available to the user.
MASSIVE-1 supports communication and interaction between users via a combina- tion of 3D graphics, real-time audio and text. Each of these three forms of interaction is realised as a spatial model medium. Each medium is handled by a specialised user client process which provides a medium-specific interface between a user and the vir- tual world. A user can employ almost any combination of the three client programs; the only restriction is that they cannot use the audio client on its own, since it has no built-in support for specifying movement within the virtual world. So one user may be
4.1.1. Graphical client
using all three client programs on a graphical workstation to give real-time graphical and audio interaction supplemented by text-based mapping and messaging. Another user may be logged in over the network from a VT100 alphanumeric terminal and have access to the text medium alone. These two users will be able to interact through the common text medium. Tools within the world may provide additional support for cross-medium interaction, for example, the text-to-speech convertor of section 4.2. A user designates one of their client programs to be the “master”, and this controls any other client programs which they may be using at the same time (the “slaves”). The master coordinates the activities of all of the user’s client programs so that they present a consistent view of the virtual environment. The three client programs will be described in turn, starting with the graphical client, followed by the audio client and finally the text client.
4.1.1. Graphical client
The graphical medium client maintains a single 3D graphical view of the virtual world; an example is shown in figure 7 on page 36. Each user is represented within
the graphical medium by a simple embodiment (a “blocky”) which is sufficient to convey the user’s position and orientation and an indication of their identity (by means of a name label and customised body colour). In addition, the blocky indicates which media a user has access to. For example, a blocky with “ears” is audio-capable, a blocky with one “eye” is a desktop (monoscopic) graphical user, while a blocky with a “T” on their forehead is a text-only user.
4.1.2. Audio client
The graphical medium client can be a user’s master client or a slave client. When act- ing as a slave (to either a text client or to another graphical client) it just provides a view of the virtual world. However, when the graphical client is the master it provides the user with a number of navigation and interaction control facilities which are listed below.
• Variable speed movement in six degrees of freedom. This is controlled using the mouse in different parts of the screen with combinations of mouse buttons.
• A choice between three settings for focus, nimbus and aura. These are “wide”, “normal” and “narrow” and provide broad undirected interaction, mid-range semi-directed interaction and close-range highly directed interaction, respectively. These are stepped through using a single key press.
• The ability to continuously vary the angle and range of focus and nimbus. Like normal navigation this makes use of the mouse, but in combination with control keys. This is a relatively specialist facility.
• A choice between a number of simple graphical gestures. These include arm move- ments, pointing and “sleeping” (used to indicate that a user is not attending to the virtual world at present). These are selected using single key presses.
• A moving “mouth”. This appears on the embodiment then the user is speaking as a visual cue to speaker identity. This also acts as a diagnostic aid if audio communi- cation is problematic, e.g. when the network is heavily loaded.
• An optional indication of the user’s focus and nimbus. This is represented by a wireframe cuboid which approximates the region of maximum focus and nimbus. Additionally, whether the graphical client is the master or a slave, it allows the user to choose (using key presses) between a number of pre-set viewpoints specified relative to their embodiment. The normal choice includes: the view out of their embodiment’s eyes (the default); a view from above and behind their embodiment which shows other nearby objects; a view from overhead looking down on their embodiment which is effective as a map; and a view from in front looking back at their own embodiment. For each view the user can use keys or the mouse to zoom in and out.
The graphical client is the normal master client, but requires a reasonable perform- ance graphical workstation such as an SG Indy.
Portals
One of the background concepts of the spatial model is that of a space or “world” within which objects and communication are situated. MASSIVE-1, like some other multi-user VR systems (e.g. DIVE [Hagsand, 1996]) includes “portals”: a portal is an object in a world which forms a link or gateway to another world. As users move about within a world they can step “into” a portal and be transported to a new world and location, or to a different location within the same world. A portal’s destination is specified when it is created by the world designer. Portals are unidirectional (but may be combined in pairs to create bidirectional links).
4.1.2. Audio client
The audio medium client exchanges awareness and configuration information in the audio medium and uses this information to establish real-time audio connections
4.1.3. Text medium client
between pairs of users and between users and other audio-capable objects. Audio in MASSIVE-1 is single channel u-law PCM (Pulse Code Modulation) encoded data at 8KHz; this is also referred to as “toll-quality audio” and is approximately the same quality as a domestic telephone call (but with significantly longer end to end delay). The audio client establishes audio connections only when awareness exceeds a thresh- old level. It also controls the playback volume to reflect the level of awareness so that sources heard with low awareness values are quiet while sources heard with high awareness values are loud.
The audio client manages a separate audio server process for each user. This was cre- ated specifically for MASSIVE-1 because existing network audio tools (such as VAT [Jacobson, 1992] and RAT [Hardman et al., 1995]) did not allow sufficient external control, for example of per-source playback volume. The audio client always operates as a slave client under the control of a text or graphical client. This is because naviga- tion and other aspects of system control cannot be achieved via the audio client (which has no speech recognition facility).
4.1.3. Text medium client
The text client provides a simple map view of the surrounding area and allows the user to send and receive simple text messages. Figure 8 on page 39 shows a screen shot of the text client during a meeting; this is the same scene as in figure 7 on page 36.
The displays has four components.
• The status bar at the top shows the orientation, location and focus/nimbus mode of the user.
• The column down the right of the screen identifies the objects in aura range and shows mutual awareness values.
• The character-based map in the centre shows the user’s immediate surroundings in the virtual world. Objects are represented by letters with the key in the column down the right of the screen (the user’s own embodiment is shown to themselves as an “@” symbol). User orientations are indicated by dashes adjacent to the appropriate character.
• The text window at the bottom of the screen displays recent text messages and allows the user to compose their own messages.
The text client can be a user’s master client or a slave client. When it is the master cli- ent it allows the user to move about using key presses, change between settings for focus, nimbus and aura and perform simple “gestures” (which in the text medium are short preset text messages). The text client always allows the user to type text mes- sages which are distributed to other users. Distribution and presentation of text mes- sages depends on awareness level as illustrated in table 5 on page 39. At very low levels of awareness nothing is observed (the message is not seen). At intermediate levels of awareness an observer sees that something is said, but does not see its con- tents. At higher levels of awareness an observer sees the full message but it is shown in brackets to indicate that the message is not part of a focused (and nimbused!) exchange. At the highest levels of awareness the full message is displayed.
In terms of user machine capabilities the text medium client is the “lowest common denominator” and can be used on a text-only terminal such as a VT100. A text-only