4. Reflexiones finales, una reinterpretación del denominado periodo de la Violencia
4.2. Del denominado periodo de la Violencia al presente
After showing that people are able to perform gaze gestures the question is which gestures to use. The gaze gestures should be easy to perform, easy to memorize, and separable from natural eye movements without the need of a gesture key. In this context the PIN entry presented in the last section is an exception as it uses a gesture key and provides its own special purpose alphabet which resembles digits. In general, gaze gestures are not suited well for text entry because eye-typing using the dwell time method is faster and more intuitive. The gaze gesture alphabet presented here shall serve as a set of commands for remote control. The feasibility of this alphabet is not proven, but it is based on observations from the user studies and reasonable assumptions. Therefore, this alphabet can serve as a starting point for further research. The focus of this work is to find a reasonable alphabet which is separable from natural eye movements.
The notation with the eight directions originating from the mouse gestures gives freedom to define a big number of possible gestures. The possible use of the four corners of the display as helping points proposes to restrict the gestures to the gestures of Edgewrite. A ‘stairway-shaped’ gesture like RDRDRD is difficult to perform without landmarks in the visual field for orientation while visiting the four corners of the display with the gaze seems to be easy. The restriction to the Edgewrite gestures make sure that the gaze has not to leave the display when performing the gesture.
The inventors of the EdgeWrite gestures defined a huge gesture alphabet with multiple gestures for each letter and digit of the Latin alphabet. In the context of gaze gestures it is not possible to use this alphabet. The reason lies in the absence of an extra modality (e.g. gesture key or touch signal of a pen) which could tell the start and end of the gesture. The gesture alphabet of EdgeWrite has gestures which are a part of another gesture, for example the ‘1’ is a single stroke and it is a part of the ‘3’ (see Figure 57). In order to avoid ambiguities in the recognition, any gesture of a gaze gesture alphabet must not be a part of another gesture in the alphabet. An easy way to achieve this is to use gestures with the same number of strokes. Such an alphabet loses the similarity of the gestures to digits and letters. However, as the gestures are used for commands and not for text entry, it does not matter.
From observations during the second user study (see 5.4.3) and the comments given by participants it seems that users prefer gestures which end at the position where it started. This type of gestures makes it easy to repeat the gesture in the case it was not detected or in the case where repetitive input is wanted, e.g. increasing the volume or switching to the next channel in a media application.
The gestures of the gaze gesture alphabet should also be separable from natural eye movements. Four stroke gestures seem to be a good choice for gaze gestures and therefore worth being tested. Gestures with fewer strokes bear the danger of occurrence within the natural eye movements. Gestures with more than four strokes are complex and hard to memorize according to the opinion of some participants in the user study. As people are able to memorize the alphabet for reading and writing they should be able to learn even a complex gaze gesture alphabet. To memorize the letter A we do not learn an abstract stroke definition but a shape and a motion sequence for writing. Similarly, it is not necessary to memorize the notation for the gaze gestures. RDLU is the definition of a gesture suitable for the detection algorithm but we memorize it as a square performed clockwise. It is also possible to construct and memorize complex gestures by concatenation of simple ones. It is possible to
memorize the RD7DR7 gesture as the upper right triangle followed by the lower left triangle. However, as long as there is no reason for complex gestures, they should be as simple as possible. One reason for the need of more complex gestures is the fact that the number of four-stroke gestures is limited to 18 gestures by the demands given above and the following further demand.
A gesture that ends at the starting point should not get a different meaning when the same pattern starts at another corner. If the RDLU gesture has an assigned meaning, the gestures DLUR, LURD, and URDL should have the same meaning or no meaning. Otherwise, when performing the RDLU gesture repetitively and one stroke is not detected correctly, the gesture results in an unwanted action. For an example with two-stroke gestures, let us assume that the gesture RL means ‘volume up’ and LR means ‘volume down’. Then the gesture to turn the volume down three steps is LRLRLR. In the case that the first stroke was not recognized correctly, because the gesture was not performed well or because of a temporary malfunction of the eye tracker (changing light conditions, hand obstructed the camera view), the detected gesture could be L:RLRLR, 7RLRLR or RLRLR. These gestures would turn the volume up two steps – the opposite of the user’s intention. One possibility to avoid such problems with concatenated gestures is demanding a rest in the eye movements between two gestures, i.e. the output of a colon from the detection algorithm. However, this slows down the execution time of concatenated gestures.
Gestures which need only two points, typically RLRL, occur too often in natural movement. With a systematic construction of possible gaze gestures which use at least three points, it is possible to identify four types of four- stroke gestures as given in Figure 72. The total number of different gesture commands by mirroring or rotation is 18 (2 + 4 + 4 + 8). This is about the typical amount of buttons on a remote control for a TV set. Allowing to start the gesture at any corner leads to 72 different gesture strings.
Figure 72: Four types of four stroke gaze gestures and their number of variations v by rotation and mirroring