• No se han encontrado resultados

4. OBLIGACIÓN

4.8 GLOSA POR APORTES

4.8.4 GLOSA POR RESPONSABILIDAD PATRONAL

4.8.4.6 RESPONSABILIDAD PATRONAL EN EL SEGURO DE RIESGOS DEL TRABAJO:

Augmented Reality vision systems, in isolation or within the Remote Access Laboratory framework, incorporate seven different types of technology [64]; object detection, object matching, object tracking, scene registration, camera calibration, display devices and three dimensional modelling. These technologies in combination raise two main difficulties for current AR systems. For users of visual AR systems to achieve immersion into the mixed environment, the computer-generated objects must be in synchronisation with reality. Secondly, a visual AR system must be able to understand the video scene. To perform the second task, CV processes must first locate reference points to obtain the appropriate registration, thus aligning virtual and real objects within the video stream. Discovery of reference points and locating objects between consecutive frames for CV systems, is a very complex and difficult to achieve. Without reliable discovery of selected points or objects, as frames arrive to the CV processes, AR effectiveness is diminished.

1.5.1 Synchronisation Challenges

Computer Vision (CV) techniques are the key to visual AR success [65]. For live video streaming, achieving synchronisation within the enhanced video stream (see Figure 1-5) between the real and virtual objects, requires efficient processing by the Vision Analysis (VA) systems. The VA system performs the necessary CV processes, and requires fast and accurate examination of the video stream so that as video frames arrive to the system, they perform their required functions, and pass the extracted knowledge along to sub-processes.

Ideally, VA systems should operate at the incoming video frame rate to ensure perfect synchronisation between real and virtual objects. In reality, matching the video frame rate is not always necessary. There is not always a case to process all incoming video frames, nor is there always a need to operate at the maximum possible frame rate. Immersive visual AR must still be veridical, and so requires adequate frame rates for virtual objects, such that they appear to be an integral part of the scene. Critical primary systems may find that frame rates cannot be compromised, and require fast refreshes. Assessment of less critical AR systems can benefit from shortcuts within the VA system, which may operate on limited frames. Shortcuts can reduce the effectiveness and capabilities, so should be assessed to determine their suitability for the AR system implementation.

Worst case scenario for AR RAL will require some level of real-time video processing. By this, it means that processing is not performed offline, but as video frames arrive, depending on the operational requirements of the system. For this works, real-time processing is indicative of online processing at the frame rate required by the AR RAL systems. While most AR RAL is not considered safety critical, there are still many complex VA processes that require careful development to ensure an appropriate level of user immersion. Computer Vision processes, designed to improve the synchronisation of real and virtual objects are outside the scope of this research. Computer Vision techniques reviewed to achieve the research question have been chosen to meet AR RAL requirements.

1.5.2 Registration Challenges

A visual AR system requires basic knowledge about the video scene. The view from the camera represents a certain pose in which the orientation of the camera and the location of objects within the scene are understood. Unfortunately, the camera pose is rarely known to the AR system and must be determined through CV processes. If the registration between real and virtual objects is out of alignment by a small amount, it will be detectable to the user [66]. It is generally believed that without a priori, the visual environment is too complex for effective registration [67]. Discovering the camera pose or locating Objects of Interest (OoI) require methods for extracting meaningful data from the video frames. Locating and isolating primary reference points

is a complex process and a limiting factor for any AR system[68], which can be reduced to two main methodologies.

1.5.2. (a) Fiducial Markers

Fiducial markers are unique images placed within the environment to identify key locations or objects. Augmented Reality systems use fiducial markers as two or three- dimensional registration/reference points. Initially, fiducial markers consisted of a combination of colours in a geometric pattern [69], which is unique within the video scene. Scanning the scene locates the solid regions of unicolour where the shape can be determined. Using markers consisting of multiple colours and shapes aid’s in the triangulation of the known reference points. Unicolour markers suffer with reliability, so multi-ringed coloured markers [70] where used to improve the detection process. Colour markers require colour processing and an understanding of geometric shapes which reduces their reliability. High false positive detection rates and inter-marker identification mistakes called for greater robust fiducial markers.

Bi-tonal binary fiducial markers provide a simplified detection and identification system [71]. Utilising black and white patterns as fiducial markers allows a larger image to be quickly segmented, filtering aspects of the scene that are irrelevant to the registration processes. Figure 1-6 shows an ARTag [71] fiducial marker which has a unique binary combination: that is rotating the image does not generate a combination that will represent any other marker pattern. The unique nature of the binary marker

allows camera pose to be ascertained as the combination of binary marks are only relevant in one orientation.

The problem with fiducial markers is the artificial nature of the item. Markers have to be placed at key locations to ensure proper two or three-dimensional referencing. Adding markers to objects within the video scene creates its own set of challenges. Some environments will not be suitable for fiducial markers. Moving items are not suitable specifically because of the constantly changing reference point. Other marker locations may affect the operation of the devices within the scene. Within a RAL environment, adding fiducial markers can render the experiment invalid. For example, experimental results for a pendulum are invalidated if a fiducial marker has to be applied to the pendulum. Determining the frame of reference or locating objects without adding artificial markers to the scene requires a better solution.

1.5.2. (b) Feature Points

Every object and shape within an image generate a series of interest points such as colours, patterns, edge, corners and vertices. Complex mathematical convolution is able to uncover some of these feature points, but at a significant processing cost. Feature point detection models produce significant data sets which still require post-processing before information about the current frame can be extracted. Filtering or associating the large number of feature points, such as the SUSAN [16] corner detected feature points in Figure 1-7, requires some prior knowledge or methods to extract points relevant to the task at hand.

Using feature points to calculate the frame of reference can be unreliable from frame to frame within the video sequence due to the variances in image capturing and

compression. Other forms of image distortion consist of lighting variations, lens distortion, noise within the image capture device (CCD) and process errors. Image variations from frame to frame are significant to overcome, necessitating additional complex image processing. The pixel attribute differences from the first frame to the second frame within the laboratory experiment of Figure 1-7 can be demonstrated in Figure 1-8. During frame subtraction, if frame one and two were truly identical, then the image in Figure 1-8 should be black. However, because the SUSAN corners detected in one frame are significantly different to the next frame Figure 1-8 shows a myriad of different detected corners. This becomes a problem for accurately assessing any possible frame of reference, as key reference points can be unstable.

1.5.3 Object Detection Challenges

Tomasi and Kanade [72] pose two seemingly simple questions regarding object detection challenges; how to select features within a video stream and then how to track them from frame-to-frame. This is the crux of the problem facing all computer vision research and a key factor in effective Augmented Reality systems. With this difficulty in mind, this works discovers Computer Vision models which reliably locate and track objects within the video stream, without the use of fiducial markers.

Addressing AR within the RAL framework is a relevantly recent application, and there has been little Augmented Reality principles applied to current Remote Access Laboratories systems. Research combining both fields has been mostly limited to static configurations where a video stream displays an experiment unfolding while video overlays of sensor measurements are displayed [52, 73, 74]. These implementations are clever and add worth as a demonstration of the capabilities for the combined fields. Some implementations are developed for their particular research needs or domain, and

are not practical for normal use. Engaging the user in an immersive interactive environment has had limited delivery. Augmented Reality is capable of delivering sensory feedback of remotely delivered content, such that any deficiency a Remote Access Laboratories user may suffer as a result of their isolation from the resources, is counterbalanced.

Most Augmented Reality systems rely on vision feedback. Remote Access Laboratories provides live video streams of the operating experiment. Feature extraction is a key aspect of image analysis within computer vision fields. It is also a key aspect for Augmented Reality systems in which the features of an image must be discovered before further sub-processing is possible. This research builds the basis for future AR RAL systems; as a result, models to extract key object features from live video streams (of remotely operated experiments) are developed, without relying on prior setups/configurations (such as fiducial markers), such that AR processes can enhance the video stream to the user.