Framework for effectiveness assessment of virtual reality training in assembly tasks

Texto completo

(1)

(2) I would like to thank Fundación Carolina and its training program for have granted me with this scholarship and provide me the opportunity of studying a master in a leader university of Spain. I would also like to thank my supervisor Angelica de Antonio for the proposal of the subject of this work and for the guidance and support during its development. Lastly, I would like to thank my family and friends for understanding my absence and encourage me to follow my dreams..

(3) El entrenamiento organizacional es una de las herramientas más relevantes aplicadas por las empresas para desarrollar una fuerza laboral calificada y seguir siendo competitivas. La búsqueda de formas de entrenamiento más efectivas ha resultado en la aplicación de la realidad virtual como un modo para proporcionar un aprendizaje interactivo capaz de mejorar la calidad y retención del contenido aprendido. A pesar de este potencial, el consenso sobre la efectividad de la realidad virtual para el entrenamiento está lejos de ser alcanzado, especialmente en el área de fabricación, donde los pocos estudios existentes han presentado conclusiones contradictorias. Las razones sobre la falta de consenso son vastas e incluyen la ausencia de un método de evaluación común y la existencia de problemas metodológicos y de diseño experimental. Debido a que la evaluación de un programa de entrenamiento es un elemento esencial del proceso de aprendizaje capaz de identificar el valor de la capacitación, así como las deficiencias y los aspectos exitosos, en este estudio se propone un framework para medir la efectividad de la realidad virtual para el entrenamiento en tareas de ensamblaje. La motivación para este estudio provino de la necesidad de evaluar un sistema virtual para el entrenamiento en el ensamblaje de asientos de automóviles, un proyecto en desarrollo por una asociación entre la Universidad Politécnica de Madrid y Seat Inc.. Por lo tanto, para evaluar la efectividad de la herramienta virtual, en este trabajo se aplica el framework propuesto para comparar el entrenamiento virtual con el método de capacitación actual aplicado por la empresa. Los dos métodos son comparados en relación a los resultados de desempeño de los participantes y sus sentimientos y opiniones posteriores al entrenamiento sobre la experiencia de aprendizaje. Además, con el objetivo de comprender los factores que pueden hacer que los participantes aprendan con una menor o mayor efectividad e identificar oportunidades de mejora, se investiga si los resultados de rendimiento de los participantes están influenciados por sus características (estilo de aprendizaje y edad), por sus sentimientos y/u opiniones sobre el entrenamiento. Palabras clave: Effectividad del entrenamiento, Realidad virtual, Tareas de montaje, Framework.. i.

(4) Organizational training is one of the most relevant tools applied by companies to develop a skilled workforce and remain competitive. In search of a more effective training approach, virtual reality emerged as a way of providing a highly interactive learning-bydoing training with capacity to enhance the quality and retention of the transferred content. Although this potential, the consensus on the effectiveness of virtual reality for training is far from being reached, especially in the manufacturing area, where the little existent studies have presented conflicting conclusions. The reasons about the lack of agreement are vast and include the absence of a common evaluation method and the existence of methodological and experimental design issues. As the evaluation of a training program is an essential element of the learning process, identifying the value of training as well as shortfalls and successful aspects, this study proposes a framework to measure the effectiveness of virtual reality for the training of assembly tasks. The motivation for this study came from the necessity of evaluating a virtual system for the training of the assembly of automotive seats, a project under devolvement in a partnership between the Technical University of Madrid and Seat Inc. (simulated name). Thus, in this. work, in order to assess the effectiveness of the virtual tool, the proposed framework is applied to compare the virtual training with the current training approach applied by the company in terms of participants’ performance outcomes and their post-training feelings and opinions about the learning experience. Additionally, aiming to understand the factors that. can make the participants perform more/less effectively and identify refinement opportunities, it is investigated if the participants’ performance results are influenced by their characteristics (learning style and age) and by their feelings and opinions about the training. Keywords: Effectiveness of training, Virtual Reality, Assembly Tasks, Framework. ii.

(5) iii.

(6) iv.

(7) v.

(8) vi.

(9) Organizational training is a process driven by companies aiming for the employees’ learning of competencies (knowledge, skills, or attitudes). The development and maintenance of a skilled workforce are considered one of the most relevant factors to make organizations remain competitive, allowing them to innovate, adapt, surpass, produce, improve safety, and achieve goals. In the United States, for example, it is estimated that companies spend more than $135 billion per year in training (Patel, 2010). Moreover, training is of great relevance for governments once it can produce considerable social implications, influencing the unemployment rate and affecting the economic growth, being considered, thus, a major instrument for local and national economic development (Salas, Eduardo, Tannenbaum, Kraiger, & Smith-Jentsch, 2012). Given such relevance, training has grown as a science in constant development which has produced in the last years a substantial content about many aspects of the training process, providing essential information for organizations about the design, delivering, and implementation of training programs. The form of delivering training content has also evolved over the years and followed the advancement of technology. Although classical approaches like paper-based training, instructional videos, and demonstrative training are vastly used, the search for more effective ways of delivering learning culminated in the application of virtual reality (VR) for this purpose. The development of a virtual environment (VE) is costly and time-consuming, however, overcoming this initial investment, it has potential to reduce the training costs and improve the effectiveness of a training program. In some organizations, especially in manufacturing industries, a considerable part of the time and cost of the training is due to the creation of expensive and complex simulated workstations where the trainees have to perform the tasks with real components whose assemble and disassemble can take a lot of time. Thus, the use of a VE which emulates the real training scenario can save time and money (Aguinis & Kraiger, 2009). Moreover, VR offers a highly interactive way of learning through a learning-by-doing approach which is typically faced as enjoyable and that, thus, can improve the transfer of knowledge and increase the retention and recall of the contents. Yet, it offers an automatically way of measuring trainees’ performance, can be easily adapted to participants characteristics. 1.

(10) (like learning style) and is credited to offer a safe training and controlled environment (Gavish, et al., 2013). Although those benefits of the use of VR for training, there is still no consensus about its effectiveness as a training tool once the studies in the area have presented conflicting conclusions about. While in some areas like medicine this consensus is close to being reached, in others like assembly, the lack of a common approach to measure effectiveness and the shortfalls (methodological and experimental design issues) presented in the already scarce papers difficult the achievement of a conclusion about it (Borsci, Lawson, & Broome, 2015).. Evaluating training outcomes is an essential element of the learning process. The evaluation of a training program is able to identify the value of training instead of seeing it. as just a cost, revealing the benefits obtained from training results and justifying the investment (Tennant, Boonkrong, & Roberts, 2002). Moreover, it allows the identification of shortfalls and successful aspects, promoting this way a constant refinement of the training system. Lastly, it is considered as one of the principal ways of setting objectives for the. training, enabling the establishment of appropriated training objectives and learning outcomes (Salas, Eduardo, Tannenbaum, Kraiger, & Smith-Jentsch, 2012).. Given the exposed, the principal objective of this study is to propose a framework to evaluate the effectiveness of virtual tools for the training of assembly tasks. The motivation for the proposal of the framework came from the project SIEMA (Sistemas Inteligentes para ayuda al Entrenamiento en Montaje manual de Asientos de automóviles), a VR system for the training of the assembly of automotive seats and that has been developed in partnership between the Technical University of Madrid and Seat Inc.. The goal is to apply the framework to evaluate the effectiveness of that virtual tool.. In this evaluation process, the objective is to compare SIEMA with the current training approach applied by the company in terms of participants’ performance outcomes and their post-training feelings and opinions about the learning experience. Moreover, correlation analysis will be drawn to investigate if the participants’ performance results are influenced by participants characteristics (learning style and age) and by their perceived usability, trust in technology, and perceived workload, allowing the. 2.

(11) understanding of what makes the participants have a more/less effective learning and identifying improvement opportunities.. The current study is organized as follows: Chapter 2: It is presented all the basic concepts necessary for a better understanding of the current study. First, the definition and structure of organizational training, as well as the most defunded theories of the science of training, are presented. Then, a brief review of the current VR technologies is introduced and lastly, an overall view of assembly tasks with special attention to the automotive industry is introduced. Chapter 3: The results of an extensive literature review on the topic is presented. A final list of the most relevant studies in the effectiveness assessment of virtual technology for the training of assembly tasks is obtained. Chapter 4: The framework for the effectiveness assessment of virtual reality training in assembly tasks is proposed based on the most suitable criteria gathered from the literature review and the principal theoretical content of the science of training. Chapter 5: A plan for the application of the proposed framework to SIEMA is suggested taking into consideration the company training objectives and the peculiarities of the proposed virtual tool. Chapter 6: Lastly, an overview of this study is presented and future work is discussed.. 3.

(12) In this chapter, all the basic concepts necessary for a better understanding of the current study are explored. First, the definition and structure of organizational training are discussed as well as the factors that can influence its effectiveness and how it can be assessed. Later, a review about virtual technologies followed by taxonomies about the main current systems and hardware used are presented. Lastly, an overall view of assembly tasks with special attention to the automotive industry is introduced.. Organizational training has acquired different definitions over time and over market evolution. A classic one, that focuses on trainees’ learned content, can be found in (Goldstein & Ford, 2002) which affirmed that training is the systematic learning of skills, rules, concepts, or attitudes aiming the improvement of performance in another environment. A more modern description (Noe, 2010), that sees training as a company activity, says that training is an organizational planned effort to promote the learning of job competencies by its employees. Lately, a more complete explanation, which put together the focus on learning of the first definition and the emphasis on the organization of the second, was proposed by (Kraiger & Culbertson, 2013): training is a “systematic process initiated by the organization that results in relatively permanent changes in the knowledge, skills, or attitudes of organizational members”. It is an interpretation that considers the existence of learning without training (e.g., self-directed learning or incidental learning) and of training without learning (an ineffective training program). A definition that comes from the market but that reflects those academy interpretations can be found in Inc. Magazine Encyclopedia (Inc., 2018), which states that training is a formal and structured way (a sort of educational methods and programs) employed by organizations to enhance the performance of their employees. In other words, training is an activity that allows trainees to learn relevant content necessary for a better performance in their jobs. The knowledge to be taught varies from highly specific procedures to broader and long-term skills, from mechanical and repetitive tasks to management abilities, and, thus, can be applied by any organization which desires to. 4.

(13) improve their production processes by the development of greatly skilled labor force, acquiring this way an advantage in an increasingly competitive market. Current business research attests that a company should be as efficient as possible in managing three areas: finance, products (or markets), and personnel (also called human capital or workforce) (Boudreau & Ramstad, 2005). Analyzing those three domains in terms of market competition, it is possible to see that the labor force is the most decisive factor for the organization success. The global economy and its mechanisms have created an environment where it is more or less equally hard (or easy, depending on the economic health) to most of the companies of the same size to obtain funds. Moreover, the globalization has allowed the industries to sell to basically any market and, in relation to product innovation, although it continues to represent a great advantage, it has a differential impact smaller than before with the products under the same category becoming increasingly similar to each other (it is the case of smartphones, for example). Thus, the development and maintenance of a skillful personnel is the strongest advantage a company may have nowadays and training plays a key role in this. Training is relevant not only for organizations but for governments as well. Building and maintaining a skilled workforce are seen as a great instrument for national economic development and have considerable social implications because it can influence the unemployment rate and affect the economic growth. Through a well-designed training system is possible to tackle the problem of aging of the labor force at the same time that can deal with a new generation of personnel and its diverse motivation and ways of learning. It is also possible to recover displaced people and create a flexible labor pool with capacity to adapt to changes, which can be fundamental to attract and maintain companies in the town/state/country, reducing unemployment and increasing the local economy (Aguinis & Kraiger, 2009). Given such relevance for organizations and governments, training has grown as a science full of theories and theoretical content that illustrate and study many aspects of the learning process including training design, trainees’ characteristics, and environmental features, aiming basically to determine what makes a training be effective. Although (Campbell J. P., 1971), in a statement that became famous in the area, affirmed that training literature is “voluminous, non-empirical, nontheoretical, poorly written, and dull”, an expressive advance in this field has been seen in the last thirty years, which has allowed a better and deeper understanding of training and of which factors are responsible 5.

(14) for its effectiveness, and has served as base for the development of training systems in many areas (medical, military, industry etc.) (Salas, Eduardo, Tannenbaum, Kraiger, & Smith-Jentsch, 2012). One of those studies has caused a great impact on training literature for creating a structure of the training process with proposed components and for approximating the theoretical content to the training practice (analysis, design, and evaluation). According to (Cannon-Bowers, Tannenbaum, Salas, & Converse, 1991) and a further refinement of (Arthur Jr & Bennett Jr, 2003) and (Borsci, Lawson, & Broome, 2015), the main components of a training program are: 1. Training Objectives: the goals of the training in terms of the organization aims and needs. It can be, for example, the learning of a specific procedure like a surgery or the training of all the operations involved in the construction of the seat of a new car model; 2. Training Contents: the knowledge, skills, and affective capacities that are expected to be learned by the trainees and that are in line with the training objectives. It can involve, for example, theoretical content (what to do), procedural skills (how to do), and motivation (trainees’ attitude); 3. Training Method: it is how the content is delivered to the trainees. It can be by a classroom style (paper-, video-based or physical demonstration), on-job, or by a virtual training system; 4. Evaluation Criteria + Expected Outcomes: a set of metrics that measure how well the training method has delivered the training contents in line with the training objectives. As can be seen in Figure 2.1, all the training program components are connected to each other in a way that the definition of one depends on the content of the others. Thus, training program managers can manipulate those components aiming the optimization of learning transfer and the selection of evaluation criteria able to match the expected outcomes with the training objectives (Borsci, Lawson, & Broome, 2015). An example can help to visualize and understand how those components are connected and can influence each other. Given a training program whose objective is to teach a surgery procedure (a), a very specialized set of operations, (b) which mimics as faithful as possible the real procedures, is required. A virtual-based training tool with a high level of physical fidelity (c) can be chosen to simulate the tasks and actions of the 6.

(15) procedure with a high degree of accuracy given that surgery is a critical activity and the trainees need to reproduce precisely in the real world the learned steps. Thus, the designed evaluation criteria (d), taking into account the characteristic of physical fidelity of the procedure, can include metrics able to attest how accurate the hand movements are, which can be measured, for example, by an analysis of the economy of movements.. Figure 2.1. Training program components and how they relate to each other. Source: (Borsci, Lawson, & Broome, 2015). On the other hand, given a training program whose objective is to teach mechanics how to fix cars of a given brand (a), a virtual-based training tool (c) that delivers a very specific procedure (b) is not so useful considering that there are different car models (and each car can have a particular problem) and this approach would not prepare well the trainees for the variability of the real world. In this case, a cognitive fidelity virtual tool which teaches, instead of highly specific procedures, psychomotor and cognitive tasks that are similar (but not equal) to those the trainers will probably face in the real world is more suitable. Thus, the workers would learn skills to make them able to generalize the learned content (and not to reproduce it precisely) and apply it successfully in a similar real scenario. In terms of evaluation criteria (d) metrics, applying the analysis of the economy of movements would not be so useful once the efficacy of a mechanic performance is not connected to how efficiently he or she moves the hands or if unnecessary movements are performed. Another prominent study in the training literature (Kraiger, Ford, & Salas, 1993) explores the concept of learning transfer and the different forms of content delivered by a training system. Learning transfer measures how much of the learning during training is later applied in the real job or how it affects subsequently real performance (Salas, 7.

(16) Eduardo, Tannenbaum, Kraiger, & Smith-Jentsch, 2012). According to (Kraiger, Ford, & Salas, 1993), this transfer is expressed in the form of three types of content: cognitive, skill-based, and affective. The cognitive face of learning transfer is related to the knowledge taught, to what the trainees need to know. Skill-based consists of the new skills participants need to learn (e.g. procedural skills, mental rotation ability, high precision hand movements, etc.). Affective consists of the expected trainees’ attitudes like motivation and self-efficacy, representing what participants need to feel. Thus, training can be defined as a set of activities aiming for the acquisition of knowledge (cognitive), skills, and attitudes, promoting changes in cognition and behavior necessary for an adequate job performance. Hence, a well-designed training evaluation needs to measure the learning transfer in terms of those three types of content. The form of delivering training content for assembly tasks have varied and evolved over the years and followed the advancement of technology, however, classical approaches are still in use. A simple method consists in using an illustrative guided sheet with the steps necessary for mounting the target object. In this work, this technique is called paper-based training. It is a conventional approach that can be applied only when the complexity of the procedure is not high (Hoedt, Claeys, Van Landeghem, & Cottyn, 2016). Another conventional form, typically applied more frequently than the previous one, is watching a video where the product is built (video-based training). In the effectiveness assessment experiments analyzed in this document, the participants watched the video two or three times (Hoedt, Claeys, Van Landeghem, & Cottyn, 2017). Another classic way consists in watching a physical demonstration performed by an expert (demonstrative training) which normally is followed by trainees trial performances. On-job training is an effective approach especially for complex tasks, although it can have an impact on the overall efficacy of the production. It is typically applied after one of the three previous approaches. Lastly, a very recent and still not mature method is the use of virtual-based training systems, whose development is normally expensive and time-consuming, but whose application can save costs and improve the learning transfer (Gavish, et al., 2013). In many companies, the mounting of physical training setup is a complex and costly task. Moreover, the assemble and disassemble of real components take a lot of time. Thus, the 8.

(17) use of a VE which emulates the real training scenario can save time and money. Additionally, VR allows a highly interactive learning-by-doing approach which can potentially improve the acquisition of knowledge as well as the retention and recall of the contents. Other benefits of using VR for training include: easier training evaluation through the automatic measurement of trainees’ performance, easier adaptation to participants characteristics (like learning style) and safer training and controlled environment. However, a s discussed further. in this document, it is a technique whose effectiveness for delivering training content is still under study, although it has presented promising results in many areas (Brough, et al., 2007).. The definitions of training evaluation are as vast as the own concept of training. (Kraiger & Culbertson, 2013) affirmed that training evaluation is the gathering and analysis of data aiming at the understanding of whether training objectives were achieved and/or if those objectives resulted in an improvement of job performance after training. (Rogelberg, 2007) says that training evaluation is a process that aims (1) to determine the effectiveness (the extent to which trainees and organization benefit as planned) and/or efficiency (the ratio of benefits to costs) of training systems and (2) to collect data to improve the training process. The relevance of evaluating a training system lays on the fact that the effectiveness assessment is a way of analyzing the training approach in financial terms, providing data to justify the applied investment and further improvements. Revealing explicitly the benefits obtained from training outcomes is a way of valuing the training not only in terms of costs but in relation to gained human resources (Lewis & Thornhill, 1994). Moreover, the evaluation allows the identification of the parts that are not working well and those that worked as expected by the training objectives, promoting this way a constant refinement of the training system (Salas, Eduardo, Tannenbaum, Kraiger, & SmithJentsch, 2012). Although many authors have highlighted for years the importance of evaluating the training systems ( (Hesseling, 1966) affirmed that one of the main jobs of a training manager is to check if the training methods achieved the desired outcomes and (Mann & Robertson, 1996) attested training effectiveness measurement is a vital element of the 9.

(18) learning process), this is the most ignored and worst executed training activity (Lewis & Thornhill, 1994). It is estimated that only 35% of UK organizations evaluate their education, training, and development programs (Tennant, Boonkrong, & Roberts, 2002). (Kirkpatrick, 1959) is another classic and relevant study of training literature and has been used until now as base for the development of evaluation approaches for training programs by organizations and researchers. According to it, there are four types of outcomes that should be taken into account in the effectiveness measurement of a training system: reactions, learning, behavior, and results. “Reactions” are the learners’ attitudes, opinions and feelings about the training, “learning” measures the knowledge and skills learned during the training, “behavior” is related to the transfer of the learned content to the job, and “results” measure how the training reflected on the company in terms of quality improvement and cost reduction. Kirkpatrick levels are, in a certain way, similar to the (Kraiger, Ford, & Salas, 1993) learning transfer contents of the multidimensional model (KSA model) described previously in this section. Specifically, the first two levels of Kirkpatrick model represent the same concepts of the three KSA model items: “reactions” and “affective” symbolize the trainees’ affective and attitudinal responses to the training program, while “learning” represent the skills learned (same as “skill-based” factor) and the knowledge acquired (same as “knowledge” factor). After an extensive literature review on the topic (Chapter 3), it was identified that the KSA model is the most used evaluation model in studies about the effectiveness of VR for the training of assembly tasks. This model will be used in Chapter 4 for the proposal of the evaluation framework. The increase in the quality of training research seen in the last thirty years as well as the improvements resultant from the practice in organizations have contributed to a change in the way of seeing training. From a unique event it has passed to be faced as a constant and iterative process inserted in a bigger organizational scope where factors before, during, and after training can influence the effectiveness of the program (Salas, Eduardo, Tannenbaum, Kraiger, & Smith-Jentsch, 2012). Pre-training factors, such as the way training is framed or individual characteristics, can influence participants’ training performance and affect the effectiveness of the training program. Some studies have discovered that participants’ previous experience or abilities can affect the training results. (Baldwin & Magjuka, 1991) 10.

(19) have identified more positive attitudes toward the training in people that chose to attend the training than in people that were required to participate, and (Maurer & Tarulli, 1994) have observed that those trainees who perceived their work climate as supportive were more motivated to learn. Moreover, different people learn in different ways according to their characteristics. (Kolb & Kolb, 2013) identified four learning styles and created a test to help to discover which is the most appropriate style for each person. The trainees’ learning style can also influence the training results. In-training factors also have an impact on training outcomes and, because of this, should be considered in the effectiveness assessment framework. According to (Chiaburu & Marinova, 2005), implicit and explicit reinforcement can have a great impact on trainees’ performance, as well as trainees’ self-efficacy and motivation. Lastly, posttraining factors such as goal setting and guided reflection, as well as the application of the learning content in practice, have also influence on training outcomes (Salas, Eduardo, Tannenbaum, Kraiger, & Smith-Jentsch, 2012). Thus, the evaluation criteria designed to evaluate a training program needs to take into account pre-, in-, and post-training factors.. Since its proposal in the PhD dissertation of Ivan Sutherland (Sutherland, 1963), VR has evolved a lot as well as its definition and has become a multidisciplinary area where engineers (computer, electrical, and mechanical), physicists, chemists, biologists, psychologists, philosophers, computer graphics specialists, and designers have contributed with theories and technology development (Muhanna, 2015). The definitions are vast and can vary according to the area and background of the author and to the time when it was proposed. (Pimentel & Teixeira, 1993) and (Brooks, 1999) focus their definition in the user experience. The former says that VR is “an immersive and interactive experience generated by a computer” while the later interprets VR as “any experience in which the user is effectively immersed in a responsive virtual world”. (Zhao, 2002), proposing a definition that focuses more in the components of the system, affirmed that VR is a closed computer system composed of a VE, a physical environment, and a software and hardware interface, allowing interaction between human and computer. (Sherman & Craig, 2003) definition is more biological and emphasizes human senses: VR is a medium composed of interactive computer simulations in which 11.

(20) the position and actions of users “are sensed in order to replace or augment the feedback to one or more senses”, providing the feeling of “being mentally immersed in the simulation”. A more recent and complete definition can be found on (Dionisio, Burns III, & Gilbert, 2013), which sees VR as “computer-generated simulations of threedimensional objects or environments with seemingly real, direct, or physical user interaction”. By the time of its proposal and the development of the first projects in the area, VR was received with big expectations by the overall society. The initial excitement with the promises of the new technology collapsed after some years but researchers and industry kept the investigation in the area, publishing their findings in conferences like IEEE VR or ACM VRST, and journals like Presence. However, in the last few years, VR technologies and applications stopped being a future promise restricted to university and companies research centers and now are accessible to many, being present in many sectors of our society (entertainment, education, health, industry…), what have been called The VR Revolution or The Second Wave of VR (Anthes, García-Hernández, Wiedemann, & Kranzlmüller, 2016). This “new era” of VR (the popularization era) in the tech industry started in 2010 when a teenager, Palmer Luckey, created the concept of the Oculus Rift (a VR headset) that in 2014 was bought by Facebook for 2 billion dollars. Several competitors have appeared since them, most driven by the gaming industry, and now, with a smartphone, anyone can experience an immersive high-quality VE experience with the Gear VR of Samsung. At the end of the same year, The New York Times provided together with its Sunday print version, a cardboard that could be folded in a headset and used together with a smartphone as screen. Although very limited and not highly immersive, it was the first time that millions of people at the same time had access to VR. In 2017, Zuckerberg announced the first sub-$500 oculus-ready PC and Sony released the PlayStation VR ($300). It is clear, “VR is about to hit the mass” (Hern, 2016). Although in the academy the difference between VR and Augmented Reality (AR) are conceptually well established (since their beginning), in the media and in the mind of most of people they are referenced as the same technology and named VR. In fact, the future tends to blurry the boundaries between both, making the Reality-Virtuality Continuum (Figure 2.2) more continuum than ever. The tendency is that the VR headsets will enable the users to switch among any position of the continuum (Madary & 12.

(21) Metzinger, 2016), from total reality to a total virtuality, passing by augmented reality and augmented virtuality. A new term (not so new, it is from 1994) has been used to represent technologies that involve the merging of real and virtual worlds somewhere along the Reality-Virtuality Continuum: Mixed Reality (MR). Thus, a terminology distinction needs to be done. As AR is MR and most of the studies in the area of virtual training use the term MR, in this work VR and MR will be treated as the two possible virtual technologies and the term “virtual-based training” or “virtual training” will be used to refer to both of them.. Figure 2.2. The Reality-Virtuality Continuum. Source: (Milgram & Kishino, 1994). The mentioned evolution of virtual technology has generated an increasing number of devices and systems. Thus, the proposal of a stable taxonomy is not an easy task, given that it can become outdated fast. However, taxonomies are a fundamental part of any modern information architecture and can be very useful in a comparative work, like the present document, where different studies are analyzed in the search for a common understanding over a topic and, because of this, need to be observed under the same eye. For that, two recent taxonomy proposals were used in this work to classify the training systems and the virtual devices used by them. (Muhanna, 2015) prosed a taxonomy for virtual systems whose main classification factor is the level of mental immersion. The basic systems are those that, for not having special input and output hardware devices, provide a low level of immersion. They are screen-based, pointer-driven, and presented as three-dimensional graphics. They are divided into hand-based and monitor-based systems. In the hand-based, the 3D content is displayed in hand-held devices such as smartphones and tablets (it is the case of AR mobile applications which use the device camera to show augmented information over a real scene). In the monitor-based systems, the 3D content is showed on non-portable screens of desktop computers. They are cheap but little interactive and offer low-level immersion options.. 13.

(22) On the other hand, the enhanced virtual systems use more powerful devices to deliver a higher immersive experience and can be partially or fully immersive. Partially immersive virtual systems are characterized by showing the content under a narrow field of view, typically on a large screen. It can be displayed in (a) wall projectors, where users interact with the system through a data glove with limited movements, (b) ImmersaDesk, where a big screen projects two overlapping pictures of the same content and goggles allow the visualization of the content in 3D, and (c) monocular head-based, an AR head device which projects virtual objects in the vision of the real scene. Lastly, there are the fully immersive tools which are represented by the room-based and binocular head-based systems and are characterized by providing a large field of view typically covering all user sight. In the room-based systems, a whole room (and not only a display) is equipped to provide the feeling of total immersion. The binocular head systems are represented mainly by the famous Head Mounted Display, which is composed of two small screens that display the virtual scene to each of the participant’s eyes.. Figure 2.3. A proposed taxonomy of virtual systems. Source: (Muhanna, 2015). For the understanding of the existent virtual-based training solutions explored in this study, more than identifying the type of virtual system, it is necessary to know about all sorts of devices typically used on those systems. (Anthes, García-Hernández, Wiedemann, & Kranzlmüller, 2016) presents an extensive and recent taxonomy of the current VR hardware. The tools are divided into input and output devices, and those that are not able to provide a considerable sense of immersion (e.g., simple projectors or monitors), as well as expensive and space-demanding setups (like CAVE), are not. 14.

(23) considered in this taxonomy, which focused only in those devices with possibility to reach the mass market. The main output devices are the HMDs, which can be wired or mobile. The mobile ones (e.g., Samsung Gear) are simpler because they are not connected to an additional computer in a way that a small device needs to process all the content. It is typically used for entertainment and normally displays panoramas from a stationary point of view or interactive paths based on gaze directed navigation. On the other hand, the wired HMDs (e.g., HTC Vive, Oculus Rift, and PlayStation VR) are more powerful because they are connected to a computer which processes the main part of the content, they can have a camera to enable AR or eye tracking, and normally are equipped with a 6 Degree of Freedom (DOF) tracking system. Other types of output devices are explored in the taxonomy but they are out of the scope of the current study.. Figure 2.4. A proposed taxonomy of VR hardware – Output Devices. Source: (Anthes, García-Hernández, Wiedemann, & Kranzlmüller, 2016). In terms of input devices, the principal category is the controllers, which are hand worn devices composed of buttons or touchpads similar to traditional game controllers and that can be wired or mobile. The navigation devices promote a more intuitive experience giving the illusion of traveling through endless spaces and the tracking devices emulate the positions of the user’s body providing a more intuitive way of controlling 15.

(24) actions in the VE. The data gloves can be very useful for virtual systems where the hand movements are relevant, like in the virtual tools for training of assembly tasks once they able to track the position of each of the user's fingers.. Figure 2.5. A proposed taxonomy of VR hardware – Input Devices. Source: (Anthes, García-Hernández, Wiedemann, & Kranzlmüller, 2016). Assembly is a manufacturing activity in which a given product is mounted putting together all the components and subassemblies which compose a more complex product and that normally have been produced at different times and location (Graham, 1988). According to (Nof, Wilhelm, & Warnecke, 1997), assembly can be defined as “the aggregation of all processes by which various parts and subassemblies are built together to form a complete, geometrically designed assembly or product (such as a machine or an electronic circuit) either by an individual, batch or a continuous process”. In the early years, before the Industrial Revolution, the manufacturing processes were driven by craftsmen who had to pursue a high expertise in all the steps involved in the assembly of the product, once the same person was responsible for the construction of the whole object from start to finish. Thus, to train a worker was a time-consuming and costly activity, which affected the production scale. The Industrial Revolution brought 16.

(25) more efficient production systems with the adoption of interchangeable parts and conveyor belts and chutes. It was the raising of continuous assembly lines for mass production, which became famous with the classical demonstration of Henry Ford, of Ford Motor Co., at the beginning of last century. Another big change in the manufacturing industry came in the seventies with the necessity of flexibility in design and production. It was starting the era of flexible assembly, where computers and robots started to be used for designing, planning, and control of production (Boothroyd, 2005). Assembly includes a series of activities: fastening, performing inspections and functional tests, labeling, separating good assemblies from bad, and packaging and/or preparing them for final use (Graham, 1988). The principal activities during assembly were resumed by (Lotter, 1986) and can be seen in Figure 2.6.. Figure 2.6. Activity groups employed during assembly. Source: (Lotter, 1986). Although robots have replaced workers in many of those activities, the process cannot be fully automatic, especially in those industries where the security guarantee of the products is vital (e.g., automotive and aerospace). Thus, the creation and maintenance of a skillful assembly workforce is still a priority task for organizations and promotes production efficiency.. 17.

(26) The design of an assembly line can vary a lot depending on the target product, however, they all have in common the existence of a conveyor system composed of different working stations where different workers perform different operations. The work is synchronized and production rates are programmed, no cross flow, or backtracking, or repetitious procedure (Boothroyd, 2005). In the case of an automotive assembly line, everything starts with a bare chassis. Then additional elements are successively added to it as the growing object moves along the conveyor. Typically, there are supplement lines whose subassembly product results encounter the main line and so on (Britannica, 2018). Along the line, each worker in a specific station performs a specific set of operations peculiar to that station. In some companies, to avoid health problems due to the execution of repetitive movements, workers are constantly changing stations. Naturally, this demands a constant training system where employees keep a continuous learning about the operations of different stations. When asked about the customization options of his Model T, Henry Ford answered: “You can have any color you want, as long as it's black”. However, to please an increasingly diverse clientele, nowadays manufacturers offer a variety of configurations for the same model. For instance, a luxury car model can have up to 1024 possible configurations (Parry, Newnes, & Huang, 2011). Thus, in the production line, a complex and sophisticated system of scheduling and control allows that different assemblies are on the line simultaneously, guaranteeing that the appropriate part or optional piece arrives at the same time to permit the desired combinations (Britannica, 2018). Although in the assembly lines of some industries (e.g., petroleum refining and chemical manufacture) the whole processes are ruled by machines with very little human supervision, in other industries like the automotive and aerospace, part of the production needs to be performed by humans (using sophisticated tools), especially where manual dexterity is needed (e.g., production of the car seats). Moreover, machines are expensive and inflexible, so their use is only worth it if they produce a high level of output (Deaton, 2009).. 18.

(27) Usually, many elements that compose a car are not made in the same place as the main production line. The companies typically buy those parts from suppliers or have special factories with their own production lines to produce the components (Deaton, 2009). It is the case of Seat Inc., the partner in this project for the development of a virtual training system, which produces automotive seats for different car models of the Group PSA. The assembly line of car seats is similar to that described previously in this document. Typically, there is a main line and sub lines depending on the number of required seats per vehicle and the location of those seats in the car. The conveyor is controlled by an automatic system and has pallets for placing the seats, which moves through each station where new elements are successively added. In the case of Seat Inc., for each car model, there are two assembly lines, one for the cushion and other for the back, one ending up in the other. Each station is equipped with all the necessary tools to execute the operations correspondent to that position, a guide sheet with the sequence of operations to be performed, and usually a barcode scanner to scan item barcodes before their positioning. Most of the actions are followed by a security check performed by the own employee. Quality and security are more important than speed, so the workers have an appropriate time to perform the operations without being in a hurry. Apart from the central automatic control, each station has a manual emergency stop and release buttons.. Figure 2.7. Seat assembly line. Source: (Liker, 2013). 19.

(28) Up to this point, it is clear how having and maintaining a skilled labor force is important to any industry. For this, it is necessary the design, application, and evaluation of a training program able to provide constant learning to the employees. In this process, the use of virtual-based training tools has been gaining space in the automotive industry mainly for three reasons. First, although virtual training is an approach whose creation is time-consuming and costly, the investment is surely returned because virtual tools can reduce the overall training costs since the traditional training typically demands physical facilities, real components, or even a dedicated production line which can be more costly and timeconsuming (time spent with the assembly and disassembly of the same product) than the production and maintenance of a VR/MR tool. Thus, virtual training is very useful when the procedures to be trained are expensive, difficult or dangerous to be reproduced in the real life. Second, it promotes a rich interactive experience allowing trainees to visualize and interact with simulated objects, enhancing this way the quality of skills acquisition through a learning-by-doing approach. Additionally, those systems are customizable and can be adapted to individual needs and learning style, increasing user motivation during the training and consequently promoting a more effective learning. The third reason can be extracted from the (Borsci, Lawson, & Broome, 2015) definition of effectiveness assessment: “is the process by which researchers could identify the shortfalls and the possible customizations of a training tool for satisfying the needs of an organization”. Using a virtual training system, trainers can easily collect all sort of data about trainees’ performance and use it to assess and calibrate and adjust the training process.. 20.

(29) As mentioned before, VR has been used as a training tool in many areas and has presented promising results, although still not consistent, in terms of learning effectiveness. In this chapter, those studies are identified and analyzed aiming at a better understanding of the current scenario of the evaluation of virtual-based training systems and the gathering of relevant information for the proposal of a framework to measure the effectiveness of the SIEMA.. The use of VR as training tool is a hot research area that has presented an impressive growth in the last three decades. This is the conclusion of an extensive analysis of published peer-reviewed studies on training with virtual technologies. Papers containing in the title or in the keywords section the words “virtual reality” or “augmented reality” or “mixed reality” and “training” were searched in Scopus database (Elsevier, 2018), the largest abstract and citation database of peer-reviewed literature. The query used in the search (KEY(training) OR TITLE(training)) AND (TITLE("virtual reality") OR TITLE("augmented reality") OR TITLE("mixed reality") OR KEY("virtual reality") OR KEY("augmented reality") OR KEY("mixed reality")). returned 7.704 documents. and can be graphically seen in Figure 1.. Figure 3.1. Amount of articles published per year related to the use of virtual technologies (virtual reality or mixed reality) for training.. 21.

(30) The decision about searching over the keywords area instead of considering only the titles was made because in some articles the titles are not so clear and representative of the research content while the keywords are more informative, especially in the case of Scopus, where, apart from the keywords defined by the authors, there are indexed keywords, a broader set of keywords which involves engineering controlled terms, compendex keywords, engineering main heading, and specific domain terms such as medical nomenclature. The first studies in the area dated from 1991 and since then it has presented a considerable growth, achieving 626 works in 2017. As can be seen in Figure 3.2, apart from the VR native area (Computer Science), most of the research documents are from engineering and medical fields.. Figure 3.2. Areas of studies in virtual technology for training. However, as identified by (Borsci, Lawson, & Broome, 2015), just a small portion of the studies have drawn experiments to measure the performance of participants and an even smaller number have focused their attention in the study of the effectiveness of the training tools. Extending the previous search, it was executed a new query on Scopus looking for papers with the words “effectiveness” or “evaluation” in the title, abstract or keywords fields. The query (KEY(training) OR TITLE(training)) AND (TITLE("virtual reality"). OR. TITLE("augmented. reality"). OR. TITLE("mixed. reality"). OR. KEY("virtual reality") OR KEY("augmented reality") OR KEY("mixed reality")) AND (TITLE-ABS-KEY(effectiveness) OR TITLE-ABS-KEY(evaluation)). 22. resulted in 1.799.

(31) hits (23% of the papers obtained by the first query). Naturally, it is expected that not all of those returned articles really focus on the effectiveness assessment of the training tool and indeed some of them only described the design processes and technology decisions while others just evaluated the trainees’ performance. As the target of this research are the training tools for assembly tasks, a deeper search was executed on Scopus adding the quest for the word “assembly” in the title, abstract or keywords fields: (KEY(training) OR TITLE(training)) AND (TITLE("virtual reality"). OR. TITLE("augmented. reality"). OR. TITLE("mixed. reality"). OR. KEY("virtual reality") OR KEY("augmented reality") OR KEY("mixed reality")) AND (TITLE-ABS-KEY(effectiveness) OR TITLE-ABS-KEY(evaluation)) AND TITLE-ABSKEY(assembly). A total of 57 documents were returned. However, it is expected that some. of those papers are not studies about the effectiveness of VR tools for assembly tasks but, for another reason, they just presented the words “effectiveness” or “assembly” in one of their fields. Thus, a manual analysis was performed in the set of the returned articles, which is resumed in the table of Appendix A. Over half (31 or 54%) of those 57 documents, although containing the words “effectiveness” or “evaluation”, are studies that explore only the design of the training tools (Werrlich, et al., 2018) (Webel, Bockholt, & Keil, 2011), or some technical aspects (a new method to calculate haptic feedback, for example) (Faas, 2011) (Eck, Pankratz, Sandor, Klinker, & Laga, 2015), or are a review about the topic (Lawson, Salanitri, & Waterfield, 2015) and do not evaluate the training system within their learning goals or explore the effectiveness assessment. Additionally, more than half (14) of the 26 remaining papers focus the evaluation only in the trainees’ performance (typically in terms of time and number of errors). However, according to many studies (Moskaliuk, Bertram, & Cress, 2012) (Gavish, et al., 2013) (Borsci, Lawson, & Broome, Empirical evidence, evaluation criteria and challenges for the effectiveness of virtual and mixed reality tools for training operators of car service maintenance, 2015), this way of measuring effectiveness is poor because it does not consider the whole training program of which the training tool is part, the trainees’ perceived effectiveness (subjective evaluation), or the capacity to recall the learned content. Not considering one study that, because it is written in Japanese, could not be better evaluated, and considering different papers within the same project as a unique study, a final list of seven studies which evaluate the effectiveness of virtual-based 23.

(32) training tools for assembly tasks is obtained and can be seen in Table 1. These papers were then deeply analyzed aiming at understanding the current status of the research about the effectiveness of VR for the training of assembly tasks (section 3.2) and the proposal of an evaluation framework for SIEMA (chapter 4). Table 3.1. Studies about the effectiveness of virtual training for assembly tasks found on Scopus Id 1 2. 3. 4 5. 6 7. Study (Murcia-López & Steed, 2018) (Hoedt, Claeys, Van Landeghem, & Cottyn, The evaluation of an elementary virtual training system for manual assembly, 2017) (Hoedt, Claeys, Van Landeghem, & Cottyn, 2016) (Jiang, Zheng, Zhou, & Zhang, 2016). Area General. (Borsci S. , Lawson, Jha, Burges, & Salanitri, 2016) (Carlson P. , Peters, Gilbert, Vance, & Luse, 2015) (Oren, Carlson, Gilbert, & Vance, 2012) (Gavish, et al., 2013). Automotive. (Jia, Bhatti, Nahavandi, & Horan, 2013) (Jia, Bhatti, & Nahavandi, 2012) (Jia, Bhatti, & Nahavandi, 2009a). Automotive. (Jia, Bhatti, & Nahavandi, 2009b). General. General Industry. General General Industry. Goal Effectiveness assessment Effectiveness assessment. Effectiveness Assessment Criteria Performance, Subjective evaluation, and Recall assessment Performance and Subjective evaluation. Effectiveness assessment Design and Effectiveness assessment Effectiveness assessment Effectiveness assessment Effectiveness assessment Effectiveness assessment Effectiveness assessment Effectiveness assessment Design and Effectiveness assessment Effectiveness assessment. Performance and Subjective evaluation Performance and Subjective evaluation Performance, Subjective evaluation, and Recall assessment Performance, Subjective evaluation, and Recall assessment Performance, Subjective evaluation, and Recall assessment Performance, Subjective evaluation, and Recall assessment Performance, Subjective evaluation, and Recall assessment. Most of them focus on the evaluation of a virtual-based training tool better described in previous papers, while others contain the design and the effectiveness assessment in the same document and thus typically present a limited description of the evaluation process. Some of the training systems were developed to cover a specific application area, such as automotive or mechanical industry, but others opted to be generic, using puzzles or blocking construction toys (LEGO® or MECCANO®), instead of real assembly tasks, in the procedures to be taught. However, what all those studies have in common is the fact of employing more robust effectiveness evaluation criteria that does not concentrate only on trainees’ performance but also on the perceived usability and effectiveness (typically referred to as subjective evaluation or heuristic analysis) and. 24.

(33) on the assessment of the capacity of recalling the learned skills and knowledge days after the training.. Although the car industry was one of the first to apply VR technology in its processes, especially in the prototyping and design areas (Sá & Zachmann, 1999), it took a while for them starting to employ it for the knowledge acquisition of their employees. Thus, while some fields such as medicine and military have already a well-established scientific content about the use of virtual training in their areas, the automotive manufacturers are still giving the first steps (Borsci, Lawson, & Broome, 2015). Given the benefits of the use of virtual training highlighted in section 2.1, it is expected that it can easily become a more powerful and effective tool for delivering training content than the traditional forms (video-based or learning-by-doing, for example). However, consensus about the effectiveness of VR for the training of assembly tasks is still far from being reached (Carlson P. , Peters, Gilbert, Vance, & Luse, 2015) given that the works in the area have presented conflicting conclusions. In some of them, it is clear the superiority of virtual-based training over traditional learning approaches in terms of trainees’ performance, but in others, they presented equal capacity to provide a quality learning. This uncertainty about the effective use of virtual training systems for assembly tasks can be explained by a sort of reasons that depend on peculiarities of this application field. which. distinguish. it. from. other. industries. and. on. methodological. limitations/mistakes insome of those papers. In other fields, such as healthcare and military, the fact that different VR training tools are applied under basically the same training aims, conditions, and contents, where a specialized procedure with standardized rules is taught, allows the creation of comparable benchmarks or evaluation criteria within each field that can be used as a standard way of measuring effectiveness (Borsci, Lawson, & Broome, 2015). This is not what happens with the training for assembly tasks, in which the fact that the program goals, the stakeholders involved in the training (managers, suppliers, operators, etc.), and the VR tools vary a lot depending on the company, and the fact that operators are trained to perform variable procedures (assembly, disassembly, maintenance, etc.), hinder the 25.

(34) creation of comparable evaluation criteria (Michalos, Makris, Papakostas, Mourtzis, & Chryssolouris, 2010). Another factor that contributes to the lack of evaluation studies of VR training tools for assembly tasks and the consequent uncertainty about their effectiveness is that the development of a virtual training system is a costly and time-consuming process, which makes the companies refuse to spend an extra budget having their employees allocated to an assessment activity instead of working on their jobs (Haque & Srinivasan, 2006). The high heterogeneity of training goals, conditions, and contents, as well as company budget restrictions, have limited the number of studies that aim to measure the effectiveness of VR based training for assembly tasks. According to (Tang, Owen, Biocca, & Mou, 2003) and (Borsci, Lawson, & Broome, 2015), this lack of experimental proves is evident in the manufacturing and automotive sectors. Moreover, part of the already limited amount of evaluation studies suffers from diverse methodological and experimental design issues. In some of them (Jiang, Zheng, Zhou, & Zhang, 2016) (Gallegos-Nieto, Medellín-Castillo, González-Badillo, Lim, & Ritchie, 2017), the size of the sample was too small to allow drawing statistically significant affirmations, resulting in a relevant threat to conclusion validity. In others (Gavish, et al., 2013) (Carlson P. , Peters, Gilbert, Vance, & Luse, 2015) (Hoedt, Claeys, Van Landeghem, & Cottyn, 2017), just one task was used as the target procedure to be trained and performed, resulting in weak conclusions, given that they were obtained over just one experimental object (external validity threat). Furthermore, the task(s) used in the experiments did not present enough complexity to avoid a ceiling effect resultant of experienced workers performing a very simple task (since the task is easy, it can be learned effectively independent of the learning approach: VR, video-based training, learning-by-doing etc.). Some studies (Gallegos-Nieto, Medellín-Castillo, GonzálezBadillo, Lim, & Ritchie, 2017) (Stone, Watts, Zhong, & Wei, 2011) have found that effectiveness depends on the complexity of the task performed. According to them, complex tasks permit the identification, in a clearer way than simple tasks, of the benefits of VR based training. It is possible to say: the greater the task complexity, the greater the effectiveness.. 26.

(35) Another experimental problem is that some of those studies (Brough, et al., 2007) (Jia, Bhatti, Nahavandi, & Horan, 2013) just evaluate their VR training systems in terms of user performance and perception and do not compare them with the traditional training approaches. According to (Gallegos-Nieto, Medellín-Castillo, González-Badillo, Lim, & Ritchie, 2017), one of the most direct forms of quantifying effectiveness is to attest the reduction of the real assembly time after the virtual training in comparison to the real assembly time when traditional training is used. The last issue identified in some of those studies is that they applied a limited set of evaluation criteria. Typically, the metrics used in the effectiveness assessment were the time and the number of errors during a test performance of the trained procedure. However, as some authors have highlighted (Riva & Mantovani, 2001) (Moskaliuk, Bertram, & Cress, 2012) (Gavish, et al., 2013), a training tool belongs to a bigger training program of an organization and, thus, the analysis of the effectiveness of this tool cannot disregards the organizational needs, the training program aims, and the environment (actors and their relationships).. The seven studies about the effectiveness of virtual training for assembly tasks were analyzed deeply aiming at a better understanding of the problem and the acquisition of proper knowledge to achieve the main goal of this work which is proposing a framework for measuring the effectiveness of virtual-based training for assembly tasks. To that previous list of selected articles, three other studies (Brough, et al., 2007) (Borsci S. , Lawson, Salanitri, & Jha, 2016) (Gallegos-Nieto, Medellín-Castillo, González-Badillo, Lim, & Ritchie, 2017) were added manually (looking for other papers of the same authors or in the reference lists of the papers already found). They were not revealed by the search filters but are relevant studies about the evaluation of virtual training tools. The final list can be seen in Table 3.2. In the first moment of this analysis, in the search for the more adequate set of evaluation measures for SIEMA, the effectiveness assessment criteria used by each of those papers were evaluated and compared to each other (details about this process are described in the next chapter). After this, a study evaluation form (see Appendix B) was 27.

(36) created which was applied to all the target papers, aiming to have a structured view of those articles’ research in terms of evaluation criteria, experimental design (number and kind of participants, variation of target task complexity, comparison with traditional training approaches), technology used, training design, analysis of the results, and conclusion about the effectiveness. The analysis results can be seen in Table 3.2 and are summarized in the following paragraphs. As explained before, the term Virtual Reality is used in this work in a broad way, representing, given the Virtual Reality Continuum, not only the totally immersive tools, but the mixed reality (augmented reality and augmented virtuality) approaches as well. However, aiming to identify which technology is more used in the development of virtual training systems for assembly tasks and if there is scientific evidence that one is better than the other, the distinction between VR and MR was made during this study analysis. Although the number of studies that applied MR, especially AR, for the training of assembly tasks is much higher than VR, very few studies have compared the effectiveness of both and they could not find evidence that one is more effective than the other. The preference for MR is credited to the fact that this technology, as opposed to VR, can be used not only for training but also for the guiding of the employees during their real assembly activities post-training (Borsci, Lawson, & Broome, 2015). Another aspect considered in the analysis was the complexity of the tasks to be trained. As mentioned before, the complexity of the procedure to be learned can influence the effectiveness of the tool. Complexity definition and ways of measurement are diverse and depend on the area of study, but parameters like the number of parts, the shape of the parts, the number of possible orientations, among others, certainly can influence the easiness of mounting an object. In the manufacturing industry, some studies have presented different but related views about complexity. (Goldwasser, Latombe, & Motwani, 1996) explored the concept of assembly cost affirming that it depends on the number of steps, number of directions, number of re-orientation, and depth of assembly sequence. (Ghandi & Masehian, 2015) attested that when the parts have a simple geometric shape, the complexity can be measured in terms of the number of those parts, but when they have polygonal or polyhedral shapes, the total number of vertices is a better complexity indicator. (Badrous & Elmaraghy, 2010) was more objective and presented an equation to calculate the 28.

(37) ௡೛. complexity of an assembly product: ‫ܥ‬௣௥௢ௗ௖௨௧ ൌ ൤ே ൅ ‫ܫܥ‬௣௥௢ௗ௨௖௧ ൨ ൣ ଶ ሺܰ௣ ൅ ͳሻ൧ ൅ ೛. ௡. ቂேೞ ቃ ሾ ଶ ሺܰ௦ ൅ ͳሻሿ; ‫ܫܥ‬௣௥௢ௗ௨௖௧ ൌ σ௡௣ୀଵ ‫ݔ‬௣ ‫ܥ‬௣௔௥௧ , where Np is the total number of parts, Ns ೞ. is the total number of fasteners, np is the amount of unique parts, and ns is the amount of unique fasteners. This equation was used to define the complexity of the analyzed studies. However, some of the papers did not inform details about the trained task enough to use that equation. Other elements analyzed in the studies review and that can influence the effectiveness of a training tool, though in a smaller scale, are the control group learning approach, the type of the target task, and the kind of participants selected for the experiments. Typically, the control group are trained with one of these three traditional training approaches: a 2D form with images illustrating the assembly step by step, an instructional video, or learning-by-doing. According to (Hoedt, Claeys, Van Landeghem, & Cottyn, 2016), the learning strategy for the control group can influence the results and conclusions of the effectiveness assessment, and the lack of homogeneity in the choice among the studies “makes the comparison of different test results very hard and subjective”. Lastly, for the experiments to represent as faithfully as possible the real world, and thus to allow the drawing of more reliable conclusions (decreasing the threats to external validity), it is preferable to use real assembly tasks (some of the papers used LEGO® and others puzzles) and workers of the manufacture industry (in some studies, the trainees were students or randomly selected people). (Borsci S. , Lawson, Jha, Burges, & Salanitri, 2016) is one of the most complete work in the evaluation of the effectiveness of virtual training for assembly tasks and the one that best deals with those methodological and experimental design issues explored previously in this section. They developed a system to train professionals of any industry through a virtual learning-by-doing approach and drew an experiment in which sixty people (novices and intermediate workers of one of the automotive development process step: design, project management, inline operations, etc.) were divided by their expertise and randomly assigned to one of three groups: one trained with VR CAVE, other with VR zSpace (a portable 3D holographic interactive table), and the other with a video explanation of the procedure (control group). It was used just one task as the procedure to be learned, but it was a real operation (disassemblying and changing a fault lower arm of the front suspension of the car) of a real car model (Range Rover Evoque) composed 29.

(38) of 24 steps. No more details were showed about the task but given the presented information it is possible to infer that was a complex task since it seems to involve a considerable number of components of non-trivial geometric shapes. Applying robust evaluation criteria composed not only of information about trainees performance (time and number of errors) but also a heuristic evaluation through feedback questionnaires that measured user perceptions (more details in the next chapter), they concluded that virtualbased training “can enhance significantly trainees’ acquisition of the procedural skills” and that it is “a powerful alternative to classic video training explanations”, although no statistically significant differences were found among the groups in terms of post-training performance and proficiency. (Jia, Bhatti, & Nahavandi, 2009a) (Jia, Bhatti, & Nahavandi, 2009b) (Jia, Bhatti, & Nahavandi, 2012) (Jia, Bhatti, Nahavandi, & Horan, 2013) are papers of the same study with a focus on establishing a method for evaluation of VR for the training of technical and procedural tasks. For this, they created a virtual training system to prepare workers of an automotive assembly production line. The system was designed targeting great usability in a way that the representation of the scenario and mainly the user actions were as similar to the real world and natural as possible. The use of 6 DoF HMD, haptic devices (Phantom®), data glove, and 3D mouse helped in this process. In the experiment, seventysix people of different backgrounds and diverse age ware trained to perform seven tasks which vary on difficulty level (easy, moderate, and advanced). Moreover, different training modes (process demonstration, guided assembly, unguided assembly, and free play) were contrived to allow a progressive and effective learning. The criteria used were based on classical training studies which affirm that the assessment of a training tool should be guided by cognitive, skill-based and affective metrics. The cognitive outcomes were measured by a memory test questionnaire applied two weeks after the training while the skill-based evaluation was calculated taking into account performance results (time and number of errors). The main contribution of this work lays on the affective assessment, with the creation of two questionnaires to measure the users’ perceived capability in performing the training (self-efficacy scale) and the participants’ personal belief that the tool was able to deliver an effective learning (perceived VR efficacy scale). The user perception and satisfaction with the tool is considered a relevant metric for the evaluation of effectiveness by many studies (Borsci, Lawson, & Broome, 2015) (Vélaz, Arce, Gutiérrez, Lozano-Rodero, & Suescun, 2014) (Xia, Lopes, Restivo, & Yao, 2011) 30.

(39) (Gavish, et al., 2013) (Hoedt, Claeys, Van Landeghem, & Cottyn, 2016). The evaluation of the tool was based on those metrics outcomes but the problem with the performed experiment is that the authors did not consider a control group and thus did not compare the effectiveness of the virtual tool with traditional training approaches. (Gavish, et al., 2013) performed an experiment to evaluate the VR and AR platforms of the SKILLS Integrated Project (Casado, et al., 2009), an interdisciplinary project that aims at “the development of new methodologies for enactive (learning by doing) skills acquisition and transfer of skills using multi-modal interfaces in VR and AR environments”. Forty specialists from a manufacturing company of packaging equipment for liquids were divided into four groups: VR, AR, control VR, and control AR. In the case of VR, the equipment used was simple and composed of just a screen displaying the 3D graphical scene and one haptic device for input actions. They used just one target task, but it was a real assembly procedure (mounting part of the electronic actuator of a motorized modulating valve) of medium complexity (twenty-five steps grouped in six sub-tasks). In terms of the evaluation criteria applied, instead of just counting the number of performance errors, they categorized the mistakes (missing a piece, exchanging positions of pieces belonging to the same step, exchanging pieces between steps, and wrong placement of components) to have a better understanding of trainee’s errors. They also applied two post-training questionnaires to measure the perceived usability and the perceived effectiveness of the training. Although they couldn’t compare the performance between the VR and AR groups because the training conditions were different, they could compare the perceived learning transfer and the perceived usability to check which tool was better received by the users. Participants gave higher scores to the AR platform than to the VR platform in most of the items of the post-training questionnaires. However, it is important to mention that the VR platform used was simple and little immersive (they used a screen instead of an HMD). The problem in this study is that they used a relatively simple procedure to be learned by specialist and current workers in the area. Thus, they could not find differences between the performance of the VR group and Control-VR since the control group could learn the task easily just watching a demonstrative video. Another conclusion of the experiment is that virtual training takes more time (VR and AR training groups required longer training time compared to the Control-VR and ControlAR groups). However, this conclusion is not relevant given that it came from an experiment where the training tools were not considered inside a training program. A 31.