• No se han encontrado resultados

11. La Certificación Energética de Viviendas y Edificios

11.1. Propuesta de Documento Reconocido “Procedimiento para la certificación

11.1.3. Ejemplo

One major obstacle that even we, as humans, encounter when observing people’s behavior, is given by the difficulty of verbalizing and communicating in detail what we see. If we limit ourselves to everyday activities, we are generally able to provide a coarse description of what happened in a given period of time. This is usually achieved by listing verbs which represent theactions of interest. The problem becomes slightly more complex when we are asked to describe how specific actions are done. For example, if we had to convey what is involved in cooking a particular dish, we could present “mixing the ingredients” as a repetitive task where the arm makes a circular motion, while holding a wooden spoon, etc. Yet, other actions—such as jumping to hit the ball during a volleyball game—might be more challenging, since a precise combination of precisely executed atomic motions is the only way to achieve the desired goal.

If we move to an even finer scale, describing the actual motion of our limbs and body parts to others with the purpose of having them repeat the same or similar trajectory, we are left with little choice but to show an actual example of what we mean. Aside for a few motions for which we have words like “step”, “reach”, “grab”, “swing”, etc., we suffer from the lack of a taxonomy that appropriately represents what we understand and interpret effortlessly.

develop a notation that describes human motion, the direct mapping to visually observable aspects is still missing. The general consensus of the community seems to be that a fixed vocabulary for human motion is both non-existent and inappropriate. This is also our opinion, and we essentially agree with [Bob97] that the right approach must be hierarchical, with different taxonomies at different levels of abstraction. We base our decomposition of human motion primarily on the time-scale at which it happens, and in part on the semantic we commonly attach to it.

hold a m e e t i n g

di n e w or k ou t

w alk

g e t s e at e d t alk op e n / c los e t he door

t ak e n ot e s r u n dr i n k e at k i c k r e ac h p u ll g r ab s i t dow n c ar r y g e t u p s t e p le f t s t e p r i g ht t hr ow c he w M ov e m e s A c t i on s A c t i v i t i e s

Figure 5.1: A Hierarchical View. We interpret human motion in a hierarchical way. At the highest layer, a single word is sufficient to provide a compact description of an “activity”. We use the term “action” for shorter events that, joined together probabilistically, yield an activity. At the bottom layer are the “movemes”: atomic motions which are learned without supervision from data, and do not necessarily posses a verbal description. We arbitrarily name them for the sake of example.

At the highest level, we think of activities as happening over an extended period of time. The top portion of Figure 5.1 shows a few examples of what could be regarded as an activity. Humans generally have no trouble choosing words that identify an activity, and the choice is generally agreed upon by the majority, leaving no ambiguity.

a few seconds. Repetitive events such as walking are also classified as actions, since the elementary cycle that is repeated fits the definition of action. Another defining aspect of actions is that their combination, meant in a stochastic manner, yields an activity. For example, the activity of “holding a meeting” could be considered a probabilistic concate- nation of actions such as “sitting down”, “opening a door”, “taking notes”, etc. In general, the duration, the order (to some extent), and even the presence or absence of some of the actions do not compromise the existence of the activity.

Elementary pieces of motion, limited in time to a few frames, which could even involve only part of the body (such as a limb, or the lower/upper body), belong to the bottom level. Inspired by the early work of Bregler [Bre97], we refer to them as movemes. Although we are able to associate names for some of these elementary motions, we do not feel this needs to be the case. In fact, unlike actions and activities, we believe that their definition should be inferred from data as the “best” set of short-term motions describing the actions observed. Unfortunately the definition of “best” here is vague, since it is difficult to develop a meaningful metric that makes sense for every problem. Nevertheless, guided by a few requirements that we deem important, we propose a procedure for the identification of a dictionary of movemes in video sequences.

Documento similar