• No se han encontrado resultados

CAPÍTULO II. MARCO TEÓRICO

2.1. DESARROLLO DE PROVEEDORES

2.1.4. Experiencias en Minería

It is possible to observe human activities as a sequence of smaller, basic movements. For a specific activity, the order of these basic movements is always the same. For example, the activity ’picking up an object’ can be split into a sequence of five separate actions, namely (1) a downward acceleration; (2) a downward deceleration; (3) no movement; (4) an upward acceleration; and finally (5) an upward deceleration. To understand a typical human activity in terms of a HMM, we will start with the hidden Markov chain and what it should represent. As a starting point, and to impose a temporal order to the model, we choose for a left-to-right architecture, with delta equal to 1 (see also 2.3.2). We let each state in this representation correspond to one of the basic movements of an activity. For the activity ’picking up an object’ as described above, the number of states then will be five. Because of the chosen architecture each of these states must be visited, which also means that it is not possible to skip any of the basic movements of the activity. With this we have enough information to implicitly define the transition probability matrixA. It should be an upper triangular matrix with all elements equal to zero, except those on and directly above the diagonal.

The second model parameter, namely the initial state distribution, is easy for the chosen Markov chain architecture. Since we are only considering left-to-right models, all chains must start from the left-most state. This means that the initial distribution will be of the formπππ={1,0, . . . ,0}.

The third model parameter is the signal probability distribution in each state. We have assumed that a state corresponds to a small, basic movement, which is characterized by the data being stationary in some sense (e.g. downward acceleration or no movement at all). In the theoretical ideal case, where specific human activities are always performed in the exact same way, and any resulting recorded data is very ’clean’, it would be possible to observe these stationary parts across all parallel data channels. The resulting sequence of these stationary parts then makes up a complete activity.

To some extent it is indeed possible to observe these stationary parts in real life measurements. See for example figure 3.2. To account for noise in the signal and variation between different examples of the same activity, we model each state by a normal distribution with meanµand covariance matrix Σ. Data points can then be probabilistically matched to whatever state was most likely to have generated them. We now have covered all parameters λ= (A,B, πππ) that make up a HMM. For each activity that needs to be classified, a separate HMM needs to be defined. The total of these activities make up our dictionary.

3.3.1 Initialization

Before the training algorithms can be applied, the HMM parameters need to be initialized. The initial distributionπππ is always of the form{1,0, . . . ,0}as described above. For the initialization of the transition probability matrix the only thing that

0 10 20 30 40 50 60 70 80 90 100 −16 −14 −12 −10 −8 −6 −4 −2 0 2 4

x−accelerometer data for a ’Squat’

time

acceleration (m/s

2)

Figure 3.2: An example of an accelerometer data channel of a subject performing a ’squat’. The sensor was positioned on the spine. It is possible to observe stationary parts within the time series.

is important is that all illegal state transitions have probability zero. The nonzero elements can be given any positive value between zero and one, the only constraint is that each row of the matrix should sum up to one.

For the initialization of the signal probability distribution the values of µ and Σ need to be specified. In contrast to the matrix A, here the initial values play a more important role because they determine the initial location of the hidden Markov chain states within the state space. Since the Baum-Welch algorithm is a local optimizer, and since it is as of yet unknown to us how ’mountainous’ the space is where Baum-Welch operates, a smart choice of the initial values ofµand Σ seems essential.

However, for our preliminary testing, we will provide all HMMs with a ’flat’ start. Allµj and Σj will be given the value of the mean and variance of the complete training dataset. We will return to this question in chapter 5 and address it in more detail.

3.3.2 Training and testing

To train the HMMs, we need to provide a set of training data. This training set consists of a sequence of manually labeled examples of each of the activities from the dictionary. The activities are recorded in a random order, to make sure the classifier is not trained to recognize them in a certain order.

through application of the iterative Baum-Welch algorithm (see section 2.4.3). An open question is how often to iterate it. If a HMM goes through too many iterations, it may adjust to very specific random features of the training data, that have little or nothing to do with the target activity. When the model overfits the data, the performance on the training examples may still increases while at the same time the performance on unseen data becomes worse. For the case of speech recognition, the HTK manual [43] suggests that five iterations should be sufficient. Compared to speech, our data is of a higher dimension, so maybe a few more iterations will be needed.

The trained models can now be provided with an independent test data set. The test data set again consists of a sequence of randomly ordered activities. The Viterbi algorithm (see section 2.4.2) is applied to transcribe the data.

Documento similar