• No se han encontrado resultados

Nivel II Gerencia media y Coordinación de un programa: Es el nivel con conocimiento especializado en ciertas tareas de gestión que dirigen la ejecución de

PERFIL PRIMER NIVEL GERENTE GENERAL

We evaluate the various methodologies by training on the full set of time-points from

one mouse and then testing our classifications on the full set of time-points from

every other mouse. This validation scheme is designed to match the experimental

situation, in which a model would be trained on time segments for one (or more)

mouse and used completely out of sample on data obtained from different mice. Of course it is possible to train on randomly selected subsets of data (ideally, blocked

subsets) from one mouse and test on hold out subsets reserved from the same mouse.

The comparative performance of ”same mouse but out-of-sample” is much better,

There are only 8 mice in the data set. However, we train on one, fit to the

other seven, and then repeat this procedure over all eight mice yielding nearly

500,000 out of sample epochs. Thus, all results presented are statistically significant

under the standard multinomial models, even though these models do not apply in

the presence of strongly autocorrelated time series data. Consequently, we do not

provide standard errors because it is not clear what the probability model is for the

test statistic.

The EEG/EMG-based manual scoring of these eight mice is the ”gold standard”

for classification. There is an issue with internal inconsistency of manual scoring:

classifications from different scorers agree in only about 92% of the epochs. Never-

theless, any model classifications which did not substantially match manual scoring

would not be considered useful by sleep researchers. In addition to this overall

error rate, we also consider the false positive and false negative rate for the rare

and important REM state which is of special interest to sleep researchers. They

prefer a low REM false negative rate but can tolerate a high REM false positive

rate because there are so few REM epochs.

A second way we evaluate our predictions is by comparing the fitted duration

distributions to the actual ones. This is important because sleep researchers are

sometimes interested in estimating the parameters of these distributions and seeing

how they vary across mice (McShaneet al., 2010). In this case, fitting the distribu- tions well is what is important, even if the epoch-by-epoch classifications themselves

are not particularly accurate. We did this via χ2 goodness-of-fit statistics. First,

we formed bins for each of the six conditional states based on the empirical dis-

tributions. We started with a single bin for durations of length one and added

individual durations until the bin had greater than 5% of the empirical bouts in it.

We iterated this process ensuring that all bins (including the terminal bin) had 5%

or more data.

As an example, consider the state NREM->WAKE. 44.83% of the empirical

NREM->WAKE bouts were of length one hence this became its own bin. Likewise

19.57% and 6.13% were of length two and three epochs respectively thus leading

to those durations being their own bins. 5.81% of empirical bouts had duration of

either four or five epochs thus defining the fourth bin. 5.56% had durations of six,

seven, eight or nine epochs yielding the fifth bin. 5.04% had durations between ten

and twenty-three epochs and 5.10% were between twenty-four and sixty-five epochs

yielding the sixth and seven bins respectively. Finally, the last 7.95% of epochs

were sixty-six epochs or longer in duration thus defining the terminal bin.

For each of the eight mice, we fit the model. We then computed the observed

empirical duration distributions and fitted durations distributions on the other seven

mice for each of the six conditional states. From these fits and the bins described

above, we obtained six χ2 statistics (one for each conditional state). We averaged

each of these six across the eight mice used to fit the model yielding six average χ2

Overall Error REM False Positive REM False Negative 0.0 0.2 0.4 0.6 0.8 1.0

Error Rates: Contemporaneous Covariates

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

Figure 7.14: Error Rates for Various Methods. The black horizontal lines indicate the mean across all methods and the black vertical lines denote ±1 standard error. Random forests error rates are given in green, Random forests with Markov model in blue, and Random forests with TDGMM in red.

These two sets of metrics, the three error rates and the six χ2 statistics, allow us to evaluate the various methods ”locally” and ”globally”. The error rates show

how well the methods perform on an epoch-by-epoch basis, something which is

important for replicating EEG/EMG-based manual scoring. On the other hand,

the χ2 statistics show how the methods perform in terms of fitting the entire curve

of duration distributions, a task also relevant for sleep scientists and one that does

not necessarily require good fits on an epoch-by-epoch basis.

Documento similar