SECCION CARTELES PAGADOS
TITULO SUPLETORIO
During the initial training process of each sensor classifier, only one sensor bag is used per ESM label provided by a mobile user. Our model is trained with a limited amount of sample data for all labels (i.e., one bag per class label), which then need to perform annotation prediction progressively throughout the simulated data collection in a day-to-day manner. In other words, the objective of the experiment is to perform annotation prediction accurately based on the streaming of multidimensional sensor data during an ESM study, given the influence of in-situ contexts of the mobile user. Consequently, this experiment compares the performance of annotation prediction by general approaches with our proposed semi-supervised approach.
In our work, the density-based bag summarisation component employs the same strategy as [Birant
and Kut,2007,Shao et al.,2016] and [Liono et al.,2018b] by setting the parameters E ps = 0.3 and
minpts= ln(n) for the given DBSCAN algorithm (for density-based clustering), where n is the number
of feature instances in an unlabelled sensor feature bag Siu. In the co-training process, a random split
operation is performed proportionally on the set of sensor classifiers to produce two different views Vf irst (View 1) and Vsecond (View 2). In this case, the number of distinct sensor classifiers in a view is
at least (n/2). At the end of the annotation prediction process, the binary value of MAC evaluation is calculated under the condition of a mutual agreement between the views of sensor classifiers where yf irst == ysecond, and its agreeable posterior (i.e., Pr(yagreed|X)) is being under a certain threshold
β = 0.9. Therefore, a MAC evaluation is considered valid when it satisfies the output of Equation 4.1 where macEvaluated == 1. Before the summarised sensor bags are added to TrainingPool (given a valid MAC evaluation) for the sensor classifiers to be re-trained, the upsampling operation is performed on the summarised sensor bag by using the k-number of the instance replication strategy, where k is withdrawn from the Poisson(δ ) distribution with δ = 5. To simulate the active learning component of the semi-supervised module in CoAct-nnotate, we leverage the actual annotation at the end time of the time interval based on the actual user labels in the CrowdSignals dataset. For the time duration of recent
sensor data on the given annotation a, 30 minutes of past mobile sensor data (i.e., tδ = 30 minutes) are
Experimental Evaluation 75
Since annotation prediction is crucial for mobile data collection in the wild, we simulate an experi- ment in which the end time of self-annotation (user-driven labelling) is the time point of ESM annotation. All participants involved in CrowdSignals data collection are mobile users who own Android smart- phones. Different phone models are noticeable within the dataset since the capability of smartphones to sense their context and environments varies. Due to the diversity of sensors in different smartphone models, the performance of annotation prediction can be greatly influenced by the limited composition of sensor classifiers contained within a view.
As the base classifier of the mobile sensors, we leverage the following algorithms in our evaluation
(using scikit-learn [Pedregosa et al.,2011]):
• Naive Bayes (NB)
• Support Vector Classifiers (SVC)
• Multilayer Perceptron (MLP) with 0.00001 as the L2 penalty (regularisation parameter), L-
BFGS [Andrew and Gao,2007] as the solver for weight optimisation and structure of two hidden
layers (consisting of five neurons for the first layer and two neurons for the second layer) • Random Forests (RF) with 100 trees
• Decision Tree (DT)
• k Nearest Neighbour (k-NN) with k = 1 (1NN)
For the baseline of annotation prediction, we leverage the general approaches that can be used for annotation prediction as follows:
• Multivariate time-window based annotation prediction (denoted as MAP). In the MAP approach, only one classifier is trained for all sensor feature dimensions and instances in TrainingPool. • Non-multivariate time-window based annotation prediction (denoted as 1C1S). In 1C1S approach,
one classifier is trained per sensor.
• 1C1S with co-training (denoted as Co-1C1S). In the Co-1C1S approach, the concept of co-training is applied to perform multi-view annotation prediction. The basic operation of view split is similar to CoAct-nnotate, except for the process of sensor classifiers improvement. For the improvement
process, the predicted annotation (i.e., yagreed) is used to label Su, which will be included in
TrainingPoolonly if there is a mutual agreement (i.e., MA== 1) between two views of sensor
Experimental Evaluation 76
For both the MAP and 1C1S approaches, the training of classifiers is based on the bags of all first occurrences of each a in A. In other words, only one bag is used for a class label during the training phase, which results in no progressive learning over time. In contrast, both Co-1C1S and CoAct-nnotate employ the concept of progressive learning by a co-training mechanism. The only difference between Co-1C1S and CoAct-nnotate is in the criteria for sensor classifier improvement and cost-efficient performance of
bag summarisation for Suin CoAct-nnotate. In the feature extraction process of all annotation prediction
approaches (MAP, 1C1S, Co-1C1S and CoAct-nnotate), time-interval based temporal segmentation is used for a given bag whereby the size of the time window is set to 60 seconds (1 minute) with 50% overlapping parameters. In each time window, statistical features are extracted, such as mean, median,
maximum, minimum, standard deviation, interquartile rangeand root mean square. In terms of general
evaluation performance of annotation prediction, the correctness metric is used to measure the accuracy of an annotation predictor. Consequently, the correctness metric can be measured by calculating the fraction of the total count of correct predictions over the number of annotation prediction, as expressed in the following equation:
Correctness=∑
v
u=1annotationucorrect
v (4.2)
where v is the total number of annotation predictions and annotationucorrectis the binary value whether
the u-th annotation prediction is correct or not. To evaluate the performance of the systems empirically, the experiment is performed with 10 iterations per base classifier on each approach.