PROGRAMA DE AJUSTE ESTRUCTURAL Y COMPORTAMIENTO DE LOS INGRESOS LABORALES
5. Los ingresos de los trabajadores del sector semiempresarial y familiar
5.3. Los ingresos laborales en el sector familiar
In Chapter 3, we performed a study to contrast the differences in various linguistic variables—
that have been found in correlation with cognitive decline—in the constrained discourse of picture description tasks against the more spontaneous type of discourse of describing objects. In this context, although patients are all describing the same objects, no visual reference is provided, so they describe them using their own mental image, perception and personal experiences with said objects. This increases the variety of syntactical structures and the diversity of vocabulary used when compared to standardized picture description tasks and moves us a step closer to the type of analysis and challenges to be expected when studying completely spontaneous conversations.
Based on the previous literature on analysis of spontaneous speech, we were also interested in observing the use of specific and general vocabulary, and its significance when trying to differentiate healthy controls from AD patients during both description tasks. We proposed a new metric for evaluating coverage of information and pertinence of the discourse based on the use of generic and specific vocabulary in healthy and cognitively impaired individuals. We also evaluated other linguistic features, such as lexical richness and the use of specific linguistic patterns that could provide an insight into the types of syntactic structures that are
most affected by cognitive impairment. Our experiments were carried out with native speakers of Spanish and English to test the multilingual robustness of our proposed metrics.
Our new proposed metric for coverage of information solved the biggest limitation of the metric proposed in Chapter 2. Instead of using a part of the healthy cohort to create the referent, we first extracted the general vocabulary of a healthy older population from two free- discourse corpora, one in Spanish and one in English. By contrasting the most-used vocabulary of the free-discourse corpora against the description tasks, we were able to extract the specific vocabulary for each task in its respective language. We then measured how many instances of the specific vocabulary participants were covering during their descriptions (information coverage), and how much of their vocabulary corresponded to task-specific vocabulary (pertinence).
We used these features, along with lexical richness measures and specific linguistic patterns to train a support vector machine and random forests learners. We found that our newly proposed metrics of information coverage and pertinence based on the use of specific vocabulary were the highest correlated with the severity of cognitive impairment for both types of tasks. For both corpora, the best results were obtained with the support vector machine learner with linear kernels. On a 10-fold cross-validation experiment for classifying AD patients from healthy controls, we obtained an average F-score and area under the curve of 0.98 for the object description tasks, and an average F-score and area under the curve of 0.83 for the standard picture-description task. Our results compared favorably to those of the state-of-the-art methods for both tasks, and to those of our pervious works.
We corroborated that our high results for the object description task were not caused by an overfitting problem by testing the classification without performing hyper-parameter tuning. We obtained an AUC of 0.95 using the default parameters of the SVM implementation. Also, previous literature has reported AUC results of 0.96 and 0.97 on the same corpus. The apparently vast difference in the performance of our classifiers between standardized picture
description and object description tasks had more to do with the tasks and the characteristics of the cohorts in the corpus than with the features themselves.
From our experiments, we observed that the task of describing six common objects without any visual stimulus was highly taxing for patients with cognitive impairment, with over a tenth of the participants with Alzheimer’s disease being unable to complete the task. Furthermore, the education levels and conditions of the settings in which the task was performed favored the healthy cohort considerably. There was a significantly higher level of interaction between participants and examiners for the healthy cohort, which produced, on average, descriptions three times longer than those of the Alzheimer’s group. Unfortunately, all these factors made it difficult to compare the influence of the features in distinguishing AD patients from healthy controls in the contexts of standardized picture description tasks and of object descriptions.
We were able, however, to observe some phenomena related to the use of parts of speech that were consistent both in English and Spanish speakers. A significantly increased use of pronouns and nouns without verbs by the AD cohorts is a finding that has been previously detected for AD English and Portuguese speakers, and that we also found now in AD Spanish speakers. However, we also found that the variety of syntactic structures in the discourse for the object description task was still partially restricted due to the nature of the discourse and did not allow us to make deeper observations in various linguistic patterns. This motivates our research progress into the study of spontaneous conversations to evaluate differences in these types of structures.
To evaluate the adeptness of our metrics in detecting signs of AD at one of its earlier stages, we performed a classification of MCI patients and healthy controls using the standardized picture description corpus. We obtained an average AUC of 0.79, an F-score of 0.80, and 0.79 accuracy for this task. This is an important improvement from the previous literature, which reported an accuracy of 0.65, since detecting MCI is challenging, even for specialists.
6.3 Longitudinal characterization of language alterations in spontaneous speech