• No se han encontrado resultados

Desarrollo de las políticas contables del rubro inversiones en la compañía XXYZ

6. Respuesta a los Objetivos de Investigación

6.1 Desarrollo de las políticas contables del rubro inversiones en la compañía XXYZ

Prediction

The sparse regression framework combines populations of scans from multiple modalities into one linear model while retaining classical statistical interpretability of the resulting

linear model. The scientific insights usually obtained from linear models, including the size and sign of coefficients, as well as the statistical significance of the results, are all present and have their usual interpretation in the output of the method. For example, knowing whether a specific structure tends to increase or decrease in thickness with increasing age is critical for formulating neurobiologically principled interpretations of experimental data. This is a primary advantage of this method over more mainstream machine learning methods, such as support vector machines (SVM’s), random forests, and artificial neural networks (ANN’s), which emphasize predictive accuracy at the cost of model interpretability. At the same time, we do pay a cost: A fundamental premise of statistical hypothesis testing is that the model is specified before the data are seen (Wasserman and Roeder, 2009). As such, any exploratory model fitting, including our proposed sparse regression technique, cannot use the same data for choosing a model as for evaluating its fit. Therefore, we must use a set of training data for generating the model and a testing set for evaluating the data. This training- testing split is standard in machine learning, but runs counter to the spirit of inferential statistics, which seeks to draw conclusions about a broader population from a limited study sample. Because the statistical inference is performed only on the testing dataset, which is a fraction of the data available, the statistical power of studies is decreased. At the same time, the efficiency of the estimator is increased, as the regions of interest are tuned to the sample population and outcome of interest. Nevertheless, increasing the number of subjects available for statistical inference would greatly increase the appeal of the sparse regression method to the broader imaging community. At least two avenues provide a promising path forward for this issue.

One way to maximize the number of subjects used for inferential statistics would be to use sparse regression to obtain regions of interest in a related but distinct dataset from the one under study. Creating standardized data-driven regions of interest would eliminate the need to perform the sparse regression for new studies, so all of the subjects for new studies would be available for statistical analysis without the need for taking a subset of subjects for training. Although this approach is theoretically straightforward, establishing regions

of interest that are robust to heterogeneity in patient population, pulse sequences, scanner model, etc. is not a trivial task and requires careful and rigorous validation across a wide variety of datasets. Despite these challenges, standardized regions of interest have been created for some major diseases. Standardized ROI’s have been investigated for years in AD, for specific areas of interest as PET scanning in AD (Landau et al., 2012) and cortical thinning in early AD (Dickerson et al., 2009). The regions of interest in these examples were created from univariate voxelwise correlations between values obtained from imaging and a clinical outcome. Defining ROI’s based on multivariate approaches such as the sparse regression technique proposed here may help capture a more realistic representation of the brain’s activity and function as an interconnected network, rather than as a collection of points working independently.

A drawback to defining prespecified ROI’s for a given set of outcomes and imaging modalities is that predefining ROI’s is only possible for established target diseases and populations. Because pre-specified ROI’s are by nature generated for a specific study population and disease, they cannot be applied to studies of unrelated pathological processes or vastly different patient populations. Nevertheless, generation of ROI’s derived from multivariate analyses of neuroimaging datasets for at least some common diseases would maintain the advantages of ROI’s specifically developed for multivariate methodologies while still allowing calculation of statistical inference on the entire dataset.

Another avenue that could provide statistics over a set of anatomically interpetable multivariate network of brain regions, without resorting to training and testing splits, is

to compare some statistic derived from a model against a null distribution. Classical

linear model theory uses the F-statistic to compare the fit of two nested models with different numbers of parameters. In the sparse high-dimensional setting, where there are many possible models to choose from and the model design is not set in advance of the analysis, the F-statistic is far too liberal and would result in including lots of components. Recent advances in significance testing for the Lasso, a popular sparse regression technique,

points to possible avenues for future research. Standard multivariate linear model theory examines the drop in residual sum of squares (RSS) obtained by including additional

parameters in a model and compares that to a χ2 distribution to decide whether or not

an additional predictor is significant. When the model design is not fixed, as is the case in high-dimensional settings such as brain mapping, this distribution leads to statistical tests for significance that are far too liberal. Lockhart et al. have proposed a “covariance test statistic” as an alternative to the RSS that follows an exponential distribution (Lockhart et al., 2014). Similar techniques may be used to define an empirical null distribution for brain mapping that can be used to determine the significance of additional brain regions in a sparse multivariate model. Recently, progress has been made towards characterizing the null distribution of weights of support vector machine classifiers in predicting outcomes from brain images (Gaonkar and Davatzikos, 2013), and extending this work to include spatial regularization in the case of linear regression is feasible.

6.3. Perspectives on Different Sources of Imaging

Documento similar