4 Apoyo Psicológico, una definición desde un paradigma de campo.
4.2. Apoyo psicológico desde el paradigma de la Terapia Gestalt.
4.2.1. Conceptos, definiciones y principios.
4.2.1.5. Intencionalidad intención y afordancia : la importancia del ello.
4.2.1.6.1. Una revolución en la forma de entender la empatía: las neuronas espejo.
by the forward selection and stepwise selection method models, A: y=β0+β1(sugars)+β2(fiber)+β3(sodium)+β4(fat)
+β5(protein)+β6(carbohydrates)+β7(calories)+β8(vitamins) +β9(potassium)+ε
VARIABLE SELECTION CRITERIA
Thus, we have two candidate models vying for designation as the best model, with each model chosen by two model selection procedures. The only difference between the models is the inclusion of theshelf 2indicator variable. Let us take a moment to examine why the forward selection method did not include this variable, whereas the backward elimination method did.
So far, we have blindly applied the variable selection procedures, using the default selection criteria given by the algorithms. In many cases, these default values work quite nicely, but the analyst should always be aware of the thresholds being applied to omit or retain variables in the variable selection process. Figure 3.13 shows the dialog box for setting the entry/removal thresholds for the Clementine software, with the default values shown. Variables will be added to the model only if the associatedp-value for the partialF-test is smaller than the entry value specified in this dialog box, and removed only if the p-value is larger than the removal value specified.
If analysts wish to be more parsimonious, a lower entry threshold may be specified, which will make it more difficult for variables of borderline significance to be entered into the model. Similarly, a lower removal threshold will make it easier to omit variables of borderline significance. On the other hand, if the analyst wishes to be more inclusive, higher levels for the entry and removal thresholds may be specified. Clearly, however, the entry threshold value must be less than the removal value.
So what is the significance of theshelf 2indicator variable given that the other variables in Table 3.15 are in the model? Table 3.20 shows that thep-value for thet-test for theshelf 2variable is 0.05. Earlier we learned how thet-test and the (appropriate) F-test were equivalent. Thus, it would follow that thep-value for the sequentialF-test for inclusion ofshelf 2is 0.05.
We can verify thisp-value using the sequentialF-test directly, as follows. From Table 3.21 we have the regression sum of squares for the full model (includingshelf 2but ignoringcupsandweight) equal to 14,979.560. Also from Table 3.21 we have
SPH SPH
JWDD006-03 JWDD006-Larose November 25, 2005 17:26 Char Count= 0
136 CHAPTER 3 MULTIPLE REGRESSION AND MODEL BUILDING
TABLE 3.20 Shelf 2Indicator Variable Has a Significance of 0.05, as Shown by thet-Test
The regression equation is
Rating=54.3 - 0.230 Calories + 3.25 Protein - 1.67 Fat - 0.0552 Sodium + 3.47 Fiber + 1.16 Carbos - 0.708 Sugars - 0.0330 Potassium - 0.0496 Vitamins + 0.314 shelf 2
Predictor Coef SE Coef T P VIF
Constant 54.2968 0.4488 120.99 0.000 Calories - 0.229610 0.007845 - 29.27 0.000 6.8 Protein 3.24665 0.08582 37.83 0.000 2.6 Fat - 1.66844 0.09650 - 17.29 0.000 2.7 Sodium - 0.0552464 0.0008142 - 67.86 0.000 1.4 Fiber 3.46905 0.07165 48.41 0.000 8.5 Carbos 1.16030 0.03127 37.11 0.000 5.1 Sugars - 0.70776 0.03343 - 21.17 0.000 6.4 Potassium - 0.032982 0.002416 - 13.65 0.000 8.6 Vitamins - 0.049640 0.002940 - 16.88 0.000 1.3 shelf 2 0.3140 0.1573 2.00 0.050 1.4 S = 0.510915 R-Sq = 99.9% R-Sq(adj) = 99.9% Analysis of Variance Source DF SS MS F P Regression 10 14979.6 1498.0 5738.55 0.000 Residual Error 66 17.2 0.3 Total 76 14996.8
the regression sum of squares from the reduced model (not includingshelf 2) given as 14,978.521. Thus, we have
SSshelf 2|all other variables =SSall variables−SSall variables except shelf2 =14,979.560−14,978.521
=1.039
From Table 3.21 we have MSEall variables=0.261. Hence, F(shelf 2|all other variables)= SSshelf 2|all other variables
MSEall variables
=1.039
0.261 =3.9808
TABLE 3.21 Regression ANOVA Tables Without and withShelf 2
Model Sum of Squares df Mean Square F Significance With Regression 14,978.521 9 1,664.280 6,104.043 0.000 Shelf 2 Residual 18.268 67 0.273 Total 14,996.788 76 Without Regression 14,979.560 10 1,497.956 5,738.554 0.000 Shelf 2 Residual 17.228 66 0.261 Total 14,996.788 76
VARIABLE SELECTION CRITERIA 137
Figure 3.14 Adjusting the entry threshold for the forward selection algorithm.
This value of 3.9808 for the sequential F-statistic lies at the 95th percentile of the F1,n−p−2=F1,65-distribution, thereby verifying ourp-value of 0.05 for the inclusion of theshelf 2indicator variable in the model.
Now recall that 0.05 happens to be the default entry threshold for both the forward selection and stepwise selection procedures. Thus, if we adjust the entry threshold level just a touch upward (say, to 0.051), we would expectshelf 2to be included in the final models from both of these procedures. Figure 3.14 shows the dialog box for adjusting the entry threshold level for Clementine’s forward selection algorithm, with the level moved up slightly to 0.051. Finally, Table 3.22 shows the model summary results from the forward selection algorithm using the adjusted entry threshold value of 0.051. Note that, as expected,shelf 2is now included, as the last variable to be entered into the model. Otherwise, Table 3.22 is exactly the same as Table 3.15, the forward selection results using the default threshold value.
TABLE 3.22 Model Summary Results for the Forward Selection Procedure, After Adjusting
the Entry Threshold Upward Slightly and with Inclusion ofShelf 2
Adjusted Std. Error Model R R2 R2 of the Estimate
1 0.762a 0.580 0.575 9.16160 2 0.899b 0.808 0.803 6.23743 3 0.948c 0.899 0.895 4.54638 4 0.981d 0.962 0.960 2.82604 5 0.985e 0.970 0.968 2.50543 6 0.987f 0.975 0.973 2.31269 7 0.995g 0.990 0.989 1.47893 8 0.998h 0.995 0.995 1.01477 9 0.999i 0.999 0.999 0.52216 10 0.999j 0.999 0.999 0.51091 a Predictors: (constant),sugars.
bPredictors: (constant),sugars, fiber. c Predictors: (constant),sugars, fiber, sodium. d Predictors: (constant),sugars, fiber, sodium, fat. e Predictors: (constant),sugars, fiber, sodium, fat, protein.
f Predictors: (constant),sugars, fiber, sodium, fat, protein, carbohydrates. g Predictors: (constant),sugars, fiber, sodium, fat, protein, carbohydrates, calories. h Predictors: (constant),sugars, fiber, sodium, fat, protein, carbohydrates, calories, vitamins. i Predictors: (constant),sugars, fiber, sodium, fat, protein, carbohydrates, calories, vitamins, potassium. j Predictors: (constant),sugars, fiber, sodium, fat, protein, carbohydrates, calories, vitamins, potassium, shelf 2.
SPH SPH
JWDD006-03 JWDD006-Larose November 25, 2005 17:26 Char Count= 0
138 CHAPTER 3 MULTIPLE REGRESSION AND MODEL BUILDING
At this point, all four of our variable selection algorithms point to the same model as the best model. We now designate model B, as ourworking model:
y=β0+β1(sugars)+β2(fiber)+β3(sodium)+β4(fat) +β5(protein)+β6(carbohydrates)+β7(calories) +β8(vitamins)+β9(potassium)+β10(shelf2)+ε
Let us simply reiterate that one need not report only one model as a final model. Two or three models may be carried forward, and input sought from managers about which model may be most ameliorative of the business or research problem. However, it is often convenient to have one “working model” selected, because of the complexity of model building in the multivariate environment. Note, however, that the variable selection criteria for choosing the “best” model do not account for the multicollinearity that still exists among the predictors. Alert readers will have seen from Table 3.20 that the variance inflation factors for four or five variables are rather high, and will need some attention.
But first we need to address a problem that our working model has with a set of outliers. Figure 3.15 is a plot of the standardized residuals versus the fitted values for the current working model. Note the set of four outliers in the lower section of the plot. These are all cereals whose nutritional rating is lower than expected given their of predictor variable levels. These cereals are:
r Record 46: Raisin Nut Bran
r Record 52: Apple Cinnamon Cheerios r Record 55: Honey Nut Cheerios r Record 56: Oatmeal Raisin Crisp
1 0 2 −1 −2 −3 −4 10 20 30 40 50 60 70 80 90 100 Fitted Value Standardized Residual