i have an interesting question on feature selection for an ordinal logistic regression model in R (same in python).
suppose that you have done everything possible to boil down a list of features, and you are still finding that not all features are significant i.e. p-value > 0.05
model evaluation metrics are better when including all features at this point vs. removing some of the non-significant features.
in this case, do you typically remove feature/s:
A, with negative coefficient which are not significant
B, which are not significant
Thanks
What do you mean by p-value of the feature? As in the correlation with the target before model training, or the relevancy deemed by the model?
Danning Zhan as in the output in summary(model), after training basically
Miranda Gorman with respect to the coefficients, the magnitude of the coefficients tells you how impactful the feature is to the target and the sign of the coefficient tells you the correlation that the model found between the feature and the target. These are the important things regarding the model, along with the standard metrics that should be considered.
I assume you are using R because such summary values aren't calculated in python. From what I have read regarding the p(t) value, it seems kind of pointless as it is looking at each individual feature and their individual impact, but the purpose of the model is to consider them all together. But I may be wrong though, I have never seen such statistics being used before.
Danning Zhan i tend to share the same view. this probably explains why including all features regardless of if they are significant or not creates a better model. in other cases, removing the non-significant features (or less important ones) can generate better results