Lead Data Scientist Question:

Tell us how would you validate a model you created to generate a predictive model of a quantitative outcome variable using multiple regression?

Answer:

Proposed methods for model validation:

☛ If the values predicted by the model are far outside of the response variable range, this would immediately indicate poor estimation or model inaccuracy.
☛ If the values seem to be reasonable, examine the parameters; any of the following would indicate poor estimation or multi-collinearity: opposite signs of expectations, unusually large or small values, or observed inconsistency when the model is fed new data.
☛ Use the model for prediction by feeding it new data, and use the coefficient of determination (R squared) as a model validity measure.
☛ Use data splitting to form a separate dataset for estimating model parameters, and another for validating predictions.
☛ Use jackknife resampling if the dataset contains a small number of instances, and measure validity with R squared and mean squared error (MSE).

Download Lead Data Scientist PDF Read All 60 Lead Data Scientist Questions

Previous Question	Next Question
Tell us what tools or devices help you succeed in your role as a data scientist?	Do you know what is the difference between rnorm and runif functions?