Lead Data Scientist Question:

Download Job Interview Questions and Answers PDF

Tell us how would you validate a model you created to generate a predictive model of a quantitative outcome variable using multiple regression?

Lead Data Scientist Interview Question
Lead Data Scientist Interview Question

Answer:

Proposed methods for model validation:

☛ If the values predicted by the model are far outside of the response variable range, this would immediately indicate poor estimation or model inaccuracy.
☛ If the values seem to be reasonable, examine the parameters; any of the following would indicate poor estimation or multi-collinearity: opposite signs of expectations, unusually large or small values, or observed inconsistency when the model is fed new data.
☛ Use the model for prediction by feeding it new data, and use the coefficient of determination (R squared) as a model validity measure.
☛ Use data splitting to form a separate dataset for estimating model parameters, and another for validating predictions.
☛ Use jackknife resampling if the dataset contains a small number of instances, and measure validity with R squared and mean squared error (MSE).

Download Lead Data Scientist Interview Questions And Answers PDF

Previous QuestionNext Question
Tell us what tools or devices help you succeed in your role as a data scientist?Do you know what is the difference between rnorm and runif functions?