Lead Data Scientist Question:

Tell me what cross-validation technique would you use on a time series dataset?

Tweet Share WhatsApp

Answer:

Instead of using k-fold cross-validation, you should be aware to the fact that a time series is not randomly distributed data - It is inherently ordered by chronological order.

In case of time series data, you should use techniques like forward chaining – Where you will be model on past data then look at forward-facing data.

fold 1: training[1], test[2]

fold 1: training[1 2], test[3]

fold 1: training[1 2 3], test[4]

fold 1: training[1 2 3 4], test[5]

Download Lead Data Scientist PDF Read All 60 Lead Data Scientist Questions
Previous QuestionNext Question
Tell me you develop a big data model, but your end user has difficulty understanding how the model works and the insights it can reveal. How do you communicate with the user to get your points across?Tell us why is resampling done?