Skip to content
- I ran a model on a dataset to predict diabetes based on inactivity and obesity.
- The model became overfit as I worked on it.
- I used the scikit-learn to test the model after overfitting was identified in the model by performing k-fold cross validation on the data.
- Producing a number of partitions of sample observations from the training dataset is the objective of cross validation.
- Number of partitions are determined by the number of observations.
- I used a different fold as the validation set each time I trained and evaluated the model after folding the data into K(5) folds.
- Performance metrics from each fold are summed to determine the model’s generalization performance.
- Following cross validation, it does prevent overfitting to some extent.