Lecture 10
Regularization and Generalization
Today’s Topics:
- 1. Improving Generalization
- 2. Data Augmentation
- 3. Early Stopping
- 4. L1 and L2 Regularization
- 5. Dropout
1. Improving Generalization
Generalization refers to how well a trained model performs on unseen data.
A model that generalizes well captures the true underlying patterns in the dataset instead of memorizing noise from the training set.
Achieving high training accuracy is not enough — the goal is to ensure that the model performs well on new data.
Overfitting occurs when a model is too closely tailored to the training set, leading to poor test performance.
Key strategies to improve generalization:
- Collect more diverse data
- Use data augmentation
- Reduce model capacity (simpler models)
- Apply regularization (L1, L2, dropout)
- Use early stopping
- Employ transfer/self-supervised learning
2. Data Augmentation
Data augmentation increases dataset size by generating label-preserving transformations — improving robustness and reducing overfitting.
Useful when labeled data is scarce or costly (e.g., medical imaging).
Common augmentations:
- Random crop / resize
- Flipping, rotation, translation, zoom
- Adding noise or color jitter
- Mixup, CutMix
These simulate real-world variations and encourage invariant feature learning.
3. Early Stopping
Early stopping halts training when validation performance stops improving — preventing overfitting.
Procedure
- Split data into training/validation/test.
- Track validation performance.
- Stop training when validation loss stops decreasing.
4. L1 and L2 Regularization
Regularization penalizes large weights, encouraging simpler models and preventing overfitting.
4.1 L1 Regularization (Lasso)
- Promotes sparsity (many weights → 0)
- Useful for feature selection
4.2 L2 Regularization (Ridge)
5. Dropout
Dropout randomly removes a fraction of neurons during training — forcing the network to learn redundant, distributed representations.
Why it works
- Prevents co-adaptation
- Acts like model ensemble
- Improves robustness
Summary
Regularization and generalization methods — like data augmentation, early stopping, L2, and dropout — are essential to ensure models learn meaningful patterns, not noise.
They enable robust generalization and stable performance on unseen data.