Recommended Reading
My Recommended Books & Papers
Books
- Pearl et al. Causal inference in statistics: a primer. John Wiley & Sons 2016.
- Peters et al. Elements of causal inference: foundations and learning algorithms. MIT press 2017.
- Bishop et al. Pattern recognition and machine learning. Springer 2006.
- Tengyu Ma. Lecture notes for machine learning theory.
- Mohri et al. Foundations of machine learning (second edition). MIT press 2018.
- Pierre Alquier. User-friendly introduction to PAC-Bayes bounds. 2024.
- Endre Suli and David F. Mayers. An introduction to numerical analysis. Cambridge University press 2003.
Papers
Why importance weighting fails under overparameterization
- Zhai et al. Understanding why generalized reweighting does not improve over ERM. In ICLR 2023.
- Xu et al. Understanding the role of importance weighting for deep learning. In ICLR 2021.
- Byrd and Lipton. What is the effect of importance weighting in deep learning? In ICML 2019.
Long-tail learning
- Menon et al. Long-tail learning via logit adjustment. In ICLR 2021.
- Ren et al. Balanced MSE for imbalanced visual regression. In CVPR 2022.
- Cao et al. Learning imbalanced datasets with label-distribution-aware margin loss. In NeurIPS 2019.
- Kini et al. Label-imbalanced and group-sensitive classification under overparameterization. In NeurIPS 2021.
- Kang et al. Decoupling representation and classifier for long-tailed recognition. In ICLR 2020.
Robust fine-tuning
- Wortsman et al. Robust fine-tuning of zero-shot models. In CVPR 2022.
- Kumar et al. Calibrated ensembles can mitigate accuracy tradeoffs under distribution shift. In UAI 2022.
- Kumar et al. Fine-tuning can distort pretrained features and underperform out-of-distribution. In ICLR 2022.
- Li et al. Explicit inductive bias for transfer learning with convolutional networks. In ICML 2018.
- Tian et al. Trainable projected gradient method for robust fine-tuning. In CVPR 2023.
Robustness
- Foret et al. Sharpness-aware minimization for efficiently improving generalization. In ICLR 2021.
- Martin Arjovsky. Out of distribution generalization in machine learning. PhD Thesis.
- Qiu et al. Simple and fast group robustness by automatic feature reweighting. In ICML 2023.
- Khani and Liang. Removing spurious features can hurt accuracy and affect groups disproportionately. In FAccT 2021.
- Zhou et al. Sparse invariant risk minimization. In ICML 2022.
Ensemble
- Tumer and Ghosh. Error correlation and error reduction in ensemble classifiers. Connection science 1996.
- Tumer and Ghosh. Analysis of decision boundaries in linearly combined neural classifiers. PR 1996.
Others
- Sun et al. Out-of-distribution detection with deep nearest neighbors. In ICML 2022.
- Du et al. Algorithmic regularization in learning deep homogeneous models: layers are automatically balanced. In NeurIPS 2018.
- Bartlett et al. Local rademacher complexities. The Annals of Statistics 2005.
- Bartlett et al. Benign overfitting in linear regression. PNAS 2020.