Calibration
General Topic
-
On Calibration of Modern Neural Networks
We discover that modern neural networks, unlike those from a decade ago, are poorly calibrated.
-
Unified Uncertainty Calibration
- aleatoric uncertainty と epistemic uncertainty の違い
- Aleatoric Uncertainty : アレアトリックな不確実性は、データのラベル付けの過程における不可減のランダム性の源泉として生じます。
- Epistemic Uncertainty: 例えば、牛とラクダの画像を含むデータセットから学習した場合、良い予測器はテスト画像が牛またはラクダを描写している場合、しかし判断が難しいほどぼやけている場合に、アレアトリックな不確実性を高めます。一方、エピステミックな不確実性は、画像がこれら2つの動物以外のものを描写している場合に発生します。たとえば、ドライバーのようなもの
- 簡単に言えば、アレアトリックな不確実性はデータのラベル付けに関連する固有のノイズやランダム性に関連しており、エピステミックな不確実性は入力データの非典型性や未知性に関連しています。
Blog Post
Ralated Works
Calibration & Fairness
- On Fairness and Calibration
In this paper, we investigate the tension between minimizing error disparity across different population groups while maintaining calibrated probability estimates.
- Sample Complexity of Uniform Convergence for Multicalibration
Multicalibration gives a comprehensive methodology to address group fairness. Our work gives sample complexity bounds for uniform convergence guarantees of multicalibration error.
Calibration & OOD
- On Calibration and Out-of-domain Generalization: Summary JP
In this work, we draw a link between OOD performance and model calibration, arguing that calibration across multiple domains can be viewed as a special case of an invariant representation leading to better OOD generalization.
- Trainable Calibration Measures for Neural Networks from Kernel Mean Embeddings: Main reference of above paper. summary
We propose a more principled fix that minimizes an explicit calibration error during training.
- Can You Trust Your Model’s Uncertainty? Evaluating Predictive Uncertainty Under Dataset Shift
We present a large-scale benchmark of existing state-of-the-art methods on classification problems and investigate the effect of dataset shift on accuracy and calibration.
Approach to overcome Calibration problem
- Preventing Failures Due to Dataset Shift: Learning Predictive Models That Transport
We propose a proactive approach which learns a relationship in the training domain that will generalize to the target domain by incorporating prior knowledge of aspects of the data generating process that are expected to differ as expressed in a causal selection diagram.
- Calibrating deep neural networks using focal loss
We provide a thorough analysis of the factors causing miscalibration, and use the insights we glean from this to justify the empirically excellent performance of focal loss.
- Intra Order-preserving Functions for Calibration of Multi-Class Neural Networks
In this work, we aim to learn general post-hoc calibration functions that can preserve the top-k predictions of any deep network.
- Beyond temperature scaling: Obtaining well-calibrated multiclass probabilities with Dirichlet calibration, poster, summary
We propose a natively multiclass calibration method applicable to classifiers from any model class, derived from Dirichlet distributions and generalising the beta calibration method from binary classification.
Blog Post
Calibration Metrics
- Measuring Calibration in Deep Learning
We propose two new measures for calibration, the Static Calibration Error (SCE) and Adaptive Calibration Error (ACE).
Blog Post
Model Selection
- Scalable Marginal Likelihood Estimation for Model Selection in Deep Learning: Summary JP
In this work, we present a scalable marginal-likelihood estimation method to select both hyperparameters and network architectures, based on the training data alone.