9. Generative Modeling: From Uncertainty Quantification to Stochastic Downscaling#
9.1. Learning Objectives:#
Define generative modeling
Know different ways of adding uncertainty to ML models
Understand how to evaluate such uncertainty estimates
Know three state-of-the-art generative models
Apprehend applications of latent data representations
9.2. Generative Modeling#
Generative modeling refers to a class of models within artificial intelligence that focuses on the creation of new data samples that resemble a given dataset. Instead of predictive tasks, like classification or regression, generative models aim to understand and mimic the underlying structure of the data to generate entirely new samples.
These models learn the probability distribution of the input data, capturing the patterns and relationships among the data points. They can then generate new instances that, ideally, exhibit similar characteristics to the original dataset.
9.3. Add Uncertainty to Machine Learning Models#
Incorporating uncertainty into machine learning models is pivotal, especially in scenarios where decisions rely on understanding the confidence or reliability of predictions. Several methods exist to inject uncertainty into models:
Ensemble Prediction: Ensemble methods, like bagging or boosting, involve training multiple models on different subsets of data or using diverse algorithms. The variance among predictions across these models reflects uncertainty. Aggregating these predictions, such as through averaging or weighted averaging, provides more robust estimates.
Multi-Model Approaches: Combining predictions from diverse models, possibly stemming from different architectures or learning algorithms, can provide a broader perspective on uncertainty. Integrating outputs from these models allows capturing various sources of uncertainty present in the data.
Parametric Distributional Prediction: Certain models, like Bayesian Neural Networks or Gaussian Processes, parameterize probability distributions over predictions. These models learn the uncertainty in terms of distribution parameters, enabling the estimation of full predictive distributions rather than point estimates.
Non-Parametric Distributional Prediction: Methods like Kernel Density Estimation or Histogram-based approaches directly estimate the underlying probability distribution of the predictions. These techniques offer non-parametric ways to model uncertainty without assuming specific functional forms for distributions.
Monte Carlo Dropout (MC Dropout): Applying dropout during both training and testing in neural networks, and then performing multiple forward passes during testing, allows the model to approximate Bayesian inference. The variance in predictions across these passes provides estimates of uncertainty.
Fig 1. Summary schematics and key ideas for non-Bayesian uncertainty quantification approaches (Haynes et al. 2023)
By adding uncertainty, models become more robust, reliable, and adaptable to real-world complexities.
Fig 2. Information flow for an machine learning model with uncertainty quantification (Haynes et al. 2023)
9.4. Evaluate Uncertainty#
Evaluating uncertainty estimates in machine learning models involves various methods and metrics, such as:
Spread-Skill Plot: This metric assesses the relationship between uncertainty and model performance. It plots the spread of predicted uncertainties against the model’s accuracy. A good model demonstrates higher uncertainty for incorrectly predicted instances.
Continuous Ranked Probability Score (CRPS): Measures the discrepancy between predicted and observed cumulative distribution functions. Lower CRPS values indicate better calibration and accuracy of uncertainty estimates.
9.5. Notable Genetative Models:#
Autoencoders: Neural networks designed to learn a compressed representation of the input data, which can then be used to generate similar data.
Generative Adversarial Networks (GANs): Comprising two networks—a generator and a discriminator—that compete against each other. The generator creates synthetic samples, and the discriminator tries to distinguish between real and generated data. Through adversarial training, the generator improves its ability to produce more realistic samples.
Variational Autoencoders (VAEs): Another type of autoencoder that is trained to generate new data points by learning the underlying probability distribution of the data.
Probabilistic Graphical Models (PGMs): A framework for modeling probability distributions using graphs, incorporating nodes to represent random variables and edges to signify dependencies between variables.
9.6. Latent data representations#
Latent data representations, existing on a latent manifold, are condensed, abstract encodings extracted from raw data through methods like autoencoders, VAEs, or GANs. These representations capture essential features while reducing dimensionality, residing within a structured latent space.
In semi-supervised learning, latent manifolds within latent data representations may capture the underlying structure or the distribution of the data, to improve the model’s performance when only a limited portion of the data is labeled. For example, models may use unlabeled data that lie close to the clusters or structures within the latent manifold to generalize better, thus learning from both labeled and unlabeled data points. The structure of the latent manifold helps in defining decision boundaries, assigning labels to unlabeled data points. The smoothness and continuity of the latent space assist in guiding the model’s predictions, allowing for more accurate classifications even with limited labeled data.