About Conference > Invited Sessions
Speakers
Invited Session Abstracts
66
Presentations
19
Sessions
37
In-person
29
Online
Advanced Bayesian Methods for Analyzing Complex Data
Bldg 219 Room 102
3 Presentations
1. Bayesian confounder selection using decision trees
Speaker: Chanmin Kim, Sungkyunkwan University
Online
Abstract:
Causal mediation analysis plays a vital role in understanding the mechanisms through which an exposure variable influences an outcome, by decomposing the total effect into direct and indirect effects. A major challenge in mediation analysis is the presence of unobserved or high-dimensional confounders that can bias these causal effect estimates. In this talk, we propose a Bayesian nonparametric framework for confounder selection in causal mediation analysis, leveraging a flexible extension of Bayesian Additive Regression Trees (BART). Our approach incorporates sparsity-inducing priors to identify potential confounders that satisfy a modified disjunctive cause criterion—ensuring appropriate adjustment for variables that affect the exposure, mediator, or outcome. We demonstrate the consistency of confounder selection under high-dimensional settings, providing theoretical guarantees for the posterior concentration and selection probabilities. Through extensive simulation studies, we compare our method against existing approaches, illustrating its superior performance in selecting true confounders and accurately estimating direct and indirect causal effects. Finally, we apply our propo
Causal mediation analysis plays a vital role in understanding the mechanisms through which an exposure variable influences an outcome, by decomposing the total effect into direct and indirect effects. A major challenge in mediation analysis is the presence of unobserved or high-dimensional confounders that can bias these causal effect estimates. In this talk, we propose a Bayesian nonparametric framework for confounder selection in causal mediation analysis, leveraging a flexible extension of Bayesian Additive Regression Trees (BART). Our approach incorporates sparsity-inducing priors to identify potential confounders that satisfy a modified disjunctive cause criterion—ensuring appropriate adjustment for variables that affect the exposure, mediator, or outcome. We demonstrate the consistency of confounder selection under high-dimensional settings, providing theoretical guarantees for the posterior concentration and selection probabilities. Through extensive simulation studies, we compare our method against existing approaches, illustrating its superior performance in selecting true confounders and accurately estimating direct and indirect causal effects. Finally, we apply our propo
Keywords:
Bayesian additive regression trees, causal inference, confounder selection
2. A modified VAR-deGARCH model for asynchronous multivariate financial time series via variational Bayesian inference
Speaker: Ray-Bing Chen, National Tsing Hua Unversity
In-person
Abstract:
This study proposes a modified VAR-deGARCH model, denoted by M-VAR-deGARCH, for modeling asynchronous multivariate financial time series with GARCH effects and simultaneously accommodating the latest market information. A variational Bayesian (VB) procedure is developed for the M-VAR-deGARCH model to infer structure selection and parameter estimation. We conduct extensive simulations and empirical studies to evaluate the fitting and forecasting performance of the M-VAR-deGARCH model. The simulation results reveal that the proposed VB procedure produces satisfactory selection performance. In addition, our empirical studies find that the latest market information in Asia can provide helpful information to predict market trends in Europe and South Africa, especially when momentous events occur.
This study proposes a modified VAR-deGARCH model, denoted by M-VAR-deGARCH, for modeling asynchronous multivariate financial time series with GARCH effects and simultaneously accommodating the latest market information. A variational Bayesian (VB) procedure is developed for the M-VAR-deGARCH model to infer structure selection and parameter estimation. We conduct extensive simulations and empirical studies to evaluate the fitting and forecasting performance of the M-VAR-deGARCH model. The simulation results reveal that the proposed VB procedure produces satisfactory selection performance. In addition, our empirical studies find that the latest market information in Asia can provide helpful information to predict market trends in Europe and South Africa, especially when momentous events occur.
Keywords:
Asynchronous time series, GARCH, Variable selection, Variational Bayesian inference, Vector autoregressive model
3. Bayesian Group Sparsity for Detecting Multiple Structural Breaks in an AR(p) Process
Speaker: Kuo-Jung Lee, National Chen Kung University
In-person
Abstract:
This paper proposes a Bayesian group sparsity method for modeling time series with structural breaks in autoregressive (AR) processes. Structural breaks are treated as unknown and estimated jointly with segment-specific AR parameters within a unified Bayesian variable selection framework, eliminating the need to pre-specify the number or location of breakpoints. The method uses groupwise Gibbs sampling (GWGS) for efficient estimation, even in long time series. Simulation studies confirm its ability to detect structural changes and recover regime-specific dynamics. Applications to U.S. macroeconomic indicators and S&P 500 stock returns show that the approach effectively captures shifts in economic conditions and financial market volatility. Compared to traditional Bayesian methods, the proposed model is more computationally efficient and suitable for high-frequency or complex data, offering a practical tool for structural break detection in economic and financial analysis.
This paper proposes a Bayesian group sparsity method for modeling time series with structural breaks in autoregressive (AR) processes. Structural breaks are treated as unknown and estimated jointly with segment-specific AR parameters within a unified Bayesian variable selection framework, eliminating the need to pre-specify the number or location of breakpoints. The method uses groupwise Gibbs sampling (GWGS) for efficient estimation, even in long time series. Simulation studies confirm its ability to detect structural changes and recover regime-specific dynamics. Applications to U.S. macroeconomic indicators and S&P 500 stock returns show that the approach effectively captures shifts in economic conditions and financial market volatility. Compared to traditional Bayesian methods, the proposed model is more computationally efficient and suitable for high-frequency or complex data, offering a practical tool for structural break detection in economic and financial analysis.
Keywords:
Bayesian variable selection; Group sparsity; Inflation persistence; Multiple structural changes; Nonstationary autoregressive process
Advanced Bayesian methods
Bldg 219 Room 105
3 Presentations
1. Bayesian Analysis of Tensor Product Neural Networks
Speaker: Yongdai Kim, Seoul National University
In-person
Abstract:
The growing emphasis on interpretability in machine learning has brought renewed attention to functional ANOVA model, which offers a principled approach for decomposing high-dimensional functions into interpretable lower-dimensional components. Recently, the Tensor Product Neural Network (TPNN) has been proposed to estimate each component in the functional ANOVA model accurately and stably. The number of basis TPNNs, however, should be defined a prior, which limits the applicability of TPNN significantly. In this work, we propose a Bayesian TPNN which can learn the number of basis TPNNs as well as the basis TPNNs themselves. We develop an efficient MCMC algorithm and illustrate that the proposed Bayesian TPNN performs well by analyzing multiple benchmark datasets.
The growing emphasis on interpretability in machine learning has brought renewed attention to functional ANOVA model, which offers a principled approach for decomposing high-dimensional functions into interpretable lower-dimensional components. Recently, the Tensor Product Neural Network (TPNN) has been proposed to estimate each component in the functional ANOVA model accurately and stably. The number of basis TPNNs, however, should be defined a prior, which limits the applicability of TPNN significantly. In this work, we propose a Bayesian TPNN which can learn the number of basis TPNNs as well as the basis TPNNs themselves. We develop an efficient MCMC algorithm and illustrate that the proposed Bayesian TPNN performs well by analyzing multiple benchmark datasets.
Keywords:
Explainable AI, Functional ANOVA model, Tensor Product Neural Networks
2. Efficient MCMC for Bayesian Neural Networks
Speaker: Juho Lee, KAIST
In-person
Abstract:
Bayesian Neural Networks (BNNs) offer a principled framework for modeling predictive uncertainty and improving out-of-distribution (OOD) robustness by estimating posterior distributions over network parameters. Stochastic Gradient Markov Chain Monte Carlo (SGMCMC) enables scalable posterior sampling by combining stochastic gradients with Langevin dynamics, but often suffers from limited sample diversity, hindering uncertainty estimation and performance. We propose a simple and effective method to boost sample diversity in SGMCMC without tempering or multiple chains. By reparameterizing each weight matrix as a product of matrices, our approach induces trajectories that explore the parameter space more effectively. This leads to faster mixing and more diverse samples under the same computational budget, without increasing inference cost. Extensive experiments on image classification, including OOD robustness, loss landscape analyses, and comparisons with Hamiltonian Monte Carlo, validate the superiority of our method.
Bayesian Neural Networks (BNNs) offer a principled framework for modeling predictive uncertainty and improving out-of-distribution (OOD) robustness by estimating posterior distributions over network parameters. Stochastic Gradient Markov Chain Monte Carlo (SGMCMC) enables scalable posterior sampling by combining stochastic gradients with Langevin dynamics, but often suffers from limited sample diversity, hindering uncertainty estimation and performance. We propose a simple and effective method to boost sample diversity in SGMCMC without tempering or multiple chains. By reparameterizing each weight matrix as a product of matrices, our approach induces trajectories that explore the parameter space more effectively. This leads to faster mixing and more diverse samples under the same computational budget, without increasing inference cost. Extensive experiments on image classification, including OOD robustness, loss landscape analyses, and comparisons with Hamiltonian Monte Carlo, validate the superiority of our method.
Keywords:
Bayesian neural networks, Stochastic gradient MCMC, Parameter expansion
3. L2-norm posterior contraction in Gaussian models with unknown variance
Speaker: Seonghyun Jeong, Yonsei University
In-person
Abstract:
The testing-based approach is a fundamental tool for establishing posterior contraction rates. Although the Hellinger metric is attractive owing to the existence of a desirable test function, it is not directly applicable in Gaussian models, because translating the Hellinger metric into more intuitive metrics typically requires strong boundedness conditions. When the variance is known, this issue can be addressed by directly constructing a test function relative to the L2-metric using the likelihood ratio test. However, when the variance is unknown, existing results are limited and rely on restrictive assumptions. To overcome this limitation, we derive a test function tailored to an unknown variance setting with respect to the L2-metric and provide sufficient conditions for posterior contraction based on the testing-based approach. We apply this result to analyze high-dimensional regression and nonparametric regression.
The testing-based approach is a fundamental tool for establishing posterior contraction rates. Although the Hellinger metric is attractive owing to the existence of a desirable test function, it is not directly applicable in Gaussian models, because translating the Hellinger metric into more intuitive metrics typically requires strong boundedness conditions. When the variance is known, this issue can be addressed by directly constructing a test function relative to the L2-metric using the likelihood ratio test. However, when the variance is unknown, existing results are limited and rely on restrictive assumptions. To overcome this limitation, we derive a test function tailored to an unknown variance setting with respect to the L2-metric and provide sufficient conditions for posterior contraction based on the testing-based approach. We apply this result to analyze high-dimensional regression and nonparametric regression.
Keywords:
Bayesian nonparametrics, High-dimensional regression, Nonparametric regression, Testing-based posterior contraction
Advanced Bayesian methods for high-dimensional heterogeneous data
Bldg 219 Room 103
4 Presentations
1. Bayesian covariate-assisted interaction analysis for multivariate count data in microbiome study
Speaker: Juhee Lee, University of California, Santa Cruz
Online
Abstract:
Understanding covariate-dependent interdependencies among features is of great interest in various applications. Motivated by a dataset of multivariate counts from a microbiome study, where microbial abundance and interaction patterns may change with environmental factors, we develop a Bayesian covariate-dependent factor model that flexibly estimates heteroscedasticity in the covariance matrix due to covariates. Our approach employs covariance regression through linear regression on a lower-dimensional factor loading matrix. This formulation, combined with joint sparsity induced by the Dir-HS prior for the factor loadings, provides robust estimation of covariate-dependent covariance in high-dimensional settings. The model uses a regression approach to the mean abundance and addresses the varying mean and covariance structure with covariates. Furthermore, the model tackles significant statistical challenges such as discreteness, over-dispersion, compositionality, and high dimensionality that are common in microbiome data analyses, using a flexible nonparametric Bayesian approach. We thoroughly explore the properties of the model and perform extensive simulation studies to examine it
Understanding covariate-dependent interdependencies among features is of great interest in various applications. Motivated by a dataset of multivariate counts from a microbiome study, where microbial abundance and interaction patterns may change with environmental factors, we develop a Bayesian covariate-dependent factor model that flexibly estimates heteroscedasticity in the covariance matrix due to covariates. Our approach employs covariance regression through linear regression on a lower-dimensional factor loading matrix. This formulation, combined with joint sparsity induced by the Dir-HS prior for the factor loadings, provides robust estimation of covariate-dependent covariance in high-dimensional settings. The model uses a regression approach to the mean abundance and addresses the varying mean and covariance structure with covariates. Furthermore, the model tackles significant statistical challenges such as discreteness, over-dispersion, compositionality, and high dimensionality that are common in microbiome data analyses, using a flexible nonparametric Bayesian approach. We thoroughly explore the properties of the model and perform extensive simulation studies to examine it
Keywords:
Covariate-dependent interdependencies, Microbiome data analysis, Multivariate counts, Bayesian factor model, Covariance regression, Heteroscedasticity.
2. DAG trend filtering for genomic denoising via higher-order Bayesian networks and DAG shrinkage processes
Speaker: Weixuan Zhu, Xiamen University
Online
Abstract:
Graph-based denoising is a critical preprocessing step for analyzing noisy data, particularly in genomic applications where gene regulatory networks exhibit inherent directional dependencies. This paper introduces a novel directed acyclic graph trend filtering framework that leverages novel higher-order Bayesian networks and graphical shrinkage processes to enhance local adaptivity in signal smoothing along the directed edges of a graph. Unlike traditional graph trend filtering, which assumes undirected graphs, the proposed method explicitly respects the directional structure of graphs, improving interpretability and accuracy in capturing dependencies. We employ a Hamiltonian Monte Carlo algorithm for efficient posterior inference. Through simulations and genomic applications, the proposed method outperforms a state-of-the-art graph trend filtering algorithm in terms of mean squared error reduction and signal-to-noise ratio improvement, demonstrating its utility in recovering true signals while accounting for meaningful structural information.
Graph-based denoising is a critical preprocessing step for analyzing noisy data, particularly in genomic applications where gene regulatory networks exhibit inherent directional dependencies. This paper introduces a novel directed acyclic graph trend filtering framework that leverages novel higher-order Bayesian networks and graphical shrinkage processes to enhance local adaptivity in signal smoothing along the directed edges of a graph. Unlike traditional graph trend filtering, which assumes undirected graphs, the proposed method explicitly respects the directional structure of graphs, improving interpretability and accuracy in capturing dependencies. We employ a Hamiltonian Monte Carlo algorithm for efficient posterior inference. Through simulations and genomic applications, the proposed method outperforms a state-of-the-art graph trend filtering algorithm in terms of mean squared error reduction and signal-to-noise ratio improvement, demonstrating its utility in recovering true signals while accounting for meaningful structural information.
Keywords:
Directed acyclic graph; Gene regulatory networks; Graph trend filtering; Graphical shrinkage process; Higher-order smoothing.
3. Generalized Bayesian nonparametric clustering framework for high-dimensional spatial data
Speaker: Bencong Zhu, Hong Kong University of Science and Technology
Online
Abstract:
The advent of next-generation sequencing-based spatially resolved transcriptomics (SRT) techniques has transformed genomic research by enabling high-throughput gene expression profiling while preserving spatial context. Identifying spatial domains within SRT data is a critical task, with numerous computational approaches currently available. However, most existing methods rely on a multi-stage process that involves ad-hoc dimension reduction techniques to manage the high dimensionality of SRT data. Additionally, many approaches depend on arbitrarily specifying the number of clusters, which can result in information loss and suboptimal downstream analysis. To address these limitations, we propose a novel Bayesian nonparametric mixture of factor analysis (BNPMFA) model, which incorporates a Markov random field-constrained Gibbs-type prior for partitioning high-dimensional spatial omics data. This new prior effectively integrates the spatial constraints inherent in SRT data while simultaneously inferring cluster membership and determining the optimal number of spatial domains. We have established the theoretical identifiability of cluster membership within this framework.
The advent of next-generation sequencing-based spatially resolved transcriptomics (SRT) techniques has transformed genomic research by enabling high-throughput gene expression profiling while preserving spatial context. Identifying spatial domains within SRT data is a critical task, with numerous computational approaches currently available. However, most existing methods rely on a multi-stage process that involves ad-hoc dimension reduction techniques to manage the high dimensionality of SRT data. Additionally, many approaches depend on arbitrarily specifying the number of clusters, which can result in information loss and suboptimal downstream analysis. To address these limitations, we propose a novel Bayesian nonparametric mixture of factor analysis (BNPMFA) model, which incorporates a Markov random field-constrained Gibbs-type prior for partitioning high-dimensional spatial omics data. This new prior effectively integrates the spatial constraints inherent in SRT data while simultaneously inferring cluster membership and determining the optimal number of spatial domains. We have established the theoretical identifiability of cluster membership within this framework.
Keywords:
spatially resolved transcriptomics, factor analysis, Gibbs- type priors, Markov random field
4. Spatial panel data model with multi-dimensional heterogeneity: A Bayesian nonparametric approach
Speaker: Jianchao Zhuo, Xiamen University
Online
Abstract:
This paper introduces evolving grouped pattern of heterogeneity to spatial panel autoregressive model, where group membership is left unrestricted and allowed to change over time. This approach enables dynamic clustering while simultaneously accounting for spatial dependence among cross-sectional units. Our model extends existing heterogeneous spatial panel data models by incorporating a hidden Markov model to represent group membership, which allows for transition of membership over time. Rather than assuming static group structures or fixed number of temporal membership breaks, our approach can accommodate uncertainty regarding the number of groups and their evolution. In an empirical study, we estimate the evolving group structure with spatial contemporaneous effect and unobserved heterogeneity to reveal the dynamic structure of industrial upgrading in China.
This paper introduces evolving grouped pattern of heterogeneity to spatial panel autoregressive model, where group membership is left unrestricted and allowed to change over time. This approach enables dynamic clustering while simultaneously accounting for spatial dependence among cross-sectional units. Our model extends existing heterogeneous spatial panel data models by incorporating a hidden Markov model to represent group membership, which allows for transition of membership over time. Rather than assuming static group structures or fixed number of temporal membership breaks, our approach can accommodate uncertainty regarding the number of groups and their evolution. In an empirical study, we estimate the evolving group structure with spatial contemporaneous effect and unobserved heterogeneity to reveal the dynamic structure of industrial upgrading in China.
Keywords:
Bayesian nonparametrics, spatial panel data, group heterogeneity
Advancements in Bayesian Modeling of Complex Data Patterns
Bldg 219 Room 104
4 Presentations
1. Region Selection with Spatially Dependent Continuous Shrinkage Prior with an Application to Hurricane Prediction
Speaker: Shuang Zhou, Arizona State University
Online
Abstract:
In this talk, we focus on developing a novel spatially dependent shrinkage prior for high-dimensional areal data under the Bayesian framework. The motivating problem originated from a climate data application that aims at predicting hurricane occurrences in the Atlantic basin region in the United States and understanding the climate system by extracting the significant sub-regions that may be related to hurricane occurrence. A high-dimensional Bayesian Poisson model is mainly discussed in this work, with both covariate vector and coefficient vector representing certain spatial correlation patterns. Unfortunately, the current Bayesian variable selection techniques hardly capture spatially correlation structure presented in areal data. Therefore, we proposed to apply continuous shrinkage priors to Bayesian spatial models, such as the Conditional Autoregressive (CAR) model, for the purpose of region selection. In this talk, numerical results will be presented to show a robust performance of our method for region selection under various spatial settings, and a real data application is discussed regarding the hurricane prediction for the Atlantic basin region from 1950 to 2013.
In this talk, we focus on developing a novel spatially dependent shrinkage prior for high-dimensional areal data under the Bayesian framework. The motivating problem originated from a climate data application that aims at predicting hurricane occurrences in the Atlantic basin region in the United States and understanding the climate system by extracting the significant sub-regions that may be related to hurricane occurrence. A high-dimensional Bayesian Poisson model is mainly discussed in this work, with both covariate vector and coefficient vector representing certain spatial correlation patterns. Unfortunately, the current Bayesian variable selection techniques hardly capture spatially correlation structure presented in areal data. Therefore, we proposed to apply continuous shrinkage priors to Bayesian spatial models, such as the Conditional Autoregressive (CAR) model, for the purpose of region selection. In this talk, numerical results will be presented to show a robust performance of our method for region selection under various spatial settings, and a real data application is discussed regarding the hurricane prediction for the Atlantic basin region from 1950 to 2013.
Keywords:
High-dimensional; Spatial model; Dependent shrinkage prior; Areal data
2. A Bayesian Nonparametric Approach to Recycling Classification Models via Clustering under Domain and Category Shift
Speaker: Zeya Wang, University of Kentucky
Online
Abstract:
Recycling pretrained classification models for new domains has been extensively studied under the closed-set assumption that source and target domains share identical label spaces. However, this assumption does not hold when unseen classes appear in the target domain. Addressing this category shift remains particularly challenging in the source-free setting, where access to source data is unavailable. This more general and realistic scenario accounts for unknown target classes with no prior information on their identities or quantity. Most existing methods proposed for this task treat all unknown classes as a single group during both training and evaluation, limiting their capacity to model the underlying structure within the unknown class space. In this work, we present Adapt via Bayesian Nonparametric Clustering (ABC), a novel framework that, unlike prior methods, explicitly achieves fine-grained classification including unknown target classes, offering a more structured vision of the problem. Experiments on standard benchmarks demonstrate ABC’s superior performance and effective clustering of unknown classes.
Recycling pretrained classification models for new domains has been extensively studied under the closed-set assumption that source and target domains share identical label spaces. However, this assumption does not hold when unseen classes appear in the target domain. Addressing this category shift remains particularly challenging in the source-free setting, where access to source data is unavailable. This more general and realistic scenario accounts for unknown target classes with no prior information on their identities or quantity. Most existing methods proposed for this task treat all unknown classes as a single group during both training and evaluation, limiting their capacity to model the underlying structure within the unknown class space. In this work, we present Adapt via Bayesian Nonparametric Clustering (ABC), a novel framework that, unlike prior methods, explicitly achieves fine-grained classification including unknown target classes, offering a more structured vision of the problem. Experiments on standard benchmarks demonstrate ABC’s superior performance and effective clustering of unknown classes.
Keywords:
Bayesian Nonparametric Clustering,Dirichlet Process,Transfer Learning
3. Scalable and robust regression models for continuous proportional data
Speaker: Changwoo Lee, Duke University
Online
Abstract:
Beta regression is used routinely for continuous proportional data, but it often encounters practical issues such as a lack of robustness of regression parameter estimates to misspecification of the beta distribution. We develop an improved class of generalized linear models starting with the continuous binomial (cobin) distribution and further extending to dispersion mixtures of cobin distributions (micobin). The proposed cobin regression and micobin regression models have attractive robustness, computation, and flexibility properties. A key innovation is the Kolmogorov-Gamma data augmentation scheme, which facilitates Gibbs sampling for Bayesian computation, including in hierarchical cases involving nested, longitudinal, or spatial data. We demonstrate robustness, ability to handle responses exactly at the boundary (0 or 1), and computational efficiency relative to beta regression in simulation experiments and through analysis of the benthic macroinvertebrate multimetric index of US lakes using lake watershed covariates.
Beta regression is used routinely for continuous proportional data, but it often encounters practical issues such as a lack of robustness of regression parameter estimates to misspecification of the beta distribution. We develop an improved class of generalized linear models starting with the continuous binomial (cobin) distribution and further extending to dispersion mixtures of cobin distributions (micobin). The proposed cobin regression and micobin regression models have attractive robustness, computation, and flexibility properties. A key innovation is the Kolmogorov-Gamma data augmentation scheme, which facilitates Gibbs sampling for Bayesian computation, including in hierarchical cases involving nested, longitudinal, or spatial data. We demonstrate robustness, ability to handle responses exactly at the boundary (0 or 1), and computational efficiency relative to beta regression in simulation experiments and through analysis of the benthic macroinvertebrate multimetric index of US lakes using lake watershed covariates.
Keywords:
Bayesian, Data augmentation, Generalized linear model, Latent Gaussian model, Markov chain Monte Carlo
4. Bayesian Non-parametrics for Spatio-temporal Data Sets
Speaker: Marcin Jurek, Southern Methodist University
Online
Abstract:
Many environmental phenomena which evolve through time are extremely complex and difficult to model adequately. For example, modern remote sensing tools in atmospheric sciences allow us to obtain millions of measurements of certain variables with high frequency. At the same time, representing their temporal evolution requires very complex models and enormous computing power. In addition, since these complicated models often take a very long time to develop theoretically and implement. This allows but a handful of highly specialized researchers to use those tools. A promising approach to solving these problems is based on Gaussian Process State Space models, a Bayesian Nonparametric method which consists of imposing a Gaussian process prior on the unknown evolution system. However, the existing techniques using GPSSMs were built for low-dimensional systems, and without adjustments, they would be computationally infeasible if the dimension of the system is high. In this talk, we show this approach can be scaled to high-dimensional environmental problems.
Many environmental phenomena which evolve through time are extremely complex and difficult to model adequately. For example, modern remote sensing tools in atmospheric sciences allow us to obtain millions of measurements of certain variables with high frequency. At the same time, representing their temporal evolution requires very complex models and enormous computing power. In addition, since these complicated models often take a very long time to develop theoretically and implement. This allows but a handful of highly specialized researchers to use those tools. A promising approach to solving these problems is based on Gaussian Process State Space models, a Bayesian Nonparametric method which consists of imposing a Gaussian process prior on the unknown evolution system. However, the existing techniques using GPSSMs were built for low-dimensional systems, and without adjustments, they would be computationally infeasible if the dimension of the system is high. In this talk, we show this approach can be scaled to high-dimensional environmental problems.
Keywords:
Gaussian Process, spatio-temporal data, bayesian nonparametrics
Advances in High-Dimensional Inference
Bldg 219 Room B119
3 Presentations
1. Bayesian Multilevel Network Recovery Selection
Speaker: Inyoung Kim, Virginia Tech
In-person
Abstract:
In this talk, we examine multilevel network recovery selection under a two-level structure in which higher-level variables contain lower-level variables nested within them. Due to the dependency structure, variables work together to accomplish certain tasks at both levels. Our main interest is to simultaneously explore variable selection and identify dependency structures among both higher and lower-level variables under a nonadditive model framework. We develop a multi-level nonparametric kernel machine approach with a newly proposed multilevel Ising spike-slab prior, utilizing Markov-chain Monte Carlo and variational Bayes inference to identify multi-level variables and jointly build the network. The variational inference approach is novel in utilizing the sampled dependency structure as the observed variable rather than the response. In addition to the variable selection and network recovery capabilities, our approach can produce both mean and quantile estimations of the original response variable of interest. We demonstrate the advantages of our approach using simulation studies and a genetic pathway-based analysis.
In this talk, we examine multilevel network recovery selection under a two-level structure in which higher-level variables contain lower-level variables nested within them. Due to the dependency structure, variables work together to accomplish certain tasks at both levels. Our main interest is to simultaneously explore variable selection and identify dependency structures among both higher and lower-level variables under a nonadditive model framework. We develop a multi-level nonparametric kernel machine approach with a newly proposed multilevel Ising spike-slab prior, utilizing Markov-chain Monte Carlo and variational Bayes inference to identify multi-level variables and jointly build the network. The variational inference approach is novel in utilizing the sampled dependency structure as the observed variable rather than the response. In addition to the variable selection and network recovery capabilities, our approach can produce both mean and quantile estimations of the original response variable of interest. We demonstrate the advantages of our approach using simulation studies and a genetic pathway-based analysis.
Keywords:
Network Estimation, Quantile Regression, Variable Selection
2. Shape-Constrained Estimation of Standard Errors in Reversible Markov Chains
Speaker: Hyebin Song, Penn State University
Online
Abstract:
In this talk, I will present novel nonparametric, shape-constrained methods for estimating the autocovariance sequence of a reversible Markov chain. One primary motivation for this problem is the estimation of Markov chain Monte Carlo standard errors (MCSE), which quantify the uncertainty associated with estimates produced by MCMC algorithms. Our proposed estimator is based on the key observation that the autocovariance sequence of a reversible Markov chain can be represented as a moment sequence, which naturally imposes shape constraints such as monotonicity and convexity. I will discuss ordinary and weighted least squares formulations of the estimator, and outline their theoretical properties. In particular, we show that the resulting estimator is strongly consistent for the asymptotic variance of the MCMC sample mean, and l2-consistent for the full autocovariance sequence. I will also present an efficient algorithm for computing the estimator using convex optimization techniques, and I demonstrate its practical utility through empirical studies.
In this talk, I will present novel nonparametric, shape-constrained methods for estimating the autocovariance sequence of a reversible Markov chain. One primary motivation for this problem is the estimation of Markov chain Monte Carlo standard errors (MCSE), which quantify the uncertainty associated with estimates produced by MCMC algorithms. Our proposed estimator is based on the key observation that the autocovariance sequence of a reversible Markov chain can be represented as a moment sequence, which naturally imposes shape constraints such as monotonicity and convexity. I will discuss ordinary and weighted least squares formulations of the estimator, and outline their theoretical properties. In particular, we show that the resulting estimator is strongly consistent for the asymptotic variance of the MCMC sample mean, and l2-consistent for the full autocovariance sequence. I will also present an efficient algorithm for computing the estimator using convex optimization techniques, and I demonstrate its practical utility through empirical studies.
Keywords:
shape-constrained estimation, Markov chain Monte Carlo standard error, autocovariance
3. Hierarchical Skinny Gibbs Sampler in Logistic Regression
Speaker: Xia Wang, University of Cincinnati
In-person
Abstract:
We introduce a highly scalable tuning-free algorithm for variable selection in logistic regression using Polya-Gamma data augmentation. The proposed method is both theoretically consistent and robust to potential mis-specification of the tuning parameter, achieved through a hierarchical approach. Existing works suitable for high-dimensional settings primarily rely on t-approximation of the logistic density, which is not based on the original likelihood. The proposed method not only builds upon the exact logistic likelihood, offering superior empirical performance, but is also more computationally efficient, particularly in cases involving highly correlated covariates, as demonstrated in a comprehensive simulation study. We apply our method to a gene expression PCR dataset from mice and an RNA-seq dataset from asthma studies in humans. By comparing its performance to existing frequentist and Bayesian methods in variable selection, we demonstrate the competitive predictive capabilities of the Polya-Gamma-based approach. Our results indicate that this method enhances the accuracy of variable selection and improves the robustness of predictions in complex, high-dimensional datasets.
We introduce a highly scalable tuning-free algorithm for variable selection in logistic regression using Polya-Gamma data augmentation. The proposed method is both theoretically consistent and robust to potential mis-specification of the tuning parameter, achieved through a hierarchical approach. Existing works suitable for high-dimensional settings primarily rely on t-approximation of the logistic density, which is not based on the original likelihood. The proposed method not only builds upon the exact logistic likelihood, offering superior empirical performance, but is also more computationally efficient, particularly in cases involving highly correlated covariates, as demonstrated in a comprehensive simulation study. We apply our method to a gene expression PCR dataset from mice and an RNA-seq dataset from asthma studies in humans. By comparing its performance to existing frequentist and Bayesian methods in variable selection, we demonstrate the competitive predictive capabilities of the Polya-Gamma-based approach. Our results indicate that this method enhances the accuracy of variable selection and improves the robustness of predictions in complex, high-dimensional datasets.
Keywords:
Logistic regression, P{\'o}lya-Gamma distribution, Hierarchical Skinny Gibbs, Spike-and-Slab prior
Advances in Likelihood-Free and High-Dimensional Bayesian Inference
Bldg 219 Room 104
4 Presentations
1. Learning Summary Statistics for Likelihood-Free Bayesian Inference
Speaker: Rong Tang, Hong Kong University of Science and Technology
Online
Abstract:
The challenge of performing Bayesian inference in models where likelihood functions are difficult to evaluate but sampling is straightforward has driven the development of likelihood-free methods such as approximate Bayesian computation (ABC). A key element in ABC is the use of summary statistics to reduce the dimensionality of the data, thereby avoiding the curse of dimensionality in nonparametric conditional density estimation as the observed data size grows. However, the selection of informative summary statistics that can capture the essential information about the parameter contained in the full data remains challenging. In this work, we proposes a general framework for learning informative summary statistics and the subsequent posterior inference based on the summaries. The proposed method provides a global posterior approximation applicable to any dataset, rather than being limited to a single dataset. In addition, a more refined posterior approximations for specific datasets can be obtained by integrating this approach with MCMC-ABC methods.
The challenge of performing Bayesian inference in models where likelihood functions are difficult to evaluate but sampling is straightforward has driven the development of likelihood-free methods such as approximate Bayesian computation (ABC). A key element in ABC is the use of summary statistics to reduce the dimensionality of the data, thereby avoiding the curse of dimensionality in nonparametric conditional density estimation as the observed data size grows. However, the selection of informative summary statistics that can capture the essential information about the parameter contained in the full data remains challenging. In this work, we proposes a general framework for learning informative summary statistics and the subsequent posterior inference based on the summaries. The proposed method provides a global posterior approximation applicable to any dataset, rather than being limited to a single dataset. In addition, a more refined posterior approximations for specific datasets can be obtained by integrating this approach with MCMC-ABC methods.
Keywords:
Approximate Bayesian Inference, Intractable likelihood, M-estimation, Dimension Reduction, Summary Statistics
2. Bayesian Optimal Change Point Detection in High Dimensions
Speaker: Kyoungjae Lee, Sungkyunkwan University
In-person
Abstract:
We propose the first Bayesian methods for detecting change points in high-dimensional mean and covariance structures. These methods are constructed using pairwise Bayes factors, leveraging modularization to identify significant changes in individual components efficiently. We establish that the proposed methods consistently detect and estimate change points under much milder conditions than existing approaches in the literature. Additionally, we demonstrate that their localization rates are nearly optimal in terms of rates. The practical performance of the proposed methods is evaluated through extensive simulation studies, where they are compared to state-of-the-art techniques. The results show comparable or superior performance across most scenarios. Notably, the methods effectively detect change points whenever signals of sufficient magnitude are present, irrespective of the number of signals. Finally, we apply the proposed methods to genetic and financial datasets, illustrating their practical utility in real-world applications.
We propose the first Bayesian methods for detecting change points in high-dimensional mean and covariance structures. These methods are constructed using pairwise Bayes factors, leveraging modularization to identify significant changes in individual components efficiently. We establish that the proposed methods consistently detect and estimate change points under much milder conditions than existing approaches in the literature. Additionally, we demonstrate that their localization rates are nearly optimal in terms of rates. The practical performance of the proposed methods is evaluated through extensive simulation studies, where they are compared to state-of-the-art techniques. The results show comparable or superior performance across most scenarios. Notably, the methods effectively detect change points whenever signals of sufficient magnitude are present, irrespective of the number of signals. Finally, we apply the proposed methods to genetic and financial datasets, illustrating their practical utility in real-world applications.
Keywords:
High-dimensional change point detection, mean vector, covariance matrix, maximum pairwise Bayes factor.
3. Applying Multi-Objective Bayesian Optimization to Likelihood-Free Inference
Speaker: David Chen, National University of Singapore
In-person
Abstract:
Scientific statistical models are often defined by generative processes for simulating synthetic data, but many, such as sequential sampling models (SSMs) used in psychology and consumer behavior, involve intractable likelihoods. Likelihood-free inference (LFI) methods address this challenge, enabling Bayesian parameter inference for such models. We propose to apply Multi-objective Bayesian Optimization (MOBO) to LFI for estimation of parameters using multi-source data, such as SSMs parameters using response times and choice outcomes. This approach models discrepancies for each data source separately, using MOBO to efficiently approximate the joint likelihood. This multivariate approach also identifies conflicting information from different data sources and provides insights on their different importance in estimation of individual parameters. We demonstrate the advantages of MOBO over single-discrepancy methods through a synthetic data example and a real-world application evaluating ride-hailing drivers' preferences for electric vehicle rentals in Singapore. While focused on SSMs, our method generalizes to likelihood-free calibration for other multi-source models.
Scientific statistical models are often defined by generative processes for simulating synthetic data, but many, such as sequential sampling models (SSMs) used in psychology and consumer behavior, involve intractable likelihoods. Likelihood-free inference (LFI) methods address this challenge, enabling Bayesian parameter inference for such models. We propose to apply Multi-objective Bayesian Optimization (MOBO) to LFI for estimation of parameters using multi-source data, such as SSMs parameters using response times and choice outcomes. This approach models discrepancies for each data source separately, using MOBO to efficiently approximate the joint likelihood. This multivariate approach also identifies conflicting information from different data sources and provides insights on their different importance in estimation of individual parameters. We demonstrate the advantages of MOBO over single-discrepancy methods through a synthetic data example and a real-world application evaluating ride-hailing drivers' preferences for electric vehicle rentals in Singapore. While focused on SSMs, our method generalizes to likelihood-free calibration for other multi-source models.
Keywords:
Likelihood-Free Inference, Sequential Sampling Models, Multi-objective Bayesian Optimization
4. Weighted Fisher Divergence for High-Dimensional Gaussian Variational Inference
Speaker: Linda S. L. Tan, National University of Singapore
Online
Abstract:
This talk considers Gaussian variational approximation with sparse precision matrices in high dimensional problems. Although the optimal Gaussian approximation is usually defined as the one closest to the target posterior in Kullback-Leibler divergence, our work studies the weighted Fisher divergence, which focuses on gradient differences between the target posterior and its approximation, with the Fisher and score-based divergences being special cases. We make three main contributions. First, we compare approximations for weighted Fisher divergences under mean-field assumptions for both Gaussian and non-Gaussian targets with Kullback-Leibler approximations. Second, we go beyond mean-field and consider approximations with sparse precision matrices reflecting posterior conditional independence structure for hierarchical models. Using stochastic gradient descent to enforce sparsity, we develop two approaches to minimize the weighted Fisher divergence, based on the reparametrization trick and a batch approximation of the objective. Finally, we examine the performance of our methods for logistic regression, generalized linear mixed models and stochastic volatility models.
This talk considers Gaussian variational approximation with sparse precision matrices in high dimensional problems. Although the optimal Gaussian approximation is usually defined as the one closest to the target posterior in Kullback-Leibler divergence, our work studies the weighted Fisher divergence, which focuses on gradient differences between the target posterior and its approximation, with the Fisher and score-based divergences being special cases. We make three main contributions. First, we compare approximations for weighted Fisher divergences under mean-field assumptions for both Gaussian and non-Gaussian targets with Kullback-Leibler approximations. Second, we go beyond mean-field and consider approximations with sparse precision matrices reflecting posterior conditional independence structure for hierarchical models. Using stochastic gradient descent to enforce sparsity, we develop two approaches to minimize the weighted Fisher divergence, based on the reparametrization trick and a batch approximation of the objective. Finally, we examine the performance of our methods for logistic regression, generalized linear mixed models and stochastic volatility models.
Keywords:
Fisher divergence, Score-based divergence, Stochastic gradient descent, Gaussian variational approximation
Advances in theory and computation methods for Bayesian inference
Bldg 219 Room 103
3 Presentations
1. Enhancing Scalability in Bayesian Nonparametric Factor Analysis of Spatiotemporal Data
Speaker: Cheng Li, National University of Singapore
Online
Abstract:
We propose novel computational strategies for Bayesian nonparametric latent factor spatiotemporal models that are computationally feasible for moderate to large spatiotemporal data. Although such Bayesian models are flexible and enable spatial clustering, they face a prohibitively high computational cost in posterior sampling when the spatial and temporal dimensions increase to a couple hundred. We address this challenge with several speed-up proposals. We integrate a new slice sampling algorithm that permits varying numbers of spatial mixture components, which are guaranteed to be non-increasing through the posterior sampling iterations and thus effectively reducing the number of mixture parameters. Additionally, we introduce a spatial latent nearest-neighbor Gaussian process prior and new sequential updating algorithms for the spatially varying latent variables in the stick-breaking process prior. Our new proposals lead to significantly enhanced computational scalability and storage efficiency while maintaining capabilities for both spatiotemporal prediction and clustering of locations with similar temporal trajectories.
We propose novel computational strategies for Bayesian nonparametric latent factor spatiotemporal models that are computationally feasible for moderate to large spatiotemporal data. Although such Bayesian models are flexible and enable spatial clustering, they face a prohibitively high computational cost in posterior sampling when the spatial and temporal dimensions increase to a couple hundred. We address this challenge with several speed-up proposals. We integrate a new slice sampling algorithm that permits varying numbers of spatial mixture components, which are guaranteed to be non-increasing through the posterior sampling iterations and thus effectively reducing the number of mixture parameters. Additionally, we introduce a spatial latent nearest-neighbor Gaussian process prior and new sequential updating algorithms for the spatially varying latent variables in the stick-breaking process prior. Our new proposals lead to significantly enhanced computational scalability and storage efficiency while maintaining capabilities for both spatiotemporal prediction and clustering of locations with similar temporal trajectories.
Keywords:
Nearest-neighbor Gaussian process, Sequential updates, Slice sampling
2. Metropolis-Adjusted Subdifferential Langevin Algorithm
Speaker: Ning Ning, Texas A&M University
In-person
Abstract:
The Metropolis-Adjusted Langevin Algorithm (MALA) is a widely used Markov Chain Monte Carlo (MCMC) method for sampling from high-dimensional distributions. However, MALA relies on differentiability assumptions that restrict its applicability. In this paper, we introduce the Metropolis-Adjusted Subdifferential Langevin Algorithm (MASLA), a generalization of MALA that extends its applicability to distributions whose log-densities are locally Lipschitz, generally non-differentiable, and non-convex. We establish the theoretical foundation of MASLA by proving its convergence to a set-valued differential inclusion equation, ensuring well-defined long-run behavior. Furthermore, we evaluate the performance of MASLA by comparing it with other sampling algorithms in settings where they are applicable. Our results demonstrate the effectiveness of MASLA in handling a broader class of distributions while maintaining computational efficiency.
The Metropolis-Adjusted Langevin Algorithm (MALA) is a widely used Markov Chain Monte Carlo (MCMC) method for sampling from high-dimensional distributions. However, MALA relies on differentiability assumptions that restrict its applicability. In this paper, we introduce the Metropolis-Adjusted Subdifferential Langevin Algorithm (MASLA), a generalization of MALA that extends its applicability to distributions whose log-densities are locally Lipschitz, generally non-differentiable, and non-convex. We establish the theoretical foundation of MASLA by proving its convergence to a set-valued differential inclusion equation, ensuring well-defined long-run behavior. Furthermore, we evaluate the performance of MASLA by comparing it with other sampling algorithms in settings where they are applicable. Our results demonstrate the effectiveness of MASLA in handling a broader class of distributions while maintaining computational efficiency.
Keywords:
Markov chain Monte Carlo, Metropolis-adjusted Langevin algorithm, Generalized subdifferential, Non-convex and non-smooth optimization
3. Online Bernstein-von Mises Theorem
Speaker: Minwoo Chae, Pohang University of Science and Technology
In-person
Abstract:
Online learning is an inferential paradigm in which parameters are updated incrementally from sequentially available data, in contrast to batch learning, where the entire dataset is processed at once. In this talk, we assume that mini-batches from the full dataset become available sequentially. The Bayesian framework, which updates beliefs about unknown parameters after observing each mini-batch, is naturally suited for online learning. At each step, we update the posterior distribution using the current prior and new observations, with the updated posterior serving as the prior for the next step. However, this recursive Bayesian updating is rarely computationally tractable unless the model and prior are conjugate. When the model is regular, the updated posterior can be approximated by a normal distribution, as justified by the Bernstein-von Mises theorem. We adopt a variational approximation at each step and investigate the frequentist properties of the final posterior obtained through this sequential procedure. Under mild assumptions, we show that the accumulated approximation error becomes negligible once the mini-batch size exceeds a threshold depending on the parameter dimensi
Online learning is an inferential paradigm in which parameters are updated incrementally from sequentially available data, in contrast to batch learning, where the entire dataset is processed at once. In this talk, we assume that mini-batches from the full dataset become available sequentially. The Bayesian framework, which updates beliefs about unknown parameters after observing each mini-batch, is naturally suited for online learning. At each step, we update the posterior distribution using the current prior and new observations, with the updated posterior serving as the prior for the next step. However, this recursive Bayesian updating is rarely computationally tractable unless the model and prior are conjugate. When the model is regular, the updated posterior can be approximated by a normal distribution, as justified by the Bernstein-von Mises theorem. We adopt a variational approximation at each step and investigate the frequentist properties of the final posterior obtained through this sequential procedure. Under mild assumptions, we show that the accumulated approximation error becomes negligible once the mini-batch size exceeds a threshold depending on the parameter dimensi
Keywords:
Bayesian online learning, Bernstein-von Mises theorem, variational approximation
Bayesian Approaches for Modeling Tensors, Distributions and Clustering
Bldg 219 Room 104
3 Presentations
1. Bayesian Tensor Modeling for Distribution-on-Distribution Regression
Speaker: Justin Strait, Los Alamos National Laboratories
Online
Abstract:
In this work, we propose a fully Bayesian model for learning univariate distributional outcomes from several univariate input distributions, as motivated by problems in multifidelity statistical emulation for computer models with distributional output. In particular, we jointly model all quantiles of the outcome distribution by specifying a flexible tensor-valued regression parameter which relates each outcome quantile to all input distributions at their own individual quantiles. Due to the high-dimensionality of the tensor, we assume a low rank structure and propose use of a multi-way shrinkage prior on tensor margins. Distributions are represented by log quantile density (LQD) functions, which have been shown to facilitate specification of functional data models without requiring additional constraints on the functions. We assess our model's performance through a comprehensive simulation study based on multifidelity gas transport simulations through discrete fracture networks (DFN). Then, we apply our model to learn the structure relating low-fidelity gas transport simulation output to corresponding high-fidelity output for the purposes of emulation.
In this work, we propose a fully Bayesian model for learning univariate distributional outcomes from several univariate input distributions, as motivated by problems in multifidelity statistical emulation for computer models with distributional output. In particular, we jointly model all quantiles of the outcome distribution by specifying a flexible tensor-valued regression parameter which relates each outcome quantile to all input distributions at their own individual quantiles. Due to the high-dimensionality of the tensor, we assume a low rank structure and propose use of a multi-way shrinkage prior on tensor margins. Distributions are represented by log quantile density (LQD) functions, which have been shown to facilitate specification of functional data models without requiring additional constraints on the functions. We assess our model's performance through a comprehensive simulation study based on multifidelity gas transport simulations through discrete fracture networks (DFN). Then, we apply our model to learn the structure relating low-fidelity gas transport simulation output to corresponding high-fidelity output for the purposes of emulation.
Keywords:
TBD
2. Efficient Decision Trees for Tensor Regressions
Speaker: Hengrui Luo, Rice University
Online
Abstract:
This talk covers recent progress in tree-based methods for tensor regressions, which is an interpretable nonparametric tool that assists in various learning tasks. In particular, we develop single and ensemble tree methods for tensor-input regressions. We begin with scalar-on-tensor (tensor input and scalar output) regression and design efficient computational strategies to handle tensor inputs, which is a more complex search space than vector input space. Then we extend our tensor tree model to tensor-on-tensor (tensor input and tensor output) regressions with ensemble approaches with theoretic guarantees. We will also identify some existing challenges in applying tree-based and nonparametric methods for ultra-high dimensional tensor data, like MRI/fMRI data with limited individuals and possible missing data. We'll wrap up with a ranking perspective and raise a couple of open questions when extending this ranking perspective to tensor-input scenarios.
This talk covers recent progress in tree-based methods for tensor regressions, which is an interpretable nonparametric tool that assists in various learning tasks. In particular, we develop single and ensemble tree methods for tensor-input regressions. We begin with scalar-on-tensor (tensor input and scalar output) regression and design efficient computational strategies to handle tensor inputs, which is a more complex search space than vector input space. Then we extend our tensor tree model to tensor-on-tensor (tensor input and tensor output) regressions with ensemble approaches with theoretic guarantees. We will also identify some existing challenges in applying tree-based and nonparametric methods for ultra-high dimensional tensor data, like MRI/fMRI data with limited individuals and possible missing data. We'll wrap up with a ranking perspective and raise a couple of open questions when extending this ranking perspective to tensor-input scenarios.
Keywords:
TBD
3. Predictor-Informed Bayesian Nonparametric Clustering
Speaker: Jeremy Gaskins, University of Louisville
Online
Abstract:
In this project we are interested in performing clustering of observations such that the cluster membership is influenced by a set of covariates. To that end, we employ the Bayesian nonparametric Common Atom Model (CAM), which is a nested clustering algorithm that utilizes a (fixed) group membership for each observation to encourage more similar clustering of members of the same group. CAM operates by assuming each group has its own vector of cluster probabilities, which are themselves clustered to allow similar clustering for some groups. We extend this approach by treating the group membership as an unknown latent variable determined by a collection of covariate predictors. Consequently, observations with similar predictor values will be in the same latent group and are more likely to be clustered together than observations with disparate predictors. We propose a pyramid group model that flexibly partitions the predictor space into these latent group memberships. This pyramid model operates similarly to a Bayesian regression tree process except that it uses the same splitting rule for at all nodes at the same tree depth which facilitates improved mixing. We propose a block Gibbs sampler for our model to perform posterior inference. Our methodology is demonstrated in simulation and real data examples. In the real data application, we utilize the RAND Health and Retirement Study to cluster and predict patient outcomes in terms of the number of days spent overnight in the hospital.
In this project we are interested in performing clustering of observations such that the cluster membership is influenced by a set of covariates. To that end, we employ the Bayesian nonparametric Common Atom Model (CAM), which is a nested clustering algorithm that utilizes a (fixed) group membership for each observation to encourage more similar clustering of members of the same group. CAM operates by assuming each group has its own vector of cluster probabilities, which are themselves clustered to allow similar clustering for some groups. We extend this approach by treating the group membership as an unknown latent variable determined by a collection of covariate predictors. Consequently, observations with similar predictor values will be in the same latent group and are more likely to be clustered together than observations with disparate predictors. We propose a pyramid group model that flexibly partitions the predictor space into these latent group memberships. This pyramid model operates similarly to a Bayesian regression tree process except that it uses the same splitting rule for at all nodes at the same tree depth which facilitates improved mixing. We propose a block Gibbs sampler for our model to perform posterior inference. Our methodology is demonstrated in simulation and real data examples. In the real data application, we utilize the RAND Health and Retirement Study to cluster and predict patient outcomes in terms of the number of days spent overnight in the hospital.
Keywords:
TBD
Bayesian Inference in Modern Machine Learning
Bldg 219 Room 105
3 Presentations
1. Semi-supervised Bayesian Spatial Topic Modeling for Identifying Multicellular Spatial Tissue Structures in Multiplex Imaging Data
Speaker: Junsouk Choi, University of Michigan
In-person
Abstract:
Understanding spatial architecture of tissues is essential for decoding complex interactions within cellular ecosystems and their implications for disease pathology and clinical outcomes. Recent advances in multiplex imaging technologies have enabled high-resolution profiling of cellular phenotypes and their spatial distributions, revealing the pivotal role of tissue structures in modulating immune responses and driving disease progression. To systematically identify and characterize spatial tissue architecture from such data, we propose a novel semi-supervised Bayesian spatial topic model, which integrates spatial Gaussian processes into latent Dirichlet allocation to flexibly model spatial dependencies inherent in tissue organization. Furthermore, by jointly analyzing multiple multiplex images, the proposed approach identifies consistent and coherent spatial structures across samples and incorporates clinical covariates to guide and enhance these discoveries. We applied our method to a lung cancer multiplex imaging dataset, revealing biologically meaningful tumor microenvironment patterns that were consistent across patients and significantly associated with clinical features.
Understanding spatial architecture of tissues is essential for decoding complex interactions within cellular ecosystems and their implications for disease pathology and clinical outcomes. Recent advances in multiplex imaging technologies have enabled high-resolution profiling of cellular phenotypes and their spatial distributions, revealing the pivotal role of tissue structures in modulating immune responses and driving disease progression. To systematically identify and characterize spatial tissue architecture from such data, we propose a novel semi-supervised Bayesian spatial topic model, which integrates spatial Gaussian processes into latent Dirichlet allocation to flexibly model spatial dependencies inherent in tissue organization. Furthermore, by jointly analyzing multiple multiplex images, the proposed approach identifies consistent and coherent spatial structures across samples and incorporates clinical covariates to guide and enhance these discoveries. We applied our method to a lung cancer multiplex imaging dataset, revealing biologically meaningful tumor microenvironment patterns that were consistent across patients and significantly associated with clinical features.
Keywords:
Multiplex imaging, Tissue microenvironment, Spatial topic models, Semi-supervised clustering, Gaussian processes
2. Statistical Modeling of Subcellular Expression Patterns for High-resolution Spatial Transcriptomics
Speaker: Jade Wang, University of Michigan
Online
Abstract:
Advances in spatially resolved transcriptomic technologies are producing gene expression data at increasingly higher throughput, scale, and resolution. Identifying patterns of sub-cellular mRNA localization in spatially resolved transcriptomic studies is essential for understanding the cellular dynamics of RNA processing. Here, we present a statistical method, expression gradient-based mRNA localization analysis (ELLA), that integrates high-resolution spatially resolved gene expression data with histology imaging data to identify the sub-cellular mRNA localization patterns in various spatially resolved transcriptomic techniques. ELLA models spatial count data through a nonhomogeneous Poisson process model and relies on an expression gradient function to characterize the sub-cellular mRNA localization pattern, producing effective control of type I errors and yielding high statistical power. Analyzing four spatially resolved transcriptomic datasets using ELLA, we identified genes in multiple cell types with various sub-cellular localization patterns.
Advances in spatially resolved transcriptomic technologies are producing gene expression data at increasingly higher throughput, scale, and resolution. Identifying patterns of sub-cellular mRNA localization in spatially resolved transcriptomic studies is essential for understanding the cellular dynamics of RNA processing. Here, we present a statistical method, expression gradient-based mRNA localization analysis (ELLA), that integrates high-resolution spatially resolved gene expression data with histology imaging data to identify the sub-cellular mRNA localization patterns in various spatially resolved transcriptomic techniques. ELLA models spatial count data through a nonhomogeneous Poisson process model and relies on an expression gradient function to characterize the sub-cellular mRNA localization pattern, producing effective control of type I errors and yielding high statistical power. Analyzing four spatially resolved transcriptomic datasets using ELLA, we identified genes in multiple cell types with various sub-cellular localization patterns.
Keywords:
Spatially resolved transcriptomics, subcellular mRNA localization, spatial variable genes, ELLA, nonhomogeneous Poisson process
3. Deep Generative Quantile Bayes
Speaker: Jungeum Kim, University of Chicago
Online
Abstract:
This paper develops a multivariate Bayesian posterior sampling method through generative quantile learning. Our method learns a mapping that can transform (spherically) uniform random vectors into posterior samples without adversarial training. We utilize Monge-Kantorovich depth in multivariate quantiles to directly sample from Bayesian credible sets, a unique feature not offered by typical posterior sampling methods. To enhance training in quantile mapping, we designed a neural network that automatically performs summary statistic extraction. This additional neural network structure has performance benefits including support shrinkage (or posterior contraction) as the observation sample size increases. We demonstrate the usefulness of our approach on several examples where the absence of likelihood renders classical MCMC infeasible. Finally, we provide frequentist theoretical justifications for our quantile learning framework.
This paper develops a multivariate Bayesian posterior sampling method through generative quantile learning. Our method learns a mapping that can transform (spherically) uniform random vectors into posterior samples without adversarial training. We utilize Monge-Kantorovich depth in multivariate quantiles to directly sample from Bayesian credible sets, a unique feature not offered by typical posterior sampling methods. To enhance training in quantile mapping, we designed a neural network that automatically performs summary statistic extraction. This additional neural network structure has performance benefits including support shrinkage (or posterior contraction) as the observation sample size increases. We demonstrate the usefulness of our approach on several examples where the absence of likelihood renders classical MCMC infeasible. Finally, we provide frequentist theoretical justifications for our quantile learning framework.
Keywords:
Neural posterior sampling, Quantile learning, Conditional vector quantiles, Support shrinkage
Bayesian Learning With Latent Variables
Bldg 219 Room 105
3 Presentations
1. Causal representation learning: Identifying latent causal factors from unstructured data
Speaker: Yixin Wang, University of Michigan
Online
Abstract:
Causal inference traditionally involves analyzing tabular data where variables like treatment, outcome, covariates, and colliders are manually labeled by humans. However, many complex causal inference problems rely on unstructured data sources such as images, text and videos that depict overall situations. These causal problems require a crucial first step - extracting the high-level latent causal factors from the low-level unstructured data inputs, a task known as "causal representation learning." In this talk, we explore how to identify latent causal factors from unstructured data, whether from passive observations, interventional experiments, or multi-domain datasets.
Causal inference traditionally involves analyzing tabular data where variables like treatment, outcome, covariates, and colliders are manually labeled by humans. However, many complex causal inference problems rely on unstructured data sources such as images, text and videos that depict overall situations. These causal problems require a crucial first step - extracting the high-level latent causal factors from the low-level unstructured data inputs, a task known as "causal representation learning." In this talk, we explore how to identify latent causal factors from unstructured data, whether from passive observations, interventional experiments, or multi-domain datasets.
Keywords:
causal inference, representation learning, latent variables
2. Predictive variational inference: Learn the predictively optimal posterior distribution
Speaker: Yuling Yao, The University of Texas at Austin
Online
Abstract:
Vanilla variational inference finds an optimal approximation to the Bayesian posterior distribution, but even the exact Bayesian posterior is often not meaningful under model misspecification. We propose predictive variational inference (PVI): a general inference framework that seeks and samples from an optimal posterior density such that the resulting posterior predictive distribution is as close to the true data generating process as possible, while this closeness is measured by multiple scoring rules. By optimizing the objective, the predictive variational inference is generally not the same as, or even attempting to approximate, the Bayesian posterior, even asymptotically. Rather, we interpret it as implicit hierarchical expansion. Further, the learned posterior uncertainty detects heterogeneity of parameters among the population, enabling automatic model diagnosis. This framework applies to both likelihood-exact and likelihood-free models. We demonstrate its application in real data examples.
Vanilla variational inference finds an optimal approximation to the Bayesian posterior distribution, but even the exact Bayesian posterior is often not meaningful under model misspecification. We propose predictive variational inference (PVI): a general inference framework that seeks and samples from an optimal posterior density such that the resulting posterior predictive distribution is as close to the true data generating process as possible, while this closeness is measured by multiple scoring rules. By optimizing the objective, the predictive variational inference is generally not the same as, or even attempting to approximate, the Bayesian posterior, even asymptotically. Rather, we interpret it as implicit hierarchical expansion. Further, the learned posterior uncertainty detects heterogeneity of parameters among the population, enabling automatic model diagnosis. This framework applies to both likelihood-exact and likelihood-free models. We demonstrate its application in real data examples.
Keywords:
generalized Bayes, variational inference, posterior predictive, simulation-based inference
3. Bayesian federated quantification learning under distribution shift
Speaker: Zehang Li, University of California, Santa Cruz
Online
Abstract:
In regions lacking medically certified causes of death, verbal autopsy (VA) is a critical and widely used tool to ascertain the cause of death through interviews with caregivers. In this talk, we develop a novel Bayesian Federated Learning framework for both individual-level cause-of-death classification and population-level quantification of cause-specific mortality fractions (CSMFs) using VA data, in a target domain with limited or no local labeled data. The proposed framework is modular, computationally efficient, and compatible with a wide range of existing VA algorithms as candidate models, facilitating flexible deployment in real-world mortality surveillance systems. We validate the performance of BFL through extensive experiments on two real-world VA datasets under varying levels of distribution shift.
In regions lacking medically certified causes of death, verbal autopsy (VA) is a critical and widely used tool to ascertain the cause of death through interviews with caregivers. In this talk, we develop a novel Bayesian Federated Learning framework for both individual-level cause-of-death classification and population-level quantification of cause-specific mortality fractions (CSMFs) using VA data, in a target domain with limited or no local labeled data. The proposed framework is modular, computationally efficient, and compatible with a wide range of existing VA algorithms as candidate models, facilitating flexible deployment in real-world mortality surveillance systems. We validate the performance of BFL through extensive experiments on two real-world VA datasets under varying levels of distribution shift.
Keywords:
quantification learning, classification, federated learning
Bayesian Methods for Translational Biomedical Research
Bldg 219 Room 105
3 Presentations
1. Detection of Cell-type-specific Differentially Methylated Regions in Epigenome-Wide Association Studies
Speaker: Yingying Wei, The Chinese University of Hong Kong
Online
Abstract:
Epidemiologists are interested in investigating DNA methylation at cytosine-phosphate-guanine (CpG) sites in large cohorts through epigenome-wide association studies (EWAS). However, the observed EWAS data are bulk data with signals aggregated from distinct cell types. As a result, recently, there has been active research on detecting cell-type-specific risk CpG sites for EWAS data. However, although existing methods significantly improve the detection at the aggregated-level—identifying a CpG site as a risk CpG site as long as it is associated with the phenotype in any cell type, they have low power in detecting cell-type-specific associations for EWAS with typical sample sizes. Here, we develop a new method, Fine-scale inference for Differentially Methylated Regions (FineDMR), to borrow strengths of nearby CpG sites to improve the cell-type-specific association detection. Via a Bayesian hierarchical model built upon Gaussian process functional regression, FineDMR takes advantage of the spatial dependencies between CpG sites. Simulation studies and real data analysis show that FineDMR substantially improves the power in detecting cell-type-specific associations for EWAS data.
Epidemiologists are interested in investigating DNA methylation at cytosine-phosphate-guanine (CpG) sites in large cohorts through epigenome-wide association studies (EWAS). However, the observed EWAS data are bulk data with signals aggregated from distinct cell types. As a result, recently, there has been active research on detecting cell-type-specific risk CpG sites for EWAS data. However, although existing methods significantly improve the detection at the aggregated-level—identifying a CpG site as a risk CpG site as long as it is associated with the phenotype in any cell type, they have low power in detecting cell-type-specific associations for EWAS with typical sample sizes. Here, we develop a new method, Fine-scale inference for Differentially Methylated Regions (FineDMR), to borrow strengths of nearby CpG sites to improve the cell-type-specific association detection. Via a Bayesian hierarchical model built upon Gaussian process functional regression, FineDMR takes advantage of the spatial dependencies between CpG sites. Simulation studies and real data analysis show that FineDMR substantially improves the power in detecting cell-type-specific associations for EWAS data.
Keywords:
Bayesian hierarchical model, Gaussian process functional regression, epigenome-wide association studies, cell-type-specific associations
2. Balancing the effective sample size in prior across different doses in the curve-free Bayesian decision-theoretic design for dose-finding trials
Speaker: Dehua Bi, Stanford University
In-person
Abstract:
The primary goal of dose allocation in phase I trials is to minimize patient exposure to subtherapeutic or excessively toxic doses, while accurately recommending a phase II dose that is as close as possible to the maximum tolerated dose (MTD). Fan et al. (2012) introduced a curve-free Bayesian decision-theoretic design (CFBD), which leverages the assumption of a monotonic dose-toxicity relationship without directly modeling dose-toxicity curves. This approach has also been extended to drug combinations for determining the MTD (Lee et al., 2017). Although CFBD has demonstrated improved trial efficiency by using fewer patients while maintaining high accuracy in identifying the MTD, it may artificially inflate the effective sample sizes for the updated prior distributions, particularly at the lowest and highest dose levels. This can lead to either overshooting or undershooting the target dose. In this paper, we propose a modification to CFBD’s prior distribution updates that balances effective sample sizes across different doses. Simulation results show that with the modified prior specification, CFBD achieves a more focused dose allocation at the MTD and offers more precise dose recommendations with fewer patients on average. It also demonstrates robustness to other well-known dose finding designs in literature.
The primary goal of dose allocation in phase I trials is to minimize patient exposure to subtherapeutic or excessively toxic doses, while accurately recommending a phase II dose that is as close as possible to the maximum tolerated dose (MTD). Fan et al. (2012) introduced a curve-free Bayesian decision-theoretic design (CFBD), which leverages the assumption of a monotonic dose-toxicity relationship without directly modeling dose-toxicity curves. This approach has also been extended to drug combinations for determining the MTD (Lee et al., 2017). Although CFBD has demonstrated improved trial efficiency by using fewer patients while maintaining high accuracy in identifying the MTD, it may artificially inflate the effective sample sizes for the updated prior distributions, particularly at the lowest and highest dose levels. This can lead to either overshooting or undershooting the target dose. In this paper, we propose a modification to CFBD’s prior distribution updates that balances effective sample sizes across different doses. Simulation results show that with the modified prior specification, CFBD achieves a more focused dose allocation at the MTD and offers more precise dose recommendations with fewer patients on average. It also demonstrates robustness to other well-known dose finding designs in literature.
Keywords:
TBD
3. BESS-Surv: A Bayesian estimator of sample size for survival data
Speaker: Jiaxin Liu, BayeSoft Inc.
In-person
Abstract:
Bayesian statistics offers a coherent and adaptable framework for quantifying uncertainty, useful in sample size estimation (SSE) and re-estimation (SSR) for clinical trials. Building upon the Bayesian Estimator Sample Size (BESS) methodology (Bi and Ji, 2025), we propose BESS-Surv, an extension tailored for randomized clinical trials with time-to-event outcomes. BESS-Surv estimates the sample size by balancing the evidence in the observed data and the desirable confidence of the trial decisions. It allows for adaptive sample size adjustments based on interim treatment effects—reducing the required sample size when strong effects are observed, or increasing the probability of correct decision-making when effects are less pronounced. Simulation studies and a case study will be present to illustrate BESS-Surv’s performance.
Bayesian statistics offers a coherent and adaptable framework for quantifying uncertainty, useful in sample size estimation (SSE) and re-estimation (SSR) for clinical trials. Building upon the Bayesian Estimator Sample Size (BESS) methodology (Bi and Ji, 2025), we propose BESS-Surv, an extension tailored for randomized clinical trials with time-to-event outcomes. BESS-Surv estimates the sample size by balancing the evidence in the observed data and the desirable confidence of the trial decisions. It allows for adaptive sample size adjustments based on interim treatment effects—reducing the required sample size when strong effects are observed, or increasing the probability of correct decision-making when effects are less pronounced. Simulation studies and a case study will be present to illustrate BESS-Surv’s performance.
Keywords:
TBD
Bayesian Spatial Modeling, Variable Selection, and Applications
Bldg 219 Room 103
4 Presentations
1. Double-robust Bayesian variable selection and model prediction with spherically symmetric errors
Speaker: Dr. Min Wang, University of Texas at San Antonio
In-person
Abstract:
Response surface methodology has been known as an effective tool for improving an overall manufacturing process where quality requirements are fulfilled. This work proposes a double-robust Bayesian modeling method that can simultaneously cope with variable selection, model form uncertainty, and non-normality for quality prediction. Double robustness is achieved by specifying the class of spherically symmetric distributions for the errors and accounting for model form uncertainty through Bayesian model averaging. Furthermore, with a special choice in the sub-harmonic priors for the regression coefficients, a closed-form expression of the marginal posterior distribution of each candidate model is obtained, which is not only free of the error distributions (other than spherical symmetry) but also can be easily computed using standard software. To provide a better interpretation of the model, a special prior is specified for the model space to maintain and reflect the hierarchical or structural relationships among input variables. The proposed Bayesian method has the properties of variable selection consistency and prediction consistency under Bayesian model averaging. Through numerical experiments and a case study, the proposed double-robust Bayesian modeling method is shown to achieve results superior to those of the existing established methods in prediction and variable selection in linear models under different types of error distributions.
Response surface methodology has been known as an effective tool for improving an overall manufacturing process where quality requirements are fulfilled. This work proposes a double-robust Bayesian modeling method that can simultaneously cope with variable selection, model form uncertainty, and non-normality for quality prediction. Double robustness is achieved by specifying the class of spherically symmetric distributions for the errors and accounting for model form uncertainty through Bayesian model averaging. Furthermore, with a special choice in the sub-harmonic priors for the regression coefficients, a closed-form expression of the marginal posterior distribution of each candidate model is obtained, which is not only free of the error distributions (other than spherical symmetry) but also can be easily computed using standard software. To provide a better interpretation of the model, a special prior is specified for the model space to maintain and reflect the hierarchical or structural relationships among input variables. The proposed Bayesian method has the properties of variable selection consistency and prediction consistency under Bayesian model averaging. Through numerical experiments and a case study, the proposed double-robust Bayesian modeling method is shown to achieve results superior to those of the existing established methods in prediction and variable selection in linear models under different types of error distributions.
Keywords:
Bayesian model averaging; consistency; variable selection; robustness; response surface methodology.
2. Bayesian envelope dimension reduction in spatial-temporal setting
Speaker: Wenbo Wu, University of Texas at San Antonio
In-person
Abstract:
The recently developed Bayesian framework for the envelope model enhances the interpretability of model parameters and simplifies the incorporation of prior information. However, the current method presumes an independent error structure within the model and fails to account for the additional complexities introduced by spatially correlated data. Therefore, we propose a Bayesian framework for the spatial envelope model. By Incorporating the prior information of model parameters, the proposed method offers a more flexible and robust framework for capturing uncertainty and spatial correlation in high-dimensional data. Furthermore, we investigate appropriate prior specifications for the spatial parameters, study the propriety of the posterior distribution, and develop a posterior sampling method to sample both the posterior distributions of spatial parameters and the conditional posterior distributions associated with the envelope model.
The recently developed Bayesian framework for the envelope model enhances the interpretability of model parameters and simplifies the incorporation of prior information. However, the current method presumes an independent error structure within the model and fails to account for the additional complexities introduced by spatially correlated data. Therefore, we propose a Bayesian framework for the spatial envelope model. By Incorporating the prior information of model parameters, the proposed method offers a more flexible and robust framework for capturing uncertainty and spatial correlation in high-dimensional data. Furthermore, we investigate appropriate prior specifications for the spatial parameters, study the propriety of the posterior distribution, and develop a posterior sampling method to sample both the posterior distributions of spatial parameters and the conditional posterior distributions associated with the envelope model.
Keywords:
envelope model, spatial correlation, Gibbs sampling
3. Bayesian variable selection in the high-dimensional Cox model with the horseshoe prior
Speaker: Zhuanzhuan Ma, University of Texas at Rio Grande Valley
In-person
Abstract:
With the rapid advancement of computing technology, high-dimensional statistical inference has become increasingly relevant and essential, particularly in genomics and computational biology. In our ongoing work, we focus on variable selection in the Cox proportional hazards model to identify important covariates that associate genomic features with patients’ censored survival times. To ease the computational burden and model complexity, we incorporate the Limited-memory Broyden–Fletcher–Goldfarb–Shanno (L-BFGS) algorithm to obtain the maximum a posteriori (MAP) estimator, with the regression coefficients regularized by the horseshoe prior. A key theoretical contribution of this work is the derivation of posterior contraction rates under high-dimensional asymptotics, along with sufficient conditions for the consistency of the MAP estimator. We carry out simulations to compare the finite sample performances of the proposed method with the existing methods in terms of the accuracy of variable selection. Finally, a real-data application is provided for illustrative purposes.
With the rapid advancement of computing technology, high-dimensional statistical inference has become increasingly relevant and essential, particularly in genomics and computational biology. In our ongoing work, we focus on variable selection in the Cox proportional hazards model to identify important covariates that associate genomic features with patients’ censored survival times. To ease the computational burden and model complexity, we incorporate the Limited-memory Broyden–Fletcher–Goldfarb–Shanno (L-BFGS) algorithm to obtain the maximum a posteriori (MAP) estimator, with the regression coefficients regularized by the horseshoe prior. A key theoretical contribution of this work is the derivation of posterior contraction rates under high-dimensional asymptotics, along with sufficient conditions for the consistency of the MAP estimator. We carry out simulations to compare the finite sample performances of the proposed method with the existing methods in terms of the accuracy of variable selection. Finally, a real-data application is provided for illustrative purposes.
Keywords:
Cox's proportional hazards model, variable selection, horseshoe prior, genomic data
4. Detecting malware without ground truth
Speaker: Keying Ye, University of Texas at San Antonio
In-person
Abstract:
One of the real-world problems when it comes to classification is finding the right tool (or sets of tools) to classify objects into their true classes. We sometimes use multiple classification tools and decide on a class of a given observation based on the results of a majority decision using these tools. In this research, we use a Bayesian framework to study the predictive distribution of crowdsourcing data. Simulation studies are carried out using synthetic data with known ground truth and accuracies in estimations are compared. A real data set is applied with the proposed approach. Furthermore, some theoretical work has been conducted.
One of the real-world problems when it comes to classification is finding the right tool (or sets of tools) to classify objects into their true classes. We sometimes use multiple classification tools and decide on a class of a given observation based on the results of a majority decision using these tools. In this research, we use a Bayesian framework to study the predictive distribution of crowdsourcing data. Simulation studies are carried out using synthetic data with known ground truth and accuracies in estimations are compared. A real data set is applied with the proposed approach. Furthermore, some theoretical work has been conducted.
Keywords:
Empirical Bayesian analysis, crowdsourcing, malware detection, predictive distributions
Enhancing interpretability with Bayesian approaches: From methodological foundations to applications
Bldg 219 Room 105
3 Presentations
1. Catalytic priors: Using synthetic data to specify prior distributions in Bayesian analysis
Speaker: Dongming Huang, National University of Singapore
In-person
Abstract:
Catalytic prior distributions provide general, easy-to-use and interpretable specifications of prior distributions for Bayesian analysis. They are particularly beneficial when observed data are inadequate to well-estimate a complex target model. A catalytic prior distribution stabilizes a high-dimensional "working model" by shrinking it toward a "simplified model." The shrinkage is achieved by supplementing the observed data with a small amount of "synthetic data" generated from a predictive distribution under the simpler model. We apply this framework to generalized linear models, where we propose various strategies for the specification of a tuning parameter governing the degree of shrinkage and study resultant properties. The catalytic priors have simple interpretations and are easy to formulate. In our numerical experiments and a real-world study, the performance of the inference based on the catalytic prior are superior to, or comparable to, that of other commonly used prior distributions.
Catalytic prior distributions provide general, easy-to-use and interpretable specifications of prior distributions for Bayesian analysis. They are particularly beneficial when observed data are inadequate to well-estimate a complex target model. A catalytic prior distribution stabilizes a high-dimensional "working model" by shrinking it toward a "simplified model." The shrinkage is achieved by supplementing the observed data with a small amount of "synthetic data" generated from a predictive distribution under the simpler model. We apply this framework to generalized linear models, where we propose various strategies for the specification of a tuning parameter governing the degree of shrinkage and study resultant properties. The catalytic priors have simple interpretations and are easy to formulate. In our numerical experiments and a real-world study, the performance of the inference based on the catalytic prior are superior to, or comparable to, that of other commonly used prior distributions.
Keywords:
prior specification, regularization, synthetic data, causal inference
2. Bayesian additive regression trees in transcriptome-wide association studies
Speaker: Min Chen, The University of Texas at Dallas
Online
Abstract:
While genome-wide association studies (GWAS) have identified numerous trait-associated loci, the underlying causal genes and mechanisms often remain unclear. Transcriptome-wide association studies (TWAS) can uncover molecular and causal mechanisms underlying variant-trait associations. Traditional TWAS often focuses on one gene at a time and usually ignore gene regulatory relationships. We propose a novel network-based joint modeling approach that employ multi-variate Bayesian Additive Regression Trees (BART) in gene networks constructed using Hi-C data. By integrating spatial genomic architecture, it offers a robust approach to uncovering complex regulatory mechanisms. This method identifies biologically meaningful gene clusters, enhancing predictive power and interpretability. The model consistently outperforms traditional methods in identifying key SNPs and enhancing predictive accuracy using simulations and large-scale genomic datasets from GEUVADIS and GTEx.
While genome-wide association studies (GWAS) have identified numerous trait-associated loci, the underlying causal genes and mechanisms often remain unclear. Transcriptome-wide association studies (TWAS) can uncover molecular and causal mechanisms underlying variant-trait associations. Traditional TWAS often focuses on one gene at a time and usually ignore gene regulatory relationships. We propose a novel network-based joint modeling approach that employ multi-variate Bayesian Additive Regression Trees (BART) in gene networks constructed using Hi-C data. By integrating spatial genomic architecture, it offers a robust approach to uncovering complex regulatory mechanisms. This method identifies biologically meaningful gene clusters, enhancing predictive power and interpretability. The model consistently outperforms traditional methods in identifying key SNPs and enhancing predictive accuracy using simulations and large-scale genomic datasets from GEUVADIS and GTEx.
Keywords:
GWAS, TWAS, multivariate BART, gene-network, 3D spatial regulatory interactions
3. AI-powered Bayesian methods for interpretable pathology image analysis
Speaker: Qiwei Li, The University of Texas at Dallas
In-person
Abstract:
Statistics traditionally emphasizes human-driven analysis supported by computational tools, whereas AI primarily depends on computer algorithms with guidance from human insight. Nonetheless, each milestone in statistical development opens new frontiers for AI and offers fresh perspectives within statistics itself. This interplay fosters discoveries initiated from either domain that ultimately enrich the other. In this talk, I will illustrate how the integration of statistical spatial and shape analysis and AI enables more interpretable and predictive pathways from histopathology images to clinically meaningful insights. Recent advances in deep learning have made it possible to detect and classify tissue regions and individual cells at scale from digital histopathology images. I will introduce several novel AI-powered Bayesian models for analyzing these images. These methods offer new insights into cell-cell interactions, spatial cellular architecture, and tumor boundaries in the context of cancer progression, supported by multiple case studies.
Statistics traditionally emphasizes human-driven analysis supported by computational tools, whereas AI primarily depends on computer algorithms with guidance from human insight. Nonetheless, each milestone in statistical development opens new frontiers for AI and offers fresh perspectives within statistics itself. This interplay fosters discoveries initiated from either domain that ultimately enrich the other. In this talk, I will illustrate how the integration of statistical spatial and shape analysis and AI enables more interpretable and predictive pathways from histopathology images to clinically meaningful insights. Recent advances in deep learning have made it possible to detect and classify tissue regions and individual cells at scale from digital histopathology images. I will introduce several novel AI-powered Bayesian models for analyzing these images. These methods offer new insights into cell-cell interactions, spatial cellular architecture, and tumor boundaries in the context of cancer progression, supported by multiple case studies.
Keywords:
Bayesian, medical image, histology image, spatially resolved transcriptomics, spatial analysis, shape analysis
General Bayesian approaches and Bayesian predictive syntheses
Bldg 219 Room 104
4 Presentations
1. Tree boosting for learning density ratios with generalized Bayesian uncertainty quantification
Speaker: Naoki Awaya, Waseda University
In-person
Abstract:
Learning the density ratio from two samples of observations is a fundamental task for detecting and quantifying differences between two groups. To provide an accurate approximation of density ratios with reasonable computational cost, we propose a variant of the AdaBoost algorithm, historically used for classification and regression tasks. Similar to the standard AdaBoost, the proposed algorithm sequentially updates the estimate by adding tree-based weak learners, while observations are weighted based on the gap between the current density ratio estimate and group allocation. A novel loss function, called the balancing loss, is inspired by the commonly used loss in classification AdaBoost but is tailored to facilitate direct density ratio estimation. Our numerical experiments demonstrate that the proposed algorithm outperforms existing approaches in terms of both accuracy and computational efficiency. Additionally, we introduce a generalized Bayesian framework for uncertainty quantification, allowing for the assessment of statistical significance at each observed point.
Learning the density ratio from two samples of observations is a fundamental task for detecting and quantifying differences between two groups. To provide an accurate approximation of density ratios with reasonable computational cost, we propose a variant of the AdaBoost algorithm, historically used for classification and regression tasks. Similar to the standard AdaBoost, the proposed algorithm sequentially updates the estimate by adding tree-based weak learners, while observations are weighted based on the gap between the current density ratio estimate and group allocation. A novel loss function, called the balancing loss, is inspired by the commonly used loss in classification AdaBoost but is tailored to facilitate direct density ratio estimation. Our numerical experiments demonstrate that the proposed algorithm outperforms existing approaches in terms of both accuracy and computational efficiency. Additionally, we introduce a generalized Bayesian framework for uncertainty quantification, allowing for the assessment of statistical significance at each observed point.
Keywords:
Non-parametric inference, tree-based models, generalized bayes
2. General Bayesian quantile regression for counts via generative modeling
Speaker: Yuta Yamauchi, Nagoya University
In-person
Abstract:
While count data frequently arise in biomedical applications, such as the length of hospital stay, their discrete nature poses significant challenges for appropriately modeling conditional quantiles. To solve the practical difficulty, we propose a novel general Bayesian framework for quantile regression tailored to count data. We seek the regression parameter on the conditional quantile by minimizing the expected loss with respect to the distribution of the conditional quantile of the latent continuous variable associated with the observed count response variable. By modeling the unknown conditional distribution through the Bayesian nonparametric kernel mixture for the joint distribution of the count response and covariates, we obtain the posterior distribution of the regression parameter via a simple optimization. We numerically show that the proposed method improves bias and estimation accuracy of the existing crude approaches to count quantile regression. Furthermore, we analyze the length of hospital stay for acute myocardial infarction and demonstrate that the proposed method gives more interpretable and flexible results than the existing ones.
While count data frequently arise in biomedical applications, such as the length of hospital stay, their discrete nature poses significant challenges for appropriately modeling conditional quantiles. To solve the practical difficulty, we propose a novel general Bayesian framework for quantile regression tailored to count data. We seek the regression parameter on the conditional quantile by minimizing the expected loss with respect to the distribution of the conditional quantile of the latent continuous variable associated with the observed count response variable. By modeling the unknown conditional distribution through the Bayesian nonparametric kernel mixture for the joint distribution of the count response and covariates, we obtain the posterior distribution of the regression parameter via a simple optimization. We numerically show that the proposed method improves bias and estimation accuracy of the existing crude approaches to count quantile regression. Furthermore, we analyze the length of hospital stay for acute myocardial infarction and demonstrate that the proposed method gives more interpretable and flexible results than the existing ones.
Keywords:
Quantile regression, Nonparametric Bayesian learning, Markov chain Monte Carlo, Pitman-Yor process, Rounded Gaussian distribution
3. Dynamic Bayesian regression quantile synthesis for forecasting outlook-at-risk
Speaker: Genya Kobayashi, Meiji University
In-person
Abstract:
This study aims to provide a Bayesian approach to accurate quantile forecasting for time series data through the Bayesian predictive synthesis. The proposed dynamic Bayesian regression quantile introduces predictions from the agent quantile predictive models as latent factors and lets the weights for the agent models vary across time, constituting a dynamic latent factor model for quantiles. We also consider extending the model for quantile prediction of multiple time series data by introducing an additional factor structure to the synthesis weights. The performance of the proposed approach is demonstrated using the US inflation rate and GDP growth rates for some developed countries.
This study aims to provide a Bayesian approach to accurate quantile forecasting for time series data through the Bayesian predictive synthesis. The proposed dynamic Bayesian regression quantile introduces predictions from the agent quantile predictive models as latent factors and lets the weights for the agent models vary across time, constituting a dynamic latent factor model for quantiles. We also consider extending the model for quantile prediction of multiple time series data by introducing an additional factor structure to the synthesis weights. The performance of the proposed approach is demonstrated using the US inflation rate and GDP growth rates for some developed countries.
Keywords:
quantile regression, dynamic linear model, Bayesian predictive synthesis
4. Locally adaptive Bayesian spatiotemporal conditional autoregressive model
Speaker: Takahiro Onizuka, Chiba University
In-person
Abstract:
Spatio-temporal smoothing is a method for estimating the underlying spatio-temporal trend by removing observational noise from the data. In typical applications where multi-year data are observed for multiple adjacent regions, it is common to assume that the spatial trend changes smoothly over space based on the actual geographical adjacency, and that this spatial trend also evolves smoothly over time. However, when the data contain local structural changes that violate these smoothness assumptions, conventional methods may oversmooth the data and fail to capture important local variations. In this study, we propose a novel spatio-temporal smoothing approach that is capable of capturing such local spatio-temporal changes. The effectiveness of the proposed method is demonstrated through numerical experiments.
Spatio-temporal smoothing is a method for estimating the underlying spatio-temporal trend by removing observational noise from the data. In typical applications where multi-year data are observed for multiple adjacent regions, it is common to assume that the spatial trend changes smoothly over space based on the actual geographical adjacency, and that this spatial trend also evolves smoothly over time. However, when the data contain local structural changes that violate these smoothness assumptions, conventional methods may oversmooth the data and fail to capture important local variations. In this study, we propose a novel spatio-temporal smoothing approach that is capable of capturing such local spatio-temporal changes. The effectiveness of the proposed method is demonstrated through numerical experiments.
Keywords:
spatiotemporal model, conditional autoregressive model, shrinkage priors
New development in Bayesian analysis and the applications
Bldg 219 Room 103
4 Presentations
1. Bayesian empirical likelihood inference for the mean absolute deviation
Speaker: Hongyan Jiang, Huaiyin Institute of Technology
Online
Abstract:
The mean absolute deviation (MAD) is a direct measure of the dispersion of a random variable about its mean. In this paper, the empirical likelihood (EL) and the adjusted EL methods for the MAD are proposed. The Bayesian empirical likelihood, the Bayesian adjusted empirical likelihood, the Bayesian jackknife empirical likelihood and the Bayesian adjusted jackknife empirical likelihood methods are used to construct credible intervals for the MAD. Simulation results show that the proposed EL method performs better than the JEL in Zhao et al. , and the proper prior information improves coverage rates of confidence/credible intervals. Two real datasets are used to illustrate the new procedures.
The mean absolute deviation (MAD) is a direct measure of the dispersion of a random variable about its mean. In this paper, the empirical likelihood (EL) and the adjusted EL methods for the MAD are proposed. The Bayesian empirical likelihood, the Bayesian adjusted empirical likelihood, the Bayesian jackknife empirical likelihood and the Bayesian adjusted jackknife empirical likelihood methods are used to construct credible intervals for the MAD. Simulation results show that the proposed EL method performs better than the JEL in Zhao et al. , and the proper prior information improves coverage rates of confidence/credible intervals. Two real datasets are used to illustrate the new procedures.
Keywords:
Adjusted empirical likelihood, Bayesian empirical likelihood, Bayesian jackknife empirical likelihood, empirical likelihood, mean absolute deviation
2. Bayesian nonparametric model for heterogeneous treatment effects with zero-inflated data
Speaker: Yisheng Li, MD Anderson Cancer Center
Online
Abstract:
Precision medicine calls for statistical models to be developed for identifying and evaluating potentially heterogeneous treatment effects in a robust manner. The oft-cited existing methods for assessing treatment effect heterogeneity are based on parametric models with interactions or conditioning on covariate values, the performance of which are sensitive to the omission of important covariates and/or the choice of their values. We propose a new Bayesian nonparametric (BNP) method for estimating heterogeneous causal effects in studies with zero-inflated outcome data that arise commonly in health-related studies. We employ the enriched Dirichlet process (EDP) mixture in our BNP approach, establishing a connection between an outcome and a covariate DP mixture. This enables us to estimate posterior distributions concurrently, facilitating flexible inference regarding individual causal effects. We show in simulation studies that the proposed method outperforms two other BNP methods for inference on the conditional average treatment effects. We apply the proposed method to a study of the relationship between heart radiation dose and the blood level of a cardiac toxicity biomarker.
Precision medicine calls for statistical models to be developed for identifying and evaluating potentially heterogeneous treatment effects in a robust manner. The oft-cited existing methods for assessing treatment effect heterogeneity are based on parametric models with interactions or conditioning on covariate values, the performance of which are sensitive to the omission of important covariates and/or the choice of their values. We propose a new Bayesian nonparametric (BNP) method for estimating heterogeneous causal effects in studies with zero-inflated outcome data that arise commonly in health-related studies. We employ the enriched Dirichlet process (EDP) mixture in our BNP approach, establishing a connection between an outcome and a covariate DP mixture. This enables us to estimate posterior distributions concurrently, facilitating flexible inference regarding individual causal effects. We show in simulation studies that the proposed method outperforms two other BNP methods for inference on the conditional average treatment effects. We apply the proposed method to a study of the relationship between heart radiation dose and the blood level of a cardiac toxicity biomarker.
Keywords:
Enriched Dirichlet process, high-sensitivity cardiac troponin T, heterogeneous effects, missing at random
3. BiGER: Fast and Accurate Bayesian Rank Aggregation for Genomic Data
Speaker: Sherry Wang, University of Texas at Arlington
Online
Abstract:
With the rise of large-scale genomic studies, large gene lists targeting important diseases are increasingly common. While evaluating each study individually gives valuable insights on specific samples and study designs, the wealth of available evidence in the literature calls for robust and efficient meta-analytic methods. Crucially, the diverse assumptions and experimental protocols underlying different studies require a flexible but rigorous method for aggregation. To address these issues, we propose BiGER, a fast and accurate Bayesian rank aggregation method for the inference of latent global rankings. Unlike existing methods in the field, BiGER accommodates mixed gene lists with top-ranked and top-unranked genes as well as bottom-tied and missing genes, by design. Using a Bayesian hierarchical framework combined with variational inference, BiGER efficiently aggregates large-scale gene lists with high accuracy, while providing valuable insights into source-specific reliability for researchers. Through both simulated and real datasets, we show that BiGER is a useful tool for reliable meta-analysis in genomic studies.
With the rise of large-scale genomic studies, large gene lists targeting important diseases are increasingly common. While evaluating each study individually gives valuable insights on specific samples and study designs, the wealth of available evidence in the literature calls for robust and efficient meta-analytic methods. Crucially, the diverse assumptions and experimental protocols underlying different studies require a flexible but rigorous method for aggregation. To address these issues, we propose BiGER, a fast and accurate Bayesian rank aggregation method for the inference of latent global rankings. Unlike existing methods in the field, BiGER accommodates mixed gene lists with top-ranked and top-unranked genes as well as bottom-tied and missing genes, by design. Using a Bayesian hierarchical framework combined with variational inference, BiGER efficiently aggregates large-scale gene lists with high accuracy, while providing valuable insights into source-specific reliability for researchers. Through both simulated and real datasets, we show that BiGER is a useful tool for reliable meta-analysis in genomic studies.
Keywords:
Bayesian hierarchical modeling, gene expression, Gibbs sampling, meta-analysis, posterior inference, variational inference
4. Interval Estimation for the Youden Index and Optimal Cut-Off Point in AUC-Based Optimal Combinations of Multivariate Normal Biomarkers With Covariates
Speaker: Yichuan Zhao, Georgia State University
Online
Abstract:
In this talk, we present interval estimation methods for the Youden index and the optimal cut‐off point in the context of AUC‐based optimal combinations of multivariate normally distributed biomarkers, considering the presence of covariates. We propose a generalized pivotal confidence interval, a Bayesian credible interval, and several bootstrap confidence intervals for both the Youden index and its corresponding cut‐off point. To evaluate the performance of these confidence and credible intervals, we conducted a Monte Carlo simulation study. Finally, we illustrate the proposed methods using a diabetic dataset.
In this talk, we present interval estimation methods for the Youden index and the optimal cut‐off point in the context of AUC‐based optimal combinations of multivariate normally distributed biomarkers, considering the presence of covariates. We propose a generalized pivotal confidence interval, a Bayesian credible interval, and several bootstrap confidence intervals for both the Youden index and its corresponding cut‐off point. To evaluate the performance of these confidence and credible intervals, we conducted a Monte Carlo simulation study. Finally, we illustrate the proposed methods using a diabetic dataset.
Keywords:
Bayesian credible interval, covariates, generalized pivotal variable, linear combinations, ROC curve, Youden index
Recent Advances in Bayesian Methods for Spatial Biomedical Data
Bldg 219 Room 102
4 Presentations
1. AI-powered Bayesian Methods for Analyzing Spatial Biomedical Data
Speaker: Li Qiwei, The University of Texas at Dallas
In-person
Abstract:
Statistics relies more on human analyses with computer aids, while AI relies more on computer algorithms with aids from humans. Nevertheless, expanding the statistics concourse at each milestone provides new avenues for AI and creates new insights in statistics. This part incubates the findings initiated from either statistics or AI and benefits from the other. In this talk, I will demonstrate how the marriage between spatial statistics and AI leads to more explainable and predictable paths from raw spatial biomedical data to conclusions. The first part concerns the spatial modeling of AI-reconstructed pathology images. Recent developments in deep-learning methods have enabled us to identify and classify individual cells from digital pathology images at a large scale. The randomly distributed cells can be considered from a marked point process. I will present two novel Bayesian models for characterizing spatial correlations in a multi-type spatial point pattern. The new method provides a unique perspective for understanding the role of cell-cell interactions in cancer progression, demonstrated through a case study of 188 lung cancer patients. The second part concerns the spatial modeling of the emerging spatially resolved transcriptomics data. Recent technology breakthroughs in spatial molecular profiling have enabled the comprehensive molecular characterization of single cells while preserving their spatial and morphological contexts. This new bioinformatics scenario advances our understanding of molecular and cellular spatial organizations in tissues, fueling the next generation of scientific discovery. I will focus on integrating information from AI tools into Bayesian models to address key questions in this field, such as spatially variable gene detection and spatial domain identification.
Statistics relies more on human analyses with computer aids, while AI relies more on computer algorithms with aids from humans. Nevertheless, expanding the statistics concourse at each milestone provides new avenues for AI and creates new insights in statistics. This part incubates the findings initiated from either statistics or AI and benefits from the other. In this talk, I will demonstrate how the marriage between spatial statistics and AI leads to more explainable and predictable paths from raw spatial biomedical data to conclusions. The first part concerns the spatial modeling of AI-reconstructed pathology images. Recent developments in deep-learning methods have enabled us to identify and classify individual cells from digital pathology images at a large scale. The randomly distributed cells can be considered from a marked point process. I will present two novel Bayesian models for characterizing spatial correlations in a multi-type spatial point pattern. The new method provides a unique perspective for understanding the role of cell-cell interactions in cancer progression, demonstrated through a case study of 188 lung cancer patients. The second part concerns the spatial modeling of the emerging spatially resolved transcriptomics data. Recent technology breakthroughs in spatial molecular profiling have enabled the comprehensive molecular characterization of single cells while preserving their spatial and morphological contexts. This new bioinformatics scenario advances our understanding of molecular and cellular spatial organizations in tissues, fueling the next generation of scientific discovery. I will focus on integrating information from AI tools into Bayesian models to address key questions in this field, such as spatially variable gene detection and spatial domain identification.
Keywords:
TBD
2. Differential Inference for Spatial Transcriptomics Data
Speaker: Song Fangda, The Chinese University of Hong Kong, Shenzhen
In-person
Abstract:
Spatial transcriptomics experiments are becoming more and more complicated, involving patients or tissues collected from multiple biological conditions. Comparative studies for spatial transcriptomics data enable the discovery of different spatial expression patterns between different biological conditions. Moreover, spatial expression patterns are highly heterogeneous across different subregions in a single slide. Therefore, we develop a Bayesian hierarchical model on the spatial metric of subregions to simultaneously classify the subregions and conduct differential inference for spatial summary statistics. Subregion classification and differential inference for spatial metric enables us to identify the perturbation induced by the disease conditions and explain the underlying biological mechanism.
Spatial transcriptomics experiments are becoming more and more complicated, involving patients or tissues collected from multiple biological conditions. Comparative studies for spatial transcriptomics data enable the discovery of different spatial expression patterns between different biological conditions. Moreover, spatial expression patterns are highly heterogeneous across different subregions in a single slide. Therefore, we develop a Bayesian hierarchical model on the spatial metric of subregions to simultaneously classify the subregions and conduct differential inference for spatial summary statistics. Subregion classification and differential inference for spatial metric enables us to identify the perturbation induced by the disease conditions and explain the underlying biological mechanism.
Keywords:
Bayesian Hierarchical Modelling, Spatial Transcriptomics Experiments, Differential Inference
3. Spatially Aware Adjusted Rand Index for Evaluating Spatial Transcriptomics Clustering
Speaker: Yan Yinqiao, Beijing University of Technology
In-person
Abstract:
The spatial transcriptomics (ST) clustering plays a crucial role in elucidating the tissue spatial heterogeneity. An accurate ST clustering result can greatly benefit downstream biological analyses. As various ST clustering approaches are proposed in recent years, comparing their clustering accuracy becomes important in benchmarking studies. However, the widely used metric, adjusted Rand index (ARI), totally ignores the spatial information in ST data, which prevents ARI from fully evaluating spatial ST clustering methods. We propose a spatially aware Rand index (spRI) as well as spARI that incorporate the spatial distance information. The spatially aware feature of spRI adaptively differentiates disagreement object pairs based on their distinct distances, providing a useful evaluation metric that favors spatial coherence of clustering. The spARI is obtained by adjusting spRI for random chances such that its expectation takes zero under an appropriate null model. Statistical properties of spRI and spARI are discussed. The applications to simulation study and two ST datasets demonstrate the improved utilities of spARI compared to ARI in evaluating ST clustering methods.
The spatial transcriptomics (ST) clustering plays a crucial role in elucidating the tissue spatial heterogeneity. An accurate ST clustering result can greatly benefit downstream biological analyses. As various ST clustering approaches are proposed in recent years, comparing their clustering accuracy becomes important in benchmarking studies. However, the widely used metric, adjusted Rand index (ARI), totally ignores the spatial information in ST data, which prevents ARI from fully evaluating spatial ST clustering methods. We propose a spatially aware Rand index (spRI) as well as spARI that incorporate the spatial distance information. The spatially aware feature of spRI adaptively differentiates disagreement object pairs based on their distinct distances, providing a useful evaluation metric that favors spatial coherence of clustering. The spARI is obtained by adjusting spRI for random chances such that its expectation takes zero under an appropriate null model. Statistical properties of spRI and spARI are discussed. The applications to simulation study and two ST datasets demonstrate the improved utilities of spARI compared to ARI in evaluating ST clustering methods.
Keywords:
clustering evaluation, hypergeometric distribution, Rand index, spatial transcriptomics
4. Tissue Annotation for in Lung Cancer Histopathology with Batch Effect Correction
Speaker: Zhai Yibo, The Chinese University of Hong Kong
In-person
Abstract:
Histopathological analysis of hematoxylin and eosin (H&E)-stained slides plays a critical role in lung cancer diagnosis and prognosis. However, accurate annotation of tumor regions and other tissue structures typically requires expert pathology experts, making annotation for large numbers of images time-consuming, labor intensive and expensive. In this study, we propose a Bayesian model that facilitates automated annotation of diverse tissue regions, including regions of interest (ROIs), from lung cancer H&E slides. At the same time, our method can correct for batch effect, such as staining variability and scanner-induced color difference, ensuring that the extracted image features are comparable across datasets. Application of our proposed model to lung cancer H&E slides show that the features identified by our proposed model provide meaningful biological interpretations in the context of epidermal growth factor receptor (EGFR) mutations, which are clinically important for lung cancer diagnosis and therapy strategy.
Histopathological analysis of hematoxylin and eosin (H&E)-stained slides plays a critical role in lung cancer diagnosis and prognosis. However, accurate annotation of tumor regions and other tissue structures typically requires expert pathology experts, making annotation for large numbers of images time-consuming, labor intensive and expensive. In this study, we propose a Bayesian model that facilitates automated annotation of diverse tissue regions, including regions of interest (ROIs), from lung cancer H&E slides. At the same time, our method can correct for batch effect, such as staining variability and scanner-induced color difference, ensuring that the extracted image features are comparable across datasets. Application of our proposed model to lung cancer H&E slides show that the features identified by our proposed model provide meaningful biological interpretations in the context of epidermal growth factor receptor (EGFR) mutations, which are clinically important for lung cancer diagnosis and therapy strategy.
Keywords:
Pathology images, Batch effects, Model-based clustering
Recent Developments in a Bayesian Framework
Bldg 219 Room 105
4 Presentations
1. Extended Fiducial Inference: Toward an Automated Process of Statistical Inference
Speaker: Sehwan Kim, Ewha Womans University
Online
Abstract:
While fiducial inference was widely considered a big blunder by R.A. Fisher, the goal he initially set –'inferring the uncertainty of model parameters on the basis of observations' – has been continually pursued by many statisticians. To this end, we develop a new statistical inference method called extended Fiducial inference (EFI). The new method achieves the goal of fiducial inference by leveraging advanced statistical computing techniques while remaining scalable for big data. EFI involves jointly imputing random errors realized in observations using stochastic gradient Markov chain Monte Carlo and estimating the inverse function using a sparse deep neural network (DNN). The consistency of the sparse DNN estimator ensures that the uncertainty embedded in observations is properly propagated to model parameters through the estimated inverse function, thereby validating downstream statistical inference. Compared to frequentist and Bayesian methods, EFI offers significant advantages in parameter estimation and hypothesis testing, especially when outliers are present in the observations. EFI also provides an innovative framework for semi-supervised learning.
While fiducial inference was widely considered a big blunder by R.A. Fisher, the goal he initially set –'inferring the uncertainty of model parameters on the basis of observations' – has been continually pursued by many statisticians. To this end, we develop a new statistical inference method called extended Fiducial inference (EFI). The new method achieves the goal of fiducial inference by leveraging advanced statistical computing techniques while remaining scalable for big data. EFI involves jointly imputing random errors realized in observations using stochastic gradient Markov chain Monte Carlo and estimating the inverse function using a sparse deep neural network (DNN). The consistency of the sparse DNN estimator ensures that the uncertainty embedded in observations is properly propagated to model parameters through the estimated inverse function, thereby validating downstream statistical inference. Compared to frequentist and Bayesian methods, EFI offers significant advantages in parameter estimation and hypothesis testing, especially when outliers are present in the observations. EFI also provides an innovative framework for semi-supervised learning.
Keywords:
Complex Hypothesis Test, Markov chain Monte Carlo, Semi-Supervised Learning, Sparse deep learning, Uncertainty Quantification
2. Bayesian Order-based Structure Learning for multiple DAGs
Speaker: Hyunwoong Chang, University of Texas at Dallas
In-person
Abstract:
We propose a novel order-based structure learning method for jointly estimating multiple directed acyclic graph (DAG) models. While the model identifiability is expected to be alleviated by pooling information from multiple datasets, high computational costs make it difficult to incorporate the variability into the model. Our proposed model addresses this challenge with a Bayesian framework tailored to fast Markov chain Monte Carlo (MCMC) sampling, scalable to handle a large number of datasets. The proposed approach remains effective even when violating the faithfulness assumption, which is widely considered unrealistic in practice. We demonstrate the efficacy of the method through extensive simulation studies and showcase its practical advantages with a case-control study on major depressive disorder (MDD) using single-cell RNA sequencing data.
We propose a novel order-based structure learning method for jointly estimating multiple directed acyclic graph (DAG) models. While the model identifiability is expected to be alleviated by pooling information from multiple datasets, high computational costs make it difficult to incorporate the variability into the model. Our proposed model addresses this challenge with a Bayesian framework tailored to fast Markov chain Monte Carlo (MCMC) sampling, scalable to handle a large number of datasets. The proposed approach remains effective even when violating the faithfulness assumption, which is widely considered unrealistic in practice. We demonstrate the efficacy of the method through extensive simulation studies and showcase its practical advantages with a case-control study on major depressive disorder (MDD) using single-cell RNA sequencing data.
Keywords:
Bayesian Statistics, graphical model, structure learning, Markov chain Monte Carlo, genetics
3. Flexible modeling of between-location heterogeneity in time-series count data using a Nonparametric Bayesian Poisson hurdle model: a case study in Japan
Speaker: Jinsu Park, Chungbuk National University
In-person
Abstract:
In environmental epidemiology, the short-term association between temperature and suicide has been examined by analyzing daily time-series data on suicide and temperature collected from multiple locations. A two-stage meta-regression has been conventionally used. A Poisson regression is fitted for each location in the first stage, and location-specific association parameter estimates are pooled, adjusted, and regressed onto location-specific variables using meta-regressions in the second stage. However, several limitations of the two-stage approaches have been reported. In this study, we propose a nonparametric Bayesian Poisson hurdle random effects model to investigate heterogeneity in the temperature-suicide association across multiple locations. The proposed model consists of two parts, binary and positive, with random coefficients specified to describe heterogeneity. Furthermore, random coefficients combined with location-specific indicators were assumed to follow a Dirichlet process mixture of normals to identify the subgroups. The proposed methodology was validated through a simulation study and applied to data from a nationwide temperature-suicide association study in Japan.
In environmental epidemiology, the short-term association between temperature and suicide has been examined by analyzing daily time-series data on suicide and temperature collected from multiple locations. A two-stage meta-regression has been conventionally used. A Poisson regression is fitted for each location in the first stage, and location-specific association parameter estimates are pooled, adjusted, and regressed onto location-specific variables using meta-regressions in the second stage. However, several limitations of the two-stage approaches have been reported. In this study, we propose a nonparametric Bayesian Poisson hurdle random effects model to investigate heterogeneity in the temperature-suicide association across multiple locations. The proposed model consists of two parts, binary and positive, with random coefficients specified to describe heterogeneity. Furthermore, random coefficients combined with location-specific indicators were assumed to follow a Dirichlet process mixture of normals to identify the subgroups. The proposed methodology was validated through a simulation study and applied to data from a nationwide temperature-suicide association study in Japan.
Keywords:
Bayesian inference, temperature-suicide association, Poisson hurdle model, Drichlet process mixture, model-based clustering
4. Posterior asymptotics of high-dimensional spiked covariance model with inverse-Wishart prior
Speaker: Kwangmin Lee, Chonnam National University
In-person
Abstract:
We study Bayesian inference for the spiked eigenstructure of high-dimensional covariance matrices, where a small number of eigenvalues (spikes) are significantly larger than the remaining bulk. Our goal is to estimate the spiked eigenvalues, their associated eigenvectors, and the number of spikes. To this end, we impose an inverse-Wishart prior on the unknown covariance matrix and derive the posterior distributions of the eigenvalues and eigenvectors by transforming the posterior distribution of the full covariance matrix. We show that the degrees of freedom parameter in the inverse-Wishart prior governs the shrinkage of the eigenvalues, and we propose a data-driven choice of this hyperparameter to correct bias in the estimated eigenvalues. Furthermore, we prove that under the spiked covariance model, the posterior distribution of the spiked eigenvectors concentrates around the true eigenvectors, and in the single-spike setting, the posterior achieves minimax optimality. We also introduce a Bayesian method for selecting the number of spikes, which provides a principled way to quantify uncertainty in determining the intrinsic dimension of principal components.
We study Bayesian inference for the spiked eigenstructure of high-dimensional covariance matrices, where a small number of eigenvalues (spikes) are significantly larger than the remaining bulk. Our goal is to estimate the spiked eigenvalues, their associated eigenvectors, and the number of spikes. To this end, we impose an inverse-Wishart prior on the unknown covariance matrix and derive the posterior distributions of the eigenvalues and eigenvectors by transforming the posterior distribution of the full covariance matrix. We show that the degrees of freedom parameter in the inverse-Wishart prior governs the shrinkage of the eigenvalues, and we propose a data-driven choice of this hyperparameter to correct bias in the estimated eigenvalues. Furthermore, we prove that under the spiked covariance model, the posterior distribution of the spiked eigenvectors concentrates around the true eigenvectors, and in the single-spike setting, the posterior achieves minimax optimality. We also introduce a Bayesian method for selecting the number of spikes, which provides a principled way to quantify uncertainty in determining the intrinsic dimension of principal components.
Keywords:
High-dimensional, Bayesian, Principal component analysis
Recent advances in Bayesian computational methods
Bldg 219 Room 103
3 Presentations
1. When Transfer Learning Meets Bayesian Analysis
Speaker: Xiaozhou Wang, East China Normal University
In-person
Abstract:
Transfer learning leverages knowledge gained from related source domains to improve performance on tasks in a target domain. In this talk, I will introduce some research results on the integration of transfer learning with Bayesian analysis. The proposed algorithms and the corresponding theoretical properties are presented. Numerical experiments are conducted to demonstrate the effectiveness of the methodologies.
Transfer learning leverages knowledge gained from related source domains to improve performance on tasks in a target domain. In this talk, I will introduce some research results on the integration of transfer learning with Bayesian analysis. The proposed algorithms and the corresponding theoretical properties are presented. Numerical experiments are conducted to demonstrate the effectiveness of the methodologies.
Keywords:
Bayesian statistics, transfer learning, machine learning
2. Improving approximate Bayesian computation methods using machine learning methods
Speaker: Shijia Wang, ShanghaiTech University
In-person
Abstract:
We address the challenge of Markov Chain Monte Carlo (MCMC) algorithms within the approximate Bayesian Computation (ABC) framework, which often get trapped in local optima due to their inherent local exploration mechanism. We propose a novel Global-Local ABC-MCMC algorithm that combines the ``exploration" capabilities of global proposals with the ``exploitation" finesse of local proposals. We integrate iterative importance resampling into the likelihood-free framework to establish an effective global proposal distribution, and select the optimum mixture of global and local moves based on a relative version of expected squared jumped distance via sequential optimization. Furthermore, we propose two adaptive schemes. The first adaptive scheme is a normalizing flow-based probabilistic distribution learning model to iteratively improve the proposal of importance sampling. The second adaptive scheme is optimizing the efficiency of the local sampler by utilizing Langevin dynamics and common random numbers. We numerically demonstrate that our method is able to improve sampling efficiency and achieve more reliable convergence for complex posteriors.
We address the challenge of Markov Chain Monte Carlo (MCMC) algorithms within the approximate Bayesian Computation (ABC) framework, which often get trapped in local optima due to their inherent local exploration mechanism. We propose a novel Global-Local ABC-MCMC algorithm that combines the ``exploration" capabilities of global proposals with the ``exploitation" finesse of local proposals. We integrate iterative importance resampling into the likelihood-free framework to establish an effective global proposal distribution, and select the optimum mixture of global and local moves based on a relative version of expected squared jumped distance via sequential optimization. Furthermore, we propose two adaptive schemes. The first adaptive scheme is a normalizing flow-based probabilistic distribution learning model to iteratively improve the proposal of importance sampling. The second adaptive scheme is optimizing the efficiency of the local sampler by utilizing Langevin dynamics and common random numbers. We numerically demonstrate that our method is able to improve sampling efficiency and achieve more reliable convergence for complex posteriors.
Keywords:
approximate Bayesian computation, Markov chain Monte Carlo, Global-Local proposal
3. Poisson Hyperplane Processes with Rectified Linear Units
Speaker: Shufei Ge, ShanghaiTech University
In-person
Abstract:
Neural networks have shown state-of-the-art performances in various classification and regression tasks. Rectified linear units (ReLU) are often used as activation functions for the hidden layers in a neural network model. In this article, we establish the connection between the Poisson hyperplane processes (PHP) and two-layer ReLU neural networks. We show that the PHP with a Gaussian prior is an alternative probabilistic representation to a two-layer ReLU neural network. In addition, we show that a two-layer neural network constructed by PHP is scalable to large-scale problems via the decomposition propositions. Finally, we propose an annealed sequential Monte Carlo algorithm for Bayesian inference. Our numerical experiments demonstrate that our proposed method outperforms the classic two-layer ReLU neural network.
Neural networks have shown state-of-the-art performances in various classification and regression tasks. Rectified linear units (ReLU) are often used as activation functions for the hidden layers in a neural network model. In this article, we establish the connection between the Poisson hyperplane processes (PHP) and two-layer ReLU neural networks. We show that the PHP with a Gaussian prior is an alternative probabilistic representation to a two-layer ReLU neural network. In addition, we show that a two-layer neural network constructed by PHP is scalable to large-scale problems via the decomposition propositions. Finally, we propose an annealed sequential Monte Carlo algorithm for Bayesian inference. Our numerical experiments demonstrate that our proposed method outperforms the classic two-layer ReLU neural network.
Keywords:
neural network, Poisson hyperplane processes, sequential Monte Carlo
Recent development in Bayesian methodology and computation
Bldg 219 Room 103
4 Presentations
1. A Stein Gradient Descent Approach for Doubly Intractable Distributions
Speaker: Jaewoo Park, Yonsei University
In-person
Abstract:
Bayesian inference for doubly intractable distributions is challenging because they include intractable terms, which are functions of parameters of interest. Although several alternatives have been developed for such models, they are computationally intensive due to repeated auxiliary variable simulations. We propose a novel Monte Carlo Stein variational gradient descent (MC-SVGD) approach for inference for doubly intractable distributions. Through an efficient gradient approximation, our MC-SVGD approach rapidly transforms an arbitrary reference distribution to approximate the posterior distribution of interest, without necessitating any predefined variational distribution class for the posterior. Such a transport map is obtained by minimizing Kullback-Leibler divergence between the transformed and posterior distributions in a reproducing kernel Hilbert space (RKHS). We also investigate the convergence rate of the proposed method. We illustrate the application of the method to challenging examples, including a Potts model, an exponential random graph model, and a Conway--Maxwell--Poisson regression model.
Bayesian inference for doubly intractable distributions is challenging because they include intractable terms, which are functions of parameters of interest. Although several alternatives have been developed for such models, they are computationally intensive due to repeated auxiliary variable simulations. We propose a novel Monte Carlo Stein variational gradient descent (MC-SVGD) approach for inference for doubly intractable distributions. Through an efficient gradient approximation, our MC-SVGD approach rapidly transforms an arbitrary reference distribution to approximate the posterior distribution of interest, without necessitating any predefined variational distribution class for the posterior. Such a transport map is obtained by minimizing Kullback-Leibler divergence between the transformed and posterior distributions in a reproducing kernel Hilbert space (RKHS). We also investigate the convergence rate of the proposed method. We illustrate the application of the method to challenging examples, including a Potts model, an exponential random graph model, and a Conway--Maxwell--Poisson regression model.
Keywords:
doubly-intractable distributions, variational inference, Markov chain Monte Carlo, kernel Stein discrepancy, importance sampling
2. Application to Bayesian Statistical Methods in Environmental Epidemiology
Speaker: Daewon Yang, Chungnam National University
In-person
Abstract:
Numerous studies have shown a U- or J-shaped association between ambient temperature and mortality, which has shifted over time due to climate change and adaptation. Our preliminary analysis found that these temporal changes are not continuous, challenging the linearity assumption often made in conventional two-stage models. Segmenting the study period into sub-periods can lead to reduced power, and assuming normality in second-stage mixed-effects models increases sensitivity to outliers. To overcome these issues, we propose a robust Bayesian change-point model. In first stage, we estimated temperature-mortality relationships using distributed lag nonlinear models across three-year sub-periods. In second stage, we applied mixed-effects modeling with a Gaussian–t mixture to account for outliers, followed by hierarchical Bayesian change-point modeling with no-return constraint. Our approach identified two significant change points, indicating a recent decline in heat-mortality risk and suggesting successful adaptation in Japan. This framework enables robust detection of temporal shifts in functional data and can be applied beyond climate-related mortality studies.
Numerous studies have shown a U- or J-shaped association between ambient temperature and mortality, which has shifted over time due to climate change and adaptation. Our preliminary analysis found that these temporal changes are not continuous, challenging the linearity assumption often made in conventional two-stage models. Segmenting the study period into sub-periods can lead to reduced power, and assuming normality in second-stage mixed-effects models increases sensitivity to outliers. To overcome these issues, we propose a robust Bayesian change-point model. In first stage, we estimated temperature-mortality relationships using distributed lag nonlinear models across three-year sub-periods. In second stage, we applied mixed-effects modeling with a Gaussian–t mixture to account for outliers, followed by hierarchical Bayesian change-point modeling with no-return constraint. Our approach identified two significant change points, indicating a recent decline in heat-mortality risk and suggesting successful adaptation in Japan. This framework enables robust detection of temporal shifts in functional data and can be applied beyond climate-related mortality studies.
Keywords:
Change-point model; Bayesian Nonparametrics; Environmental Epidemiology; Temperature mortality association
3. Bayesian Envelope Models for Biomedical Data: A Novel Approach to Dimension Reduction
Speaker: Yeonhee Park, Sungkyunkwan University
In-person
Abstract:
Bayesian envelope models provide a powerful approach to dimension reduction in high-dimensional biomedical data. Originating from Cook et al. (2010), the envelope model improves multivariate regression by identifying material and immaterial parts of variation relevant to the response. This presentation introduces two recent advancements in Bayesian envelope methodology and their applications in imaging genetics and cell line data analysis for targeted treatments. The first is the Bayesian Simultaneous Partial Envelope Model, which reduces dimensions of both multivariate responses and selected predictors, preserving important genetic signals while removing irrelevant variation. The second is a Bayesian method for the Multivariate Probit Model with a Latent Envelope, tailored for correlated binary outcomes. This model, the first envelope approach for non-continuous multivariate responses under a generalized linear model framework, addresses identifiability challenges by reparameterizing the space with constraints and applying essential identifiability. Applications to biomedical data show both models reduce noise while maintaining key signals, enhancing inference and prediction.
Bayesian envelope models provide a powerful approach to dimension reduction in high-dimensional biomedical data. Originating from Cook et al. (2010), the envelope model improves multivariate regression by identifying material and immaterial parts of variation relevant to the response. This presentation introduces two recent advancements in Bayesian envelope methodology and their applications in imaging genetics and cell line data analysis for targeted treatments. The first is the Bayesian Simultaneous Partial Envelope Model, which reduces dimensions of both multivariate responses and selected predictors, preserving important genetic signals while removing irrelevant variation. The second is a Bayesian method for the Multivariate Probit Model with a Latent Envelope, tailored for correlated binary outcomes. This model, the first envelope approach for non-continuous multivariate responses under a generalized linear model framework, addresses identifiability challenges by reparameterizing the space with constraints and applying essential identifiability. Applications to biomedical data show both models reduce noise while maintaining key signals, enhancing inference and prediction.
Keywords:
Latent envelope, multivariate regression, multivariate probit model, reducing subspace, simultaneous envelope.
4. Bayesian Regression for Aggregate Ordinal Outcomes with Imprecise Categories
Speaker: Yeongjin Gwon, University of Nebraska Medical Center
Online
Abstract:
Comparing emerging treatment options is often challenging because of the sparse of direct comparisons from head-to-head trials and inconsistencies in outcome measures among published placebo-controlled trials for each treatment. The ordinal response variable will inevitably contain unknown response categories because they cannot be directly derived from published data. In this talk, we propose a statistical methodology to overcome such a common but unresolved issue in the context of network meta-regression for aggregate ordinal outcomes. Specifically, we introduce unobserved latent counts and model these counts within a Bayesian framework. The proposed approach includes several existing models as special cases and also allows us to conduct a proper statistical analysis in the presence of trials with certain missing categories. We then develop an efficient Markov chain Monte Carlo sampling algorithm to carry out Bayesian computation. A variation of the deviance information criterion is used for the assessment of goodness-of-fit under different distributions of the latent counts. A case study demonstrating the usefulness of the proposed methodology is conducted.
Comparing emerging treatment options is often challenging because of the sparse of direct comparisons from head-to-head trials and inconsistencies in outcome measures among published placebo-controlled trials for each treatment. The ordinal response variable will inevitably contain unknown response categories because they cannot be directly derived from published data. In this talk, we propose a statistical methodology to overcome such a common but unresolved issue in the context of network meta-regression for aggregate ordinal outcomes. Specifically, we introduce unobserved latent counts and model these counts within a Bayesian framework. The proposed approach includes several existing models as special cases and also allows us to conduct a proper statistical analysis in the presence of trials with certain missing categories. We then develop an efficient Markov chain Monte Carlo sampling algorithm to carry out Bayesian computation. A variation of the deviance information criterion is used for the assessment of goodness-of-fit under different distributions of the latent counts. A case study demonstrating the usefulness of the proposed methodology is conducted.
Keywords:
Bayesian SUCRA, Collapsed Gibbs sampling, Indirect comparison, Latent count