Home
home
Keynote Speakers
home

Invited Sessions

About Conference > Invited Sessions

Speakers

If you need to update photos or website links, please send an email to eacisba2025@gmail.com.

Invited Session Abstracts

Conference Abstract Collection
66 Presentations
19 Sessions
37 In-person
29 Online

Advanced Bayesian Methods for Analyzing Complex Data

Bldg 219 Room 102
3 Presentations
Organizer: Keunbaik Lee / Sungkyunkwan University
Chair: Keunbaik Lee / Sungkyunkwan University

1. Bayesian confounder selection using decision trees

Speaker: Chanmin Kim, Sungkyunkwan University Online
Email: chanmin.kim@skku.edu
Abstract:
Causal mediation analysis plays a vital role in understanding the mechanisms through which an exposure variable influences an outcome, by decomposing the total effect into direct and indirect effects. A major challenge in mediation analysis is the presence of unobserved or high-dimensional confounders that can bias these causal effect estimates. In this talk, we propose a Bayesian nonparametric framework for confounder selection in causal mediation analysis, leveraging a flexible extension of Bayesian Additive Regression Trees (BART). Our approach incorporates sparsity-inducing priors to identify potential confounders that satisfy a modified disjunctive cause criterion—ensuring appropriate adjustment for variables that affect the exposure, mediator, or outcome. We demonstrate the consistency of confounder selection under high-dimensional settings, providing theoretical guarantees for the posterior concentration and selection probabilities. Through extensive simulation studies, we compare our method against existing approaches, illustrating its superior performance in selecting true confounders and accurately estimating direct and indirect causal effects. Finally, we apply our propo
Keywords:
Bayesian additive regression trees, causal inference, confounder selection

2. A modified VAR-deGARCH model for asynchronous multivariate financial time series via variational Bayesian inference

Speaker: Ray-Bing Chen, National Tsing Hua Unversity In-person
Email: rbchen@stat.nthu.edu.tw Coauthors: Wei-Ting Lai/National Central University, Shih-Feng Huang/National Central University
Abstract:
This study proposes a modified VAR-deGARCH model, denoted by M-VAR-deGARCH, for modeling asynchronous multivariate financial time series with GARCH effects and simultaneously accommodating the latest market information. A variational Bayesian (VB) procedure is developed for the M-VAR-deGARCH model to infer structure selection and parameter estimation. We conduct extensive simulations and empirical studies to evaluate the fitting and forecasting performance of the M-VAR-deGARCH model. The simulation results reveal that the proposed VB procedure produces satisfactory selection performance. In addition, our empirical studies find that the latest market information in Asia can provide helpful information to predict market trends in Europe and South Africa, especially when momentous events occur.
Keywords:
Asynchronous time series, GARCH, Variable selection, Variational Bayesian inference, Vector autoregressive model

3. Bayesian Group Sparsity for Detecting Multiple Structural Breaks in an AR(p) Process

Speaker: Kuo-Jung Lee, National Chen Kung University In-person
Email: kuojunglee@ncku.edu.tw Coauthors: Kuo-Jung Lee/National Cheng Kung University, Yi-Chi Chen/National Cheng Kung University
Abstract:
This paper proposes a Bayesian group sparsity method for modeling time series with structural breaks in autoregressive (AR) processes. Structural breaks are treated as unknown and estimated jointly with segment-specific AR parameters within a unified Bayesian variable selection framework, eliminating the need to pre-specify the number or location of breakpoints. The method uses groupwise Gibbs sampling (GWGS) for efficient estimation, even in long time series. Simulation studies confirm its ability to detect structural changes and recover regime-specific dynamics. Applications to U.S. macroeconomic indicators and S&P 500 stock returns show that the approach effectively captures shifts in economic conditions and financial market volatility. Compared to traditional Bayesian methods, the proposed model is more computationally efficient and suitable for high-frequency or complex data, offering a practical tool for structural break detection in economic and financial analysis.
Keywords:
Bayesian variable selection; Group sparsity; Inflation persistence; Multiple structural changes; Nonstationary autoregressive process

Advanced Bayesian methods

Bldg 219 Room 105
3 Presentations
Organizer: Minwoo Chae / POSTECH
Chair: Minwoo Chae / POSTECH

1. Bayesian Analysis of Tensor Product Neural Networks

Speaker: Yongdai Kim, Seoul National University In-person
Email: ydkim0903@gmail.com Coauthors: Seokhun Park / Seoul National University, Choeun Kim / Seoul National University, Jihu Lee / Seoul National University, Yunseop Shin / Seoul National University
Abstract:
The growing emphasis on interpretability in machine learning has brought renewed attention to functional ANOVA model, which offers a principled approach for decomposing high-dimensional functions into interpretable lower-dimensional components. Recently, the Tensor Product Neural Network (TPNN) has been proposed to estimate each component in the functional ANOVA model accurately and stably. The number of basis TPNNs, however, should be defined a prior, which limits the applicability of TPNN significantly. In this work, we propose a Bayesian TPNN which can learn the number of basis TPNNs as well as the basis TPNNs themselves. We develop an efficient MCMC algorithm and illustrate that the proposed Bayesian TPNN performs well by analyzing multiple benchmark datasets.
Keywords:
Explainable AI, Functional ANOVA model, Tensor Product Neural Networks

2. Efficient MCMC for Bayesian Neural Networks

Speaker: Juho Lee, KAIST In-person
Email: juholee@kaist.ac.kr Coauthors: Hyunsu Kim / KAIST, Giung Nam / KAIST, Chulhee Yun / KAIST, Hongseok Yang / KAIST
Abstract:
Bayesian Neural Networks (BNNs) offer a principled framework for modeling predictive uncertainty and improving out-of-distribution (OOD) robustness by estimating posterior distributions over network parameters. Stochastic Gradient Markov Chain Monte Carlo (SGMCMC) enables scalable posterior sampling by combining stochastic gradients with Langevin dynamics, but often suffers from limited sample diversity, hindering uncertainty estimation and performance. We propose a simple and effective method to boost sample diversity in SGMCMC without tempering or multiple chains. By reparameterizing each weight matrix as a product of matrices, our approach induces trajectories that explore the parameter space more effectively. This leads to faster mixing and more diverse samples under the same computational budget, without increasing inference cost. Extensive experiments on image classification, including OOD robustness, loss landscape analyses, and comparisons with Hamiltonian Monte Carlo, validate the superiority of our method.
Keywords:
Bayesian neural networks, Stochastic gradient MCMC, Parameter expansion

3. L2-norm posterior contraction in Gaussian models with unknown variance

Speaker: Seonghyun Jeong, Yonsei University In-person
Email: sjeong@yonsei.ac.kr
Abstract:
The testing-based approach is a fundamental tool for establishing posterior contraction rates. Although the Hellinger metric is attractive owing to the existence of a desirable test function, it is not directly applicable in Gaussian models, because translating the Hellinger metric into more intuitive metrics typically requires strong boundedness conditions. When the variance is known, this issue can be addressed by directly constructing a test function relative to the L2-metric using the likelihood ratio test. However, when the variance is unknown, existing results are limited and rely on restrictive assumptions. To overcome this limitation, we derive a test function tailored to an unknown variance setting with respect to the L2-metric and provide sufficient conditions for posterior contraction based on the testing-based approach. We apply this result to analyze high-dimensional regression and nonparametric regression.
Keywords:
Bayesian nonparametrics, High-dimensional regression, Nonparametric regression, Testing-based posterior contraction

Advanced Bayesian methods for high-dimensional heterogeneous data

Bldg 219 Room 103
4 Presentations
Organizer: Qiwei Li / The University of Texas at Dallas
Chair: Qiwei Li / The University of Texas at Dallas

1. Bayesian covariate-assisted interaction analysis for multivariate count data in microbiome study

Speaker: Juhee Lee, University of California, Santa Cruz Online
Email: jle297@ucsc.edu Coauthors: Shuangjie Zhang/ Texas A&M University, Michael Patnode /University of California Santa Cruz
Abstract:
Understanding covariate-dependent interdependencies among features is of great interest in various applications. Motivated by a dataset of multivariate counts from a microbiome study, where microbial abundance and interaction patterns may change with environmental factors, we develop a Bayesian covariate-dependent factor model that flexibly estimates heteroscedasticity in the covariance matrix due to covariates. Our approach employs covariance regression through linear regression on a lower-dimensional factor loading matrix. This formulation, combined with joint sparsity induced by the Dir-HS prior for the factor loadings, provides robust estimation of covariate-dependent covariance in high-dimensional settings. The model uses a regression approach to the mean abundance and addresses the varying mean and covariance structure with covariates. Furthermore, the model tackles significant statistical challenges such as discreteness, over-dispersion, compositionality, and high dimensionality that are common in microbiome data analyses, using a flexible nonparametric Bayesian approach. We thoroughly explore the properties of the model and perform extensive simulation studies to examine it
Keywords:
Covariate-dependent interdependencies, Microbiome data analysis, Multivariate counts, Bayesian factor model, Covariance regression, Heteroscedasticity.

2. DAG trend filtering for genomic denoising via higher-order Bayesian networks and DAG shrinkage processes

Speaker: Weixuan Zhu, Xiamen University Online
Email: zhuweixuan@gmail.com Coauthors: Yang Ni/Texas A&M university, Fan Liao/Xiamen University
Abstract:
Graph-based denoising is a critical preprocessing step for analyzing noisy data, particularly in genomic applications where gene regulatory networks exhibit inherent directional dependencies. This paper introduces a novel directed acyclic graph trend filtering framework that leverages novel higher-order Bayesian networks and graphical shrinkage processes to enhance local adaptivity in signal smoothing along the directed edges of a graph. Unlike traditional graph trend filtering, which assumes undirected graphs, the proposed method explicitly respects the directional structure of graphs, improving interpretability and accuracy in capturing dependencies. We employ a Hamiltonian Monte Carlo algorithm for efficient posterior inference. Through simulations and genomic applications, the proposed method outperforms a state-of-the-art graph trend filtering algorithm in terms of mean squared error reduction and signal-to-noise ratio improvement, demonstrating its utility in recovering true signals while accounting for meaningful structural information.
Keywords:
Directed acyclic graph; Gene regulatory networks; Graph trend filtering; Graphical shrinkage process; Higher-order smoothing.

3. Generalized Bayesian nonparametric clustering framework for high-dimensional spatial data

Speaker: Bencong Zhu, Hong Kong University of Science and Technology Online
Email: mabczhu@ust.hk
Abstract:
The advent of next-generation sequencing-based spatially resolved transcriptomics (SRT) techniques has transformed genomic research by enabling high-throughput gene expression profiling while preserving spatial context. Identifying spatial domains within SRT data is a critical task, with numerous computational approaches currently available. However, most existing methods rely on a multi-stage process that involves ad-hoc dimension reduction techniques to manage the high dimensionality of SRT data. Additionally, many approaches depend on arbitrarily specifying the number of clusters, which can result in information loss and suboptimal downstream analysis. To address these limitations, we propose a novel Bayesian nonparametric mixture of factor analysis (BNPMFA) model, which incorporates a Markov random field-constrained Gibbs-type prior for partitioning high-dimensional spatial omics data. This new prior effectively integrates the spatial constraints inherent in SRT data while simultaneously inferring cluster membership and determining the optimal number of spatial domains. We have established the theoretical identifiability of cluster membership within this framework.
Keywords:
spatially resolved transcriptomics, factor analysis, Gibbs- type priors, Markov random field

4. Spatial panel data model with multi-dimensional heterogeneity: A Bayesian nonparametric approach

Speaker: Jianchao Zhuo, Xiamen University Online
Email: xiaoyihan@xmu.edu.cn Coauthors: Jianchao Zhuo / Xiamen University, Xiaoyi Han / Xiamen University, Weixuan Zhu / Xiamen University
Abstract:
This paper introduces evolving grouped pattern of heterogeneity to spatial panel autoregressive model, where group membership is left unrestricted and allowed to change over time. This approach enables dynamic clustering while simultaneously accounting for spatial dependence among cross-sectional units. Our model extends existing heterogeneous spatial panel data models by incorporating a hidden Markov model to represent group membership, which allows for transition of membership over time. Rather than assuming static group structures or fixed number of temporal membership breaks, our approach can accommodate uncertainty regarding the number of groups and their evolution. In an empirical study, we estimate the evolving group structure with spatial contemporaneous effect and unobserved heterogeneity to reveal the dynamic structure of industrial upgrading in China.
Keywords:
Bayesian nonparametrics, spatial panel data, group heterogeneity

Advancements in Bayesian Modeling of Complex Data Patterns

Bldg 219 Room 104
4 Presentations
Organizer: Quan Zhou / Texas A&M University
Chair: Hyunwoong Chang / University of Texas at Dallas

1. Region Selection with Spatially Dependent Continuous Shrinkage Prior with an Application to Hurricane Prediction

Speaker: Shuang Zhou, Arizona State University Online
Email: szhou98@asu.edu Coauthors: Zihan Zhu / Case Western Reserve University, Xueying Tang / University of Arizona
Abstract:
In this talk, we focus on developing a novel spatially dependent shrinkage prior for high-dimensional areal data under the Bayesian framework. The motivating problem originated from a climate data application that aims at predicting hurricane occurrences in the Atlantic basin region in the United States and understanding the climate system by extracting the significant sub-regions that may be related to hurricane occurrence. A high-dimensional Bayesian Poisson model is mainly discussed in this work, with both covariate vector and coefficient vector representing certain spatial correlation patterns. Unfortunately, the current Bayesian variable selection techniques hardly capture spatially correlation structure presented in areal data. Therefore, we proposed to apply continuous shrinkage priors to Bayesian spatial models, such as the Conditional Autoregressive (CAR) model, for the purpose of region selection. In this talk, numerical results will be presented to show a robust performance of our method for region selection under various spatial settings, and a real data application is discussed regarding the hurricane prediction for the Atlantic basin region from 1950 to 2013.
Keywords:
High-dimensional; Spatial model; Dependent shrinkage prior; Areal data

2. A Bayesian Nonparametric Approach to Recycling Classification Models via Clustering under Domain and Category Shift

Speaker: Zeya Wang, University of Kentucky Online
Email: zeya.wang@uky.edu
Abstract:
Recycling pretrained classification models for new domains has been extensively studied under the closed-set assumption that source and target domains share identical label spaces. However, this assumption does not hold when unseen classes appear in the target domain. Addressing this category shift remains particularly challenging in the source-free setting, where access to source data is unavailable. This more general and realistic scenario accounts for unknown target classes with no prior information on their identities or quantity. Most existing methods proposed for this task treat all unknown classes as a single group during both training and evaluation, limiting their capacity to model the underlying structure within the unknown class space. In this work, we present Adapt via Bayesian Nonparametric Clustering (ABC), a novel framework that, unlike prior methods, explicitly achieves fine-grained classification including unknown target classes, offering a more structured vision of the problem. Experiments on standard benchmarks demonstrate ABC’s superior performance and effective clustering of unknown classes.
Keywords:
Bayesian Nonparametric Clustering,Dirichlet Process,Transfer Learning

3. Scalable and robust regression models for continuous proportional data

Speaker: Changwoo Lee, Duke University Online
Email: changwoo.lee@duke.edu Coauthors: Changwoo Lee / Duke University,, Benjamin Dahl / Duke University,, Otso Ovaskainen / University of Jyvaskyla,, David Dunson / Duke University
Abstract:
Beta regression is used routinely for continuous proportional data, but it often encounters practical issues such as a lack of robustness of regression parameter estimates to misspecification of the beta distribution. We develop an improved class of generalized linear models starting with the continuous binomial (cobin) distribution and further extending to dispersion mixtures of cobin distributions (micobin). The proposed cobin regression and micobin regression models have attractive robustness, computation, and flexibility properties. A key innovation is the Kolmogorov-Gamma data augmentation scheme, which facilitates Gibbs sampling for Bayesian computation, including in hierarchical cases involving nested, longitudinal, or spatial data. We demonstrate robustness, ability to handle responses exactly at the boundary (0 or 1), and computational efficiency relative to beta regression in simulation experiments and through analysis of the benthic macroinvertebrate multimetric index of US lakes using lake watershed covariates.
Keywords:
Bayesian, Data augmentation, Generalized linear model, Latent Gaussian model, Markov chain Monte Carlo

4. Bayesian Non-parametrics for Spatio-temporal Data Sets

Speaker: Marcin Jurek, Southern Methodist University Online
Email: mjurek@smu.edu
Abstract:
Many environmental phenomena which evolve through time are extremely complex and difficult to model adequately. For example, modern remote sensing tools in atmospheric sciences allow us to obtain millions of measurements of certain variables with high frequency. At the same time, representing their temporal evolution requires very complex models and enormous computing power. In addition, since these complicated models often take a very long time to develop theoretically and implement. This allows but a handful of highly specialized researchers to use those tools. A promising approach to solving these problems is based on Gaussian Process State Space models, a Bayesian Nonparametric method which consists of imposing a Gaussian process prior on the unknown evolution system. However, the existing techniques using GPSSMs were built for low-dimensional systems, and without adjustments, they would be computationally infeasible if the dimension of the system is high. In this talk, we show this approach can be scaled to high-dimensional environmental problems.
Keywords:
Gaussian Process, spatio-temporal data, bayesian nonparametrics

Advances in High-Dimensional Inference

Bldg 219 Room B119
3 Presentations
Organizer: Xia Wang / University of Cincinnati, USA
Chair: Xia Wang / University of Cincinnati, USA

1. Bayesian Multilevel Network Recovery Selection

Speaker: Inyoung Kim, Virginia Tech In-person
Email: inyoungk@vt.edu Coauthors: Inyoung Kim/Virginia Tech, Mohamed Salem/Virgnia Tech
Abstract:
In this talk, we examine multilevel network recovery selection under a two-level structure in which higher-level variables contain lower-level variables nested within them. Due to the dependency structure, variables work together to accomplish certain tasks at both levels. Our main interest is to simultaneously explore variable selection and identify dependency structures among both higher and lower-level variables under a nonadditive model framework. We develop a multi-level nonparametric kernel machine approach with a newly proposed multilevel Ising spike-slab prior, utilizing Markov-chain Monte Carlo and variational Bayes inference to identify multi-level variables and jointly build the network. The variational inference approach is novel in utilizing the sampled dependency structure as the observed variable rather than the response. In addition to the variable selection and network recovery capabilities, our approach can produce both mean and quantile estimations of the original response variable of interest. We demonstrate the advantages of our approach using simulation studies and a genetic pathway-based analysis.
Keywords:
Network Estimation, Quantile Regression, Variable Selection

2. Shape-Constrained Estimation of Standard Errors in Reversible Markov Chains

Speaker: Hyebin Song, Penn State University Online
Email: hps5320@psu.edu
Abstract:
In this talk, I will present novel nonparametric, shape-constrained methods for estimating the autocovariance sequence of a reversible Markov chain. One primary motivation for this problem is the estimation of Markov chain Monte Carlo standard errors (MCSE), which quantify the uncertainty associated with estimates produced by MCMC algorithms. Our proposed estimator is based on the key observation that the autocovariance sequence of a reversible Markov chain can be represented as a moment sequence, which naturally imposes shape constraints such as monotonicity and convexity. I will discuss ordinary and weighted least squares formulations of the estimator, and outline their theoretical properties. In particular, we show that the resulting estimator is strongly consistent for the asymptotic variance of the MCMC sample mean, and l2-consistent for the full autocovariance sequence. I will also present an efficient algorithm for computing the estimator using convex optimization techniques, and I demonstrate its practical utility through empirical studies.
Keywords:
shape-constrained estimation, Markov chain Monte Carlo standard error, autocovariance

3. Hierarchical Skinny Gibbs Sampler in Logistic Regression

Speaker: Xia Wang, University of Cincinnati In-person
Email: xia.wang@uc.edu Coauthors: Eric Odoom / University of Cincinnati, Xuan Cao / University of Cincinnati, Jiarong Ouyang / University of Cincinnati
Abstract:
We introduce a highly scalable tuning-free algorithm for variable selection in logistic regression using Polya-Gamma data augmentation. The proposed method is both theoretically consistent and robust to potential mis-specification of the tuning parameter, achieved through a hierarchical approach. Existing works suitable for high-dimensional settings primarily rely on t-approximation of the logistic density, which is not based on the original likelihood. The proposed method not only builds upon the exact logistic likelihood, offering superior empirical performance, but is also more computationally efficient, particularly in cases involving highly correlated covariates, as demonstrated in a comprehensive simulation study. We apply our method to a gene expression PCR dataset from mice and an RNA-seq dataset from asthma studies in humans. By comparing its performance to existing frequentist and Bayesian methods in variable selection, we demonstrate the competitive predictive capabilities of the Polya-Gamma-based approach. Our results indicate that this method enhances the accuracy of variable selection and improves the robustness of predictions in complex, high-dimensional datasets.
Keywords:
Logistic regression, P{\'o}lya-Gamma distribution, Hierarchical Skinny Gibbs, Spike-and-Slab prior

Advances in Likelihood-Free and High-Dimensional Bayesian Inference

Bldg 219 Room 104
4 Presentations
Organizer: Cheng Li / National University of Singapore
Chair: Dongming Huang / National University of Singapore

1. Learning Summary Statistics for Likelihood-Free Bayesian Inference

Speaker: Rong Tang, Hong Kong University of Science and Technology Online
Email: martang@ust.hk
Abstract:
The challenge of performing Bayesian inference in models where likelihood functions are difficult to evaluate but sampling is straightforward has driven the development of likelihood-free methods such as approximate Bayesian computation (ABC). A key element in ABC is the use of summary statistics to reduce the dimensionality of the data, thereby avoiding the curse of dimensionality in nonparametric conditional density estimation as the observed data size grows. However, the selection of informative summary statistics that can capture the essential information about the parameter contained in the full data remains challenging. In this work, we proposes a general framework for learning informative summary statistics and the subsequent posterior inference based on the summaries. The proposed method provides a global posterior approximation applicable to any dataset, rather than being limited to a single dataset. In addition, a more refined posterior approximations for specific datasets can be obtained by integrating this approach with MCMC-ABC methods.
Keywords:
Approximate Bayesian Inference, Intractable likelihood, M-estimation, Dimension Reduction, Summary Statistics

2. Bayesian Optimal Change Point Detection in High Dimensions

Speaker: Kyoungjae Lee, Sungkyunkwan University In-person
Email: leekjstat@gmail.com Coauthors: Jaehoon Kim / Sungkyunkwan University, Lizhen Lin / The University of Maryland
Abstract:
We propose the first Bayesian methods for detecting change points in high-dimensional mean and covariance structures. These methods are constructed using pairwise Bayes factors, leveraging modularization to identify significant changes in individual components efficiently. We establish that the proposed methods consistently detect and estimate change points under much milder conditions than existing approaches in the literature. Additionally, we demonstrate that their localization rates are nearly optimal in terms of rates. The practical performance of the proposed methods is evaluated through extensive simulation studies, where they are compared to state-of-the-art techniques. The results show comparable or superior performance across most scenarios. Notably, the methods effectively detect change points whenever signals of sufficient magnitude are present, irrespective of the number of signals. Finally, we apply the proposed methods to genetic and financial datasets, illustrating their practical utility in real-world applications.
Keywords:
High-dimensional change point detection, mean vector, covariance matrix, maximum pairwise Bayes factor.

3. Applying Multi-Objective Bayesian Optimization to Likelihood-Free Inference

Speaker: David Chen, National University of Singapore In-person
Email: e1039688@u.nus.edu Coauthors: Li Xinwei / National University of Singapore, Euijin-Kim / Ajou University, Prateek Bansal / National University of Singapore, Prateek Bansal / National University of Singapore
Abstract:
Scientific statistical models are often defined by generative processes for simulating synthetic data, but many, such as sequential sampling models (SSMs) used in psychology and consumer behavior, involve intractable likelihoods. Likelihood-free inference (LFI) methods address this challenge, enabling Bayesian parameter inference for such models. We propose to apply Multi-objective Bayesian Optimization (MOBO) to LFI for estimation of parameters using multi-source data, such as SSMs parameters using response times and choice outcomes. This approach models discrepancies for each data source separately, using MOBO to efficiently approximate the joint likelihood. This multivariate approach also identifies conflicting information from different data sources and provides insights on their different importance in estimation of individual parameters. We demonstrate the advantages of MOBO over single-discrepancy methods through a synthetic data example and a real-world application evaluating ride-hailing drivers' preferences for electric vehicle rentals in Singapore. While focused on SSMs, our method generalizes to likelihood-free calibration for other multi-source models.
Keywords:
Likelihood-Free Inference, Sequential Sampling Models, Multi-objective Bayesian Optimization

4. Weighted Fisher Divergence for High-Dimensional Gaussian Variational Inference

Speaker: Linda S. L. Tan, National University of Singapore Online
Email: statsll@nus.edu.sg Coauthors: Aoxiang Chen / National University of Singapore, David J. Nott / National University of Singapore
Abstract:
This talk considers Gaussian variational approximation with sparse precision matrices in high dimensional problems. Although the optimal Gaussian approximation is usually defined as the one closest to the target posterior in Kullback-Leibler divergence, our work studies the weighted Fisher divergence, which focuses on gradient differences between the target posterior and its approximation, with the Fisher and score-based divergences being special cases. We make three main contributions. First, we compare approximations for weighted Fisher divergences under mean-field assumptions for both Gaussian and non-Gaussian targets with Kullback-Leibler approximations. Second, we go beyond mean-field and consider approximations with sparse precision matrices reflecting posterior conditional independence structure for hierarchical models. Using stochastic gradient descent to enforce sparsity, we develop two approaches to minimize the weighted Fisher divergence, based on the reparametrization trick and a batch approximation of the objective. Finally, we examine the performance of our methods for logistic regression, generalized linear mixed models and stochastic volatility models.
Keywords:
Fisher divergence, Score-based divergence, Stochastic gradient descent, Gaussian variational approximation

Advances in theory and computation methods for Bayesian inference

Bldg 219 Room 103
3 Presentations
Organizer: Rong Tang / Hong Kong University of Science and Technology
Chair: Rong Tang / Hong Kong University of Science and Technology

1. Enhancing Scalability in Bayesian Nonparametric Factor Analysis of Spatiotemporal Data

Speaker: Cheng Li, National University of Singapore Online
Email: stalic@nus.edu.sg Coauthors: Yifan Cheng
Abstract:
We propose novel computational strategies for Bayesian nonparametric latent factor spatiotemporal models that are computationally feasible for moderate to large spatiotemporal data. Although such Bayesian models are flexible and enable spatial clustering, they face a prohibitively high computational cost in posterior sampling when the spatial and temporal dimensions increase to a couple hundred. We address this challenge with several speed-up proposals. We integrate a new slice sampling algorithm that permits varying numbers of spatial mixture components, which are guaranteed to be non-increasing through the posterior sampling iterations and thus effectively reducing the number of mixture parameters. Additionally, we introduce a spatial latent nearest-neighbor Gaussian process prior and new sequential updating algorithms for the spatially varying latent variables in the stick-breaking process prior. Our new proposals lead to significantly enhanced computational scalability and storage efficiency while maintaining capabilities for both spatiotemporal prediction and clustering of locations with similar temporal trajectories.
Keywords:
Nearest-neighbor Gaussian process, Sequential updates, Slice sampling

2. Metropolis-Adjusted Subdifferential Langevin Algorithm

Speaker: Ning Ning, Texas A&M University In-person
Email: patning@tamu.edu
Abstract:
The Metropolis-Adjusted Langevin Algorithm (MALA) is a widely used Markov Chain Monte Carlo (MCMC) method for sampling from high-dimensional distributions. However, MALA relies on differentiability assumptions that restrict its applicability. In this paper, we introduce the Metropolis-Adjusted Subdifferential Langevin Algorithm (MASLA), a generalization of MALA that extends its applicability to distributions whose log-densities are locally Lipschitz, generally non-differentiable, and non-convex. We establish the theoretical foundation of MASLA by proving its convergence to a set-valued differential inclusion equation, ensuring well-defined long-run behavior. Furthermore, we evaluate the performance of MASLA by comparing it with other sampling algorithms in settings where they are applicable. Our results demonstrate the effectiveness of MASLA in handling a broader class of distributions while maintaining computational efficiency.
Keywords:
Markov chain Monte Carlo, Metropolis-adjusted Langevin algorithm, Generalized subdifferential, Non-convex and non-smooth optimization

3. Online Bernstein-von Mises Theorem

Speaker: Minwoo Chae, Pohang University of Science and Technology In-person
Email: mchae@postech.ac.kr Coauthors: Jeyong Lee / Pohang University of Science and Technology (POSTECH), Junhyeok Choi / Pohang University of Science and Technology (POSTECH)
Abstract:
Online learning is an inferential paradigm in which parameters are updated incrementally from sequentially available data, in contrast to batch learning, where the entire dataset is processed at once. In this talk, we assume that mini-batches from the full dataset become available sequentially. The Bayesian framework, which updates beliefs about unknown parameters after observing each mini-batch, is naturally suited for online learning. At each step, we update the posterior distribution using the current prior and new observations, with the updated posterior serving as the prior for the next step. However, this recursive Bayesian updating is rarely computationally tractable unless the model and prior are conjugate. When the model is regular, the updated posterior can be approximated by a normal distribution, as justified by the Bernstein-von Mises theorem. We adopt a variational approximation at each step and investigate the frequentist properties of the final posterior obtained through this sequential procedure. Under mild assumptions, we show that the accumulated approximation error becomes negligible once the mini-batch size exceeds a threshold depending on the parameter dimensi
Keywords:
Bayesian online learning, Bernstein-von Mises theorem, variational approximation

Bayesian Approaches for Modeling Tensors, Distributions and Clustering

Bldg 219 Room 104
3 Presentations
Organizer: Rajarshi Guhaniyogi / Texas A&M University
Chair: Rajarshi Guhaniyogi / Texas A&M University

1. Bayesian Tensor Modeling for Distribution-on-Distribution Regression

Speaker: Justin Strait, Los Alamos National Laboratories Online
Email: jstrait@lanl.gov
Abstract:
In this work, we propose a fully Bayesian model for learning univariate distributional outcomes from several univariate input distributions, as motivated by problems in multifidelity statistical emulation for computer models with distributional output. In particular, we jointly model all quantiles of the outcome distribution by specifying a flexible tensor-valued regression parameter which relates each outcome quantile to all input distributions at their own individual quantiles. Due to the high-dimensionality of the tensor, we assume a low rank structure and propose use of a multi-way shrinkage prior on tensor margins. Distributions are represented by log quantile density (LQD) functions, which have been shown to facilitate specification of functional data models without requiring additional constraints on the functions. We assess our model's performance through a comprehensive simulation study based on multifidelity gas transport simulations through discrete fracture networks (DFN). Then, we apply our model to learn the structure relating low-fidelity gas transport simulation output to corresponding high-fidelity output for the purposes of emulation.
Keywords:
TBD

2. Efficient Decision Trees for Tensor Regressions

Speaker: Hengrui Luo, Rice University Online
Email: hl180@rice.edu
Abstract:
This talk covers recent progress in tree-based methods for tensor regressions, which is an interpretable nonparametric tool that assists in various learning tasks. In particular, we develop single and ensemble tree methods for tensor-input regressions. We begin with scalar-on-tensor (tensor input and scalar output) regression and design efficient computational strategies to handle tensor inputs, which is a more complex search space than vector input space. Then we extend our tensor tree model to tensor-on-tensor (tensor input and tensor output) regressions with ensemble approaches with theoretic guarantees. We will also identify some existing challenges in applying tree-based and nonparametric methods for ultra-high dimensional tensor data, like MRI/fMRI data with limited individuals and possible missing data. We'll wrap up with a ranking perspective and raise a couple of open questions when extending this ranking perspective to tensor-input scenarios.
Keywords:
TBD

3. Predictor-Informed Bayesian Nonparametric Clustering

Speaker: Jeremy Gaskins, University of Louisville Online
Email: jeremy.gaskins@louisville.edu
Abstract:
In this project we are interested in performing clustering of observations such that the cluster membership is influenced by a set of covariates. To that end, we employ the Bayesian nonparametric Common Atom Model (CAM), which is a nested clustering algorithm that utilizes a (fixed) group membership for each observation to encourage more similar clustering of members of the same group. CAM operates by assuming each group has its own vector of cluster probabilities, which are themselves clustered to allow similar clustering for some groups. We extend this approach by treating the group membership as an unknown latent variable determined by a collection of covariate predictors. Consequently, observations with similar predictor values will be in the same latent group and are more likely to be clustered together than observations with disparate predictors. We propose a pyramid group model that flexibly partitions the predictor space into these latent group memberships. This pyramid model operates similarly to a Bayesian regression tree process except that it uses the same splitting rule for at all nodes at the same tree depth which facilitates improved mixing. We propose a block Gibbs sampler for our model to perform posterior inference. Our methodology is demonstrated in simulation and real data examples. In the real data application, we utilize the RAND Health and Retirement Study to cluster and predict patient outcomes in terms of the number of days spent overnight in the hospital.
Keywords:
TBD

Bayesian Inference in Modern Machine Learning

Bldg 219 Room 105
3 Presentations
Organizer: Quan Zhou / Texas A&M University
Chair: Qiwei Li / The University of Texas at Dallas

1. Semi-supervised Bayesian Spatial Topic Modeling for Identifying Multicellular Spatial Tissue Structures in Multiplex Imaging Data

Speaker: Junsouk Choi, University of Michigan In-person
Email: junsoukchoi@korea.ac.kr Coauthors: Junsouk Choi / Korea University, Jian Kang / University of Michigan, Veerabhadran Baladandayuthapani / University of Michigan
Abstract:
Understanding spatial architecture of tissues is essential for decoding complex interactions within cellular ecosystems and their implications for disease pathology and clinical outcomes. Recent advances in multiplex imaging technologies have enabled high-resolution profiling of cellular phenotypes and their spatial distributions, revealing the pivotal role of tissue structures in modulating immune responses and driving disease progression. To systematically identify and characterize spatial tissue architecture from such data, we propose a novel semi-supervised Bayesian spatial topic model, which integrates spatial Gaussian processes into latent Dirichlet allocation to flexibly model spatial dependencies inherent in tissue organization. Furthermore, by jointly analyzing multiple multiplex images, the proposed approach identifies consistent and coherent spatial structures across samples and incorporates clinical covariates to guide and enhance these discoveries. We applied our method to a lung cancer multiplex imaging dataset, revealing biologically meaningful tumor microenvironment patterns that were consistent across patients and significantly associated with clinical features.
Keywords:
Multiplex imaging, Tissue microenvironment, Spatial topic models, Semi-supervised clustering, Gaussian processes

2. Statistical Modeling of Subcellular Expression Patterns for High-resolution Spatial Transcriptomics

Speaker: Jade Wang, University of Michigan Online
Email: jadewang@umich.edu Coauthors: Jade Wang University of Michigan, Xiang Zhou / University of Michigan
Abstract:
Advances in spatially resolved transcriptomic technologies are producing gene expression data at increasingly higher throughput, scale, and resolution. Identifying patterns of sub-cellular mRNA localization in spatially resolved transcriptomic studies is essential for understanding the cellular dynamics of RNA processing. Here, we present a statistical method, expression gradient-based mRNA localization analysis (ELLA), that integrates high-resolution spatially resolved gene expression data with histology imaging data to identify the sub-cellular mRNA localization patterns in various spatially resolved transcriptomic techniques. ELLA models spatial count data through a nonhomogeneous Poisson process model and relies on an expression gradient function to characterize the sub-cellular mRNA localization pattern, producing effective control of type I errors and yielding high statistical power. Analyzing four spatially resolved transcriptomic datasets using ELLA, we identified genes in multiple cell types with various sub-cellular localization patterns.
Keywords:
Spatially resolved transcriptomics, subcellular mRNA localization, spatial variable genes, ELLA, nonhomogeneous Poisson process

3. Deep Generative Quantile Bayes

Speaker: Jungeum Kim, University of Chicago Online
Email: jkim255@ncsu.edu Coauthors: Jungeum Kim / The University of Chicago Booth School of Business, Percy S. Zhai / The University of Chicago Booth School of Business, Veronika Ročková / The University of Chicago Booth School of Business
Abstract:
This paper develops a multivariate Bayesian posterior sampling method through generative quantile learning. Our method learns a mapping that can transform (spherically) uniform random vectors into posterior samples without adversarial training. We utilize Monge-Kantorovich depth in multivariate quantiles to directly sample from Bayesian credible sets, a unique feature not offered by typical posterior sampling methods. To enhance training in quantile mapping, we designed a neural network that automatically performs summary statistic extraction. This additional neural network structure has performance benefits including support shrinkage (or posterior contraction) as the observation sample size increases. We demonstrate the usefulness of our approach on several examples where the absence of likelihood renders classical MCMC infeasible. Finally, we provide frequentist theoretical justifications for our quantile learning framework.
Keywords:
Neural posterior sampling, Quantile learning, Conditional vector quantiles, Support shrinkage

Bayesian Learning With Latent Variables

Bldg 219 Room 105
3 Presentations
Organizer: Zehang Li / University of California Santa Cruz
Chair: Zehang Li / University of California Santa Cruz

1. Causal representation learning: Identifying latent causal factors from unstructured data

Speaker: Yixin Wang, University of Michigan Online
Email: yixinw@umich.edu
Abstract:
Causal inference traditionally involves analyzing tabular data where variables like treatment, outcome, covariates, and colliders are manually labeled by humans. However, many complex causal inference problems rely on unstructured data sources such as images, text and videos that depict overall situations. These causal problems require a crucial first step - extracting the high-level latent causal factors from the low-level unstructured data inputs, a task known as "causal representation learning." In this talk, we explore how to identify latent causal factors from unstructured data, whether from passive observations, interventional experiments, or multi-domain datasets.
Keywords:
causal inference, representation learning, latent variables

2. Predictive variational inference: Learn the predictively optimal posterior distribution

Speaker: Yuling Yao, The University of Texas at Austin Online
Email: yyao@austin.utexas.edu
Abstract:
Vanilla variational inference finds an optimal approximation to the Bayesian posterior distribution, but even the exact Bayesian posterior is often not meaningful under model misspecification. We propose predictive variational inference (PVI): a general inference framework that seeks and samples from an optimal posterior density such that the resulting posterior predictive distribution is as close to the true data generating process as possible, while this closeness is measured by multiple scoring rules. By optimizing the objective, the predictive variational inference is generally not the same as, or even attempting to approximate, the Bayesian posterior, even asymptotically. Rather, we interpret it as implicit hierarchical expansion. Further, the learned posterior uncertainty detects heterogeneity of parameters among the population, enabling automatic model diagnosis. This framework applies to both likelihood-exact and likelihood-free models. We demonstrate its application in real data examples.
Keywords:
generalized Bayes, variational inference, posterior predictive, simulation-based inference

3. Bayesian federated quantification learning under distribution shift

Speaker: Zehang Li, University of California, Santa Cruz Online
Email: lizehang@ucsc.edu Coauthors: Yu Zhu / University of California, Santa Cruz
Abstract:
In regions lacking medically certified causes of death, verbal autopsy (VA) is a critical and widely used tool to ascertain the cause of death through interviews with caregivers. In this talk, we develop a novel Bayesian Federated Learning framework for both individual-level cause-of-death classification and population-level quantification of cause-specific mortality fractions (CSMFs) using VA data, in a target domain with limited or no local labeled data. The proposed framework is modular, computationally efficient, and compatible with a wide range of existing VA algorithms as candidate models, facilitating flexible deployment in real-world mortality surveillance systems. We validate the performance of BFL through extensive experiments on two real-world VA datasets under varying levels of distribution shift.
Keywords:
quantification learning, classification, federated learning

Bayesian Methods for Translational Biomedical Research

Bldg 219 Room 105
3 Presentations
Organizer: Yuan Ji / The University of Chicago
Chair: Guanyu Hu / University of Texas Health Science Center at Houston

1. Detection of Cell-type-specific Differentially Methylated Regions in Epigenome-Wide Association Studies

Speaker: Yingying Wei, The Chinese University of Hong Kong Online
Email: ywei@cuhk.edu.hk Coauthors: Ruofan Jian/The Chinese University of Hong Kong, Wei Yingying/ The Chinese University of Hong Kong
Abstract:
Epidemiologists are interested in investigating DNA methylation at cytosine-phosphate-guanine (CpG) sites in large cohorts through epigenome-wide association studies (EWAS). However, the observed EWAS data are bulk data with signals aggregated from distinct cell types. As a result, recently, there has been active research on detecting cell-type-specific risk CpG sites for EWAS data. However, although existing methods significantly improve the detection at the aggregated-level—identifying a CpG site as a risk CpG site as long as it is associated with the phenotype in any cell type, they have low power in detecting cell-type-specific associations for EWAS with typical sample sizes. Here, we develop a new method, Fine-scale inference for Differentially Methylated Regions (FineDMR), to borrow strengths of nearby CpG sites to improve the cell-type-specific association detection. Via a Bayesian hierarchical model built upon Gaussian process functional regression, FineDMR takes advantage of the spatial dependencies between CpG sites. Simulation studies and real data analysis show that FineDMR substantially improves the power in detecting cell-type-specific associations for EWAS data.
Keywords:
Bayesian hierarchical model, Gaussian process functional regression, epigenome-wide association studies, cell-type-specific associations

2. Balancing the effective sample size in prior across different doses in the curve-free Bayesian decision-theoretic design for dose-finding trials

Speaker: Dehua Bi, Stanford University In-person
Email: dehuabi@stanford.edu Coauthors: Jiapeng Xu / Department of Biomedical Data Science and Center for Innovative Study Design, Stanford University, Shenghua Kelly Fan / Department of Statistics and Biostatistics, California State University at East Bay, Bee Leng Lee / Department of Mathematics and Statistics, San José State University, Ying Lu / Department of Biomedical Data Science and Center for Innovative Study Design, Stanford University
Abstract:
The primary goal of dose allocation in phase I trials is to minimize patient exposure to subtherapeutic or excessively toxic doses, while accurately recommending a phase II dose that is as close as possible to the maximum tolerated dose (MTD). Fan et al. (2012) introduced a curve-free Bayesian decision-theoretic design (CFBD), which leverages the assumption of a monotonic dose-toxicity relationship without directly modeling dose-toxicity curves. This approach has also been extended to drug combinations for determining the MTD (Lee et al., 2017). Although CFBD has demonstrated improved trial efficiency by using fewer patients while maintaining high accuracy in identifying the MTD, it may artificially inflate the effective sample sizes for the updated prior distributions, particularly at the lowest and highest dose levels. This can lead to either overshooting or undershooting the target dose. In this paper, we propose a modification to CFBD’s prior distribution updates that balances effective sample sizes across different doses. Simulation results show that with the modified prior specification, CFBD achieves a more focused dose allocation at the MTD and offers more precise dose recommendations with fewer patients on average. It also demonstrates robustness to other well-known dose finding designs in literature.
Keywords:
TBD

3. BESS-Surv: A Bayesian estimator of sample size for survival data

Speaker: Jiaxin Liu, BayeSoft Inc. In-person
Email: . Coauthors: Yuan Ji / The University of Chicago
Abstract:
Bayesian statistics offers a coherent and adaptable framework for quantifying uncertainty, useful in sample size estimation (SSE) and re-estimation (SSR) for clinical trials. Building upon the Bayesian Estimator Sample Size (BESS) methodology (Bi and Ji, 2025), we propose BESS-Surv, an extension tailored for randomized clinical trials with time-to-event outcomes. BESS-Surv estimates the sample size by balancing the evidence in the observed data and the desirable confidence of the trial decisions. It allows for adaptive sample size adjustments based on interim treatment effects—reducing the required sample size when strong effects are observed, or increasing the probability of correct decision-making when effects are less pronounced. Simulation studies and a case study will be present to illustrate BESS-Surv’s performance.
Keywords:
TBD

Bayesian Spatial Modeling, Variable Selection, and Applications

Bldg 219 Room 103
4 Presentations
Organizer: Keying Ye / University of Texas at San Antonio
Chair: Keying Ye / University of Texas at San Antonio

1. Double-robust Bayesian variable selection and model prediction with spherically symmetric errors

Speaker: Dr. Min Wang, University of Texas at San Antonio In-person
Email: min.wang3@utsa.edu
Abstract:
Response surface methodology has been known as an effective tool for improving an overall manufacturing process where quality requirements are fulfilled. This work proposes a double-robust Bayesian modeling method that can simultaneously cope with variable selection, model form uncertainty, and non-normality for quality prediction. Double robustness is achieved by specifying the class of spherically symmetric distributions for the errors and accounting for model form uncertainty through Bayesian model averaging. Furthermore, with a special choice in the sub-harmonic priors for the regression coefficients, a closed-form expression of the marginal posterior distribution of each candidate model is obtained, which is not only free of the error distributions (other than spherical symmetry) but also can be easily computed using standard software. To provide a better interpretation of the model, a special prior is specified for the model space to maintain and reflect the hierarchical or structural relationships among input variables. The proposed Bayesian method has the properties of variable selection consistency and prediction consistency under Bayesian model averaging. Through numerical experiments and a case study, the proposed double-robust Bayesian modeling method is shown to achieve results superior to those of the existing established methods in prediction and variable selection in linear models under different types of error distributions.
Keywords:
Bayesian model averaging; consistency; variable selection; robustness; response surface methodology.

2. Bayesian envelope dimension reduction in spatial-temporal setting

Speaker: Wenbo Wu, University of Texas at San Antonio In-person
Email: wenbo.wu@utsa.edu Coauthors: Reisa Widjaja/University of Wisconsin-Lacrosse, Wenbo Wu/University of Texas at San Antonio, Victor De Oliveira/University of Texas at San Antonio, Keying Ye/University of Texas at San Antonio
Abstract:
The recently developed Bayesian framework for the envelope model enhances the interpretability of model parameters and simplifies the incorporation of prior information. However, the current method presumes an independent error structure within the model and fails to account for the additional complexities introduced by spatially correlated data. Therefore, we propose a Bayesian framework for the spatial envelope model. By Incorporating the prior information of model parameters, the proposed method offers a more flexible and robust framework for capturing uncertainty and spatial correlation in high-dimensional data. Furthermore, we investigate appropriate prior specifications for the spatial parameters, study the propriety of the posterior distribution, and develop a posterior sampling method to sample both the posterior distributions of spatial parameters and the conditional posterior distributions associated with the envelope model.
Keywords:
envelope model, spatial correlation, Gibbs sampling

3. Bayesian variable selection in the high-dimensional Cox model with the horseshoe prior

Speaker: Zhuanzhuan Ma, University of Texas at Rio Grande Valley In-person
Email: zhuanzhuan.ma@utrgv.edu Coauthors: Samiran Sinha / Texas A&M University, Mai Dao / Wichita State University, Yu-Chien Bo Ning / Harvard University
Abstract:
With the rapid advancement of computing technology, high-dimensional statistical inference has become increasingly relevant and essential, particularly in genomics and computational biology. In our ongoing work, we focus on variable selection in the Cox proportional hazards model to identify important covariates that associate genomic features with patients’ censored survival times. To ease the computational burden and model complexity, we incorporate the Limited-memory Broyden–Fletcher–Goldfarb–Shanno (L-BFGS) algorithm to obtain the maximum a posteriori (MAP) estimator, with the regression coefficients regularized by the horseshoe prior. A key theoretical contribution of this work is the derivation of posterior contraction rates under high-dimensional asymptotics, along with sufficient conditions for the consistency of the MAP estimator. We carry out simulations to compare the finite sample performances of the proposed method with the existing methods in terms of the accuracy of variable selection. Finally, a real-data application is provided for illustrative purposes.
Keywords:
Cox's proportional hazards model, variable selection, horseshoe prior, genomic data

4. Detecting malware without ground truth

Speaker: Keying Ye, University of Texas at San Antonio In-person
Email: keying.ye@utsa.edu Coauthors: Ambassador Negash/Foot Locker, Inc., Zifei Han/University of International Business and Economics, Min Wang/The University of Texas at San Antonio, Shouhuai Xu/University of Colorado at Colorado Springs
Abstract:
One of the real-world problems when it comes to classification is finding the right tool (or sets of tools) to classify objects into their true classes. We sometimes use multiple classification tools and decide on a class of a given observation based on the results of a majority decision using these tools. In this research, we use a Bayesian framework to study the predictive distribution of crowdsourcing data. Simulation studies are carried out using synthetic data with known ground truth and accuracies in estimations are compared. A real data set is applied with the proposed approach. Furthermore, some theoretical work has been conducted.
Keywords:
Empirical Bayesian analysis, crowdsourcing, malware detection, predictive distributions

Enhancing interpretability with Bayesian approaches: From methodological foundations to applications

Bldg 219 Room 105
3 Presentations
Organizer: Qiwei Li / The University of Texas at Dallas
Chair: Qiwei Li / The University of Texas at Dallas

1. Catalytic priors: Using synthetic data to specify prior distributions in Bayesian analysis

Speaker: Dongming Huang, National University of Singapore In-person
Email: stahd@nus.edu.sg Coauthors: Dongming Huang / National University of Singapore, Feicheng Wang / Harvard University, Donald Rubin / Harvard University, Samuel Kou / Harvard University
Abstract:
Catalytic prior distributions provide general, easy-to-use and interpretable specifications of prior distributions for Bayesian analysis. They are particularly beneficial when observed data are inadequate to well-estimate a complex target model. A catalytic prior distribution stabilizes a high-dimensional "working model" by shrinking it toward a "simplified model." The shrinkage is achieved by supplementing the observed data with a small amount of "synthetic data" generated from a predictive distribution under the simpler model. We apply this framework to generalized linear models, where we propose various strategies for the specification of a tuning parameter governing the degree of shrinkage and study resultant properties. The catalytic priors have simple interpretations and are easy to formulate. In our numerical experiments and a real-world study, the performance of the inference based on the catalytic prior are superior to, or comparable to, that of other commonly used prior distributions.
Keywords:
prior specification, regularization, synthetic data, causal inference

2. Bayesian additive regression trees in transcriptome-wide association studies

Speaker: Min Chen, The University of Texas at Dallas Online
Email: mchen@utdallas.edu Coauthors: Hasini Gammune / University of Texas at Dallas, Ketki Joshi / University of Texas at Dallas, Zhenyu Xuan / University of Texas at Dallas, Min Chen/ University of Texas at Dallas
Abstract:
While genome-wide association studies (GWAS) have identified numerous trait-associated loci, the underlying causal genes and mechanisms often remain unclear. Transcriptome-wide association studies (TWAS) can uncover molecular and causal mechanisms underlying variant-trait associations. Traditional TWAS often focuses on one gene at a time and usually ignore gene regulatory relationships. We propose a novel network-based joint modeling approach that employ multi-variate Bayesian Additive Regression Trees (BART) in gene networks constructed using Hi-C data. By integrating spatial genomic architecture, it offers a robust approach to uncovering complex regulatory mechanisms. This method identifies biologically meaningful gene clusters, enhancing predictive power and interpretability. The model consistently outperforms traditional methods in identifying key SNPs and enhancing predictive accuracy using simulations and large-scale genomic datasets from GEUVADIS and GTEx.
Keywords:
GWAS, TWAS, multivariate BART, gene-network, 3D spatial regulatory interactions

3. AI-powered Bayesian methods for interpretable pathology image analysis

Speaker: Qiwei Li, The University of Texas at Dallas In-person
Email: qiwei.li@utdallas.edu
Abstract:
Statistics traditionally emphasizes human-driven analysis supported by computational tools, whereas AI primarily depends on computer algorithms with guidance from human insight. Nonetheless, each milestone in statistical development opens new frontiers for AI and offers fresh perspectives within statistics itself. This interplay fosters discoveries initiated from either domain that ultimately enrich the other. In this talk, I will illustrate how the integration of statistical spatial and shape analysis and AI enables more interpretable and predictive pathways from histopathology images to clinically meaningful insights. Recent advances in deep learning have made it possible to detect and classify tissue regions and individual cells at scale from digital histopathology images. I will introduce several novel AI-powered Bayesian models for analyzing these images. These methods offer new insights into cell-cell interactions, spatial cellular architecture, and tumor boundaries in the context of cancer progression, supported by multiple case studies.
Keywords:
Bayesian, medical image, histology image, spatially resolved transcriptomics, spatial analysis, shape analysis

General Bayesian approaches and Bayesian predictive syntheses

Bldg 219 Room 104
4 Presentations
Organizer: Genya Kobayashi / Meiji University
Chair: Genya Kobayashi / Meiji University

1. Tree boosting for learning density ratios with generalized Bayesian uncertainty quantification

Speaker: Naoki Awaya, Waseda University In-person
Email: nawaya@waseda.jp Coauthors: Li Ma / Duke University
Abstract:
Learning the density ratio from two samples of observations is a fundamental task for detecting and quantifying differences between two groups. To provide an accurate approximation of density ratios with reasonable computational cost, we propose a variant of the AdaBoost algorithm, historically used for classification and regression tasks. Similar to the standard AdaBoost, the proposed algorithm sequentially updates the estimate by adding tree-based weak learners, while observations are weighted based on the gap between the current density ratio estimate and group allocation. A novel loss function, called the balancing loss, is inspired by the commonly used loss in classification AdaBoost but is tailored to facilitate direct density ratio estimation. Our numerical experiments demonstrate that the proposed algorithm outperforms existing approaches in terms of both accuracy and computational efficiency. Additionally, we introduce a generalized Bayesian framework for uncertainty quantification, allowing for the assessment of statistical significance at each observed point.
Keywords:
Non-parametric inference, tree-based models, generalized bayes

2. General Bayesian quantile regression for counts via generative modeling

Speaker: Yuta Yamauchi, Nagoya University In-person
Email: yamauchi.yuta.f0@f.mail.nagoya-u.ac.jp Coauthors: Genya Kobayashi / Meiji University, Shonosuke Sugasawa / Keio University
Abstract:
While count data frequently arise in biomedical applications, such as the length of hospital stay, their discrete nature poses significant challenges for appropriately modeling conditional quantiles. To solve the practical difficulty, we propose a novel general Bayesian framework for quantile regression tailored to count data. We seek the regression parameter on the conditional quantile by minimizing the expected loss with respect to the distribution of the conditional quantile of the latent continuous variable associated with the observed count response variable. By modeling the unknown conditional distribution through the Bayesian nonparametric kernel mixture for the joint distribution of the count response and covariates, we obtain the posterior distribution of the regression parameter via a simple optimization. We numerically show that the proposed method improves bias and estimation accuracy of the existing crude approaches to count quantile regression. Furthermore, we analyze the length of hospital stay for acute myocardial infarction and demonstrate that the proposed method gives more interpretable and flexible results than the existing ones.
Keywords:
Quantile regression, Nonparametric Bayesian learning, Markov chain Monte Carlo, Pitman-Yor process, Rounded Gaussian distribution

3. Dynamic Bayesian regression quantile synthesis for forecasting outlook-at-risk

Speaker: Genya Kobayashi, Meiji University In-person
Email: gkobayashi@meiji.ac.jp Coauthors: Yuta Yamauchi/Nagoya University, Shonosuke Sugasawa / Keio University, Dongu Han / Korea University
Abstract:
This study aims to provide a Bayesian approach to accurate quantile forecasting for time series data through the Bayesian predictive synthesis. The proposed dynamic Bayesian regression quantile introduces predictions from the agent quantile predictive models as latent factors and lets the weights for the agent models vary across time, constituting a dynamic latent factor model for quantiles. We also consider extending the model for quantile prediction of multiple time series data by introducing an additional factor structure to the synthesis weights. The performance of the proposed approach is demonstrated using the US inflation rate and GDP growth rates for some developed countries.
Keywords:
quantile regression, dynamic linear model, Bayesian predictive synthesis

4. Locally adaptive Bayesian spatiotemporal conditional autoregressive model

Speaker: Takahiro Onizuka, Chiba University In-person
Email: onizuka@chiba-u.jp
Abstract:
Spatio-temporal smoothing is a method for estimating the underlying spatio-temporal trend by removing observational noise from the data. In typical applications where multi-year data are observed for multiple adjacent regions, it is common to assume that the spatial trend changes smoothly over space based on the actual geographical adjacency, and that this spatial trend also evolves smoothly over time. However, when the data contain local structural changes that violate these smoothness assumptions, conventional methods may oversmooth the data and fail to capture important local variations. In this study, we propose a novel spatio-temporal smoothing approach that is capable of capturing such local spatio-temporal changes. The effectiveness of the proposed method is demonstrated through numerical experiments.
Keywords:
spatiotemporal model, conditional autoregressive model, shrinkage priors

New development in Bayesian analysis and the applications

Bldg 219 Room 103
4 Presentations
Organizer: Yichuan Zhao / Georgia State University
Chair: Yichuan Zhao / Georgia State University

1. Bayesian empirical likelihood inference for the mean absolute deviation

Speaker: Hongyan Jiang, Huaiyin Institute of Technology Online
Email: hyitjhy@hotmail.com Coauthors: Hongyan Jiang / Huaiyin Institute of Technology, Yichuan Zhao / Georgia State University
Abstract:
The mean absolute deviation (MAD) is a direct measure of the dispersion of a random variable about its mean. In this paper, the empirical likelihood (EL) and the adjusted EL methods for the MAD are proposed. The Bayesian empirical likelihood, the Bayesian adjusted empirical likelihood, the Bayesian jackknife empirical likelihood and the Bayesian adjusted jackknife empirical likelihood methods are used to construct credible intervals for the MAD. Simulation results show that the proposed EL method performs better than the JEL in Zhao et al. , and the proper prior information improves coverage rates of confidence/credible intervals. Two real datasets are used to illustrate the new procedures.
Keywords:
Adjusted empirical likelihood, Bayesian empirical likelihood, Bayesian jackknife empirical likelihood, empirical likelihood, mean absolute deviation

2. Bayesian nonparametric model for heterogeneous treatment effects with zero-inflated data

Speaker: Yisheng Li, MD Anderson Cancer Center Online
Email: ysli@mdanderson.org Coauthors: Chanmin Kim / SungKyunKwan University, Ting Xu / The University of Texas MD Anderson Cancer Center, Zhongxing Liao / The University of Texas MD Anderson Cancer Center
Abstract:
Precision medicine calls for statistical models to be developed for identifying and evaluating potentially heterogeneous treatment effects in a robust manner. The oft-cited existing methods for assessing treatment effect heterogeneity are based on parametric models with interactions or conditioning on covariate values, the performance of which are sensitive to the omission of important covariates and/or the choice of their values. We propose a new Bayesian nonparametric (BNP) method for estimating heterogeneous causal effects in studies with zero-inflated outcome data that arise commonly in health-related studies. We employ the enriched Dirichlet process (EDP) mixture in our BNP approach, establishing a connection between an outcome and a covariate DP mixture. This enables us to estimate posterior distributions concurrently, facilitating flexible inference regarding individual causal effects. We show in simulation studies that the proposed method outperforms two other BNP methods for inference on the conditional average treatment effects. We apply the proposed method to a study of the relationship between heart radiation dose and the blood level of a cardiac toxicity biomarker.
Keywords:
Enriched Dirichlet process, high-sensitivity cardiac troponin T, heterogeneous effects, missing at random

3. BiGER: Fast and Accurate Bayesian Rank Aggregation for Genomic Data

Speaker: Sherry Wang, University of Texas at Arlington Online
Email: xinlei.wang@uta.edu Coauthors: Kaiwen Wang / Southern Methodist University, Yuqiu Yang / University of Texas Southwestern Medical Center, Yusen Xia / Georgia State University, Guanghua Xiao / University of Texas Southwestern Medical Center, Johan Lim / Seoul National University
Abstract:
With the rise of large-scale genomic studies, large gene lists targeting important diseases are increasingly common. While evaluating each study individually gives valuable insights on specific samples and study designs, the wealth of available evidence in the literature calls for robust and efficient meta-analytic methods. Crucially, the diverse assumptions and experimental protocols underlying different studies require a flexible but rigorous method for aggregation. To address these issues, we propose BiGER, a fast and accurate Bayesian rank aggregation method for the inference of latent global rankings. Unlike existing methods in the field, BiGER accommodates mixed gene lists with top-ranked and top-unranked genes as well as bottom-tied and missing genes, by design. Using a Bayesian hierarchical framework combined with variational inference, BiGER efficiently aggregates large-scale gene lists with high accuracy, while providing valuable insights into source-specific reliability for researchers. Through both simulated and real datasets, we show that BiGER is a useful tool for reliable meta-analysis in genomic studies.
Keywords:
Bayesian hierarchical modeling, gene expression, Gibbs sampling, meta-analysis, posterior inference, variational inference

4. Interval Estimation for the Youden Index and Optimal Cut-Off Point in AUC-Based Optimal Combinations of Multivariate Normal Biomarkers With Covariates

Speaker: Yichuan Zhao, Georgia State University Online
Email: yichuan@gsu.edu Coauthors: Hossein Nadeb/Yazd University
Abstract:
In this talk, we present interval estimation methods for the Youden index and the optimal cut‐off point in the context of AUC‐based optimal combinations of multivariate normally distributed biomarkers, considering the presence of covariates. We propose a generalized pivotal confidence interval, a Bayesian credible interval, and several bootstrap confidence intervals for both the Youden index and its corresponding cut‐off point. To evaluate the performance of these confidence and credible intervals, we conducted a Monte Carlo simulation study. Finally, we illustrate the proposed methods using a diabetic dataset.
Keywords:
Bayesian credible interval, covariates, generalized pivotal variable, linear combinations, ROC curve, Youden index

Recent Advances in Bayesian Methods for Spatial Biomedical Data

Bldg 219 Room 102
4 Presentations
Organizer: Yingying Wei / The Chinese University of Hong Kong
Chair: Yingying Wei / The Chinese University of Hong Kong

1. AI-powered Bayesian Methods for Analyzing Spatial Biomedical Data

Speaker: Li Qiwei, The University of Texas at Dallas In-person
Email: qiwei.li@utdallas.edu
Abstract:
Statistics relies more on human analyses with computer aids, while AI relies more on computer algorithms with aids from humans. Nevertheless, expanding the statistics concourse at each milestone provides new avenues for AI and creates new insights in statistics. This part incubates the findings initiated from either statistics or AI and benefits from the other. In this talk, I will demonstrate how the marriage between spatial statistics and AI leads to more explainable and predictable paths from raw spatial biomedical data to conclusions. The first part concerns the spatial modeling of AI-reconstructed pathology images. Recent developments in deep-learning methods have enabled us to identify and classify individual cells from digital pathology images at a large scale. The randomly distributed cells can be considered from a marked point process. I will present two novel Bayesian models for characterizing spatial correlations in a multi-type spatial point pattern. The new method provides a unique perspective for understanding the role of cell-cell interactions in cancer progression, demonstrated through a case study of 188 lung cancer patients. The second part concerns the spatial modeling of the emerging spatially resolved transcriptomics data. Recent technology breakthroughs in spatial molecular profiling have enabled the comprehensive molecular characterization of single cells while preserving their spatial and morphological contexts. This new bioinformatics scenario advances our understanding of molecular and cellular spatial organizations in tissues, fueling the next generation of scientific discovery. I will focus on integrating information from AI tools into Bayesian models to address key questions in this field, such as spatially variable gene detection and spatial domain identification.
Keywords:
TBD

2. Differential Inference for Spatial Transcriptomics Data

Speaker: Song Fangda, The Chinese University of Hong Kong, Shenzhen In-person
Email: sfd1994895@gmail.com Coauthors: Jean Yang / University of Sydney, Yingying Wei / The Chinese University of Hong Kong
Abstract:
Spatial transcriptomics experiments are becoming more and more complicated, involving patients or tissues collected from multiple biological conditions. Comparative studies for spatial transcriptomics data enable the discovery of different spatial expression patterns between different biological conditions. Moreover, spatial expression patterns are highly heterogeneous across different subregions in a single slide. Therefore, we develop a Bayesian hierarchical model on the spatial metric of subregions to simultaneously classify the subregions and conduct differential inference for spatial summary statistics. Subregion classification and differential inference for spatial metric enables us to identify the perturbation induced by the disease conditions and explain the underlying biological mechanism.
Keywords:
Bayesian Hierarchical Modelling, Spatial Transcriptomics Experiments, Differential Inference

3. Spatially Aware Adjusted Rand Index for Evaluating Spatial Transcriptomics Clustering

Speaker: Yan Yinqiao, Beijing University of Technology In-person
Email: yinqiaoyan@bjut.edu.cn Coauthors: Xiangnan Feng / Fudan University, Xiangyu Luo / Renmin University of China
Abstract:
The spatial transcriptomics (ST) clustering plays a crucial role in elucidating the tissue spatial heterogeneity. An accurate ST clustering result can greatly benefit downstream biological analyses. As various ST clustering approaches are proposed in recent years, comparing their clustering accuracy becomes important in benchmarking studies. However, the widely used metric, adjusted Rand index (ARI), totally ignores the spatial information in ST data, which prevents ARI from fully evaluating spatial ST clustering methods. We propose a spatially aware Rand index (spRI) as well as spARI that incorporate the spatial distance information. The spatially aware feature of spRI adaptively differentiates disagreement object pairs based on their distinct distances, providing a useful evaluation metric that favors spatial coherence of clustering. The spARI is obtained by adjusting spRI for random chances such that its expectation takes zero under an appropriate null model. Statistical properties of spRI and spARI are discussed. The applications to simulation study and two ST datasets demonstrate the improved utilities of spARI compared to ARI in evaluating ST clustering methods.
Keywords:
clustering evaluation, hypergeometric distribution, Rand index, spatial transcriptomics

4. Tissue Annotation for in Lung Cancer Histopathology with Batch Effect Correction

Speaker: Zhai Yibo, The Chinese University of Hong Kong In-person
Email: yibo.zhai@link.cuhk.edu.hk Coauthors: Yingying Wei / The Chinese University of Hong Kong, Shuangge Ma / Yale University
Abstract:
Histopathological analysis of hematoxylin and eosin (H&E)-stained slides plays a critical role in lung cancer diagnosis and prognosis. However, accurate annotation of tumor regions and other tissue structures typically requires expert pathology experts, making annotation for large numbers of images time-consuming, labor intensive and expensive. In this study, we propose a Bayesian model that facilitates automated annotation of diverse tissue regions, including regions of interest (ROIs), from lung cancer H&E slides. At the same time, our method can correct for batch effect, such as staining variability and scanner-induced color difference, ensuring that the extracted image features are comparable across datasets. Application of our proposed model to lung cancer H&E slides show that the features identified by our proposed model provide meaningful biological interpretations in the context of epidermal growth factor receptor (EGFR) mutations, which are clinically important for lung cancer diagnosis and therapy strategy.
Keywords:
Pathology images, Batch effects, Model-based clustering

Recent Developments in a Bayesian Framework

Bldg 219 Room 105
4 Presentations
Organizer: Minwoo Kim / Pusan National University
Chair: Younggeun Kim / Michigan State University

1. Extended Fiducial Inference: Toward an Automated Process of Statistical Inference

Speaker: Sehwan Kim, Ewha Womans University Online
Email: sehwankim@ewha.ac.kr Coauthors: Faming Liang / Purdue University, Yan Sun / University of Pennsylvania
Abstract:
While fiducial inference was widely considered a big blunder by R.A. Fisher, the goal he initially set –'inferring the uncertainty of model parameters on the basis of observations' – has been continually pursued by many statisticians. To this end, we develop a new statistical inference method called extended Fiducial inference (EFI). The new method achieves the goal of fiducial inference by leveraging advanced statistical computing techniques while remaining scalable for big data. EFI involves jointly imputing random errors realized in observations using stochastic gradient Markov chain Monte Carlo and estimating the inverse function using a sparse deep neural network (DNN). The consistency of the sparse DNN estimator ensures that the uncertainty embedded in observations is properly propagated to model parameters through the estimated inverse function, thereby validating downstream statistical inference. Compared to frequentist and Bayesian methods, EFI offers significant advantages in parameter estimation and hypothesis testing, especially when outliers are present in the observations. EFI also provides an innovative framework for semi-supervised learning.
Keywords:
Complex Hypothesis Test, Markov chain Monte Carlo, Semi-Supervised Learning, Sparse deep learning, Uncertainty Quantification

2. Bayesian Order-based Structure Learning for multiple DAGs

Speaker: Hyunwoong Chang, University of Texas at Dallas In-person
Email: hwchang@utdallas.edu Coauthors: Hyunwoong Chang / University of Texas at Dallas, Fariha Taskin / University of Texas at Dallas
Abstract:
We propose a novel order-based structure learning method for jointly estimating multiple directed acyclic graph (DAG) models. While the model identifiability is expected to be alleviated by pooling information from multiple datasets, high computational costs make it difficult to incorporate the variability into the model. Our proposed model addresses this challenge with a Bayesian framework tailored to fast Markov chain Monte Carlo (MCMC) sampling, scalable to handle a large number of datasets. The proposed approach remains effective even when violating the faithfulness assumption, which is widely considered unrealistic in practice. We demonstrate the efficacy of the method through extensive simulation studies and showcase its practical advantages with a case-control study on major depressive disorder (MDD) using single-cell RNA sequencing data.
Keywords:
Bayesian Statistics, graphical model, structure learning, Markov chain Monte Carlo, genetics

3. Flexible modeling of between-location heterogeneity in time-series count data using a Nonparametric Bayesian Poisson hurdle model: a case study in Japan

Speaker: Jinsu Park, Chungbuk National University In-person
Email: jspark@chungbuk.ac.kr Coauthors: Kisung Sim / Samsung SDS, Daewon Yang / Chungnam National University, Yoonhee Kim / University of Tokyo, Masahiro Hashizume / University of Tokyo, Yeonseung Chung / Korea Advanced Institute of Science and Technology
Abstract:
In environmental epidemiology, the short-term association between temperature and suicide has been examined by analyzing daily time-series data on suicide and temperature collected from multiple locations. A two-stage meta-regression has been conventionally used. A Poisson regression is fitted for each location in the first stage, and location-specific association parameter estimates are pooled, adjusted, and regressed onto location-specific variables using meta-regressions in the second stage. However, several limitations of the two-stage approaches have been reported. In this study, we propose a nonparametric Bayesian Poisson hurdle random effects model to investigate heterogeneity in the temperature-suicide association across multiple locations. The proposed model consists of two parts, binary and positive, with random coefficients specified to describe heterogeneity. Furthermore, random coefficients combined with location-specific indicators were assumed to follow a Dirichlet process mixture of normals to identify the subgroups. The proposed methodology was validated through a simulation study and applied to data from a nationwide temperature-suicide association study in Japan.
Keywords:
Bayesian inference, temperature-suicide association, Poisson hurdle model, Drichlet process mixture, model-based clustering

4. Posterior asymptotics of high-dimensional spiked covariance model with inverse-Wishart prior

Speaker: Kwangmin Lee, Chonnam National University In-person
Email: klee564@jnu.ac.kr Coauthors: Sewon Park/Samsung SDS, Seongmin Kim/Seoul National University, Jaeyong Lee/Seoul National University
Abstract:
We study Bayesian inference for the spiked eigenstructure of high-dimensional covariance matrices, where a small number of eigenvalues (spikes) are significantly larger than the remaining bulk. Our goal is to estimate the spiked eigenvalues, their associated eigenvectors, and the number of spikes. To this end, we impose an inverse-Wishart prior on the unknown covariance matrix and derive the posterior distributions of the eigenvalues and eigenvectors by transforming the posterior distribution of the full covariance matrix. We show that the degrees of freedom parameter in the inverse-Wishart prior governs the shrinkage of the eigenvalues, and we propose a data-driven choice of this hyperparameter to correct bias in the estimated eigenvalues. Furthermore, we prove that under the spiked covariance model, the posterior distribution of the spiked eigenvectors concentrates around the true eigenvectors, and in the single-spike setting, the posterior achieves minimax optimality. We also introduce a Bayesian method for selecting the number of spikes, which provides a principled way to quantify uncertainty in determining the intrinsic dimension of principal components.
Keywords:
High-dimensional, Bayesian, Principal component analysis

Recent advances in Bayesian computational methods

Bldg 219 Room 103
3 Presentations
Organizer: Shijia Wang / ShanghaiTech University
Chair: Shijia Wang / ShanghaiTech University

1. When Transfer Learning Meets Bayesian Analysis

Speaker: Xiaozhou Wang, East China Normal University In-person
Email: xzwang@sfs.ecnu.edu.cn
Abstract:
Transfer learning leverages knowledge gained from related source domains to improve performance on tasks in a target domain. In this talk, I will introduce some research results on the integration of transfer learning with Bayesian analysis. The proposed algorithms and the corresponding theoretical properties are presented. Numerical experiments are conducted to demonstrate the effectiveness of the methodologies.
Keywords:
Bayesian statistics, transfer learning, machine learning

2. Improving approximate Bayesian computation methods using machine learning methods

Speaker: Shijia Wang, ShanghaiTech University In-person
Email: wangshj1@shanghaitech.edu.cn
Abstract:
We address the challenge of Markov Chain Monte Carlo (MCMC) algorithms within the approximate Bayesian Computation (ABC) framework, which often get trapped in local optima due to their inherent local exploration mechanism. We propose a novel Global-Local ABC-MCMC algorithm that combines the ``exploration" capabilities of global proposals with the ``exploitation" finesse of local proposals. We integrate iterative importance resampling into the likelihood-free framework to establish an effective global proposal distribution, and select the optimum mixture of global and local moves based on a relative version of expected squared jumped distance via sequential optimization. Furthermore, we propose two adaptive schemes. The first adaptive scheme is a normalizing flow-based probabilistic distribution learning model to iteratively improve the proposal of importance sampling. The second adaptive scheme is optimizing the efficiency of the local sampler by utilizing Langevin dynamics and common random numbers. We numerically demonstrate that our method is able to improve sampling efficiency and achieve more reliable convergence for complex posteriors.
Keywords:
approximate Bayesian computation, Markov chain Monte Carlo, Global-Local proposal

3. Poisson Hyperplane Processes with Rectified Linear Units

Speaker: Shufei Ge, ShanghaiTech University In-person
Email: geshf@shanghaitech.edu.cn
Abstract:
Neural networks have shown state-of-the-art performances in various classification and regression tasks. Rectified linear units (ReLU) are often used as activation functions for the hidden layers in a neural network model. In this article, we establish the connection between the Poisson hyperplane processes (PHP) and two-layer ReLU neural networks. We show that the PHP with a Gaussian prior is an alternative probabilistic representation to a two-layer ReLU neural network. In addition, we show that a two-layer neural network constructed by PHP is scalable to large-scale problems via the decomposition propositions. Finally, we propose an annealed sequential Monte Carlo algorithm for Bayesian inference. Our numerical experiments demonstrate that our proposed method outperforms the classic two-layer ReLU neural network.
Keywords:
neural network, Poisson hyperplane processes, sequential Monte Carlo

Recent development in Bayesian methodology and computation

Bldg 219 Room 103
4 Presentations
Organizer: Yeongjin Gwon / University of Nebraska Medical Center
Chair: Jaewoo Park / Yonsei University

1. A Stein Gradient Descent Approach for Doubly Intractable Distributions

Speaker: Jaewoo Park, Yonsei University In-person
Email: jwpark88@yonsei.ac.kr Coauthors: Heesang Lee / Yonsei University,, Songhee Kim / Yonsei University,, Bokgyeong Kang / Dongguk University
Abstract:
Bayesian inference for doubly intractable distributions is challenging because they include intractable terms, which are functions of parameters of interest. Although several alternatives have been developed for such models, they are computationally intensive due to repeated auxiliary variable simulations. We propose a novel Monte Carlo Stein variational gradient descent (MC-SVGD) approach for inference for doubly intractable distributions. Through an efficient gradient approximation, our MC-SVGD approach rapidly transforms an arbitrary reference distribution to approximate the posterior distribution of interest, without necessitating any predefined variational distribution class for the posterior. Such a transport map is obtained by minimizing Kullback-Leibler divergence between the transformed and posterior distributions in a reproducing kernel Hilbert space (RKHS). We also investigate the convergence rate of the proposed method. We illustrate the application of the method to challenging examples, including a Potts model, an exponential random graph model, and a Conway--Maxwell--Poisson regression model.
Keywords:
doubly-intractable distributions, variational inference, Markov chain Monte Carlo, kernel Stein discrepancy, importance sampling

2. Application to Bayesian Statistical Methods in Environmental Epidemiology

Speaker: Daewon Yang, Chungnam National University In-person
Email: summet73y.da@gmail.com Coauthors: Daewon Yang / Department of Information and Statistics, Chungnam National University, Daejeon, South Korea, Taeryon Choi / Department of Statistics, Korea University, Seoul, South Korea, Jinsu Park / Department of Information Statistics, Chungbuk National University, Chungju, South Korea, Hohyun Jung / School of Mathematics, Statistics and Data Science, Sungshin Women's University, Seoul, South Korea, Yeonseung Chung / Department of Mathematical Sciences, Korea Advanced Institute of Science and Technology, Daejeon, South Korea
Abstract:
Numerous studies have shown a U- or J-shaped association between ambient temperature and mortality, which has shifted over time due to climate change and adaptation. Our preliminary analysis found that these temporal changes are not continuous, challenging the linearity assumption often made in conventional two-stage models. Segmenting the study period into sub-periods can lead to reduced power, and assuming normality in second-stage mixed-effects models increases sensitivity to outliers. To overcome these issues, we propose a robust Bayesian change-point model. In first stage, we estimated temperature-mortality relationships using distributed lag nonlinear models across three-year sub-periods. In second stage, we applied mixed-effects modeling with a Gaussian–t mixture to account for outliers, followed by hierarchical Bayesian change-point modeling with no-return constraint. Our approach identified two significant change points, indicating a recent decline in heat-mortality risk and suggesting successful adaptation in Japan. This framework enables robust detection of temporal shifts in functional data and can be applied beyond climate-related mortality studies.
Keywords:
Change-point model; Bayesian Nonparametrics; Environmental Epidemiology; Temperature mortality association

3. Bayesian Envelope Models for Biomedical Data: A Novel Approach to Dimension Reduction

Speaker: Yeonhee Park, Sungkyunkwan University In-person
Email: yeonheepark@skku.edu
Abstract:
Bayesian envelope models provide a powerful approach to dimension reduction in high-dimensional biomedical data. Originating from Cook et al. (2010), the envelope model improves multivariate regression by identifying material and immaterial parts of variation relevant to the response. This presentation introduces two recent advancements in Bayesian envelope methodology and their applications in imaging genetics and cell line data analysis for targeted treatments. The first is the Bayesian Simultaneous Partial Envelope Model, which reduces dimensions of both multivariate responses and selected predictors, preserving important genetic signals while removing irrelevant variation. The second is a Bayesian method for the Multivariate Probit Model with a Latent Envelope, tailored for correlated binary outcomes. This model, the first envelope approach for non-continuous multivariate responses under a generalized linear model framework, addresses identifiability challenges by reparameterizing the space with constraints and applying essential identifiability. Applications to biomedical data show both models reduce noise while maintaining key signals, enhancing inference and prediction.
Keywords:
Latent envelope, multivariate regression, multivariate probit model, reducing subspace, simultaneous envelope.

4. Bayesian Regression for Aggregate Ordinal Outcomes with Imprecise Categories

Speaker: Yeongjin Gwon, University of Nebraska Medical Center Online
Email: yeongjin.gwon@unmc.edu Coauthors: Ming-Hui Chen / University of Connecticut, Joseph Ibrahim / University of North Carolina, May Mo / Amgen, Jun Xiang / Amgen, Amy H Xia/ Amgen
Abstract:
Comparing emerging treatment options is often challenging because of the sparse of direct comparisons from head-to-head trials and inconsistencies in outcome measures among published placebo-controlled trials for each treatment. The ordinal response variable will inevitably contain unknown response categories because they cannot be directly derived from published data. In this talk, we propose a statistical methodology to overcome such a common but unresolved issue in the context of network meta-regression for aggregate ordinal outcomes. Specifically, we introduce unobserved latent counts and model these counts within a Bayesian framework. The proposed approach includes several existing models as special cases and also allows us to conduct a proper statistical analysis in the presence of trials with certain missing categories. We then develop an efficient Markov chain Monte Carlo sampling algorithm to carry out Bayesian computation. A variation of the deviance information criterion is used for the assessment of goodness-of-fit under different distributions of the latent counts. A case study demonstrating the usefulness of the proposed methodology is conducted.
Keywords:
Bayesian SUCRA, Collapsed Gibbs sampling, Indirect comparison, Latent count