Policy Gradients for Optimal Parallel Tempering MCMC

stp2yDecember 30, 20240 Comments

AmazUtah_NLP at SemEval-2024 Task 9: A MultiChoice Question Answering System for Commonsense Defying Reasoning

[Submitted on 3 Sep 2024 (v1), last revised 26 Dec 2024 (this version, v2)]

View a PDF of the paper titled Policy Gradients for Optimal Parallel Tempering MCMC, by Daniel Zhao and 1 other authors

View PDF
HTML (experimental)

Abstract:Parallel tempering is a meta-algorithm for Markov Chain Monte Carlo that uses multiple chains to sample from tempered versions of the target distribution, enhancing mixing in multi-modal distributions that are challenging for traditional methods. The effectiveness of parallel tempering is heavily influenced by the selection of chain temperatures. Here, we present an adaptive temperature selection algorithm that dynamically adjusts temperatures during sampling using a policy gradient approach. Experiments demonstrate that our method can achieve lower integrated autocorrelation times compared to traditional geometrically spaced temperatures and uniform acceptance rate schemes on benchmark distributions.

Submission history

From: Daniel Zhao [view email]
[v1]
Tue, 3 Sep 2024 03:12:45 UTC (5,585 KB)
[v2]
Thu, 26 Dec 2024 07:17:52 UTC (5,395 KB)

Source link
lol

By stp2y