This repository contains the code needed to reproduce all the experiments in the "Simulation Based Bayesian Optimization" paper: https://arxiv.org/abs/2401.10811
This paper introduces Simulation Based Bayesian Optimization (SBBO) as a novel approach to optimizing acquisition functions in Bayesian Optimization that only requires sampling-based access to posterior predictive distributions. SBBO allows the use of surrogate probabilistic models tailored for combinatorial spaces with discrete variables. Any Bayesian model in which posterior inference is carried out through Markov chain Monte Carlo can be selected as the surrogate model in SBBO. In applications involving combinatorial optimization, we demonstrate empirically the effectiveness of SBBO method using various choices of surrogate models.
The code is written Python. A conda environment contains all necessary dependencies. It can be installed using
conda env create -f sbbo.yml
And activated through
conda activate sbbo
In addition, the sbbo
package must be installed running the following in the root directory:
pip install -e .
To run SBBO, execute the following:
python -u run.py --problem {problem}
--learner {learner} --acqfun {acqfun}
--niters {niters} --search {search}
--seed_conf {seed_conf} --nexp {nexp}
The --problem
argument selects the optimization problem in which SBBO will be run. Currently implemented options are:
BQP
: Binary quadratic problem (Section 4.1 of the paper).CON
: Contamination problem (Section 4.2 of the paper).pRNA
: RNA design problem (Section 4.3 of the paper).
The --learner
argument selects the surrogate probabilistic model. Available options are:
BOCS
: Sparse Bayesian linear regression with pairwise interactions.BNN
: Bayesian Neural Network.GPr
: Tanimoto Gaussian Process Regression.NGBdec
: Natural gradient boosting with a shallow decision tree model as base learner.NGBlinCV
: Natural gradient boosting with an sparse linear regression as base learner.
The --acqfun
argument selects the acquisition function. Specify EI
for expected improvement or PI
for probability of improvement.
--niters
specifies the number of function evaluations. --search
selects the algorithm used to select the next function evaluation location. Possible options are:
MH
: to use SBBO with Metropolis-Hastings sampling scheme.Gibbs
: to use SBBO with Gibbs sampling scheme.RS
: to use Random Local Search.SA
: to use Simulated Annealing.
--seed_conf
allows configuration of random seed. Set it to 23 in order to reproduce the experiments in the paper. Finally, use --nexp
as a label for the experiment number.
After running the previous command with the selected options, results will be stored in the results folder.
In order to reproduce the convergence plot from the appendix, run the convergence.ipynb
jupyter notebook located in the notebooks
folder.
The code to reproduce tables an plots has been written in R. tidyverse
, kableExtra
and latex2exp
libraries need to be installed.
First of all, results generated in the experiments of the previous section must be preprocessed running
Rscript preprocess_results.R
Then, the plots of the paper could be reproduced running
Rscript plots.R
Resulting images will be stored in the figs
folder.
Finally, tables can be reproduced running
Rscript tables.R
Latex code for each table will be printed.