Speaker
Description
It is desirable to reduce the convergence time of optimisers used by large accelerator facilities to maximise user time. A popular technique is Bayesian optimisation which typically use Gaussian Processes (GP) to construct a surrogate of the real machine response to decision variables. GPs belong to a class of algorithms called kernel methods that assign pairwise similarities to all data points with a kernel function. While well-suited to tasks like injection, the kernel matrix must be stored and inverted at inference time, incurring a $O(N^3)$ time-complexity and limiting datasets to a few thousand examples in practice. The modeller is free to design the kernel, subject to some mild regularity conditions. We investigate whether modifications made to the kernel structure, informed by the physics of our problems, can improve sample efficiency. Thompson sampling (TS) is also investigated as a stochastic alternative to deterministic acquisition functions like Upper Confidence Bound and Expected Improvement in an effort to improve convergence.
| In which format do you inted to submit your paper? | LaTeX |
|---|