Introduction to Thompson Sampling: The Bernoulli Bandit (2017)

Introduction to Thompson Sampling: The Bernoulli Bandit (2017)gdmarmerola.github.io//ts...

rphln · 2 years ago

My favorite resource on Thompson Sampling is <https://everyday-data-science.tigyog.app/a-b-testing>.

After learning about it, I went on to replace the UCT formula in MCTS with it and the results were... not much better, actually. But it made me understand both a little better.

vintermann · 2 years ago

My favorite is this series from 2015 by Ian Osband:

https://iosband.github.io/2015/07/19/Efficient-experimentati...

jarym · 2 years ago

Love it! Thanks for sharing

zX41ZdbW · 2 years ago

Thompson Sampling, a.k.a. Bayesian Bandits, is a powerful method for runtime performance optimization. We use it in ClickHouse to optimize compression and to choose between different instruction sets: https://clickhouse.com/blog/lz4-compression-in-clickhouse

plants · 2 years ago

This is great. I remember finding another really good resource on the Bernoulli bandit that was interactive. Putting feelers out there to see if anyone knows what I’m talking about off the top of their heads.

orasis · 2 years ago

I built a contextual bandit combining XGBoost with Thompson Sampling you can check out at https://improve.ai

3abiton · 2 years ago

What's the added value over Thomson's sampling?

orasis · 2 years ago

It can learn faster and generalize learning to unseen variants by learning the impact different features have on the conversion rate.

It can also learn how different variants perform in different contexts.

eggie5 · 2 years ago

if you have an NN that is probabilistic, how do you update the prior after sampling from the posterior?

gwern · 2 years ago

You take the action which you computed to be optimal under the hypothetical of your posterior sample; this then yields a new observation. You add that to the dataset, and train a new NN.

eggie5 · 2 years ago

ah, so observe the reward and then take a gradient step

clbrmbr · 2 years ago

Beautifully composed article. Looking forward to trying this out.