
Artur Bekasov
Articles
-
Mar 27, 2024 |
amazon.science | Jacek R. Golebiowski |Philipp Schmidt |Artur Bekasov |Huijun Yu
This repository contains code for evaluating the methods proposed in Learning action embeddings for off-policy evaluation. To get started, we recommend checking the Example.ipynb notebook as it clearly demonstrates benefits of the proposed method from Section 3 and implements everything in a few lines of code. To run the notebook, you only need python 3 with standard machine learning libraries.
-
Jan 19, 2024 |
amazon.science | Jacek R. Golebiowski |Philipp Schmidt |Artur Bekasov |Matej Cief
Off-policy evaluation (OPE) methods allow us to compute the expected reward of a policy by using the logged data collected by a different policy. However, when the number of actions is large, or certain actions are under-explored by the logging policy, existing estimators based on inverse-propensity scoring (IPS) can have a high or even infinite variance.
Try JournoFinder For Free
Search and contact over 1M+ journalist profiles, browse 100M+ articles, and unlock powerful PR tools.
Start Your 7-Day Free Trial →