
Leon Lang
Articles
-
Nov 16, 2024 |
lesswrong.com | Seth Herd |Thane Ruthenis |Leon Lang |Thomas Kehrenberg
As part of the court case between Elon Musk and Sam Altman, a substantial number of emails between Elon, Sam Altman, Ilya Sutskever, and Greg Brockman have been released. I have found reading through these really valuable, and I haven't found an online source that compiles all of them in an easy to read format. So I made one. I used some AI assistance to generate this originally, and then went meticulously through each email and checked them for differences.
-
Oct 22, 2024 |
lesswrong.com | Leon Lang
This is a post on our recent paper “When Your AIs Deceive You: Challenges with Partial Observability in Reinforcement Learning from Human Feedback”, done with my great co-authors Davis Foote, Stuart Russell, Anca Dragan, Erik Jenner, and Scott Emmons. It has recently been accepted to NeurIPS (though with the camera-ready version not yet available).
-
Jul 16, 2023 |
lesswrong.com | Leon Lang
This post is an attempt to make an argument more explicit that sometimes seems to be implied (or stated without deeper explanation) in AI safety discussions, but that I haven’t seen spelled out that directly or in this particular form. This post contains nothing fundamentally new, but a slightly different – and I hope more intuitive – perspective on reasoning that has been heard elsewhere. Particularly I attempt to make the argument more tangible with a visualization.
Try JournoFinder For Free
Search and contact over 1M+ journalist profiles, browse 100M+ articles, and unlock powerful PR tools.
Start Your 7-Day Free Trial →