Leon Lang

Featured in:

Articles

OpenAI Email Archives (from Musk v. Altman) — LessWrong

Nov 16, 2024 | lesswrong.com | Seth Herd |Thane Ruthenis |Leon Lang |Thomas Kehrenberg

As part of the court case between Elon Musk and Sam Altman, a substantial number of emails between Elon, Sam Altman, Ilya Sutskever, and Greg Brockman have been released. I have found reading through these really valuable, and I haven't found an online source that compiles all of them in an easy to read format. So I made one. I used some AI assistance to generate this originally, and then went meticulously through each email and checked them for differences.
[Paper Blogpost] When Your AIs Deceive You: Challenges with Partial Observability in RLHF — LessWrong

Oct 22, 2024 | lesswrong.com | Leon Lang

This is a post on our recent paper “When Your AIs Deceive You: Challenges with Partial Observability in Reinforcement Learning from Human Feedback”, done with my great co-authors Davis Foote, Stuart Russell, Anca Dragan, Erik Jenner, and Scott Emmons. It has recently been accepted to NeurIPS (though with the camera-ready version not yet available).
Runaway Optimizers in Mind Space — LessWrong

Jul 16, 2023 | lesswrong.com | Leon Lang

This post is an attempt to make an argument more explicit that sometimes seems to be implied (or stated without deeper explanation) in AI safety discussions, but that I haven’t seen spelled out that directly or in this particular form. This post contains nothing fundamentally new, but a slightly different – and I hope more intuitive – perspective on reasoning that has been heard elsewhere. Particularly I attempt to make the argument more tangible with a visualization.

Contact details

Emails

[email protected]

Socials & Sites

Try JournoFinder For Free

Search and contact over 1M+ journalist profiles, browse 100M+ articles, and unlock powerful PR tools.

Start Your 7-Day Free Trial →

Leon Lang

Articles

OpenAI Email Archives (from Musk v. Altman) — LessWrong

[Paper Blogpost] When Your AIs Deceive You: Challenges with Partial Observability in RLHF — LessWrong

Runaway Optimizers in Mind Space — LessWrong

Contact details

Emails

Socials & Sites