Joe Carlsmith's profile photo

Joe Carlsmith

Featured in: Favicon substack.com

Articles

  • 2 months ago | lesswrong.com | Joe Carlsmith

    (Audio version here, or search for "Joe Carlsmith Audio" on your podcast app.)“There comes a moment when the children who have been playing at burglars hush suddenly: was that a real footstep in the hall?” - C.S. LewisSometimes, my thinking feels more “real” to me; and sometimes, it feels more “fake.” I want to do the real version, so I want to understand this spectrum better. This essay offers some reflections. I give a bunch of examples of this “fake vs.

  • Nov 12, 2024 | lesswrong.com | Joe Carlsmith

    (This is the fourth in a series of four posts about how we might solve the alignment problem. See the first post for an introduction to this project and a summary of the series as a whole.)In the first post in this series, I distinguished between “motivation control” (trying to ensure that a superintelligent AI’s motivations have specific properties) and “option control” (trying to ensure that a superintelligent AI’s options have specific properties).

  • Nov 4, 2024 | lesswrong.com | Joe Carlsmith

    (This is the third in a series of four posts about how we might solve the alignment problem. See the first post for an introduction to this project and a summary of the posts that have been released thus far.)In the first post in this series, I distinguished between “motivation control” (trying to ensure that a superintelligent AI’s motivations have specific properties) and “option control” (trying to ensure that a superintelligent AI’s options have specific properties).

  • Oct 30, 2024 | lesswrong.com | Joe Carlsmith

    (This is the second in a series of four posts about how we might solve the alignment problem. See the first post for an introduction to this project and a summary of the posts that have been released thus far.)In my last post, I laid out the ontology I’m going to use for thinking about approaches to solving the alignment problem (that is, the problem of building superintelligent AI agents, and becoming able to elicit their beneficial capabilities, with succumbing to the bad kind of AI takeover).

  • Oct 28, 2024 | lesswrong.com | Joe Carlsmith

    In my last post, I laid out my picture of what it would even be to solve the alignment problem. In this series of posts, I want to talk about how we might solve it. To be clear: I don’t think that humans necessarily need to solve the whole problem – at least, not on our own.

Contact details

Socials & Sites

Try JournoFinder For Free

Search and contact over 1M+ journalist profiles, browse 100M+ articles, and unlock powerful PR tools.

Start Your 7-Day Free Trial →