
Fedor Ryzhenkov
Articles
-
Aug 20, 2024 |
lesswrong.com | Fedor Ryzhenkov |Nathan Helm-Burger |Philipp Alexander Kreer |Prithviraj Singh Shahani
This June, Apart Research and Apollo Research joined forces to host the Deception Detection Hackathon. Bringing together students, researchers, and engineers from around the world to tackle a pressing challenge in AI safety; preventing AI from deceiving humans and overseers. The hackathon took place both online and in multiple physical locations simultaneously.
-
Jun 29, 2024 |
lesswrong.com | Fedor Ryzhenkov
Coauthored by Fedor Ryzhenkov and Dmitrii Volkov (Palisade Research)At Palisade, we often discuss latest safety results with policymakers and think tanks who seek to understand the state of current technology. This document condenses and streamlines the various internal notes we wrote when discussing Anthropic's "Scaling Monosemanticity".
Try JournoFinder For Free
Search and contact over 1M+ journalist profiles, browse 100M+ articles, and unlock powerful PR tools.
Start Your 7-Day Free Trial →