
Marc Carauleanu
Articles
-
Jul 30, 2024 |
lesswrong.com | Steve Byrnes |Marc Carauleanu |Mike Vaiana |Gunnar Zarnacke
Many thanks to Bogdan Ionut-Cirstea, Steve Byrnes, Gunnar Zarnacke, Jack Foxabbott and Seong Hah Cho for critical comments and feedback on earlier and ongoing versions of this work. SummaryIn this post, we introduce self-other overlap training: optimizing for similar internal representations when the model reasons about itself and others while preserving performance.
-
Dec 18, 2023 |
lesswrong.com | Cameron Berg |Roman Leventov |Judd Rosenblatt |Marc Carauleanu
Many thanks to Samuel Hammond, Cate Hall, Beren Millidge, Sumner Norman, Steve Byrnes, Lucius Bushnaq, Joar Skalse, Kyle Gracey, Gunnar Zarncke, Ross Nordby, David Lambert, Simeon Campos, Bogdan Ionut-Cirstea, Ryan Kidd, and Eric Ho for critical comments and suggestions on earlier drafts of this agenda, as well as Philip Gubbins, Diogo de Lucena, Rob Luke, and Mason Seale from AE Studio for their support and feedback throughout.
Try JournoFinder For Free
Search and contact over 1M+ journalist profiles, browse 100M+ articles, and unlock powerful PR tools.
Start Your 7-Day Free Trial →