
Martin Soto
Articles
-
Jan 21, 2025 |
lesswrong.com | Martin Soto |Martín Soto
This is the abstract and introduction of our new paper, with some discussion of implications for AI Safety at the end. Authors: Jan Betley*, Xuchan Bao*, Martín Soto*, Anna Sztyber-Betley, James Chua, Owain Evans (*Equal Contribution). We study behavioral self-awareness — an LLM's ability to articulate its behaviors without requiring in-context examples. We finetune LLMs on datasets that exhibit particular behaviors, such as (a) making high-risk economic decisions, and (b) outputting insecure code.
-
Aug 27, 2024 |
lesswrong.com | Martin Soto |Martín Soto
Two new The Information articles with insider information on OpenAI's next models and moves. They are paywalled, but here are the new bits of information:Strawberry is more expensive and slow at inference time, but can solve complex problems on the first try without hallucinations.
-
Aug 1, 2024 |
lesswrong.com | Martin Soto |Martín Soto
TL;DR: Let’s start iterating on experiments that approximate real, society-scale multi-AI deploymentEpistemic status: These ideas seem like my most prominent delta with the average AI Safety researcher, have stood the test of time, and are shared by others I intellectually respect. Please attack them fiercely!Some authors have already written about multi-polar AI failure. I especially like how Andrew Critch has tried to sketch concrete stories for it.
-
May 13, 2024 |
lesswrong.com | Martin Soto |Martín Soto
This website requires javascript to properly function. Consider activating javascript to get access to all site functionality.
-
Apr 6, 2024 |
lesswrong.com | Martin Soto |Martín Soto
Grant Snider created this comic (which became a meme):Richard Ngo extended it into posthuman/transhumanist literature:That's cool, but I'd have gone for different categories myself. Here they are together with their explanations. Top: Man vs Agency(Other names: Superintelligence, Singularity, Self-improving technology, Embodied consequentialism.)Because Nature creates Society creates Technology creates Agency.
Try JournoFinder For Free
Search and contact over 1M+ journalist profiles, browse 100M+ articles, and unlock powerful PR tools.
Start Your 7-Day Free Trial →