Articles

  • Jan 21, 2025 | lesswrong.com | Martin Soto |Martín Soto

    This is the abstract and introduction of our new paper, with some discussion of implications for AI Safety at the end. Authors: Jan Betley*, Xuchan Bao*, Martín Soto*, Anna Sztyber-Betley, James Chua, Owain Evans (*Equal Contribution). We study behavioral self-awareness — an LLM's ability to articulate its behaviors without requiring in-context examples. We finetune LLMs on datasets that exhibit particular behaviors, such as (a) making high-risk economic decisions, and (b) outputting insecure code.

  • Aug 27, 2024 | lesswrong.com | Martin Soto |Martín Soto

    Two new The Information articles with insider information on OpenAI's next models and moves. They are paywalled, but here are the new bits of information:Strawberry is more expensive and slow at inference time, but can solve complex problems on the first try without hallucinations.

  • Aug 1, 2024 | lesswrong.com | Martin Soto |Martín Soto

    TL;DR: Let’s start iterating on experiments that approximate real, society-scale multi-AI deploymentEpistemic status: These ideas seem like my most prominent delta with the average AI Safety researcher, have stood the test of time, and are shared by others I intellectually respect. Please attack them fiercely!Some authors have already written about multi-polar AI failure. I especially like how Andrew Critch has tried to sketch concrete stories for it.

  • May 13, 2024 | lesswrong.com | Martin Soto |Martín Soto

    This website requires javascript to properly function. Consider activating javascript to get access to all site functionality.

  • Apr 6, 2024 | lesswrong.com | Martin Soto |Martín Soto

    Grant Snider created this comic (which became a meme):Richard Ngo extended it into posthuman/transhumanist literature:That's cool, but I'd have gone for different categories myself. Here they are together with their explanations. Top: Man vs Agency(Other names: Superintelligence, Singularity, Self-improving technology, Embodied consequentialism.)Because Nature creates Society creates Technology creates Agency.

Contact details

Socials & Sites

Try JournoFinder For Free

Search and contact over 1M+ journalist profiles, browse 100M+ articles, and unlock powerful PR tools.

Start Your 7-Day Free Trial →