Articles

  • Jan 10, 2025 | lesswrong.com | Sam Marks

    Anthropic’s Alignment Science team conducts technical research aimed at mitigating the risk of catastrophes caused by future advanced AI systems, such as mass loss of life or permanent loss of human control. A central challenge we face is identifying concrete technical work that can be done today to prevent these risks. Future worlds where our research matters—that is, worlds that carry substantial catastrophic risk from AI—will have been radically transformed by AI development.

  • Jul 7, 2024 | alliancemagazine.org | Sam Marks

    Imagine being charged with critical life-changing responsibilities while being starved by the same public actors to whom you are accountable. This is the crazy-making situation nonprofits are finding themselves in, whether they are housing the unhoused, providing safe spaces for women fleeing intimate partner violence, or providing childcare, many of society’s most critical services rely on timely, predictable funding from government agencies.

  • Jun 12, 2024 | alliancemagazine.org | Sam Marks |Claudia Cahalane

    Imagine being charged with critical life-changing responsibilities while being starved by the same public actors to whom you are accountable. This is the crazy-making situation nonprofits are finding themselves in, whether they are housing the unhoused, providing safe spaces for women fleeing intimate partner violence, or providing childcare, many of society’s most critical services rely on timely, predictable funding from government agencies.

  • Apr 18, 2024 | lesswrong.com | Sam Marks

    In a new preprint, Sparse Feature Circuits: Discovering and Editing Interpretable Causal Graphs in Language Models, my coauthors and I introduce a technique, Sparse Human-Interpretable Feature Trimming (SHIFT), which I think is the strongest proof-of-concept yet for applying AI interpretability to existential risk reduction. In this post, I will explain how SHIFT fits into a broader agenda for what I call cognition-based oversight.

  • Feb 24, 2024 | studentnewspaper.org | Sam Marks

Contact details

Socials & Sites

Try JournoFinder For Free

Search and contact over 1M+ journalist profiles, browse 100M+ articles, and unlock powerful PR tools.

Start Your 7-Day Free Trial →