Sam Bowman's profile photo

Sam Bowman

Featured in:

Articles

  • Sep 3, 2024 | lesswrong.com | Sam Bowman

    This piece reflects my current best guess at the major goals that Anthropic (or another similarly positioned AI developer) will need to accomplish to have things go well with the development of broadly superhuman AI. Given my role and background, it’s disproportionately focused on technical research and on averting emerging catastrophic risks.

  • Apr 17, 2024 | lesswrong.com | Arjun Panickssery |Sam Bowman |Shi Feng

    Self-evaluation using LLMs is used in reward modeling, model-based benchmarks like GPTScore and AlpacaEval, self-refinement, and constitutional AI. LLMs have been shown to be accurate at approximating human annotators on some tasks. But these methods are threatened by self-preference, a bias in which an LLM evaluator scores its own outputs higher than than texts written by other LLMs or humans, relative to the judgments of human annotators.

  • Sep 25, 2023 | mommybites.com | Sam Bowman

    Whether you’re a parent with a remote job or you’re looking for an opportunity to work from home, you need to figure out how to do so while raising your kids. From moving to a better space to creating boundaries, there are many tactics you can try to juggle all of your important responsibilities so you can focus on your work and your family. Control Your CircumstancesThe best way to effectively balance both work and parenthood is to control your circumstances so you have fewer surprises.

  • Aug 4, 2023 | mommybites.com | Sam Bowman

    A Mother’s Guide to Planning a Safe and Memorable Vacation for Your KidsWhether you’re just traveling as a family unit or making an adventure with another group, moms need to be in full control. There’s a lot to prepare for and a lot that can happen, and by planning ahead, you can be ready for everything. So, if you have some vacation dates on the calendar, use these helpful tips to ensure the trip goes off without a hitch.

  • Jul 28, 2023 | lesswrong.com | Nina Rimsky |Ethan Perez |Sam Bowman |Logan Riggs

    Produced as part of the SERI ML Alignment Theory Scholars Program - Summer 2023 Cohort, under the mentorship of Evan Hubinger. I generate an activation steering vector using Anthropic's sycophancy dataset and then find that this can be used to increase or reduce performance on TruthfulQA, indicating a common direction between sycophancy on questions of opinion and untruthfulness on questions relating to common misconceptions.

Contact details

Socials & Sites

Try JournoFinder For Free

Search and contact over 1M+ journalist profiles, browse 100M+ articles, and unlock powerful PR tools.

Start Your 7-Day Free Trial →