
Quintin Pope
Articles
-
Feb 27, 2024 |
lesswrong.com | Nora Belrose |Quintin Pope
Crossposted from the AI Optimists blog. AI doom scenarios often suppose that future AIs will engage in scheming— planning to escape, gain power, and pursue ulterior motives, while deceiving us into thinking they are aligned with our interests. The worry is that if a schemer escapes, it may seek world domination to ensure humans do not interfere with its plans, whatever they may be.
-
Nov 15, 2023 |
lesswrong.com | Steven Byrnes |Joe Carlsmith |Quintin Pope |Rubi J. Hudson
(Cross-posted from my website)I’ve written a report about whether advanced AIs will fake alignment during training in order to get power later – a behavior I call “scheming” (also sometimes called “deceptive alignment”). The report is available on arXiv here. There’s also an audio version here, and I’ve included the introductory section below (with audio for that section here). This section includes a full summary of the report, which covers most of the main points and technical terminology.
-
Nov 4, 2023 |
lesswrong.com | Quintin Pope |Nora Belrose |Nathaniel Monson
So I think the issue is that when we discuss what I'd call the "standard argument from evolution", you can read two slightly different claims into it. My original post was a bit muddled because I think those claims are often conflated, and before writing this reply I hadn't managed to explicitly distinguish them.
-
Aug 10, 2023 |
lesswrong.com | Kshitij Sachan |Jacob Pfau |Quintin Pope |Violet Hour
This work was done at Redwood Research. The views expressed are my own and do not necessarily reflect the views of the organization. Thanks to Ryan Greenblatt, Fabien Roger, and Jenny Nitishinskaya for running some of the initial experiments and to Gabe Wu and Max Nadeau for revising this post. I conducted experiments to see if language models could use 'filler tokens'—unrelated text output before the final answer—for additional computation.
-
Jul 17, 2023 |
biorxiv.org | Quintin Pope |Rohan Varma |Xiaoli Z. Fern |Christine Tataru
Thank you for your interest in spreading the word about bioRxiv. NOTE: Your email address is requested solely to identify you as the sender of this article. Your Email * Your Name * Send To * Enter multiple addresses on separate lines or separate them with commas. Message Subject (Your Name) has forwarded a page to you from bioRxiv Message Body (Your Name) thought you would like to see this page from the bioRxiv website.
Try JournoFinder For Free
Search and contact over 1M+ journalist profiles, browse 100M+ articles, and unlock powerful PR tools.
Start Your 7-Day Free Trial →