Robert Wiblin's profile photo

Robert Wiblin

Exploring the inviolate sphere of ideas one interview at a time: https://t.co/2YMw00bkIQ

Featured in: Favicon 80000hours.org Favicon medium.com

Articles

  • 1 week ago | 80000hours.org | Robert Wiblin |Lukas Finnveden

    Once we get to a world where it is technologically possible to replace those researchers with AI systems — which could just be fully obedient, instruction-following AI systems — then you could feasibly have a situation where there’s just one person at the top of the organisation that gives a command: “This is how I want the next AI system to be developed.” And then this army of loyal, obedient AIs will then do all of the technical work.

  • 2 weeks ago | 80000hours.org | Robert Wiblin

    Another thing you might do once you’ve caught your AI trying to escape is let the AI think it succeeded, and see what it does. Presumably when the AI got caught, it was in the midst of taking some actions that it thought would let it launch a rogue deployment. If your model knows a bunch of security vulnerabilities in your software, it might start deploying all those things. — Buck ShlegerisMost AI safety conversations centre on alignment: ensuring AI systems share our values and goals.

  • 1 month ago | 80000hours.org | Robert Wiblin

    You should really pause and reflect on the fact that many companies now are saying what we want to do is build AGI — AI that is as good as humans. OK, what does it look like? What does a good society look like when we have humans and we have trillions of AI beings going around that are functionally much more capable? What’s the vision like? How do we coexist in an ethical and morally respectable way? And it’s like… there’s nothing. We’re careening towards this vision that is just a void, essentially.

  • 2 months ago | 80000hours.org | Robert Wiblin |Eliezer Yudkowsky

    So one criticism of “the AGI ideology,” as these people would put it, is that AGI is not foreordained… But when we talk about it as if inherently it’s coming, and it will have certain properties, that deprives citizens of agency. Now, the counterposition I would offer is: you don’t want to equip groups trying to shape history with a naive model of what’s possible.

  • 2 months ago | 80000hours.org | Robert Wiblin

    The basic law on this is that you can change [a charity’s] purpose for one of four reasons. One is that the purpose has become illegal. Second, that it’s become impossible to fulfil. Third, that it is impracticable. And fourth, that it is wasteful — you have more assets than you need to fulfil your purpose. I think the only one [that maybe applies] is an argument that it’s impracticable for some reason. I think that’s still a very tough hurdle. Is it now impracticable?

Contact details

Socials & Sites

Try JournoFinder For Free

Search and contact over 1M+ journalist profiles, browse 100M+ articles, and unlock powerful PR tools.

Start Your 7-Day Free Trial →

X (formerly Twitter)

Followers
38K
Tweets
683
DMs Open
Yes
Rob Wiblin
Rob Wiblin @robertwiblin
19 Apr 25

RT @nabeelqu: So we're in an RL-scaling world, where reward hacking and deception by AI models is common; a capabilities scaling arms race;…

Rob Wiblin
Rob Wiblin @robertwiblin
19 Apr 25

RT @labenz: OpenAI's o3/o4 models show *huge* gains toward "automating the job of an OpenAI research engineer" With that, the risk of AI-e…

Rob Wiblin
Rob Wiblin @robertwiblin
19 Apr 25

RT @DKokotajlo: Internal deployment was always the main threat model IMO. IIRC I tried to get the Preparedness Framework to cover internal…