
Donald Hobson
Articles
-
Sep 24, 2024 |
lesswrong.com | Donald Hobson
A response to this paper. https://asi-safety-lab.com/DL/Kill-Switch-For_ASI_EW_21_12_14.pdfA substantial fraction of my argument comes down to. It is plausible that a hypothetical hypercompetent civilization could coordinate to build such infrastructure. However we don't live in a hypothetical hypercompetent civilization. We are not remotely close, and any attempt to get us there will be met with almost certain failure.
-
Sep 24, 2024 |
lesswrong.com | Donald Hobson
Current LLM based AI systems are getting pretty good at maths by writing formal proofs in Lean or similar. https://deepmind.google/discover/blog/ai-solves-imo-problems-at-silver-medal-level/So, how can we use these models to help align AI? The Simple Alignment Solution Assumption states that many problems in alignment, for example corrigibility or value learning, have simple solutions. I mean "simple" in the sense that logical induction is a simple solution. That expected utility maximization is.
-
Sep 6, 2024 |
lesswrong.com | Donald Hobson
My voting system works like this. Each voter expresses their preferences for all candidates on a real numbered utility scale. Then a Maximal lottery takes place over all lotteries over candidates. https://en.wikipedia.org/wiki/Maximal_lotteriesLets describe this in more detail. Suppose there are 3 candidates. A,B,C.
-
Mar 5, 2024 |
lesswrong.com | Gerald M. Monroe |Daniel Kokotajlo |Donald Hobson
There’s a basic high-level story which worries a lot of people. The story goes like this: as AIs become more capable, the default outcome of AI training is the development of a system which, unbeknownst to us, is using its advanced capabilities to scheme against us. The conclusion of this process likely leads to AI takeover, and thence our death. We are not currently dead. So, any argument for our death by route of AI must offer us a causal story.
-
Feb 22, 2024 |
lesswrong.com | Gerald M. Monroe |Donald Hobson
Given that humans on their own haven't yet found these better architectures, humans + imitative AI doesn't seem like it would find the problem trivial. Humans on their own already did invent better RL algorithms for optimizing at the network architecture layer. https://arxiv.org/pdf/1611.01578.pdf backgroundhttps://arxiv.org/pdf/1707.07012.pdf : page 6With NAS, the RL model learns the relationship between [architecture] and [predicted performance].
Try JournoFinder For Free
Search and contact over 1M+ journalist profiles, browse 100M+ articles, and unlock powerful PR tools.
Start Your 7-Day Free Trial →