
Nathaniel Monson
Articles
-
Nov 4, 2023 |
lesswrong.com | Quintin Pope |Nora Belrose |Nathaniel Monson
So I think the issue is that when we discuss what I'd call the "standard argument from evolution", you can read two slightly different claims into it. My original post was a bit muddled because I think those claims are often conflated, and before writing this reply I hadn't managed to explicitly distinguish them.
-
Oct 20, 2023 |
lesswrong.com | Daniel Kokotajlo |Cleo Nardo |Mateusz Bagiński |Nathaniel Monson
[This comment got long. The TLDR is that, on my proposal, all [?[1]] instances of shutdown-resistance are already strictly dispreferred to no-resistance, so shutdown-resisting actions won’t be chosen. Trammelling won’t stop shutdown-resistance from being strictly dispreferred to no-resistance because trammelling only turns preferential gaps into strict preferences. Trammelling won’t remove or overturn already-existing strict preferences.]Your comment suggests a nice way to think about things.
-
Sep 28, 2023 |
lesswrong.com | Garrett Baker |Vanessa Kosoy |Nathaniel Monson |Julian Bradshaw
When Stanislav Petrov's missile alert system pinged, the world was not watching. Russia was not watching. Perhaps a number of superiors in the military were staying in the loop about Stanislav's outpost, waiting for updates. It wasn't theatre. In contrast, LessWrong's historical Petrov Day celebrations have been pretty flashy affairs. Great big red buttons, intimidating countdown timers, and all that. That's probably not what the next "don't destroy the world" moment will look like.
-
Jul 21, 2023 |
lesswrong.com | Nathaniel Monson
In the context of transformer models, the "positional embedding matrix" is the thing that encodes the meaning of positions within a prompt.
Try JournoFinder For Free
Search and contact over 1M+ journalist profiles, browse 100M+ articles, and unlock powerful PR tools.
Start Your 7-Day Free Trial →