Theoretical Computer Science Msc student at the University of [Redacted] in the United Kingdom.
I’m an aspiring alignment theorist; my research vibes are descriptive formal theories of intelligent systems (and their safety properties) with a bias towards constructive theories.
I think it’s important that our theories of intelligent systems remain rooted in the characteristics of real world intelligent systems; we cannot develop adequate theory from the null string as input.
DragonGod
“Heretical Thoughts on AI” by Eli Dourado
[Question] Why The Focus on Expected Utility Maximisers?
I don’t buy this argument for a few reasons:
SBF met Will MacAskill in 2013 and it was following that discussion that SBF decided to earn to give
EA wasn’t a powerful or influential movement back in 2013, but quite a fringe cause.
SBF was in EA since his college days, long before his career in quantitative finance and later in crypto
SBF didn’t latch onto EA after he acquired some measure of power or when EA was a force to be reckoned with, but pretty early on. He was in a sense “homegrown” within EA.
The “SBF was a sociopath using EA to launder his reputation” is just motivated credulity IMO. There is little evidence in favour of it. It’s just something that sounds good to be true and absolves us of responsibility.
Astrid’s hypothesis is very uncredible when you consider that she doesn’t seem to be aware of SBF’s history within EA. Like what’s the angle here? There’s nothing suggesting SBF planned to enter finance as a college student before MacAskill sold him on earning to give.
Sparks of Artificial General Intelligence: Early experiments with GPT-4 | Microsoft Research
Tracr: Compiled Transformers as a Laboratory for Interpretability | DeepMind
[1911.08265] Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model | Arxiv
Beren’s “Deconfusing Direct vs Amortised Optimisation”
The Limit of Language Models
Contra “Strong Coherence”
[Question] Is InstructGPT Following Instructions in Other Languages Surprising?
“Dangers of AI and the End of Human Civilization” Yudkowsky on Lex Fridman
[Yann Lecun] A Path Towards Autonomous Machine Intelligence
AI Risk Management Framework | NIST
[Question] Is “Recursive Self-Improvement” Relevant in the Deep Learning Paradigm?
2. Non-Violence: Argument gets counter-argument. Argument does not get bullet. Argument does not get doxxing, death threats, or coercion.[1]
I’d want to include some kinds of social responses as unacceptable as well. Derision, mockery, acts to make the argument low status, ad hominems, etc.
You can choose not to engage with bad arguments, but you shouldn’t engage by not addressing the arguments and instead trying to execute some social maneuver to discredit it.
I think an issue is that GPT is used to mean two things:
A predictive model whose output is a probability distribution over token space given its prompt and context
Any particular techniques/strategies for sampling from the predictive model to generate responses/completions for a given prompt.
[See the Appendix]
The latter kind of GPT, is what I think is rightly called a “Simulator”.
From @janus’ Simulators (italicised by me):
It is exactly because of the existence of GPT the predictive model, that sampling from GPT is considered simulation; I don’t think there’s any real tension in the ontology here.
Appendix
Credit for highlighting this distinction belongs to @Cleo Nardo:
To summarise:
Static GPT: GPT as predictor
Dynamic GPT: GPT as simulator