DragonGod

Karma: 2,385

Theoretical Computer Science Msc student at the University of [Redacted] in the United Kingdom.

I’m an aspiring alignment theorist; my research vibes are descriptive formal theories of intelligent systems (and their safety properties) with a bias towards constructive theories.

I think it’s important that our theories of intelligent systems remain rooted in the characteristics of real world intelligent systems; we cannot develop adequate theory from the null string as input.

“Heretical Thoughts on AI” by Eli Dourado

DragonGod19 Jan 2023 16:11 UTC

145 points

38 comments3 min readLW link

(www.elidourado.com)

[Question] Why The Focus on Expected Utility Maximisers?

DragonGod27 Dec 2022 15:49 UTC

116 points

84 comments3 min readLW link

DragonGod 8 Apr 2023 22:19 UTC
LW: 90 AF: 26
51
AF
on: GPTs are Predictors, not Imitators
GPTs are not Imitators, nor Simulators, but Predictors.
I think an issue is that GPT is used to mean two things:
1. A predictive model whose output is a probability distribution over token space given its prompt and context
2. Any particular techniques/strategies for sampling from the predictive model to generate responses/completions for a given prompt.
[See the Appendix]
The latter kind of GPT, is what I think is rightly called a “Simulator”.
From @janus’ Simulators (italicised by me):
I use the generic term “simulator” to refer to models trained with predictive loss on a self-supervised dataset, invariant to architecture or data type (natural language, code, pixels, game states, etc). The outer objective of self-supervised learning is Bayes-optimal conditional inference over the prior of the training distribution, which I call the simulation objective, because a conditional model can be used to simulate rollouts which probabilistically obey its learned distribution by iteratively sampling from its posterior (predictions) and updating the condition (prompt). Analogously, a predictive model of physics can be used to compute rollouts of phenomena in simulation. A goal-directed agent which evolves according to physics can be simulated by the physics rule parameterized by an initial state, but the same rule could also propagate agents with different values, or non-agentic phenomena like rocks. This ontological distinction between simulator (rule) and simulacra (phenomena) applies directly to generative models like GPT.
It is exactly because of the existence of GPT the predictive model, that sampling from GPT is considered simulation; I don’t think there’s any real tension in the ontology here.
Appendix
Credit for highlighting this distinction belongs to @Cleo Nardo:
Remark 2: “GPT” is ambiguous
We need to establish a clear conceptual distinction between two entities often referred to as “GPT” —
- The autoregressive language model $μ : T^{k} \to Δ (T)$ which maps a prompt $x \in T^{k}$ to a distribution over tokens $μ (\cdot | x) \in Δ (T)$ .
- The dynamic system that emerges from stochastically generating tokens using $μ$ while also deleting the start token
Don’t conflate them! These two entities are distinct and must be treated as such. I’ve started calling the first entity “Static GPT” and the second entity “Dynamic GPT”, but I’m open to alternative naming suggestions. It is crucial to distinguish these two entities clearly in our minds because they differ in two significant ways: capabilities and safety.
1. Capabilities:
  Static GPT has limited capabilities since it consists of a single forward pass through a neural network and is only capable of computing functions that are O(1). In contrast, Dynamic GPT is practically Turing-complete, making it capable of computing a vast range of functions.
2. Safety:
  If mechanistic interpretability is successful, then it might soon render Static GPT entirely predictable, explainable, controllable, and interpretable. However, this would not automatically extend to Dynamic GPT. This is because Static GPT describes the time evolution of Dynamic GPT, but even simple rules can produce highly complex systems.
  In my opinion, Static GPT is unlikely to possess agency, but Dynamic GPT has a higher likelihood of being agentic. An upcoming article will elaborate further on this point.
This remark is the most critical point in this article. While Static GPT and Dynamic GPT may seem similar, they are entirely different beasts.
To summarise:
- Static GPT: GPT as predictor
- Dynamic GPT: GPT as simulator
What links here?
- The Compleat Cybornaut by ukc10014 (19 May 2023 8:44 UTC; 64 points)
- Direction of Fit by NicholasKees (2 Oct 2023 12:34 UTC; 32 points)

DragonGod 13 Nov 2022 11:07 UTC
69 points
47
on: Noting an unsubstantiated communal belief about the FTX disaster
I don’t buy this argument for a few reasons:
- SBF met Will MacAskill in 2013 and it was following that discussion that SBF decided to earn to give
  - EA wasn’t a powerful or influential movement back in 2013, but quite a fringe cause.
- SBF was in EA since his college days, long before his career in quantitative finance and later in crypto
SBF didn’t latch onto EA after he acquired some measure of power or when EA was a force to be reckoned with, but pretty early on. He was in a sense “homegrown” within EA.
The “SBF was a sociopath using EA to launder his reputation” is just motivated credulity IMO. There is little evidence in favour of it. It’s just something that sounds good to be true and absolves us of responsibility.
Astrid’s hypothesis is very uncredible when you consider that she doesn’t seem to be aware of SBF’s history within EA. Like what’s the angle here? There’s nothing suggesting SBF planned to enter finance as a college student before MacAskill sold him on earning to give.

Sparks of Artificial General Intelligence: Early experiments with GPT-4 | Microsoft Research

DragonGod23 Mar 2023 5:45 UTC

68 points

23 comments1 min readLW link

(arxiv.org)

Tracr: Compiled Transformers as a Laboratory for Interpretability | DeepMind

DragonGod13 Jan 2023 16:53 UTC

62 points

12 comments1 min readLW link

(arxiv.org)

[1911.08265] Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model | Arxiv

DragonGod21 Nov 2019 1:18 UTC

52 points

4 comments1 min readLW link

(arxiv.org)

Beren’s “Deconfusing Direct vs Amortised Optimisation”

DragonGod7 Apr 2023 8:57 UTC

51 points

10 comments3 min readLW link

The Limit of Language Models

DragonGod6 Jan 2023 23:53 UTC

43 points

26 comments4 min readLW link

Contra “Strong Coherence”

DragonGod4 Mar 2023 20:05 UTC

39 points

24 comments1 min readLW link

[Question] Is InstructGPT Following Instructions in Other Languages Surprising?

DragonGod13 Feb 2023 23:26 UTC

39 points

15 comments1 min readLW link

“Dangers of AI and the End of Human Civilization” Yudkowsky on Lex Fridman

DragonGod30 Mar 2023 15:43 UTC

38 points

32 comments1 min readLW link

(www.youtube.com)

[Yann Lecun] A Path Towards Autonomous Machine Intelligence

DragonGod27 Jun 2022 19:24 UTC

38 points

13 comments1 min readLW link

(openreview.net)

AI Risk Management Framework | NIST

DragonGod26 Jan 2023 15:27 UTC

36 points

4 comments2 min readLW link

(www.nist.gov)

[Question] Is “Recursive Self-Improvement” Relevant in the Deep Learning Paradigm?

DragonGod6 Apr 2023 7:13 UTC

32 points

36 comments7 min readLW link

DragonGod 12 Feb 2023 10:30 UTC
31 points
22
on: Elements of Rationalist Discourse
2. Non-Violence: Argument gets counter-argument. Argument does not get bullet. Argument does not get doxxing, death threats, or coercion.^[1]
I’d want to include some kinds of social responses as unacceptable as well. Derision, mockery, acts to make the argument low status, ad hominems, etc.
You can choose not to engage with bad arguments, but you shouldn’t engage by not addressing the arguments and instead trying to execute some social maneuver to discredit it.