As a person not affiliated with Conjecture, I want to record some of my scattered reactions. A lot of upvotes on such a post without substantial comments seems… unfair?
On one hand, it is always interesting to read something like that. Many of us have pondered Conjecture, asking ourselves whether what they are doing and the way they are doing it make sense. E.g. their infohazard policy has been remarkable, super-interesting, and controversial. My own reflections on that have been rather involved and complicated.
On the other hand, when I am reading the included Conjecture response, what they are saying there seems to me to make total sense (if I were in an artificial binary position of having to fully side with the post or with them, I would have sided with Conjecture on this). Although one has to note that their https://www.conjecture.dev/a-standing-offer-for-public-discussions-on-ai/ is returning a 404 at the moment. Is that offer still standing?
Specifically, on their research quality, the Simulator theory has certainly been controversial, but many people find it extremely valuable, and I personally tend to recommend it to people as the most important conceptual breakthrough of 2022 (in my opinion) (together with the notes I took on the subject) . It is particularly valuable as a deconfusion tool on what LLMs are and aren’t, and I found that framing the LLM-related problems in terms of properties of simulation runs and in terms of sculpting and controlling the simulations is very productive. So I am super-greatful for that part of their research output.
On the other hand, I did notice that the authors of that work and Conjecture had parted ways (and when I noticed that I told myself, “perhaps I don’t need to follow that org all that closely anymore, although it is still a remarkable org”).
I think what makes writing comments on posts like this one difficult is that the post is really structured and phrased in such a way as to make this a situation of personal conflict, internal to the relatively narrow AI safety community.
I have not downvoted the post, but I don’t like this aspect, I am not sure this is the right way to approach things...
I felt exactly the same, until I had read this June 2020 paper: Transformers are RNNs: Fast Autoregressive Transformers with Linear Attention.
It turns out that using Transformers in the autoregressive mode (with output tokens being added back to the input by concatenating the previous input and the new output token, and sending the new versions of the input through the model again and again) results in them emulating dynamics of recurrent neural networks, and that clarifies things a lot...