Thanks for the kind feedback! Any suggestions for a more interesting title?
Andrea_Miotti
International treaty for global compute caps
Priorities for the UK Foundation Models Taskforce
Conjecture: A standing offer for public debates on AI
Apologies for the 404 on the page, it’s an annoying cache bug. Try to hard refresh your browser page (CMD + Shift + R) and it should work.
The “1000” instead of “10000″ was a typo in the summary.
In the transcript Connor states “SLT over the last 10000 years, yes, and I think you could claim the same over the last 150”. Fixed now, thanks for flagging!
Shah (DeepMind) and Leahy (Conjecture) Discuss Alignment Cruxes
Which one? All of them seem to be working for me.
Pessimism of the intellect, optimism of the will.
People from OpenPhil, FTX FF and MIRI were not interested in discussing at the time. We also talked with MIRI about moderating, but it didn’t work out in the end.
People from Anthropic told us their organization is very strict on public communications, and very wary of PR risks, so they did not participate in the end.
In the post I over generalized to not go into full details.
Yes, some people mentioned it was confusing to have two posts (I had originally posted two separate ones for Summary and Transcript due to them being very lengthy) so I merged them in one, and added headers pointing to Summary and Transcript for easier navigation.
Thanks, I was looking for a way to do that but didn’t know the space in italics hack!
Another formatting question: how do I make headers and sections collapsible? It would be great to have the “Summary” and “Transcript” sections as collapsible, considering how long the post is.
Christiano (ARC) and GA (Conjecture) Discuss Alignment Cruxes
Retrospective on the 2022 Conjecture AI Discussions
Thanks, fixed them!
Full Transcript: Eliezer Yudkowsky on the Bankless podcast
I really don’t think that AI dungeon was the source of this idea (why do you think that?)
We’ve heard the story from a variety of sources all pointing to AI Dungeon, and to the fact that the idea was kept from spreading for a significant amount of time. This @gwern Reddit comment, and previous ones in the thread, cover the story well.
And even granting the claim about chain of thought, I disagree about where current progress is coming from. What exactly is the significant capability increase from fine-tuning models to do chain of thought? This isn’t part of ChatGPT or Codex or AlphaCode. What exactly is the story?
Regarding the effects of chain of thought prompting on progress[1], there’s two levels of impact: first order effects and second order effects.
On first order, once chain of thought became public a large number of groups started using it explicitly to finetune their models.
Aside from non-public examples, big ones include PaLM, Google’s most powerful model to date. Moreover, it makes models much more useful for internal R&D with just prompting and no finetuning.
We don’t know what OpenAI used for ChatGPT, or future models: if you have some information about that, it would be super useful to hear about it!
On second order: implementing this straightforwardly improved the impressiveness and capabilities of models, making them more obviously powerful to the outside world, more useful for customers, and leading to an increase in attention and investment into the field.
Due to compounding, the earlier these additional investments arrive, the sooner large downstream effects will happen.
- ^
This is also partially replying to @Rohin Shah ’s question in another comment:
Why do you believe this “drastically” slowed down progress?
- ^
We’d maybe be at our current capability level in 2018, [...] the world would have had more time to respond to the looming risk, and we would have done more good safety research.
It’s pretty hard to predict the outcome of “raising awareness of problem X” ahead of time. While it might be net good right now because we’re in a pretty bad spot, we have plenty of examples from the past where greater awareness of AI risk has arguably led to strongly negative outcomes down the line, due to people channeling their interest in the problem into somehow pushing capabilities even faster and harder.
In terms of explicit claims:
“So one extreme side of the spectrum is build things as fast as possible, release things as much as possible, maximize technological progress [...].
The other extreme position, which I also have some sympathy for, despite it being the absolutely opposite position, is you know, Oh my god this stuff is really scary.
The most extreme version of it was, you know, we should just pause, we should just stop, we should just stop building the technology for, indefinitely, or for some specified period of time. [...] And you know, that extreme position doesn’t make much sense to me either.”
Dario Amodei, Anthropic CEO, explaining his company’s “Responsible Scaling Policy” on the Logan Bartlett Podcast on Oct 6, 2023.
Starts at around 49:40.