Alan E Dunne

Karma: 16

Alan E Dunne 25 May 2026 19:45 UTC
1 point
0
on: Belief as Psychosis
See
https://www.lesswrong.com/posts/j2W3zs7KTZXt2Wzah/how-do-you-feel-about-lesswrong-these-days-open-feedback#sXJxcMhT7t4NmbJpc
about ninth top comment under first answer:
[-]TurnTrout 1y211″
My sense is that neither of us have been very persuaded by those conversations, and I claim that’s not very surprising, in a way that’s epistemically defensible for both of us. I’ve spent literal years working through the topic myself in great detail, so it would be very surprising if my view was easily swayed by a short comment chain—and similarly I expect that the same thing is true of you, where you’ve spent much more time thinking about this and have much more detailed thoughts than are easy to represent in a simple comment chain.
I’ve thought about this claim more over the last year. I now disagree. I think that this explanation makes us feel good but ultimately isn’t true.
I can point to several times where I have quickly changed my mind on issues that I have spent months or years considering:
1. in early 2022, I discarded my entire alignment worldview over the course of two weeks due to Quintin Pope’s arguments. Most of the evidence which changed my mind was comm’d over Gdoc threads. I had formed my worldview over the course of four years of thought, and it crumbled pretty quickly.
2. In mid-2022, realizing that reward is not the optimization target took me about 10 minutes, even though I had spent 4 years and thousands of hours thinking about optimal policies. I realized while reading an RL paper say “agents are trained to maximize reward”; reflexively asking myself what evidence existed for that claim; and coming back mostly blank. So that’s not quite a comment thread, but still seems like the same low-bandwidth medium.
3. In early 2023, a basic RL result came out opposite the way which shard theory predicted. I went on a walk and thought about how maybe shard theory was all wrong and maybe I didn’t know what I was talking about. I didn’t need someone to beat me over the head with days of arguments and experimental results. In the end, I came back from my walk and realized I’d plotted the data incorrectly (the predicted outcome did in fact occur).
I think I’ve probably changed my mind on a range of smaller issues (closer to the size of the deceptive alignment case) but have forgotten about them. The presence of example (1) above particularly suggests to me the presence of similar google-doc-mediated insights which happened fast; where I remember one example, probably I have forgotten several more.
To conclude, I think people in comment sections do in fact spend lots of effort to avoid looking dumb, wrong, or falsified, and forget that they’re supposed to be seeking truth.
It seems to me that often people rehearse fancy and cool-sounding reasons for believing roughly the same things they always believed, and comment threads don’t often change important beliefs. Feels more like people defensively explaining why they aren’t idiots, or why they don’t have to change their mind. I mean, if so—I get it, sometimes I feel that way too. But it sucks and I think it happens a lot.
In part, I think, because the site makes truth-seeking harder by spotlighting monkey-brain social-agreement elements. ”
Also:
https://www.lesswrong.com/w/updated-beliefs-examples-thereof?sortedBy=new
and the implications of Less Wrong having such a tag

Alan E Dunne 13 Mar 2025 22:19 UTC
−2 points
0
on: Alan E Dunne’s Shortform
On catastrophic risk and effective persuasion:
https://us06web.zoom.us/webinar/register/WN_j555baQqRjeWhiefEAQ82Q#/registration

Alan E Dunne 29 Sep 2023 19:59 UTC
1 point
0
in reply to: Steven Byrnes’s comment on: My Current Thoughts on the AI Strategic Landscape
So far as the slave carries out immediate work from fear of consequences they are locally aligned with the master’s will.

Alan E Dunne 25 Sep 2023 19:42 UTC
1 point
0
on: Public Opinion on AI Safety: AIMS 2023 and 2021 Summary
How did you get respondents? Why are they “nationally representative”?

Alan E Dunne 24 Sep 2023 17:11 UTC
1 point
0
in reply to: Connor Barber’s comment on: What is to be done? (About the profit motive)
1/ evidence for these statements?
2/ in what sense is it profitable to throw away food or maintain empty dwellings that is distinct from “maintaining everyone else’s quality of life”?
3/ if the evil is that some people’s needs are not valued enough could that not be remedied by giving them money and making it profitable to meet their needs?

Alan E Dunne 17 Sep 2023 18:26 UTC
2 points
1
on: Polarization is Not (Standard) Bayesian
Is martingale different from conservation of expected evidence?
https://www.lesswrong.com/posts/jiBFC7DcCrZjGmZnJ/conservation-of-expected-evidence

Alan E Dunne 10 Aug 2023 20:58 UTC
3 points
0
on: Evaluating GPT-4 Theory of Mind Capabilities
With Respect
Given that in more than a third of the cases where GPT and the answer set disagreed you thought GPT was right and the answer set was wrong, did you check for cases where GPT and the answer set agreed on an answer you thought was wrong?
Yours Sincerely

Alan E Dunne 8 Jul 2023 20:34 UTC
7 points
0
on: Review & rebuttal of “Why machines will never rule the world: artificial intelligence without fear”
Astral Codex Ten: https://astralcodexten.substack.com/p/your-book-review-why-machines-will

Alan E Dunne 7 Jul 2023 16:23 UTC
2 points
1
in reply to: Howie Lempel’s comment on: What are the best non-LW places to read on alignment progress?
This seems to have stopped in July 2022.

Alan E Dunne 5 Jul 2023 22:02 UTC
3 points
0
in reply to: Noosphere89’s comment on: [Linkpost] Introducing Superalignment
“Finally, we can test our entire pipeline by deliberately training misaligned models, and confirming that our techniques detect the worst kinds of misalignments (adversarial testing).”

Alan E Dunne 23 Jun 2023 18:06 UTC
1 point
0
in reply to: Alan E Dunne’s comment on: Idea: medical hypotheses app for mysterious chronic illnesses
In a “heatplot” or plots cf https://www.elsblog.org/the_empirical_legal_studi/2023/05/heatplots-for-correlation-coefficients-graphs.html

Alan E Dunne 23 Jun 2023 18:02 UTC
1 point
0
on: Idea: medical hypotheses app for mysterious chronic illnesses
You could also study the distribution of correlation strengths found over the range of correlations tested, possible, seeing how it compares to what would be expected by chance.

Alan E Dunne 12 Jun 2023 1:39 UTC
1 point
0
on: Andrew Ng wants to have a conversation about extinction risk from AI
https://www.lesswrong.com/posts/QzkTfj4HGpLEdNjXX/an-artificially-structured-argument-for-expecting-agi-ruin

Alan E Dunne 31 May 2023 21:25 UTC
6 points
0
on: Statement on AI Extinction—Signed by AGI Labs, Top Academics, and Many Other Notable Figures
skeptical reaction with one expression of support: https://statmodeling.stat.columbia.edu/2023/05/31/jurassic-ai-extinction/

Alan E Dunne 26 May 2023 14:25 UTC
5 points
1
in reply to: Daniel Kokotajlo’s comment on: Book Review: How Minds Change
https://statmodeling.stat.columbia.edu/2015/12/16/lacour-and-green-1-this-american-life-0/
and generally “beware the one of just one study”

Alan E Dunne 25 May 2023 18:46 UTC
3 points
0
on: Science in a High-Dimensional World
In 26 models taken from volumes 21 to 25 of the journal Law and Human Behavior, the highest R-squared -proportion of VARIANCE, not variation, explained was 40% and the second highest 24%

Alan E Dunne 21 May 2023 16:58 UTC
1 point
0
in reply to: Bezzi’s comment on: What fact that you know is true but most people aren’t ready to accept it?
Evidence?