Rafael Harth

Karma: 4,244

I’m an independent researcher currently working on a sequence of posts about consciousness. You can send me anonymous feedback here: https://www.admonymous.co/rafaelharth. If it’s about a post, you can add [q] or [nq] at the end if you want me to quote or not quote it in the comment section.

Inner Alignment: Explain like I’m 12 Edition

Rafael Harth1 Aug 2020 15:24 UTC

179 points

46 comments13 min readLW link 2 reviews

How to evaluate (50%) predictions

Rafael Harth10 Apr 2020 17:12 UTC

134 points

50 comments9 min readLW link

[Question] How to think about and deal with OpenAI

Rafael Harth9 Oct 2021 13:10 UTC

107 points

68 comments1 min readLW link

The case for Doing Something Else (if Alignment is doomed)

Rafael Harth5 Apr 2022 17:52 UTC

93 points

14 comments2 min readLW link

Why it’s so hard to talk about Consciousness

Rafael Harth2 Jul 2023 15:56 UTC

87 points

152 comments9 min readLW link

A guide to Iterated Amplification & Debate

Rafael Harth15 Nov 2020 17:14 UTC

75 points

12 comments15 min readLW link

Rafael Harth 21 Mar 2023 16:52 UTC
67 points
44
in reply to: iceman’s comment on: My Objections to “We’re All Gonna Die with Eliezer Yudkowsky”
I would agree with this if Eliezer had never properly engaged with critics, but he’s done that extensively. I don’t think there should be a norm that you have to engage with everyone, and “ok choose one point, I’ll respond to that” seems like better than not engaging with it at all. (Would you have been more enraged if he hadn’t commented anything?)

Not-Useless Advice For Dealing With Things You Don’t Want to Do

Rafael Harth4 Apr 2022 16:37 UTC

54 points

10 comments6 min readLW link

Insights from Linear Algebra Done Right

Rafael Harth13 Jul 2019 18:24 UTC

53 points

18 comments9 min readLW link

Rafael Harth 27 Oct 2021 3:31 UTC
50 points
on: My experience at and around MIRI and CFAR (inspired by Zoe Curzi’s writeup of experiences at Leverage)
One thing I’d like to say at this point is that I think you (jessicata) have shown very high levels of integrity in responding to comments. There’s been some harsh criticism of your post, and regardless of how justified it is, it takes character not to get defensive, especially given the subject matter. To me, this is also a factor in how I think about the post itself.

We tend to forget complicated things

Rafael Harth20 Oct 2019 20:05 UTC

48 points

15 comments1 min readLW link

Rafael Harth 14 Mar 2024 10:27 UTC
48 points
12
on: ‘Empiricism!’ as Anti-Epistemology
I feel like you can summarize most of this post in one paragraph:

It is not the case that an observation of things happening in the past automatically translates into a high probability of them continuing to happen. Solomonoff Induction actually operates over possible programs that generate our observation set (and in extension, the observable universe), and it may or not may not be the case that the simplest universe is such that any given trend persists into the future. There are no also easy rules that tell you when this happens; you just have to do the hard work of comparing world models.

I’m not sure the post says sufficiently many other things to justify its length.
What links here?
- Are extreme probabilities for P(doom) epistemically justifed? by NathanBarnard (19 Mar 2024 20:32 UTC; 19 points)

Rafael Harth 10 Jan 2024 15:26 UTC
46 points
32
on: Saving the world sucks
I can’t really argue against this post insofar as it’s the description of your mental state, but it certainly doesn’t apply to me. I became way happier after trying to save the world, and I very much decided to try to save the world because of ethical considerations rather than because that’s what I happened to find fun. (And all this is still true today.)

The “AI Dungeons” Dragon Model is heavily path dependent (testing GPT-3 on ethics)

Rafael Harth21 Jul 2020 12:14 UTC

44 points

9 comments6 min readLW link

Understanding Machine Learning (I)

Rafael Harth20 Dec 2019 18:22 UTC

44 points

12 comments11 min readLW link

[Question] Do you vote based on what you think total karma should be?

Rafael Harth24 Aug 2020 13:37 UTC

43 points

57 comments1 min readLW link

Rafael Harth 19 Mar 2021 21:06 UTC
43 points
on: Some blindspots in rationality and effective altruism
- Eliezer Yudkowsky’s portrayal of a single self-recursively improving AGI (later overturned by some applied ML researchers)
I’ve found myself doubting this claim, so I’ve read the post in question. As far as I can tell, it’s a reasonable summary of the fast takeoff position that many people still hold today. If all you meant to say was that there was disagreement, then fine—but saying ‘later overturned’ makes it sound like there is consensus, not that people still have the same disagreement they’ve had 13 years ago. (And your characterization in the paragraph I’ll quote below also gives that impression.)

In hindsight, judgements read as simplistic and naive in similar repeating ways (relying on one metric, study, or paradigm and failing to factor in mean reversion or model error there; fixating on the individual and ignoring societal interactions; assuming validity across contexts):
What links here?
- Remmelt's comment on Some blindspots in rationality and effective altruism by Remmelt (EA Forum; 21 Mar 2021 18:43 UTC; 9 points)

Rafael Harth 19 Apr 2022 17:41 UTC
41 points
1
on: Fixed points and free will
I think this argument doesn’t deserve anywhere near as much thought as you’ve given it. Caplan is committing a logical error, nothing else.

He probably reasoned as follows:
1. If determinism is true, I am computable.
2. Therefore, a large enough computer can compute what I will say.
3. Since my reaction is just more physics, those should be computable as well, hence it should also be possible to tell me what I will do after hearing the result.
This is wrong because “what Caplan outputs after seeing the prediction of our physics simualtor” is a system larger than the physics simulator and hence not computable by the physics simulator. Caplan’s thought experiment works as soon as you make it so the physics simulator is not causally entangled with Caplan.

I don’t think fixed points have any place in this analysis. Obviously, Caplan can choose to implement a function without a fixed point, like $f : x \mapsto - x$ (edit: rather $x \mapsto \neg x$ ), in fact he’s saying this in the comment you quoted. The question is why he can do this, since (as by the above) he supposedly can’t.

See also my phrasing of this problem and Richard_Kennaway’s answer. I think the real problem with his quote is that it’s so badly phrased that the argument isn’t even explicit, which paradoxically makes it harder to refute. You first have to reconstruct the argument, and then it gets easier to see why it’s wrong. But I don’t think there’s anything interesting there.

Preface to the Sequence on Factored Cognition

Rafael Harth30 Nov 2020 18:49 UTC

35 points

6 comments2 min readLW link

A Simple Introduction to Neural Networks

Rafael Harth9 Feb 2020 22:02 UTC

34 points

13 comments18 min readLW link