Max H

Karma: 2,209

Most of my posts and comments are about AI and alignment. Posts I’m most proud of, which also provide a good introduction to my worldview:

I also created Forum Karma, and wrote a longer self-introduction here.

PMs and private feedback are always welcome.

NOTE: I am not Max Harms, author of Crystal Society. I’d prefer for now that my LW postings not be attached to my full name when people Google me for other reasons, but you can PM me here or on Discord (m4xed) if you want to know who I am.

Gradual takeoff, fast failure

Max H16 Mar 2023 22:02 UTC

15 points

4 comments5 min readLW link

Instantiating an agent with GPT-4 and text-davinci-003

Max H19 Mar 2023 23:57 UTC

13 points

3 comments32 min readLW link

Grinding slimes in the dungeon of AI alignment research

Max H24 Mar 2023 4:51 UTC

10 points

2 comments4 min readLW link

Steering systems

Max H4 Apr 2023 0:56 UTC

50 points

1 comment15 min readLW link

Eliezer on The Lunar Society podcast

Max H6 Apr 2023 16:18 UTC

40 points

5 comments1 min readLW link

(www.dwarkeshpatel.com)

A decade of lurking, a month of posting

Max H9 Apr 2023 0:21 UTC

70 points

4 comments5 min readLW link

“Aligned” foundation models don’t imply aligned systems

Max H13 Apr 2023 4:13 UTC

39 points

10 comments5 min readLW link

Paying the corrigibility tax

Max H19 Apr 2023 1:57 UTC

14 points

1 comment13 min readLW link

A test of your rationality skills

Max H20 Apr 2023 1:19 UTC

11 points

11 comments4 min readLW link

LLM cognition is probably not human-like

Max H8 May 2023 1:22 UTC

26 points

14 comments7 min readLW link

Gradient hacking via actual hacking

Max H10 May 2023 1:57 UTC

12 points

7 comments3 min readLW link

Reward is the optimization target (of capabilities researchers)

Max H15 May 2023 3:22 UTC

32 points

4 comments5 min readLW link

Where do you lie on two axes of world manipulability?

Max H26 May 2023 3:04 UTC

30 points

15 comments3 min readLW link

Without a trajectory change, the development of AGI is likely to go badly

Max H29 May 2023 23:42 UTC

16 points

2 comments13 min readLW link

Four levels of understanding decision theory

Max H1 Jun 2023 20:55 UTC

12 points

11 comments4 min readLW link

10 quick takes about AGI

Max H20 Jun 2023 2:22 UTC

35 points

17 comments7 min readLW link

Forum Karma: view stats and find highly-rated comments for any LW user

Max H1 Jul 2023 15:36 UTC

58 points

16 comments2 min readLW link

(forumkarma.com)

Actually, “personal attacks after object-level arguments” is a pretty good rule of epistemic conduct

Max H17 Sep 2023 20:25 UTC

37 points

15 comments7 min readLW link

An explanation for every token: using an LLM to sample another LLM

Max H11 Oct 2023 0:53 UTC

34 points

4 comments11 min readLW link

Concrete positive visions for a future without AGI

Max H8 Nov 2023 3:12 UTC

41 points

28 comments8 min readLW link