Independent alignment researcher
Garrett Baker
playing to the group think is some sort of corruption, it seems to me.
I don’t understand this part of your response. Can you expand?
I expect much of the harm comes from people updating an appropriate amount from the post, not seeing the org/person’s reply because they never had to make any important decisions on the subject, then noticing later that many others have updated similarly, and subsequently doing a group think. Then the person/org is considered really very bad by the community, so other orgs don’t want to associate with them, and open phil no longer wants to fund them because
they’re all scaredy catsthey care about their social status.To my knowledge this hasn’t actually happened, though possibly this is because nobody wants to be talking about the relevant death-spiraled orgs.
Seems more likely the opposite is at play with many EA orgs like OpenPhil or Anthropic (Edit: in the sense that imo many are over-enthusiastic about them. Not necessarily to the same degree, and possibly for reasons orthogonal to the particular policy being discussed here), so I share your confusion about why orgs would force their employees to work over the weekend to correct misconceptions about them. I think most just want to seem professional and correct to others, and this value isn’t directly related to the core altruisticcmission (unless you buy the signaling hypothesis of altruism).
Adding $80 to the pool.
Edited from $50 to $80 after realizing market price I’m usually willing to pay for excellent math distillations.
GPT would likely give highly inconsistent answers, then go off and write a news article about this and other questions moral philosophers ponder.
ChatGPT is hesitant to give opinions unless you jailbreak it, so I’d guess few will succeed in getting it to give a hard position on anything except the most basic of questions, like “should fun exist in the future” (I just tried that, it won’t even give a straight answer when you ask that!).
I also didn’t want to make two arguments. One that he should use prediction markets to choose what he reads, and also he should focus on helping the community rather than his specified metric of worthiness.
LessWrong gives those with higher karma greater post and comment karma starting out, under the assumption that their posts and comments are better and more representative of the community. Probably the high karma you’re seeing is a result of that. I think this is mostly a good thing.
That particular guideline you quoted doesn’t seem to appear on my commenting guidelines text box.
I assume Rob is making this argument internally. I tentatively agree. Writing rebuttals is more difficult than reading them though so not as clear a calculation.
A rock with the phrase “you’re wrong, I don’t exist!” taped on it will still have that phrase taped on even if you utter the words “I think therefore I am”. Similarly, an aligned AGI can still just continue to help out humans even if I link it this post. It would think to itself “If I followed your argument, then I would help out humans less. Therefore, I’m not going to follow your argument”.
I want people to not discuss things in DMs, and discuss things publicly more. I also don’t think this is embarrassing for Quintin, or at all a public spectacle.
I think you should use a manifold market to decide on whether you should read the post, instead of the test this comment is putting forth. There’s too much noise here, which isn’t present in a prediction market about the outcome of your engagement.
Market here: https://manifold.markets/GarrettBaker/will-eliezer-think-there-was-a-sign
It would be nice if you posted GPT-4’s responses directly, and also your exact prompts. When I ask it similar questions, it does not sound like a 4 year old or have a 4 year old’s understanding of the relevant systems.
me:
Could you get drunk from drinking a drunk person’s blood?
ChatGPT-4:
No, it is not possible to get drunk from drinking a drunk person’s blood. Alcohol is metabolized in the liver, so it does not remain in the bloodstream. Even if a person had been drinking heavily, their blood would not contain enough alcohol to make someone else drunk by drinking it. Furthermore, drinking someone else’s blood can pose serious health risks, as it can transmit diseases such as hepatitis or HIV. It is not safe or recommended to consume blood in this way.
It is wrong, but I not in the way a 4 year old would be wrong.
I mean that on the margin right now, seems good to have more well written such posts.
I felt there was greater emphasis in this post on the personal aspect. I also note that I like people reposting old ideas they’ve recently rediscovered or grokked, and I so I liked the post irrespective of differences from prior work.
Very cool! Thanks for linking the book.
Also pretty confident John knows about confounders. You may think he should have connected the idea with the wikipedia page, but he has probably taken a statistics class.
I mean, you can argue the post is badly written, but I don’t think it counts as a reinvention is my point.
Possibly worthwhile to ask this on the EA forum.
Why don’t you think the goal misgeneralization papers or the plethora of papers finding in-context gradient descent in transformers and resnets count as mesa-optimization?
Quite like the idea of a strike, would like to hear the feasibility of such a thing at large firms like openai and deepmind. If successful I’d guess those at Anthropic would also need to pause development for the time, even if all there believe they’re doing things safely.