Thanks for the comment! As someone who strong-upvoted and strong-agreed with Charlie’s comment, I’ll try to explain why I liked it.
I sometimes see people talking about how LessWrong comments are discouragingly critical and mostly feel confused, because I don’t really relate. I was very excited to see what the LW comments would be in response to this post, which is a major reason I asked you to cross-post it. I generally feel the same way about comments on my own posts, whether critical or positive. Positive comments feel nice, but I feel like I learn more from critical comments, so they’re probably equally as good in my opinion. As long as the commenter puts in non-neglible effort into conveying an interesting idea and doesn’t say “you/your post is stupid and bad” I’m excited to get pretty much any critique.[1]
FWIW, I didn’t see Charlie’s comment as an attack,[2] but as a step in a conversational dance. Like, if this were a collaborative storytelling exercise, you were like “the hero found a magic sword, which would let him slay the villain” and Charlie was like “but the villain had his own magic that blocks the sword” and I as the audience was like “oh, an interesting twist, I can’t wait to find out what happens next.”
It would be better if Charlie had spelled out what he meant by “but RL,” and I can appreciate why you felt that was underexplained and confusing. Like, to continue the analogy, Charlie didn’t explain how the villain’s magic actually works or explain how the hero might get around it, which left you doing a lot of work to try to guess what Charlie was thinking. He also made some claims about sycophancy which were apparently wrong, and which you did a very good job of refuting.[3]
But I still think his underlying point was useful and a great starter for further discussion (from you or others). I’d very loosely restate it as “the labs are focusing more and more on RL lately. In the limit as you do more RL, your AI tends toward reward maximization, which is different and often at odds with being a ‘nice guy.’ I wonder how this plays into the dynamic you described in your post!” I took the “I could be totally wrong about any of this” as implicit given we’re on LW, but idk if that’s accurate.
Yeah, I don’t know what to do about this. I’d be sad if some critical comments went away, even the somewhat less rigorous ones, since many feel useful to me. Of course, I would be even sadder if some posts don’t get written at all because authors are discouraged by those comments, and I feel bad about people whose posts I like a lot feeling bad about their posts.
I can sympathize with spending more time than I hoped to on replies to other people’s comments and feeling a bit burned out and frustrated by the end.[4] I still feel happy about their comments existing though. Maybe we’d ideally have a stronger norm here saying “if you don’t have time to continue telling the story, it’s okay to stop on a cliffhanger.” I guess please feel free to not respond to this comment or respond very minimally
Not that I’ve never felt bad about a polite but critical comment on my work, but I still mostly feel grateful for those comments and consider them a net good
This one too, actually. I feel like it’s a good comment, but I do also feel like “man, probably not many people are going to read this, and I had other things to work on, why do I do this to myself”
Thanks for the comment! As someone who strong-upvoted and strong-agreed with Charlie’s comment, I’ll try to explain why I liked it.
I sometimes see people talking about how LessWrong comments are discouragingly critical and mostly feel confused, because I don’t really relate. I was very excited to see what the LW comments would be in response to this post, which is a major reason I asked you to cross-post it. I generally feel the same way about comments on my own posts, whether critical or positive. Positive comments feel nice, but I feel like I learn more from critical comments, so they’re probably equally as good in my opinion. As long as the commenter puts in non-neglible effort into conveying an interesting idea and doesn’t say “you/your post is stupid and bad” I’m excited to get pretty much any critique.[1]
FWIW, I didn’t see Charlie’s comment as an attack,[2] but as a step in a conversational dance. Like, if this were a collaborative storytelling exercise, you were like “the hero found a magic sword, which would let him slay the villain” and Charlie was like “but the villain had his own magic that blocks the sword” and I as the audience was like “oh, an interesting twist, I can’t wait to find out what happens next.”
It would be better if Charlie had spelled out what he meant by “but RL,” and I can appreciate why you felt that was underexplained and confusing. Like, to continue the analogy, Charlie didn’t explain how the villain’s magic actually works or explain how the hero might get around it, which left you doing a lot of work to try to guess what Charlie was thinking. He also made some claims about sycophancy which were apparently wrong, and which you did a very good job of refuting.[3]
But I still think his underlying point was useful and a great starter for further discussion (from you or others). I’d very loosely restate it as “the labs are focusing more and more on RL lately. In the limit as you do more RL, your AI tends toward reward maximization, which is different and often at odds with being a ‘nice guy.’ I wonder how this plays into the dynamic you described in your post!” I took the “I could be totally wrong about any of this” as implicit given we’re on LW, but idk if that’s accurate.
Yeah, I don’t know what to do about this. I’d be sad if some critical comments went away, even the somewhat less rigorous ones, since many feel useful to me. Of course, I would be even sadder if some posts don’t get written at all because authors are discouraged by those comments, and I feel bad about people whose posts I like a lot feeling bad about their posts.
I can sympathize with spending more time than I hoped to on replies to other people’s comments and feeling a bit burned out and frustrated by the end.[4] I still feel happy about their comments existing though. Maybe we’d ideally have a stronger norm here saying “if you don’t have time to continue telling the story, it’s okay to stop on a cliffhanger.” I guess please feel free to not respond to this comment or respond very minimally
Not that I’ve never felt bad about a polite but critical comment on my work, but I still mostly feel grateful for those comments and consider them a net good
Not sure if you’d describe it that way either
I was very surprised by the refutation and learned a lot from it. Just another example of why I love when people post and comment on LessWrong!! :D
This one too, actually. I feel like it’s a good comment, but I do also feel like “man, probably not many people are going to read this, and I had other things to work on, why do I do this to myself”