Although I soft upvoted this post, there are some notions I’m uncomfortable with.
What I agree with:
Longtime lurkers should post more
Less technical posts are pushing more technical posts out of the limelight
Posts that dispute the Yudkowskian alignment paradigm are more likely to contain incorrect information (not directly stated but heavily implied I believe, please correct me if I’ve misinterpreted)
Karma is not an indicator of correctness or of value
The third point is likely due to the fact that the Yudkowskian alignment paradigm isn’t a particularly fun one. It is easy to dismiss great ideas for other great ideas when the other ideas promise lower x-risk. This applies in both directions however, as it’s far easier to succumb to extreme views (I don’t mean to use this term in a diminishing fashion) like “we are all going to absolutely die” or “this clever scheme will reduce our x-risk to 1%” and miss the antimeme hiding in plain sight. A perfect example of this is in my mind is the comment section of the Death with Dignity post.
I worry that posts like this discourage content that does not align with the Yudkowskian paradigm, which are likely just as important as posts that conform to it. I don’t find ideas like Shard Theory or their consequential positive reception alarming or disappointing, and on the contrary I find their presentation meaningful and valuable, regardless of whether or not they are correct (not meant to imply I think that Shard Theory is incorrect, it was merely an example). The alternative to posting potentially incorrect ideas (a category that encompasses most ideas) is to have them never scrutinized, improved upon or falsified. Furthermore, incorrect ideas and their falsification can still greatly enrich the field of alignment, and there is no reason why an incorrect interpretation of agency for example couldn’t still produce valuable alignment insights. Whilst we likely cannot iterate upon aligning AGI, alignment ideas are an area in which iteration can be applied, and we would be fools not to apply such a powerful tool broadly. Ignoring the blunt argument of “maybe Yudkowsky is wrong”, it seems evident that “non-Yudkowskian” ideas (even incorrect ones) should be a central component of LessWrong’s published alignment research, this seems to me the most accelerated path toward being predictably wrong less often.
To rephrase, is it the positive reception non-Yudkowskian ideas that alarm/disappoint you, or the positive reception of ideas you believe have a high likelihood of being incorrect (which happens to correlate positively with non-Yudkowskian ideas)?
I assume your answer will be the latter, and if so then I don’t think the correct point to press is whether or not ideas conform to views associated with a specific person, but rather ideas associated with falsity. Let me know what you think, as I share most of your concerns.
Mmm, my intent is not to discourage people from posting views I disagree with, and I don’t think this post will have that effect.
It’s more like, I see a lot of posts that could be improved by grappling more directly with Yudkowskian ideas. To the credit of many of the authors I link, they often do this, though not always as much as I’d like or in ways I think are correct.
The part I find lacking in the discourse is pushback from others, which is what I’m hoping to change. That pushback can’t happen if people don’t make the posts in the first place!
Although I soft upvoted this post, there are some notions I’m uncomfortable with.
What I agree with:
Longtime lurkers should post more
Less technical posts are pushing more technical posts out of the limelight
Posts that dispute the Yudkowskian alignment paradigm are more likely to contain incorrect information (not directly stated but heavily implied I believe, please correct me if I’ve misinterpreted)
Karma is not an indicator of correctness or of value
The third point is likely due to the fact that the Yudkowskian alignment paradigm isn’t a particularly fun one. It is easy to dismiss great ideas for other great ideas when the other ideas promise lower x-risk. This applies in both directions however, as it’s far easier to succumb to extreme views (I don’t mean to use this term in a diminishing fashion) like “we are all going to absolutely die” or “this clever scheme will reduce our x-risk to 1%” and miss the antimeme hiding in plain sight. A perfect example of this is in my mind is the comment section of the Death with Dignity post.
I worry that posts like this discourage content that does not align with the Yudkowskian paradigm, which are likely just as important as posts that conform to it. I don’t find ideas like Shard Theory or their consequential positive reception alarming or disappointing, and on the contrary I find their presentation meaningful and valuable, regardless of whether or not they are correct (not meant to imply I think that Shard Theory is incorrect, it was merely an example). The alternative to posting potentially incorrect ideas (a category that encompasses most ideas) is to have them never scrutinized, improved upon or falsified. Furthermore, incorrect ideas and their falsification can still greatly enrich the field of alignment, and there is no reason why an incorrect interpretation of agency for example couldn’t still produce valuable alignment insights. Whilst we likely cannot iterate upon aligning AGI, alignment ideas are an area in which iteration can be applied, and we would be fools not to apply such a powerful tool broadly. Ignoring the blunt argument of “maybe Yudkowsky is wrong”, it seems evident that “non-Yudkowskian” ideas (even incorrect ones) should be a central component of LessWrong’s published alignment research, this seems to me the most accelerated path toward being predictably wrong less often.
To rephrase, is it the positive reception non-Yudkowskian ideas that alarm/disappoint you, or the positive reception of ideas you believe have a high likelihood of being incorrect (which happens to correlate positively with non-Yudkowskian ideas)?
I assume your answer will be the latter, and if so then I don’t think the correct point to press is whether or not ideas conform to views associated with a specific person, but rather ideas associated with falsity. Let me know what you think, as I share most of your concerns.
Mmm, my intent is not to discourage people from posting views I disagree with, and I don’t think this post will have that effect.
It’s more like, I see a lot of posts that could be improved by grappling more directly with Yudkowskian ideas. To the credit of many of the authors I link, they often do this, though not always as much as I’d like or in ways I think are correct.
The part I find lacking in the discourse is pushback from others, which is what I’m hoping to change. That pushback can’t happen if people don’t make the posts in the first place!