One discussion question I’d be interested in hearing from people about, which has to do with how I used the word ‘misalignment’ in the headline:
Do people think that companies like Twitter/X/xAI who don’t (seemingly) align their tools to broader human values are indeed creating tools that exhibit ‘misalignment’; or are these tools seen not as ‘misaligned,’ but as only aligned to their own motives (e.g., profit), which is to be expected? In other words, or relatedly, how should we be thinking about the alignment framework, especially in its historical context—as a program that was perhaps overly idealistic or optimistic about what companies would do to make AI generally safe and beneficial, or as a program that is and was always meant to only be about making AI aligned with its corporate controllers?
I imagine the framing of this question, itself, might be objected to in various ways—just dashed this out.
One discussion question I’d be interested in hearing from people about, which has to do with how I used the word ‘misalignment’ in the headline:
Do people think that companies like Twitter/X/xAI who don’t (seemingly) align their tools to broader human values are indeed creating tools that exhibit ‘misalignment’; or are these tools seen not as ‘misaligned,’ but as only aligned to their own motives (e.g., profit), which is to be expected? In other words, or relatedly, how should we be thinking about the alignment framework, especially in its historical context—as a program that was perhaps overly idealistic or optimistic about what companies would do to make AI generally safe and beneficial, or as a program that is and was always meant to only be about making AI aligned with its corporate controllers?
I imagine the framing of this question, itself, might be objected to in various ways—just dashed this out.