Counterpoint: Sydney Bing was wildly unaligned, to the extent that it is even possible for an LLM to be aligned, and people thought it was cute / cool.
I was not precise enough in my language and agree with you highlighting that what “alignment” means for LLM is a bit vague. While people felt Sydney Bing was cool, if it was not possible to reign it in it would have made it very difficult for Microsoft to gain any market share. An LLM that doesn’t do what it’s asked or regularly expresses toxic opinions is ultimately bad for business.
In the above paragraph understand “aligned” to mean in the concrete sense of “behaves in a way that is aligned with it’s parent companies profit motive”, rather than “acting in line with humanities CEV”. To rephrase the point I was making above, I feel much of (a majority even) of today’s alignment research is focused on the the first definition of alignment, whilst neglecting the second.
Counterpoint: Sydney Bing was wildly unaligned, to the extent that it is even possible for an LLM to be aligned, and people thought it was cute / cool.
I was not precise enough in my language and agree with you highlighting that what “alignment” means for LLM is a bit vague. While people felt Sydney Bing was cool, if it was not possible to reign it in it would have made it very difficult for Microsoft to gain any market share. An LLM that doesn’t do what it’s asked or regularly expresses toxic opinions is ultimately bad for business.
In the above paragraph understand “aligned” to mean in the concrete sense of “behaves in a way that is aligned with it’s parent companies profit motive”, rather than “acting in line with humanities CEV”. To rephrase the point I was making above, I feel much of (a majority even) of today’s alignment research is focused on the the first definition of alignment, whilst neglecting the second.