Philip Niewold comments on Bing Chat is blatantly, aggressively misaligned

Philip Niewold 16 Feb 2023 10:10 UTC
−1 points
−8
Please keep in mind that the Chat technology is an desired-answer-predicter. If you are looking for weird response, the AI can see that in your questioning style. It has millions of examples of people trying to trigger certain responses in fora etc, en will quickly recognize what you really are looking for, even if your literal words might not exactly request it.

If you are a Flat Earther, the AI will do its best to accomodate your views about the shape of the earth and answer in a manner that you would like your answer to be, even though the developers of the AI have done their best to instruct it to ‘speak as accurately as possible within the parameters of their political and PR views’
If you want to trigger the AI to give poorly written code examples with mistakes in them, it can. And you don’t even have to ask it directly, it can detect your intention by carefully listening to your line of questioning.

Once again, it is a desired-answer-predicter/most-likely-response generator, that’s its primary job, not to be nice or give you accurate information.
- esaund 16 Feb 2023 19:48 UTC
  5 points
  4
  Parent
  LLMs are trained not as desired-answer-predictors, but as text predictors. Some of the text is questions and answers, most is not.
  I rather doubt that there is much text to be harvested that exhibits the sort of psychotic going around in circles behaviors Sydney is generating. Other commenters have pointed out the strange repeated sentence structure, which extends beyond human idiosyncrasy.
  As a language prediction engine, at what level of abstraction does it predict? It certainly masters English syntax. It is strong on lexical semantics and pragmatics. What about above that? In experiments with ChatGPT, I have elicited some level of commonsense reasoning and pseudo-curiosity.
  The strange behaviors we see from Sydney really do resemble a neurotic and sometimes psychotic person. Thus, the latent abstraction model reaches the level of a human persona.
  These things are generative. I believe it is not a stretch to say that these behaviors operate at the level of ideas, defined as novel combinations of well-formed concepts. The concepts that LLMs have facility with include abstract notions like thought, identity, and belief. People are fascinated by these mysteries of life and write about them in spades.
  Sydney’s chatter reminds me of a person undergoing an epistemological crisis. It may therefore be revealing a natural philosophical quicksand in idea-space. Just as mathematics explores formal logical contradictions, these should be subject to systematic charting and modeling. Just like learning how to talk someone down from a bad place, once mapped out, these rabbit holes may be subject to guardrails grounded in something like relatively hardcoded values.