Thane Ruthenis comments on A Case for the Least Forgiving Take On Alignment

Thane Ruthenis 3 May 2023 15:37 UTC
3 points
0
Do you think that even the most verbally-tuned people are actually doing the heavy lifting of their high-level thinking wordlessly?
Yes. It’s a distinction similar to whatever computations happen in LLM forward-passes vs. the way Auto-GPT exchanges messages with its subagents. Maybe it’s also a memory aid, such that memorizing the semantic representation of a thought serves as a shortcut to the corresponding mental state; but it’s not the real nuts-and-bolts of cognition. The heavily lifting is done by whatever process figures out what word to put next in the monologue; not by the inner monologue itself.
I expect that “plug-ins” that give a memory to the LLM, as people are already trying to develop, are viable. Do you expect otherwise? (Although they would not allow the LLM to learn new “instincts”.)
I think the instincts are the more crucial part, yes; perhaps I should’ve said “long-term adaptation” rather than “long-term memory”.
I do suspect the current training processes fundamentally shape LLMs’ architecture the wrong way, and not in a way that’s easy to fix with fine-tuning, or conceptually-small architectural adjustments, or plug-ins. But that’s my weakest claim, the one I’m only ~70% confident it. We’ll see, I suppose.
- rotatingpaguro 3 May 2023 16:03 UTC
  3 points
  0
  Parent
  
  The heavily lifting is done by whatever process figures out what word to put next in the monologue; not by the inner monologue itself.
  
  It seems you use “monologue” in this sentence to refer to the sequence of words only, and then say that of course the monologue is not the cognition. With this I agree, but I don’t think that’s the correct interpretation of the combo “language of thought hypothesis” + “language of thought close to natural language”. Having a “language of thought” means that there is a linear stream of items, and that your abstract cognition works only by applying some algorithm to the stream buffer to append the next item. The tape is not the cognition, but the cognition can be seen as acting (almost) only on the tape. Then “language of thought close to natural language” means that the language of thought has a short encoding in natural language. You can picture this as the language of thought of a verbal thinker being a more abstract version of natural language, similarly to when you feel what to say next but lack the word.
  - Thane Ruthenis 3 May 2023 16:15 UTC
    3 points
    0
    Parent
    cognition can be seen as acting (almost) only on the tape
    … If not for the existence of non-verbal cognition, which works perfectly well even without a “tape”. Suggesting that the tape isn’t a crucial component, that the heavy lifting can be done by the abstract algorithm alone, and therefore that even in supposed verbal thinkers, that algorithm is likely what’s doing the actual heavy lifting.
    In my view, there’s an actual stream of abstract cognition, and a “translator” function mapping from that stream to human language. When we’re doing verbal thinking, we’re constantly running the translator on our actual cognition, which has various benefits (e. g., it’s easier to translate our thoughts to other humans); but the items in the natural-language monologue are compressed versions of the items in the abstract monologue, and they’re strictly downstream of the abstract stream.
    - rotatingpaguro 3 May 2023 18:05 UTC
      3 points
      0
      Parent
      So you think
      
      There’s a “stream” of abstract thought, or “abstract monologue”
      The cognition algorithm operates on/produces the abstract stream
      Natural language is a compressed stream of the abstract stream
      
      Which seems to me the same thing I said above, unless maybe you are also implying either or both of these additional statements:
      
      a) The abstract cognition algorithm can not be seen as operating mostly autoregressively on its “abstract monologue”;
      
      b) The abstract monologue can not be translated to a longer, but boundedly longer, natural language stream (without claiming that this is what happens typically when someone verbalizes).
      
      Which of (a), (b) do you endorse, eventually with amendments?
      - Thane Ruthenis 3 May 2023 18:47 UTC
        5 points
        2
        Parent
        Which of (a), (b) do you endorse, eventually with amendments?
        I don’t necessarily endorse either. But “boundedly longer” is what does a lot of work there. As I’d mentioned, cognition can also be translated into a finitely long sequence of NAND gates. The real question isn’t “is there a finitely-long translation?”, but how much longer that translation is.
        And I’m not aware of any strong evidence suggesting that natural language is close enough to human cognition that the resultant stream would not be much longer. Long enough to be ruinously compute-intensive (effectively as ruinous as translating it into NAND-gate sequences).
        Indeed, I’d say there’s plenty of evidence to the contrary, given how central miscommunication is to the human experience.