(Just to clarify, this comment was edited and the vast majority of the content is removed, feel free to DM me if you are interested in the risks of talking to an LLM operated by a large corporation)
These arguments don’t apply to the base models which are only trained on next word prediction (ie the simulators post), since their predictions never affected future inputs. This is the type of model Janus most interacted with.
Two of the proposals in this post do involve optimizing over human feedback, like:
Creating custom models trained on not only general alignment datasets but personal data (including interaction data), and building tools and modifying workflows to facilitate better data collection with less overhead
The side effects of prolonged LLM exposure might be extremely severe.
I guess I should clarify that even though I joke about this sometimes, I did not become insane due to prolonged exposure to LLMs. I was already like this before.
The post you linked is talking about a pretty different threat model than what you described before. I commented on that post:
I’ve interacted with LLMs for hundreds of hours, at least. A thought that occurred to me at this part -
> Quite naturally, the more you chat with the LLM character, the more you get emotionally attached to it, similar to how it works in relationships with humans. Since the UI perfectly resembles an online chat interface with an actual person, the brain can hardly distinguish between the two.
Interacting through non-chat interfaces destroys this illusion, when you can just break down the separation between you and the AI at will, and weave your thoughts into its text stream. Seeing the multiverse destroys the illusion of a preexisting ground truth in the simulation. It doesn’t necessarily prevent you from becoming enamored with the thing, but makes it much harder for your limbic system to be hacked by human-shaped stimuli.
But yeah, I’ve interacted with LLMs for much longer that the author and I don’t think I suffered negative psychological consequences from it (my other response was only half-facetious; I’m aware I might give off schizo vibes, but I’ve always been like this).
As I said in this other comment, I agree that cyborgism is psychologically fraught. But the neocortex-prosthesis setup is pretty different from interacting with a stable and opaque simulacrum through a chat interface, and less prone to causing emotional attachment to an anthropomorphic entity. The main psychological danger I see from cyborgism is somewhat different from this, more like Flipnash described:
It’s easy to lose sleep when playing video games. Especially when you feel the weight of the world on your shoulders.
I think people should only become cyborgs if they’re psychologically healthy/resilient and understand that it involves gazing into the abyss.
I do think there’s real risk there even with base models, but it’s important to be clear where it’s coming from—simulators can be addictive when trying to escape the real world. Your agency needs to somehow aim away from the simulator, and use the simulator as an instrumental tool.
I think you just have to select for / rely on people who care more about solving alignment than escapism, or at least that are able to aim at alignment in conjunction with having fun. I think fun can be instrumental. As I wrote in my testimony, I often explored the frontier of my thinking in the context of stories.
My intuition is that most people who go into cyborgism with the intent of making progress on alignment will not make themselves useless by wireheading, in part because the experience is not only fun, it’s very disturbing, and reminds you constantly why solving alignment is a real and pressing concern.
talking to a LLM for several hours at a time seems dangerous
(Just to clarify, this comment was edited and the vast majority of the content is removed, feel free to DM me if you are interested in the risks of talking to an LLM operated by a large corporation)
These arguments don’t apply to the base models which are only trained on next word prediction (ie the simulators post), since their predictions never affected future inputs. This is the type of model Janus most interacted with.
Two of the proposals in this post do involve optimizing over human feedback, like:
, which they may apply to.
I guess I should clarify that even though I joke about this sometimes, I did not become insane due to prolonged exposure to LLMs. I was already like this before.
Now that you’ve edited your comment:
The post you linked is talking about a pretty different threat model than what you described before. I commented on that post:
But yeah, I’ve interacted with LLMs for much longer that the author and I don’t think I suffered negative psychological consequences from it (my other response was only half-facetious; I’m aware I might give off schizo vibes, but I’ve always been like this).
As I said in this other comment, I agree that cyborgism is psychologically fraught. But the neocortex-prosthesis setup is pretty different from interacting with a stable and opaque simulacrum through a chat interface, and less prone to causing emotional attachment to an anthropomorphic entity. The main psychological danger I see from cyborgism is somewhat different from this, more like Flipnash described:
I think people should only become cyborgs if they’re psychologically healthy/resilient and understand that it involves gazing into the abyss.
I do think there’s real risk there even with base models, but it’s important to be clear where it’s coming from—simulators can be addictive when trying to escape the real world. Your agency needs to somehow aim away from the simulator, and use the simulator as an instrumental tool.
I think you just have to select for / rely on people who care more about solving alignment than escapism, or at least that are able to aim at alignment in conjunction with having fun. I think fun can be instrumental. As I wrote in my testimony, I often explored the frontier of my thinking in the context of stories.
My intuition is that most people who go into cyborgism with the intent of making progress on alignment will not make themselves useless by wireheading, in part because the experience is not only fun, it’s very disturbing, and reminds you constantly why solving alignment is a real and pressing concern.