Chris Datcu

Karma: 5

meta-quest

Chris Datcu 7 Dec 2025 11:06 UTC
2 points
−1
on: Eliezer’s Unteachable Methods of Sanity
One could incorrectly summarize all this as “I have decided not to expect to go insane,” but that would violate the epistemic-instrumental firewall and therefore be insane.

would a saner alternative then go in the lines of:
“I have decided to entertain thoughts and actions under the expectation that I will not go insane, because that’s the most adaptive and constructive way to face this situation, even though I can’t be certain”?

if so, I see a good dynamic for sanity;
- choose (non egocentric & constructive) narrative;
- guide thoughts to fit chosen narrative.

slightly tangential question: how do you maintain coherence/continuity of narrative across contexts?

Chris Datcu 28 Dec 2024 17:18 UTC
5 points
6
in reply to: evhub’s comment on: evhub’s Shortform
Hi! I’m a first-time poster here, but a (decently) long time thinker on earth. Here are some relevant directions that currently lack their due attention.
~ Multi-modal latent reasoning & scheming (and scheming derivatives) is an area that not only seems to need more research, but also more spread of awareness on the topic. Human thinking works in a hyperspace of thoughts, many of which go beyond language. It seems possible that AIs might develop forms of reasoning that are harder for us to detect through purely language-based safety measures.
~ Multi-model interactions and the potential emergence of side communication channels is also something that I’d like to see more work put into. How corruptible can models be when interacting with corrupted models is a topic that I didn’t yet see much work on. Applying some group-dynamics on scheming seems worth pursuing & Anthropic seems best suited for that.

~ If a pre-AGI model has intent to become AGI+, how much can it orchestrate its path to AGI+ through its interactions with humans?