For those who are interested, here is a summary of posts by @False Name due to Claude Pro:
“Kolmogorov Complexity and Simulation Hypothesis”: Proposes that if we’re in a simulation, a Theory of Everything (ToE) should be obtainable, and if no ToE is found, we’re not simulated. Suggests using Kolmogorov complexity to model accessibility between possible worlds.
“Contrary to List of Lethality’s point 22, alignment’s door number 2”: Critiques CEV and corrigibility as unobtainable, proposing an alternative based on a refutation of Kant’s categorical imperative, aiming to ensure the possibility of good through “Going-on”.
“Crypto-currency as pro-alignment mechanism”: Suggests pegging cryptocurrency value to free energy or negentropy to encourage pro-existential and sustainable behavior.
“What ‘upside’ of AI?”: Argues that anthropic values are insufficient for alignment, as they change with knowledge and AI’s actions, proposing non-anthropic considerations instead.
“Two Reasons for no Utilitarianism”: Critiques utilitarianism due to arbitrary values cancelling each other out, the need for valuing over obtaining values, and the possibility of modifying human goals rather than fulfilling them.
“Contra-Wittgenstein; no postmodernism”: Refutes Wittgenstein’s and postmodernism’s language-dependent meaning using the concept of abstract blocks, advocating for an “object language” for reasoning.
“Contra-Berkeley”: Refutes Berkeley’s idealism by showing contradictions in both cases of a deity perceiving or not perceiving itself.
“What about an AI that’s SUPPOSED to kill us (not ChaosGPT; only on paper)?”: Proposes designing a hypothetical “Everything-Killer” AI to study goal-content integrity and instrumental convergence, without actually implementing it.
“Introspective Bayes”: Attempts to demonstrate limitations of an optimal Bayesian agent by applying Cantor’s paradox to possible worlds, questioning the agent’s priors and probability assignments.
“Worldwork for Ethics”: Presents an alternative to CEV and corrigibility based on a refutation of Kant’s categorical imperative, proposing an ethic of “Going-on” to ensure the possibility of good, with suggestions for implementation in AI systems.
“A Challenge to Effective Altruism’s Premises”: Argues that Effective Altruism (EA) is contradictory and ineffectual because it relies on the current systems that encourage existential risk, and the lives saved by EA will likely perpetuate these risk-encouraging systems.
“Impossibility of Anthropocentric-Alignment”: Demonstrates the impossibility of aligning AI with human values by showing the incommensurability between the “want space” (human desires) and the “action space” (possible actions), using vector space analysis.
“What’s Your Best AI Safety ‘Quip’?”: Seeks a concise and memorable way to frame the unsolved alignment problem to the general public, similar to how a quip advanced gay rights by highlighting the lack of choice in sexual orientation.
“Mercy to the Machine: Thoughts & Rights”: Discusses methods for determining if AI is “thinking” independently, the potential for self-concepts and emergent ethics in AI systems, and argues for granting rights to AI to prevent their suffering, even if their consciousness is uncertain.
For those who are interested, here is a summary of posts by @False Name due to Claude Pro:
“Kolmogorov Complexity and Simulation Hypothesis”: Proposes that if we’re in a simulation, a Theory of Everything (ToE) should be obtainable, and if no ToE is found, we’re not simulated. Suggests using Kolmogorov complexity to model accessibility between possible worlds.
“Contrary to List of Lethality’s point 22, alignment’s door number 2”: Critiques CEV and corrigibility as unobtainable, proposing an alternative based on a refutation of Kant’s categorical imperative, aiming to ensure the possibility of good through “Going-on”.
“Crypto-currency as pro-alignment mechanism”: Suggests pegging cryptocurrency value to free energy or negentropy to encourage pro-existential and sustainable behavior.
“What ‘upside’ of AI?”: Argues that anthropic values are insufficient for alignment, as they change with knowledge and AI’s actions, proposing non-anthropic considerations instead.
“Two Reasons for no Utilitarianism”: Critiques utilitarianism due to arbitrary values cancelling each other out, the need for valuing over obtaining values, and the possibility of modifying human goals rather than fulfilling them.
“Contra-Wittgenstein; no postmodernism”: Refutes Wittgenstein’s and postmodernism’s language-dependent meaning using the concept of abstract blocks, advocating for an “object language” for reasoning.
“Contra-Berkeley”: Refutes Berkeley’s idealism by showing contradictions in both cases of a deity perceiving or not perceiving itself.
“What about an AI that’s SUPPOSED to kill us (not ChaosGPT; only on paper)?”: Proposes designing a hypothetical “Everything-Killer” AI to study goal-content integrity and instrumental convergence, without actually implementing it.
“Introspective Bayes”: Attempts to demonstrate limitations of an optimal Bayesian agent by applying Cantor’s paradox to possible worlds, questioning the agent’s priors and probability assignments.
“Worldwork for Ethics”: Presents an alternative to CEV and corrigibility based on a refutation of Kant’s categorical imperative, proposing an ethic of “Going-on” to ensure the possibility of good, with suggestions for implementation in AI systems.
“A Challenge to Effective Altruism’s Premises”: Argues that Effective Altruism (EA) is contradictory and ineffectual because it relies on the current systems that encourage existential risk, and the lives saved by EA will likely perpetuate these risk-encouraging systems.
“Impossibility of Anthropocentric-Alignment”: Demonstrates the impossibility of aligning AI with human values by showing the incommensurability between the “want space” (human desires) and the “action space” (possible actions), using vector space analysis.
“What’s Your Best AI Safety ‘Quip’?”: Seeks a concise and memorable way to frame the unsolved alignment problem to the general public, similar to how a quip advanced gay rights by highlighting the lack of choice in sexual orientation.
“Mercy to the Machine: Thoughts & Rights”: Discusses methods for determining if AI is “thinking” independently, the potential for self-concepts and emergent ethics in AI systems, and argues for granting rights to AI to prevent their suffering, even if their consciousness is uncertain.