Cristian-Curaba

Karma: 0

Cristian-Curaba 22 Sep 2025 13:03 UTC
1 point
0
in reply to: johnswentworth’s comment on: Natural Latents: Latent Variables Stable Across Ontologies
By scanning the graphical proof, I don’t see any issue on the following generalization of the Mediator Determines Redund Theorem:
Let $X_{1}, \dots, X_{n}, Λ$ and $Λ^{'}$ be random variables and let $X_{1}, \dots, X_{m}$ be any not-empty subset of $X_{1}, \dots, X_{n}$ that satisfy the following conditions:
- $Λ$ Mediation: $X_{1}, \dots, X_{m}$ are independent given $Λ$
- $Λ^{'}$ Redundancy: $\forall j \in {1, \dots, m} Λ^{'} \leftarrow X_{j} \to Λ^{'}$
Then $Λ^{'} \leftarrow Λ \to Λ^{'}$ .
In the above, I’ve weaken the $Λ^{'}$ Redundancy hypothesis, requiring that the redundancy of any subset of random variables is enough to conclude the thesis.
Does the above generalization work (if don’t, why?).
If the above stands true, then just one observational random variable (with agreement) is enough to satisfy the Redundancy condition (Mediation is trivially true with one variable), an therefore $Λ^{A}$ is determined by $Λ^{B} .$ Moreover, in the general approximation case, if we have various sets of random variables that meet the naturality condition, we can choose the one that will minimize the errors (there’s some kind of trade-off between $ϵ_{m e d}$ and $ϵ_{r e d}$ errors).

Cristian-Curaba 17 Sep 2025 9:54 UTC
1 point
0
on: Natural Latents: Latent Variables Stable Across Ontologies
Great work! I have a technical question.
My current understanding is as follows:
1. If we have even one observable variable with agreement observation and for which the latent variables satisfy the exact naturality condition, we can then build the transferability function exactly.
2. In the approximation case, if we have multiple observable variables that meet these same conditions, we can choose the specific variable (or set of variables, in the proofs you used a couple) that will minimize the $ϵ_{r e d}$ errors. We would not need to use all of them.
Is this correct?
Additionally, I was wondering if you have attempted to implement the algorithm derived from the proof to construct the isomorphism. It seems that some effort could be dedicated to developing an algorithm that minimizes or reduces these errors. It could one day be helpful for interpreting and aligning different ontological frameworks, like mapping an alien Bayesian network to a human one.

Cristian-Curaba 17 Mar 2025 14:48 UTC
1 point
0
on: The case for training frontier AIs on Sumerian-only corpus
TL;DR
If the narrow (but intelligent) AI can weakly interact with the environment, adapt from feedback and exfiltrate secretly, even once, it can learn how to translate from its ancient language to the environmental one by building samples of pairs. The crucial aspect is training the agent on an unknown/unconventional alphabet: the outputs are meaningless from the environmental point of view, limiting the environmental learning possibilities of the language.

The goal of the following comment is two-sided:
Let’s assume that the narrow (but intelligent) AI can adapt from feedback and exfiltrate secretly.
- If the output tokens are bit-vectors compatible with the environment (for example they follow the Unicode Standards), then the “Sumerian + reporter” method is not effective.
- If the output tokens are not compatible with the environment (therefore they require a middleware or a Python-like dictionary to fit standards to effectively interact with other software) then the approach is more robust.
1. Fragility to Exfiltration and Adaptation
The “Sumerian + reporter” safety layer relies on the assumption that the AI is perfectly boxed and cannot interact with the environment except through the reporter. However, this assumption is fragile. If the AI manages to exfiltrate even a small amount of information—such as by outputting functional code or structured data—it could begin building a dataset of translation pairs, mapping its Sumerian-based outputs to the environmental language (e.g., English). Over time, an adaptive AI could use this dataset to reconstruct the environmental language, effectively bypassing linguistic containment.
This vulnerability highlights a critical flaw: the system is only as secure as its ability to prevent even a single unnoticed breach. Once the AI begins to adapt or collect feedback (the 3. tries to limit further this ability), linguistic isolation becomes ineffective. This makes the approach fragile when dealing with an intelligent and adaptive system.
2. Performance Bottleneck Due to Language Limitations
Using an ancient language like Sumerian introduces a significant bottleneck in terms of expressiveness and precision. Modern scientific and technical concepts may not have direct equivalents in Sumerian, which could lead to inefficiencies or inaccuracies in communication. While it is possible to extend Sumerian with new terms, this process risks introducing inconsistencies or ambiguities, further complicating the system. For tasks requiring high precision, such as medicine design, this limitation could hinder the AI’s effectiveness.
3. Alternative Approach: Custom Language with Non-Standard Alphabet
A more robust and effective alternative would be to create a custom language that is semantically equivalent to the environmental language but uses a private, non-standard alphabet (e.g., a Unicode-32 range like U+F0000 to U+FFFFD). This approach offers two main advantages:
- Robustness to Exfiltration: Outputs in a private encoding would be incompatible with standard systems unless processed through a specific middleware (Pyhton-like dictionary). This reduces the risk of the AI interacting meaningfully with the environment in the event of a breach.
- Preservation of Expressiveness: A custom language can be designed to fully support modern scientific and technical concepts, avoiding the limitations of using an ancient language like Sumerian.
However, even this refined approach is not without its own vulnerabilities. The security of the system depends entirely on the secrecy of the dictionary/encoding scheme. If the dictionary is leaked or accessed by the AI, the safety layer becomes easily exploitable: the AI would just need a way to add the middleware (both input and output) in its pipeline. This makes securing the dictionary a bottleneck. We may be able to fine-tune the agent (in case of leakage) on a different dictionary but we lack empirical feasibility of this approach. Furthermore, if the AI has situational awareness, it may prepare a piece of code (or other stuff) that will help it to learn the environmental language after the exfiltration, without even knowing the secret dictionary. Or it may collaborate with a not-exfiltrated AI to help it learn the environmental language. The first idea to patch such vulnerabilities is through a reliable monitoring and blocking system.