Thanks for giving good context on your collaborative approach to rationality!
I deliberately emboldened “sufficiently capable” and “extensive corpus of knowledge” as key general conditions. I stated that I view this “along the Scaling Hypothesis” trajectory: sufficient capabilities are tied to compute and parameters, and extensive knowledge is tied to data.
Getting to the point where the system is sufficiently capable across extensive knowledge is the part that I state requires human endeavour and ingenuity. The 8 points listed at the end are the core factors of my world model which I believe need to be considered during this endeavour.
To give a concrete exciting example: based on recent discussions I had in SF it seems we’re close to a new approach for deterministic interpretability of common frontier model architectures. If true, this improves bidirectional integration between humans & AI (improved information exchange) and accuracy of normative closure (stating what is being attempted versus an objective). I’ll post a review of the paper when it comes out if I stop getting rate-limited lol.
That’s interesting, looking forward to hearing about that paper. Does this “new approach” use the CoT, or some other means?
Thanks for the clarification on your intended meaning. For my personal taste, I would prefer you were more careful that the language you use does not appear to deny real complexities or assert guaranteed successful results.
For instance, the conditional you state is:
IF we give a sufficiently capable intelligent system access to an extensive, comprehensive corpus of knowledge
THEN two interesting things will happen
And you just confirmed in your prior comment that “sufficient capabilities are tied to compute and parameters”.
I am having trouble interpreting that in a way that does not approximately mean “alignment will inevitably happen automatically when we scale up”.
Perhaps if you could give me an idea of the high-level implications of your framework, that might give me a better context for interpreting your intent. What does it entail? What actions does it advocate for?
And you just confirmed in your prior comment that “sufficient capabilities are tied to compute and parameters”.
I am having trouble interpreting that in a way that does not approximately mean “alignment will inevitably happen automatically when we scale up”.
Sorry this is another case where I play with language a bit: I view “parametrisation of an intelligent system” as a broad statement that includes architecting it in different ways. For example recently some more capable models use a snapshot with less parameters than earlier snapshots, for me in this case the “parametrisation” is a process that includes summing the literal model parameters across the whole process and also engineering novel architecture.
Perhaps if you could give me an idea of the high-level implications of your framework, that might give me a better context for interpreting your intent. What does it entail? What actions does it advocate for?
High level I’m sharing things that I derive from my world-model for humans + superintelligence, I’m advocating for exploration of these topics and discussing how it is changing my approach to understanding AI alignment efforts that I think hold the most promise.
If this “playing with language” is merely a stylistic choice, I would personally prefer you not intentionally redefine words with known meanings to mean something else. If this was instead due to the challenges of compressing complex ideas into fewer words, I can definitely relate to that challenge. But either way, I think your use of “parameters” in that way is confusing and undermines the reader’s ability to interpret your ideas accurately and efficiently.
Thanks for giving good context on your collaborative approach to rationality!
I deliberately emboldened “sufficiently capable” and “extensive corpus of knowledge” as key general conditions. I stated that I view this “along the Scaling Hypothesis” trajectory: sufficient capabilities are tied to compute and parameters, and extensive knowledge is tied to data.
Getting to the point where the system is sufficiently capable across extensive knowledge is the part that I state requires human endeavour and ingenuity. The 8 points listed at the end are the core factors of my world model which I believe need to be considered during this endeavour.
To give a concrete exciting example: based on recent discussions I had in SF it seems we’re close to a new approach for deterministic interpretability of common frontier model architectures. If true, this improves bidirectional integration between humans & AI (improved information exchange) and accuracy of normative closure (stating what is being attempted versus an objective). I’ll post a review of the paper when it comes out if I stop getting rate-limited lol.
That’s interesting, looking forward to hearing about that paper. Does this “new approach” use the CoT, or some other means?
Thanks for the clarification on your intended meaning. For my personal taste, I would prefer you were more careful that the language you use does not appear to deny real complexities or assert guaranteed successful results.
For instance, the conditional you state is:
And you just confirmed in your prior comment that “sufficient capabilities are tied to compute and parameters”.
I am having trouble interpreting that in a way that does not approximately mean “alignment will inevitably happen automatically when we scale up”.
Perhaps if you could give me an idea of the high-level implications of your framework, that might give me a better context for interpreting your intent. What does it entail? What actions does it advocate for?
Sorry this is another case where I play with language a bit: I view “parametrisation of an intelligent system” as a broad statement that includes architecting it in different ways. For example recently some more capable models use a snapshot with less parameters than earlier snapshots, for me in this case the “parametrisation” is a process that includes summing the literal model parameters across the whole process and also engineering novel architecture.
High level I’m sharing things that I derive from my world-model for humans + superintelligence, I’m advocating for exploration of these topics and discussing how it is changing my approach to understanding AI alignment efforts that I think hold the most promise.
If this “playing with language” is merely a stylistic choice, I would personally prefer you not intentionally redefine words with known meanings to mean something else. If this was instead due to the challenges of compressing complex ideas into fewer words, I can definitely relate to that challenge. But either way, I think your use of “parameters” in that way is confusing and undermines the reader’s ability to interpret your ideas accurately and efficiently.