Could you pitch the value of this, perhaps by linking the relevant section of your agenda post, or etc? Why should someone who encounters this paper be interested as more than a a recreational math curiosity—eg, where does it fit in your agenda? why would someone who is trying to solve ASI-grade alignment need to invent this paper, or if that’s not obviously the case, what lead you to be interested? If there was another potential vanessa out there who was feeling pressure to solve ASI alignment and is getting interested in the current “field consensus”, why should they get interested in this line of work, and specifically in this subthread, such that it’s a good idea for them to read this particular work?
this is a comment I’ve been meaning to make on most of your technical posts, because you seem to be one of very few people doing work like yours and I think part of why is not explaining its necessity. I’m planning to make a post pitching, among other things, your stuff, in the hope of laying out the reasons the field should consider whether there’s a good reason to try paths like yours harder, but I’m still relatively a noob to it and you arguing for it yourself is probably just better for getting the actual reasons down in a coherent and justified way.
(I’m interested in chatting on discord before a response here.)
In order to have powerful learning algorithms with safety guarantees, we first need learning algorithms with powerful generalization guarantees that we know how to rigorously formulate (otherwise how do you know the algorithm will correctly infer the intended goal/behavior from the training data?).
Additionally, in order to formally specify “aligned to human values”, we need to formally specify “human values”, and it seems likely that the specification of “X’s values” should be something akin to “the utility function w.r.t. which X has [specific type of powerful performance guarantees]”. These powerful performance guarantees are probably a form/extension of powerful generalization guarantees.
Both reasons require us to understand the kind of natural powerful generalization guarantees that efficient learning algorithms can satisfy. Moreover, such understanding would likely be applicable to deep learning as well, as it seems likely deep learning algorithms satisfy such guarantees, but we currently don’t know how to formulate them.
I conjecture that a key missing ingredient in deriving efficient learning algorithms with powerful guarantees (more powerful than anything we already understand in computational learning theory), is understanding the role of compositionality in learning. This is because compositionality is a ubiquitous feature of our thinking about the world and, intuitively, particular forms of compositionality are strong candidates for properties that are both very general and strong enough to enable efficient learning. This line of thinking led me to some success in the context of control theory, which is a necessary ingredient of the kind of guarantees we will ultimately need.
I identified sequence prediction / online learning in the deterministic realizable case as a relatively easy (but already highly non-trivial) starting point for investigating compositional learning.
For the reasons stated in the OP, this led me to ambiguous online learning.
Could you pitch the value of this, perhaps by linking the relevant section of your agenda post, or etc? Why should someone who encounters this paper be interested as more than a a recreational math curiosity—eg, where does it fit in your agenda? why would someone who is trying to solve ASI-grade alignment need to invent this paper, or if that’s not obviously the case, what lead you to be interested? If there was another potential vanessa out there who was feeling pressure to solve ASI alignment and is getting interested in the current “field consensus”, why should they get interested in this line of work, and specifically in this subthread, such that it’s a good idea for them to read this particular work?
this is a comment I’ve been meaning to make on most of your technical posts, because you seem to be one of very few people doing work like yours and I think part of why is not explaining its necessity. I’m planning to make a post pitching, among other things, your stuff, in the hope of laying out the reasons the field should consider whether there’s a good reason to try paths like yours harder, but I’m still relatively a noob to it and you arguing for it yourself is probably just better for getting the actual reasons down in a coherent and justified way.
(I’m interested in chatting on discord before a response here.)
I did link the relevant section of my agenda post:
A brief and simplified summary:
In order to have powerful learning algorithms with safety guarantees, we first need learning algorithms with powerful generalization guarantees that we know how to rigorously formulate (otherwise how do you know the algorithm will correctly infer the intended goal/behavior from the training data?).
Additionally, in order to formally specify “aligned to human values”, we need to formally specify “human values”, and it seems likely that the specification of “X’s values” should be something akin to “the utility function w.r.t. which X has [specific type of powerful performance guarantees]”. These powerful performance guarantees are probably a form/extension of powerful generalization guarantees.
Both reasons require us to understand the kind of natural powerful generalization guarantees that efficient learning algorithms can satisfy. Moreover, such understanding would likely be applicable to deep learning as well, as it seems likely deep learning algorithms satisfy such guarantees, but we currently don’t know how to formulate them.
I conjecture that a key missing ingredient in deriving efficient learning algorithms with powerful guarantees (more powerful than anything we already understand in computational learning theory), is understanding the role of compositionality in learning. This is because compositionality is a ubiquitous feature of our thinking about the world and, intuitively, particular forms of compositionality are strong candidates for properties that are both very general and strong enough to enable efficient learning. This line of thinking led me to some success in the context of control theory, which is a necessary ingredient of the kind of guarantees we will ultimately need.
I identified sequence prediction / online learning in the deterministic realizable case as a relatively easy (but already highly non-trivial) starting point for investigating compositional learning.
For the reasons stated in the OP, this led me to ambiguous online learning.
I’m open to chatting on Discord.