I agree that discernment is necessary (so maybe expand to RCRD?).
This lens is pretty clarifying I think. That’s relative to repeatedly pointing out that “agency” in the sense of just relentlessly pursuing a goal is trivially easy to add via scaffolding, so not the missing piece many people think it is, and pointing out that LLMs are creative as hell. They might need a little prompting to get creative enough, but again that’s trivial.
Hm, what about “relentless creative refinement” since I’m not sure what resourcefulness directly points at?
Anyway, discernment does seem like the limiting factor. You’ve got to discern which of your relentlessly creative efforts are most worth further pursuit. I think discernment is a somewhat better term than the others I’ve seen used for this missing capability. Getting the right term seems worthwhile.
The following is just some of my follow-on thoughts on the path to discernment in agentic LLMs and therefore timelines. Perhaps this will be fodder for a future post. It’s pretty divergent from the original topic so feel free to ignore.
Thinking about how humans acquire discernment in a given area should give some clues as to how hard it would be to add that to agentic LLMs.
Humans do discernment (IMO) sometimes with a bunch of very complex System 2 explicit analysis of a situation to get a decent guess at whether this approach is good/working. Over enough examples/experiences we can learn/compile those many judgments into effortless and mysterious intuitive judgments (I guess more how “discernment” is usually used”). Or we get enough data/experiences to learn/compile by using some faster rubric, like “I think those pants are fashionable because something else that person is wearing seems probably fashionable.”
It’s a bunch of online learning specific to a situation OR careful analysis following strategies and formulas that are maybe less situation-specific and more general, but quite time-consuming. For instance, Google’s co-scientist project has a highly complex scaffolding to create, evolve, and evaluate scientific hypotheses, including discerning their worth against the literature and in other ways. And it seems to work. That system doesn’t have the continuous learning to compile that into better judgments. It’s unclear how far you could get by fine-tuning on results of those laborious judgments in a given domain.
The other approach would be to create datasets that include much more/better value judgments than text corpora usually do. I don’t know how easy/hard that would be to create.
To me this suggests it’s not trivial to add discernment, but also doesn’t require breakthroughs to add some, leaving the question how much discernment is enough for any given purpose.
I agree that discernment is necessary (so maybe expand to RCRD?).
This lens is pretty clarifying I think. That’s relative to repeatedly pointing out that “agency” in the sense of just relentlessly pursuing a goal is trivially easy to add via scaffolding, so not the missing piece many people think it is, and pointing out that LLMs are creative as hell. They might need a little prompting to get creative enough, but again that’s trivial.
Hm, what about “relentless creative refinement” since I’m not sure what resourcefulness directly points at?
Anyway, discernment does seem like the limiting factor. You’ve got to discern which of your relentlessly creative efforts are most worth further pursuit. I think discernment is a somewhat better term than the others I’ve seen used for this missing capability. Getting the right term seems worthwhile.
The following is just some of my follow-on thoughts on the path to discernment in agentic LLMs and therefore timelines. Perhaps this will be fodder for a future post. It’s pretty divergent from the original topic so feel free to ignore.
Thinking about how humans acquire discernment in a given area should give some clues as to how hard it would be to add that to agentic LLMs.
Humans do discernment (IMO) sometimes with a bunch of very complex System 2 explicit analysis of a situation to get a decent guess at whether this approach is good/working. Over enough examples/experiences we can learn/compile those many judgments into effortless and mysterious intuitive judgments (I guess more how “discernment” is usually used”). Or we get enough data/experiences to learn/compile by using some faster rubric, like “I think those pants are fashionable because something else that person is wearing seems probably fashionable.”
It’s a bunch of online learning specific to a situation OR careful analysis following strategies and formulas that are maybe less situation-specific and more general, but quite time-consuming. For instance, Google’s co-scientist project has a highly complex scaffolding to create, evolve, and evaluate scientific hypotheses, including discerning their worth against the literature and in other ways. And it seems to work. That system doesn’t have the continuous learning to compile that into better judgments. It’s unclear how far you could get by fine-tuning on results of those laborious judgments in a given domain.
The other approach would be to create datasets that include much more/better value judgments than text corpora usually do. I don’t know how easy/hard that would be to create.
To me this suggests it’s not trivial to add discernment, but also doesn’t require breakthroughs to add some, leaving the question how much discernment is enough for any given purpose.