Suppose we had a CoT-style transcript of every thought, word, email and action by the founder of a successful startup over the course of several years of its founding, and used this for RL: then we’d get a reward signal every time they landed a funding round, sales went up significantly, a hire they made or contract they signed clearly worked out well, and so forth — not enough training data by itself for RL, but perhaps a useful contribution.
The SGD safety pretraining equivalent would be to include that transcript in the pretraining dataset (or, since such data is very rare and useful/high quality, perhaps an entrepreneurship-specific fine-tuning dataset). So far, very similar. You would also (likely AI-assisted) look through all of the transcript, and if you located any portions where the behavior was less wise or less moral/aligned than the behavior we’d like to see from an aligned AI-entrepreneur, label that potion with <|unaligned|> tags (or whatever), and perhaps also supplement it with commentary on subject like why it is less wise/moral/aligned than the standards for an aligned AI, what should have been done instead, and speculations around the likely results of those counterfactual actions.
Suppose we had a CoT-style transcript of every thought, word, email and action by the founder of a successful startup over the course of several years of its founding, and used this for RL: then we’d get a reward signal every time they landed a funding round, sales went up significantly, a hire they made or contract they signed clearly worked out well, and so forth — not enough training data by itself for RL, but perhaps a useful contribution.
I don’t think this approach would lead to an AI that can autonomously come up with a new out-of-the-box innovative business plan, and found the company, and grow it to $1B/year revenue, over the course of years, all with literally zero human intervention.
…So I expect that future AI programmers will keep trying different approaches until they succeed via some other approach.
And such “other approaches” certainly exist—for example, Jeff Bezos’s brain was able to found Amazon without training on any such dataset, right?
(Such datasets don’t exist anyway, and can’t exist, since human founders can’t write down every one of their thoughts, there are too many of them and they are not generally formulated in English.)
It’s unclear to me how one could fine-tune high quality automated-CEO AI without such training sets (which I agree are impractical to gather — that was actually part of my point, though one might have access to, say, a CEO’s email logs, diary, and meeting transcripts). Similarly, to train one using RL, one would need an accurate simulation environment that simulates a startup and all its employees, customers, competitors, and other world events — which also sounds rather impractical.
In practice, I suspect we’ll first train an AI assistant/advisor to CEOs. and then use that to gather the data to train an automated CEO model. Or else we’ll train something so capable that it can generalize from more tractable training tasks to being a CEO, and do a better job than a human even on a task it hasn’t been specifically trained on.
Suppose we had a CoT-style transcript of every thought, word, email and action by the founder of a successful startup over the course of several years of its founding, and used this for RL: then we’d get a reward signal every time they landed a funding round, sales went up significantly, a hire they made or contract they signed clearly worked out well, and so forth — not enough training data by itself for RL, but perhaps a useful contribution.
The SGD safety pretraining equivalent would be to include that transcript in the pretraining dataset (or, since such data is very rare and useful/high quality, perhaps an entrepreneurship-specific fine-tuning dataset). So far, very similar. You would also (likely AI-assisted) look through all of the transcript, and if you located any portions where the behavior was less wise or less moral/aligned than the behavior we’d like to see from an aligned AI-entrepreneur, label that potion with <|unaligned|> tags (or whatever), and perhaps also supplement it with commentary on subject like why it is less wise/moral/aligned than the standards for an aligned AI, what should have been done instead, and speculations around the likely results of those counterfactual actions.
I don’t think this approach would lead to an AI that can autonomously come up with a new out-of-the-box innovative business plan, and found the company, and grow it to $1B/year revenue, over the course of years, all with literally zero human intervention.
…So I expect that future AI programmers will keep trying different approaches until they succeed via some other approach.
And such “other approaches” certainly exist—for example, Jeff Bezos’s brain was able to found Amazon without training on any such dataset, right?
(Such datasets don’t exist anyway, and can’t exist, since human founders can’t write down every one of their thoughts, there are too many of them and they are not generally formulated in English.)
It’s unclear to me how one could fine-tune high quality automated-CEO AI without such training sets (which I agree are impractical to gather — that was actually part of my point, though one might have access to, say, a CEO’s email logs, diary, and meeting transcripts). Similarly, to train one using RL, one would need an accurate simulation environment that simulates a startup and all its employees, customers, competitors, and other world events — which also sounds rather impractical.
In practice, I suspect we’ll first train an AI assistant/advisor to CEOs. and then use that to gather the data to train an automated CEO model. Or else we’ll train something so capable that it can generalize from more tractable training tasks to being a CEO, and do a better job than a human even on a task it hasn’t been specifically trained on.