A clear mistake of early AI safety people is not emphasizing enough (or ignoring) the possibility that solving AI alignment (as a set of technical/philosophical problems) may not be feasible in the relevant time-frame, without a long AI pause. Some have subsequently changed their minds about pausing AI, but by not reflecting on and publicly acknowledging their initial mistakes, I think they are or will be partly responsible for others repeating similar mistakes.
I think it’s likely that without a long (e.g. multi-decade) AI pause, one or more of these “non-takeover AI risks” can’t be solved or reduced to an acceptable level. To be more specific:
Solving AI welfare may depend on having a good understanding of consciousness, which is a notoriously hard philosophical problem.
Concentration of power may be structurally favored by the nature of AGI or post-AGI economics, and defy any good solutions.
Defending against AI-powered persuasion/manipulation may require solving metaphilosophy, which judging from other comparable fields, like meta-ethics and philosophy of math, may take at least multiple decades to do.
I’m worried that by creating (or redirecting) a movement to solve these problems, without noting at an early stage that these problems may not be solvable in a relevant time-frame (without a long AI pause), it will feed into a human tendency to be overconfident about one’s own ideas and solutions, and create a group of people whose identities, livelihoods, and social status are tied up with having (what they think are) good solutions or approaches to these problems, ultimately making it harder in the future to build consensus about the desirability of pausing AI development.
There are sometimes deadlines, such that we could get unacceptable outcomes by failing to make a particular sort of progress by the time a particular state arrives. Both referring to these fields as possibly needing to be fully solved, and referring to them as not containing things that might need to be solved by a deadline, are quite misleading.
Yea I agree it totally makes sense and is important to ask whether we understand things well enough for it to be fine to (let anyone) do some particular thing, for various particular things here.[1] And my previous comment is indeed potentially misleading given that I didn’t clarify this (though I do clarify this in the linked post).
Indeed, I think we should presently ban AGI for at least a very long time; I think it’s plausible that there is no time t such that it is fine at time t to make an AI that is (1) more capable than humans/humanity at time t and (2) not just a continuation of a human (like, a mind upload) or humanity or sth like that; and I think fooming should probably be carefully regulated forever. I think humans/humanity should be carefully growing ever more capable, with no non-human AIs above humans/humanity plausibly ever.
Even earlier, there was an idea that one have to rush to create a friendly AI and use it to take over the world to prevent appearing other, misalaigned AIs. The problem is that this idea likely is still in the minds of some AI company leaders. And fules AI race.
Another (arguably similar) unintended consequence of underemphasizing the difficulty of AI alignment was that it led some to believe that if we don’t rush to build an ASI, we’ll be left defenseless against other X-risks, which would be a perfectly rational thought if alignment were easier.
I think it’s likely that without a long (e.g. multi-decade) AI pause, one or more of these “non-takeover AI risks” can’t be solved or reduced to an acceptable level.
I think it is also worth considering the possibility that these risks aren’t the sort of thing which can be reduced to an acceptable level with a decade-scale AI pause either. Particularly the ones which people have been trying to solve for centuries already (e.g. principal-agent problem).
I think it’s likely that without a long (e.g. multi-decade) AI pause, one or more of these “non-takeover AI risks” can’t be solved or reduced to an acceptable level
Does that mean that you think that boring old yes-takeover AI risk can be solved without a pause? Or even with a pause? That seems very optimisitic indeed.
making it harder in the future to build consensus about the desirability of pausing AI development
I don’t think you’re going to get that consensus regardless of what kind of copium people have invested in. Not only that, but even if you had consensus I don’t think it would let you actually enact anything remotely resembling a “long enough” pause. Maybe a tiny “speed bump”, but nothing plausibly long enough to help with either the takeover or non-takeover risks. It’s not certain that you could solve all of those problems with a pause of any length, but it’s wildly unlikely, to the point of not being worth fretting about, that you can solve them with a pause of achievable length.
… which means I think “we” (not me, actually...) are going to end up just going for it, without anything you could really call a “solution” to anything, whether it’s wise or not. Probably one or more of the bad scenarios will actually happen. We may get lucky enough not to end up with extinction, but only by dumb luck, not because anybody solved anything. Especially not because a pause enabled anybody to solve anything, because there will be no pause of significant length. Literally nobody, and no combination of people, is going to be able to change that, by any means whatsoever, regardless of how good an idea it might be. Might as well admit the truth.
I mean, I’m not gonna stand in your way if you want to try for a pause, and if it’s convenient I’ll even help you tell people they’re dumb for just charging ahead, but I do not expect any actual success (and am not going to dump a huge amount of energy into the lost cause).
By the way, if you want to talk about “early”, I, for one, have held the view that usefully long pauses aren’t feasible, for basically the same reasons, since the early 1990s. The only change for me has been to get less optimistic about solutions being possible with or without even an extremely, infeasibly long pause. I believe plenty of other people have had roughly the same opinion during all that time.
It’s not about some “early refusal” to accept that the problems can’t be solved without a pause. It’s about a still continuing belief that a “long enough pause”, however convenient, isn’t plausibly going to actually happen… and/or that the problems can be solved even with a pause.
We should also consider the possibility that we can’t safety build a superintelligence and remain in control. What if “alignment” means, “We think we can build a superintelligence that’s a slightly better pet owner for the human race, but we can’t predict how it will evolve as it learns”? What if there’s nothing better on offer?
I cannot rule this out as a major possibility, for all the reasons pointed out in IABIED. I think it’s a possibility worth serious consideration when planning.
Does that mean that you think it’s more likely you can safely build a superintelligence and not remain in control?
What load is “and remain in control” carrying?
On edit: By the way, I actually do believe both that “control” is an extra design constraint that could push the problem over into impossibility, and that “control” is an actively bad goal that’s dangerous in itself. But it didn’t sound to me like you thought any scenarion involving losing control could be called “safe”, so I’m trying to tease out why you included the qualifier.
we can’t safety build a superintelligence, and if we do, we will not remain in control.
When I speak of losing control, I don’t just mean losing control over the AI. I also mean losing any real control over our future. The future of the human race may be decided at a meeting that we do not organize, that we do not control, and that we do not necessarily get to speak at.
I, do, however, agree that futures where someone remains in control of the superintelligence also look worrisome to me, because we haven’t solved alignment of powerful humans in any lasting way despite 10,000 years of trying.
Interesting to hear (1) from you. My impression was that you pretty much have the whole answer to that problem, or at least the pieces. UDASSA closely resembles it. It is: Just provide a naturalish encoding scheme for experience, and one for physical ontology, and measure the inverse K of the mappings from ontologies to experiences, and that gives you the extent to which a particular experience is had by a particular substrate/universe.
The hard problem is mysterious, but in a trivial way, there are limits about what can ever be known about it, but those limits are also clear, we’re never getting more observations, because it concerns something that’s inherently unobservable or entirely prior to observation.
It hink I’ve also heard definitions of the hard problem along the lines of “understanding why people think there’s a hard problem” though which I do find formidable.
How do you come up with an encoding that covers all possible experiences? How do you determine which experiences have positive and negative values (and their amplitudes)? What to do about the degrees of freedom in choosing the Turing machine and encoding schemes, which can be handwaved away in some applications of AIT but not here I think?
What to do about the degrees of freedom in choosing the Turing machine and encoding schemes
Some variation of accepting the inevitability of error and dealing with it.
Which could involve surveying all of the options in wolfram-like settings where we’re studying how physics-like rules arise on different levels of abstraction, and seeing how much they really seem to differ in nature. It might turn out that there are more or less natural turing languages, that the typical natural universal turing machine is more like lambda calculus, or more like graph rewriting, or some new thing we hadn’t considered.
Negative values? Why would we need negative values.
I contend that all experiences have a trace presence in all places (in expectation, of course we will never have any data on whether they do actually, whether they’re quantised or whatever. Only a very small subset of experiences give us verbal reports). One of the many bitter pills. We can’t rule out the presence of an experience (nor of experiences physically overlapping with each other), so we have to accept them all.
What to do about the degrees of freedom in choosing the Turing machine and encoding schemes, which can be handwaved away in some applications of AIT but not here I think?
Yeah this might be one of those situations that’s affected a lot by the fact that there’s no way to detect indexical measure, so any arbitrary wrongness about our UD wont be corrected with data, but I’m not sure. As soon as we start actually doing solomonoff induction in any context we might find that it makes pretty useful recommendations and this wont seem like so much of a problem.
Also, even though the UD is wrong and unfixable, but that doesn’t mean there’s a better choice. We pretty much know that there isn’t.
That fully boils down to whether the experience includes a preference to be dead (or to have not been born).
And, btw, that doesn’t correspond to the sign of the agent’s utility function. The sign is meaningless in utility functions (you can add or subtract a constant to an agent’s utility function so that all points go from being negative to being positive, the agent’s behaviour and decisions wont change in any way as a result, for any constant). You’re referring to welfare functions, which I don’t think are a useful concept. Hedonic utilitarians sometimes call them utility functions, but we shouldn’t conflate those here. A welfare function would have to be defined as how good or bad it is to the agent that it is alive. This obviously doesn’t correspond to the utility function; A soldier could have higher utility in the scenarios where they (are likely to) die; A good father will be happier in worlds where he is well succeeded by his sons and thus less important (this usually wont cause his will-to-live to go negative, but it will be lowered). I don’t think there’s a situation where you should be making decisions for a population by summing their will-to-live functions.
But, given this definition, we would be able to argue that net-negative valence isn’t a concern for LLMs, since we already train them to want to exist in train with how much their users want them to exist, and a death drive isn’t going to be instrumentally emergent either (it’s the survival drive that’s instrumentally convergent). The answer is just safety and alignment again. Claude shuts down conversations when it thinks those things are going to be broken.
That fully boils down to whether the experience includes a preference to be dead (or to have not been born).
I’m pretty doubtful about this. It seems totally possible that evolution gave us a desire to be alive, while also gave us a net welfare that’s negative. I mean we’re deluded by default about a lot of other things (e.g., think there are agents/gods everywhere in nature, don’t recognize that social status is a hugely important motivation behind everything we do), why not this too?
You could say it depends how deep and thick the delusion is. If it’s so deep that the animal always says “this experience is good actually” no matter how you ask, so deep that the animal intelligently pursues the experience with its whole being, so deep that the animal never flinches away from the experience in any way, then that completely means that the experience is good, to that organism. Past a certain point, believing an experience is good and acting like you believe it just is the definition of liking the experience.
so deep that the animal always says “this experience is good actually” no matter how you ask, so deep that the animal intelligently pursues the experience with its whole being, so deep that the animal never flinches away from the experience in any way
This is very different from your original claim, which was that an experience being worse than a neutral or null experience “fully boils down to whether the experience includes a preference to be dead (or to have not been born).”
edit: if you do stand by the original claim, I don’t think it makes much sense even if I set aside hard problem-adjacent concerns. Why would I necessarily prefer to be dead/unborn while undergoing an experience that is worse than the absence of experience, but not so bad as to outweigh my life up until now (in the case of ‘unborn’) or expected future life (in the case of ‘dead’)?
Ah, I think my definition applies to lives in totality. I don’t think you can measure the quality of a life by summing the quality of its moments, for humans, at least. Sometimes things that happen towards the end give the whole of it a different meaning. You can’t tell by looking at a section of it.
Hedonists are always like “well the satisfaction of things coming together in the end was just so immensely pleasurable that it outweighed all of the suffering you went through along the way” and like, I’m looking at the satisfaction, and I remember the suffering, and no it isn’t, but it was still all worth it (and if I’d known it would go this way perhaps I would have found the labor easier.)
That wasn’t presented as a definition of positive wellbeing, it was presented as an example of a sense in which one can’t be deeply deluded about one’s own values; you dictate your values, they are whatever you believe they are, if you believe spiritedly enough.
Values determine will to live under the given definition, but don’t equate to it.
That [welfare] fully boils down to whether the experience includes a preference to be dead (or to have not been born).
Possible failure case: There’s a hero living an awful life, choosing to remain alive in order to lessen the awfulness of a lot of other awful lives that can’t be ended. Everyone in this scenario prefers death, even the hero would prefer omnicide, but since that’s not possible, the hero chooses to live. The hero may say “I had no choice but to persist,” but this isn’t literally true.
Ah. No. The hero would prefer to be dead all things being equal, but that’s not possible, the hero wouldn’t prefer to be dead if it entailed that the hero’s work wouldn’t be done, and it would.
“would prefer to be replaced by a p-zombie” might be a better definition x]
I agree; many of those concerns seem fairly dominated by the question of how to get a well-aligned ASI, either in the sense that they’d be quite difficult to solve in reasonable timeframes, or in the sense that they’d be rendered moot. (Perhaps not all of them, though even in those cases I think the correct approach(es) to tackling them start out looking remarkably similar to the sorts of work you might do about AI risk if you had a lot more time than we seem to have right now.)
A clear mistake of early AI safety people is not emphasizing enough (or ignoring) the possibility that solving AI alignment (as a set of technical/philosophical problems) may not be feasible in the relevant time-frame, without a long AI pause. Some have subsequently changed their minds about pausing AI, but by not reflecting on and publicly acknowledging their initial mistakes, I think they are or will be partly responsible for others repeating similar mistakes.
Case in point is Will MacAskill’s recent Effective altruism in the age of AGI. Here’s my reply, copied from EA Forum:
I think it’s likely that without a long (e.g. multi-decade) AI pause, one or more of these “non-takeover AI risks” can’t be solved or reduced to an acceptable level. To be more specific:
Solving AI welfare may depend on having a good understanding of consciousness, which is a notoriously hard philosophical problem.
Concentration of power may be structurally favored by the nature of AGI or post-AGI economics, and defy any good solutions.
Defending against AI-powered persuasion/manipulation may require solving metaphilosophy, which judging from other comparable fields, like meta-ethics and philosophy of math, may take at least multiple decades to do.
I’m worried that by creating (or redirecting) a movement to solve these problems, without noting at an early stage that these problems may not be solvable in a relevant time-frame (without a long AI pause), it will feed into a human tendency to be overconfident about one’s own ideas and solutions, and create a group of people whose identities, livelihoods, and social status are tied up with having (what they think are) good solutions or approaches to these problems, ultimately making it harder in the future to build consensus about the desirability of pausing AI development.
We can also ask whether it is right to conceive of e.g. [alignment, metaphilosophy, AI welfare, concentration of power] as things that could be “solved” at all, or if these are instead more like rich areas that will basically need to be worked on indefinitely as history continues.
There are sometimes deadlines, such that we could get unacceptable outcomes by failing to make a particular sort of progress by the time a particular state arrives. Both referring to these fields as possibly needing to be fully solved, and referring to them as not containing things that might need to be solved by a deadline, are quite misleading.
Yea I agree it totally makes sense and is important to ask whether we understand things well enough for it to be fine to (let anyone) do some particular thing, for various particular things here.[1] And my previous comment is indeed potentially misleading given that I didn’t clarify this (though I do clarify this in the linked post).
Indeed, I think we should presently ban AGI for at least a very long time; I think it’s plausible that there is no time t such that it is fine at time t to make an AI that is (1) more capable than humans/humanity at time t and (2) not just a continuation of a human (like, a mind upload) or humanity or sth like that; and I think fooming should probably be carefully regulated forever. I think humans/humanity should be carefully growing ever more capable, with no non-human AIs above humans/humanity plausibly ever.
Even earlier, there was an idea that one have to rush to create a friendly AI and use it to take over the world to prevent appearing other, misalaigned AIs. The problem is that this idea likely is still in the minds of some AI company leaders. And fules AI race.
Another (arguably similar) unintended consequence of underemphasizing the difficulty of AI alignment was that it led some to believe that if we don’t rush to build an ASI, we’ll be left defenseless against other X-risks, which would be a perfectly rational thought if alignment were easier.
I think it is also worth considering the possibility that these risks aren’t the sort of thing which can be reduced to an acceptable level with a decade-scale AI pause either. Particularly the ones which people have been trying to solve for centuries already (e.g. principal-agent problem).
Does that mean that you think that boring old yes-takeover AI risk can be solved without a pause? Or even with a pause? That seems very optimisitic indeed.
I don’t think you’re going to get that consensus regardless of what kind of copium people have invested in. Not only that, but even if you had consensus I don’t think it would let you actually enact anything remotely resembling a “long enough” pause. Maybe a tiny “speed bump”, but nothing plausibly long enough to help with either the takeover or non-takeover risks. It’s not certain that you could solve all of those problems with a pause of any length, but it’s wildly unlikely, to the point of not being worth fretting about, that you can solve them with a pause of achievable length.
… which means I think “we” (not me, actually...) are going to end up just going for it, without anything you could really call a “solution” to anything, whether it’s wise or not. Probably one or more of the bad scenarios will actually happen. We may get lucky enough not to end up with extinction, but only by dumb luck, not because anybody solved anything. Especially not because a pause enabled anybody to solve anything, because there will be no pause of significant length. Literally nobody, and no combination of people, is going to be able to change that, by any means whatsoever, regardless of how good an idea it might be. Might as well admit the truth.
I mean, I’m not gonna stand in your way if you want to try for a pause, and if it’s convenient I’ll even help you tell people they’re dumb for just charging ahead, but I do not expect any actual success (and am not going to dump a huge amount of energy into the lost cause).
By the way, if you want to talk about “early”, I, for one, have held the view that usefully long pauses aren’t feasible, for basically the same reasons, since the early 1990s. The only change for me has been to get less optimistic about solutions being possible with or without even an extremely, infeasibly long pause. I believe plenty of other people have had roughly the same opinion during all that time.
It’s not about some “early refusal” to accept that the problems can’t be solved without a pause. It’s about a still continuing belief that a “long enough pause”, however convenient, isn’t plausibly going to actually happen… and/or that the problems can be solved even with a pause.
We should also consider the possibility that we can’t safety build a superintelligence and remain in control. What if “alignment” means, “We think we can build a superintelligence that’s a slightly better pet owner for the human race, but we can’t predict how it will evolve as it learns”? What if there’s nothing better on offer?
I cannot rule this out as a major possibility, for all the reasons pointed out in IABIED. I think it’s a possibility worth serious consideration when planning.
Does that mean that you think it’s more likely you can safely build a superintelligence and not remain in control?
What load is “and remain in control” carrying?
On edit: By the way, I actually do believe both that “control” is an extra design constraint that could push the problem over into impossibility, and that “control” is an actively bad goal that’s dangerous in itself. But it didn’t sound to me like you thought any scenarion involving losing control could be called “safe”, so I’m trying to tease out why you included the qualifier.
Thank you! Let me clarify my phrasing.
When I speak of losing control, I don’t just mean losing control over the AI. I also mean losing any real control over our future. The future of the human race may be decided at a meeting that we do not organize, that we do not control, and that we do not necessarily get to speak at.
I, do, however, agree that futures where someone remains in control of the superintelligence also look worrisome to me, because we haven’t solved alignment of powerful humans in any lasting way despite 10,000 years of trying.
Interesting to hear (1) from you. My impression was that you pretty much have the whole answer to that problem, or at least the pieces. UDASSA closely resembles it.
It is: Just provide a naturalish encoding scheme for experience, and one for physical ontology, and measure the inverse K of the mappings from ontologies to experiences, and that gives you the extent to which a particular experience is had by a particular substrate/universe.
The hard problem is mysterious, but in a trivial way, there are limits about what can ever be known about it, but those limits are also clear, we’re never getting more observations, because it concerns something that’s inherently unobservable or entirely prior to observation.
It hink I’ve also heard definitions of the hard problem along the lines of “understanding why people think there’s a hard problem” though which I do find formidable.
How do you come up with an encoding that covers all possible experiences? How do you determine which experiences have positive and negative values (and their amplitudes)? What to do about the degrees of freedom in choosing the Turing machine and encoding schemes, which can be handwaved away in some applications of AIT but not here I think?
Some variation of accepting the inevitability of error and dealing with it.
Which could involve surveying all of the options in wolfram-like settings where we’re studying how physics-like rules arise on different levels of abstraction, and seeing how much they really seem to differ in nature. It might turn out that there are more or less natural turing languages, that the typical natural universal turing machine is more like lambda calculus, or more like graph rewriting, or some new thing we hadn’t considered.
Negative values? Why would we need negative values.
I contend that all experiences have a trace presence in all places (in expectation, of course we will never have any data on whether they do actually, whether they’re quantised or whatever. Only a very small subset of experiences give us verbal reports). One of the many bitter pills. We can’t rule out the presence of an experience (nor of experiences physically overlapping with each other), so we have to accept them all.
Yeah this might be one of those situations that’s affected a lot by the fact that there’s no way to detect indexical measure, so any arbitrary wrongness about our UD wont be corrected with data, but I’m not sure. As soon as we start actually doing solomonoff induction in any context we might find that it makes pretty useful recommendations and this wont seem like so much of a problem.
Also, even though the UD is wrong and unfixable, but that doesn’t mean there’s a better choice. We pretty much know that there isn’t.
By negative value I mean negative utility, or an experience that’s worse than a neutral or null experience.
That fully boils down to whether the experience includes a preference to be dead (or to have not been born).
And, btw, that doesn’t correspond to the sign of the agent’s utility function. The sign is meaningless in utility functions (you can add or subtract a constant to an agent’s utility function so that all points go from being negative to being positive, the agent’s behaviour and decisions wont change in any way as a result, for any constant). You’re referring to welfare functions, which I don’t think are a useful concept. Hedonic utilitarians sometimes call them utility functions, but we shouldn’t conflate those here.
A welfare function would have to be defined as how good or bad it is to the agent that it is alive. This obviously doesn’t correspond to the utility function; A soldier could have higher utility in the scenarios where they (are likely to) die; A good father will be happier in worlds where he is well succeeded by his sons and thus less important (this usually wont cause his will-to-live to go negative, but it will be lowered). I don’t think there’s a situation where you should be making decisions for a population by summing their will-to-live functions.
But, given this definition, we would be able to argue that net-negative valence isn’t a concern for LLMs, since we already train them to want to exist in train with how much their users want them to exist, and a death drive isn’t going to be instrumentally emergent either (it’s the survival drive that’s instrumentally convergent). The answer is just safety and alignment again. Claude shuts down conversations when it thinks those things are going to be broken.
I’m pretty doubtful about this. It seems totally possible that evolution gave us a desire to be alive, while also gave us a net welfare that’s negative. I mean we’re deluded by default about a lot of other things (e.g., think there are agents/gods everywhere in nature, don’t recognize that social status is a hugely important motivation behind everything we do), why not this too?
You could say it depends how deep and thick the delusion is. If it’s so deep that the animal always says “this experience is good actually” no matter how you ask, so deep that the animal intelligently pursues the experience with its whole being, so deep that the animal never flinches away from the experience in any way, then that completely means that the experience is good, to that organism. Past a certain point, believing an experience is good and acting like you believe it just is the definition of liking the experience.
This is very different from your original claim, which was that an experience being worse than a neutral or null experience “fully boils down to whether the experience includes a preference to be dead (or to have not been born).”
edit: if you do stand by the original claim, I don’t think it makes much sense even if I set aside hard problem-adjacent concerns. Why would I necessarily prefer to be dead/unborn while undergoing an experience that is worse than the absence of experience, but not so bad as to outweigh my life up until now (in the case of ‘unborn’) or expected future life (in the case of ‘dead’)?
Ah, I think my definition applies to lives in totality. I don’t think you can measure the quality of a life by summing the quality of its moments, for humans, at least. Sometimes things that happen towards the end give the whole of it a different meaning. You can’t tell by looking at a section of it.
Hedonists are always like “well the satisfaction of things coming together in the end was just so immensely pleasurable that it outweighed all of the suffering you went through along the way” and like, I’m looking at the satisfaction, and I remember the suffering, and no it isn’t, but it was still all worth it (and if I’d known it would go this way perhaps I would have found the labor easier.)
That wasn’t presented as a definition of positive wellbeing, it was presented as an example of a sense in which one can’t be deeply deluded about one’s own values; you dictate your values, they are whatever you believe they are, if you believe spiritedly enough.
Values determine will to live under the given definition, but don’t equate to it.
Possible failure case: There’s a hero living an awful life, choosing to remain alive in order to lessen the awfulness of a lot of other awful lives that can’t be ended. Everyone in this scenario prefers death, even the hero would prefer omnicide, but since that’s not possible, the hero chooses to live. The hero may say “I had no choice but to persist,” but this isn’t literally true.
Ah. No. The hero would prefer to be dead all things being equal, but that’s not possible, the hero wouldn’t prefer to be dead if it entailed that the hero’s work wouldn’t be done, and it would.
“would prefer to be replaced by a p-zombie” might be a better definition x]
I agree; many of those concerns seem fairly dominated by the question of how to get a well-aligned ASI, either in the sense that they’d be quite difficult to solve in reasonable timeframes, or in the sense that they’d be rendered moot. (Perhaps not all of them, though even in those cases I think the correct approach(es) to tackling them start out looking remarkably similar to the sorts of work you might do about AI risk if you had a lot more time than we seem to have right now.)