Inner alignment for simulators
Broadly agreed. I’d written a similar analysis of the issue before, where I also take into account path dynamics (i. e., how and why we actually get to Azazel from a random initialization). But that post is a bit outdated.
My current best argument for it goes as follows:
The central issue, the reason why “naive” approaches for just training a ML model to make good prediction will likely result in a mesa-optimizer, is that all such setups are “outer-misaligned” by default. They don’t optimize AIs towards being good world-models, they optimize them for making specific good predictions and then channeling these predictions through a low-bandwidth communication channel. (Answering a question, predicting the second part of a video/text, etc.)That is, they don’t just simulate a world: they simulate a world, then locate some specific data they need to extract from the simulation, and translate them into a format understandable to humans.As simulation complexity grows, it seems likely that these last steps would require powerful general intelligence/GPS as well. And at that point, it’s entirely unclear what mesa-objectives/values/shards it would develop. (Seems like it almost fully depends on the structure of the goal-space. And imagine if e. g. “translate the output into humanese” and “convince humans of the output” are very nearby, and then the model starts superintelligently optimizing for the latter?)In addition, we can’t just train simulators “not to be optimizers”, in terms of locating optimization processes/agents within them and penalizing such structures. It’s plausible that advanced world-modeling is impossible without general-purpose search, and it would certainly be necessary inasmuch as the world-model would need to model humans.
The central issue, the reason why “naive” approaches for just training a ML model to make good prediction will likely result in a mesa-optimizer, is that all such setups are “outer-misaligned” by default. They don’t optimize AIs towards being good world-models, they optimize them for making specific good predictions and then channeling these predictions through a low-bandwidth communication channel. (Answering a question, predicting the second part of a video/text, etc.)
That is, they don’t just simulate a world: they simulate a world, then locate some specific data they need to extract from the simulation, and translate them into a format understandable to humans.
As simulation complexity grows, it seems likely that these last steps would require powerful general intelligence/GPS as well. And at that point, it’s entirely unclear what mesa-objectives/values/shards it would develop. (Seems like it almost fully depends on the structure of the goal-space. And imagine if e. g. “translate the output into humanese” and “convince humans of the output” are very nearby, and then the model starts superintelligently optimizing for the latter?)
In addition, we can’t just train simulators “not to be optimizers”, in terms of locating optimization processes/agents within them and penalizing such structures. It’s plausible that advanced world-modeling is impossible without general-purpose search, and it would certainly be necessary inasmuch as the world-model would need to model humans.
If such a system seemed intuitively universal but wasn’t exploding, what kind of observation would tell you that it isn’t universal after all, and therefore salvage your claim?
Given that we don’t understand how current LLMs work, and how the “space of problems” generally looks like, it’s difficult to come up with concrete tests that I’m confident I won’t goalpost-move on. A prospective one might be something like this:
If you invent or find a board game of similar complexity to chess that [the ML model] has never seen before and explain the rules using only text (and, if [the ML model] is multimodal, also images), [a pre-AGI model] will not be able to perform as well at the game as an average human who has never seen the game before and is learning it for the first time in the same way.
I. e., an AGI would be able to learn problem-solving in completely novel, “basically-off-distribution” domains. And if a system that has capabilities like this doesn’t explode (and it’s not deliberately trained to be myopic or something), that would falsify my view.
But for me to be confident in putting weight on that test, we’d need to clarify some specific details about the “minimum level of complexity” of the new board game, and that it’s “different enough” from all known board games for the AI to be unable to just generalize from them… And given that it’s unclear in which directions it’s easy to generalize, I expect I wouldn’t be confident in any metric we’d be able to come up with.
I guess a sufficient condition for AGI would be “is able to invent a new scientific field out of whole cloth, with no human steering, as an instrumental goal towards solving some other task”. But that’s obviously an overly high bar.
As far as empirical tests for AGI-ness go, I’m hoping for interpretability-based ones instead. I. e., that we’re able to formalize what “general intelligence” means, then search for search in our models.
As far as my epistemic position, I expect three scenarios here:
We develop powerful interpretability tools, and they directly show whether my claims about general intelligence hold.
We don’t develop powerful interpretability tools before an AGI explodes like I fear and kills us all.
AI capabilities gradually improve until we get to scientific-field-inventing AGIs, and my mind changes only then.
In scenarios where I’m wrong, I mostly don’t expect to ever encounter any black-boxy test like you’re suggesting which seems convincing to me, before I encounter overwhelming evidence that makes convincing me a moot point.
(Which is not to say my position on this can’t be moved at all — I’m open to mechanical arguments about why cognition doesn’t work how I’m saying it does, etc. But I don’t expect to ever observe an LLM whose mix of capabilities and incapabilities makes me go “oh, I guess that meets my minimal AGI standard but it doesn’t explode, guess I was wrong on that”.)
Yeah, I think there’s a sharp-ish discontinuity at the point where we get to AGI. “General intelligence” is, to wit, general — it implements some cognition that can efficiently derive novel heuristics for solving any problem/navigating arbitrary novel problem domains. And a system that can’t do that is, well, not an AGI.
Conceptually, the distinction between an AGI and a pre-AGI system feels similar to the distinction between a system that’s Turing-complete and one that isn’t:
Any Turing-complete system implements a set of rules that suffices to represent any mathematical structure/run any program. A system that’s just “slightly below” Turing-completeness, however, is dramatically more limited.
Similarly, an AGI has a complete set of some cognitive features that make it truly universal — features it can use to bootstrap any other capability it needs, from scratch. By contrast, even a slightly “pre-AGI” system would be qualitatively inferior, not simply quantitatively so.
There’s still some fuzziness around the edges, like whether any significantly useful R&D capabilities only happen at post-AGI cognition, or to what extent being an AGI is the sufficient and not only the necessary condition for an omnicidal explosion.
But I do think there’s a meaningful sense in which AGI-ness is a binary, not a continuum. (I’m also hopeful regarding nailing all of this down mathematically, instead of just vaguely gesturing at it like this.)
Oh, I think I should’ve made clearer that I wasn’t aiming that rant at you specifically. Just outlining my general impression of how the two views feel socially.
A lot of slow takeoff, gradual capabilities ramp-up, multipolar AGI world type of thinking. Personally, I agree with him this sort of scenario seems both more desirable and more likely.
I think the operative word in “seems more likely” here is “seems”. It seems like a more sophisticated, more realistic, more modern and satisfyingly nuanced view, compared to “the very first AGI we train explodes like a nuclear bomb and unilaterally sets the atmosphere on fire, killing everyone instantly”. The latter seems like an old view, a boringly simplistic retrofuturistic plot. It feels like there’s a relationship between these two scenarios, and that the latter one is a rough first-order approximation someone lifted out of e. g. The Terminator to get people interested in the whole “AI apocalypse” idea at the onset of it all. Then we gained a better understanding, sketched out detailed possibilities that take into account how AI and AI research actually work in practice, and refined that rough scenario. As the result, we got that picture of a slower multipolar catastrophe.
A pleasingly complicated view! One that respectfully takes into account all of these complicated systems of society and stuff. It sure feels like how these things work in real life! “It’s not like the AI wakes up and decides to be evil,” perish the thought.
That seeming has very little to do with reality. The unilateral-explosion isn’t the old, outdated scenario — it’s simply a different scenarios that’s operating on a different model of how intelligence explosions proceed. And as far as its proponents are concerned, its arguments haven’t been overturned at all, and nothing about how DL works rules it out.
But it sure seems like the rough naive view that the Real Experts have grown out of a while ago; and that those who refuse to update simply haven’t done that growing-up, haven’t realized there’s a world outside their chosen field with all these Complicated Factors you need to take into account.
It makes it pretty hard to argue against. It’s so low-status.
… At least, that’s how that argument feels to me, on a social level.
(Edit: Uh, to be clear, I’m not saying that there’s no other reasons to buy the multipolar scenario except “it seems shiny”; that a reasonable person could not come to believe it for valid reasons. I think it’s incorrect, and that there are some properties that unfairly advantage it in the social context, but I’m not saying it’s totally illegitimate.)
Goals are functions over the concepts in one’s internal ontology, yes. But having a concept for something doesn’t mean caring about it — your knowing what a “paperclip” is doesn’t make you a paperclip-maximizer.
The idea here isn’t to train an AI with the goals we want from scratch, it’s to train an advanced world-model that would instrumentally represent the concepts we care about, interpret that world-model, then use it as a foundation to train/build a different agent that would care about these concepts.
Now this is admittedly very different from the thesis that value is complex and fragile.
I disagree. The fact that some concept is very complicated doesn’t mean it won’t be necessarily represented in any advanced AGI’s ontology. Humans’ psychology, or the specific tools necessary to build nanomachines, or the agent foundation theory necessary to design aligned successor agents, are all also “complex and fragile” concepts (in the sense that getting a small detail wrong would result in a grand failure of prediction/planning), but we can expect such concepts to be convergently learned.
Not that I necessarily expect “human values” specifically to actually be a natural abstraction — an indirect pointer at “moral philosophy”/DWIM/corrigibility seem much more plausible and much less complex.
Two agents with the same ontology and very different purposes would behave in very different ways.
I don’t understand this objection. I’m not making any claim isomorphic to “two agents with the same ontology would have the same goals”. It sounds like maybe you think I’m arguing that if we can make the AI’s world-model human-like, it would necessarily also be aligned? That’s not my point at all.
The motivation is outlined at the start of 1A: I’m saying that if we can learn how to interpret arbitrary advanced world-models, we’d be able to more precisely “aim” our AGI at any target we want, or even manually engineer some structures over its cognition that would ensure the AGI’s aligned/corrigible behavior.
I agree that the AI would only learn the abstraction layers it’d have a use for. But I wouldn’t take it as far as you do. I agree that with “human values” specifically, the problem may be just that muddled, but with none of the other nice targets — moral philosophy, corrigibility, DWIM, they should be more concrete.
The alternative would be a straight-up failure of the NAH, I think; your assertion that “abstractions can be on a continuum” seems directly at odds with it. Which isn’t impossible, but this post is premised on the NAH working.
the opaque test is something like an obfuscated physics simulation
I think it’d need to be something weirder than just a physics simulation, to reach the necessary level of obfuscation. Like an interwoven array of highly-specialized heuristics and physical models which blend together in a truly incomprehensible way, and which itself can’t tell whether there’s etheric interference involved or not. The way Fermat’s test can’t tell a Carmichael number from a prime — it just doesn’t interact with the input number in a way that’d reveal the difference between their internal structures.
By analogy, we’d need some “simulation” which doesn’t interact with the sensory input in a way that can reveal a structural difference between the presence of a specific type of tampering and the absence of any tampering at all (while still detecting many other types of tampering). Otherwise, we’d have to be able to detect undesirable behavior, with sufficiently advanced interpretability tools. Inasmuch as physical simulations spin out causal models of events, they wouldn’t fit the bill.
It’s a really weird image, and it seems like it ought to be impossible for any complex real-life scenarios. Maybe it’s provably impossible, i. e. we can mathematically prove that any model of the world with the necessary capabilities would have distinguishable states for “no interference” and “yes interference”.
Models of world-models is a research direction I’m currently very interested in, so hopefully we can just rule that scenario out, eventually.
It seems like there are plenty of hopes
Oh, I agree. I’m just saying that there doesn’t seem to be any other approaches aside from “figure out whether this sort of worst case is even possible, and under what circumstances” and “figure out how to distinguish bad states from good states at the object-level, for whatever concrete task you’re training the AI”.
Lazy World Models
It seems like “generators” should just be simple functions over natural abstractions? But I see two different ways to go with this, inspired either by the minimal latents approach, or by the redundant-information one.
First, suppose I want to figure out a high-level model of some city, say Berlin. I already have a “city” abstraction, let’s call it P(Λ), which summarizes my general knowledge about cities in terms of a probability distribution over possible structures. I also know a bunch of facts about Berlin specifically, let’s call their sum F. Then my probability distribution over Berlin’ structure is just P(XBerlin)=P(Λ|F).
Alternatively, suppose I want to model the low-level dynamics of some object I have an abstract representation for. In this case, suppose it’s the business scene of Berlin. I condition my abstraction of a business P(B) on everything I know about Berlin, P(B|XBerlin), then sample from the resulting distribution several times until I get a “representative set”. Then I model its behavior directly.
This doesn’t seem quite right, though.
… there seems to be a selection effect where GPS-instances that don’t care about preserving future API-call-context gets removed, leaving only subagent-y GPS-instances over time.
You’re not taking into account larger selection effects on agents, which select against agents that purge all those “myopic” GPS-instances. The advantage of shards and other quick-and-dirty heuristics is that they’re fast — they’re what you’re using in a fight, or when making quick logical leaps, etc. Agents which purge all of them, and keep only slow deliberative reasoning, don’t live long. Or, rather, agents which are dominated by strong deliberative reasoning tend not to do that to begin with, because they recognize the value of said quick heuristics.
In other words: not all shards/subagents are completely selfish and sociopathic, some/most want select others around. So even those that don’t “defend themselves” can be protected by others, or not even be targeted to begin with.
A “chips-are-tasty” shard is probably not so advanced as to have reflective capabilities, and e. g. a more powerful “health” shard might want it removed. But if you have some even more powerful preferences for “getting to enjoy things”, or a dislike of erasing your preferences for practical reasons, the health-shard’s attempts might be suppressed.
A shard which implements a bunch of highly effective heuristics for escaping certain death is probably not one that any other shard/GPS instance would want removed.
So the concern is that “the AI generates a random number, sees that it passes the Fermat test, and outputs it” is the same as “the AI generates a random action, sees that it passes [some completely opaque test that approves any action that either includes no tampering OR includes etheric interference], and outputs it”, right?
Yeah, in that case, the only viable way to handle this is to get something into the system that can distinguish between no tampering and etheric interference. Just like the only way to train an AI to distinguish primes from Carmichael numbers is to find a way to… distinguish them.
Okay, that’s literally tautological. I’m not sure this problem has any internal structure that makes it possible to engage with further, then. I guess I can link the Gooder Regulator Theorem, which seems to formalize the “to get a model that learns to distinguish between two underlying system-states, we need a test that can distinguish between two underlying system-states”.
What are your current thoughts on the exact type signature of abstractions? In the Telephone Theorem post, they’re described as distributions over the local deterministic constraints. The current post also mentions that the “core” part of an abstraction is the distribution P[Λ], and its ability to explain variance in individual instances of Xi.
Applying the deterministic-constraint framework to trees, I assume it says something like “given certain ground-truth conditions (e. g., the environment of a savannah + the genetic code of a given tree), the growth of tree branches of that tree species is constrained like so, the rate of mutation is constrained like so, the spread of saplings like so, and therefore we should expect to see such-and-such distribution of trees over the landscape, and they’ll have such-and-such forms”.
Is that roughly correct? Have you arrived at any different framework for thinking about type signatures?
I think there’s a sense in which the Fermat test is a capability problem, not an interpretability/alignment problem.
It’s basically isomorphic to a situation in which sensor tampering is done via a method that never shows up in the AI’s training data. E. g., suppose it’s done via “etheric interference”, which we don’t know about, and which never fails and therefore never leads to any discrepancies in the data so the AI can’t learn it via SSL either, etc. Then the AI just… can’t learn about it, period. It’s not that it can, in theory, pick up on it, but instead throws out that data and rolls both “no tampering” and “etheric tampering” into the same mechanistic explanation. It’s just given no data on it to begin with. If etheric tampering ever fails, in a way that impacts the visible data, only then the AI can notice it and internally represent it.
Same for the Fermat primality test. If, during the training, we feed the AI a dataset of numbers that happens not to include any Carmichael numbers, it’s basically equivalent to teaching the AI on a dataset that includes no tampering of some specific type. If then, at runtime, a Carmichael number shows up (someone tampers with the sensors in that OOD way), the AI just fails, because it hasn’t been given the necessary training data not to fail.
So, yeah, if our sensors are going to be interfered-with by magic/aliens/supercriminals, in novel ways we don’t know about and which don’t show up on our training data, our AI won’t be able to foresee that either. But that’s a pure capability problem, solved via capability methods (catch them in the act and train the AI on the new data).
Edit: And if our threat model is the AI itself doing the tampering — well, it can hardly do it via a method it doesn’t know about, can it? And if it’s generating ~random actions and accidentally tampers with the sensors in a way it didn’t intend, that also seems like a capability problem that’ll be solved by routine capability work.
Also, in this case the AI’s generate-random-actions habits probably result in many more unintended side-effects than just sensor tampering, so its error should be easily noticed and corrected. The corner case is where there’s some domain of actions we don’t know about and the AI doesn’t know about, whose only side-effects results in sensor tampering. But that seems very unlikely and— again, that problem has nothing to do with the AI’s decision-making process, alignment, or lack thereof, and everything to do with no-one in the situation (AI or human) knowing how the world works.
There’s a specialized tool for that, actually! Snactiv. I’ve tried it, it’s pretty easy to use + you don’t have to put it down and pick it up when switching between snacking and typing.
You are arguing about agents being behaviorally aligned in some way on distribution, not arguing about agents being structured as wrapper-minds
I think I’m doing both. I’m using behavioral arguments as a foundation, because they’re easier to form, and then arguing that specific behaviors at a given level of capabilities can only be caused by some specific internal structures.
You are arguing that the above holds in setups where the only source of parameter updates is episodic reward, not arguing that the above holds in general across autonomous learning setups
Yeah, that’s a legitimate difference from my initial position: wasn’t considering alternate setups like this when I wrote the post.
The part I think I’m still fuzzy on is why the agent limits out to caring about some correlate(s) of G, rather than caring about some correlate(s) of R.
Mainly because I don’t want to associate my statements with “reward is the optimization target”, which I think is a rather wrong intuition. As long as we’re talking about the fuzzy category of “correlates”, I don’t think it matters much? Inasmuch as R and G are themselves each other’s close correlates, so a close correlate of one is likely a close correlate of another.
(1) Hmm I don’t understand how this works if we’re randomizing the environments, because aren’t we breaking those correlations so the agent doesn’t latch onto them instead of the real goal?(2) Also, in what you’re describing, it doesn’t seem like this agent is actually pursuing one fixed goal across contexts, since in each context, the mechanistic reason why it makes the decisions it does is because it perceives this specific G-correlate in this context, and not because it represents that perceived thing as being a correlate of G.
(1) Hmm I don’t understand how this works if we’re randomizing the environments, because aren’t we breaking those correlations so the agent doesn’t latch onto them instead of the real goal?
(2) Also, in what you’re describing, it doesn’t seem like this agent is actually pursuing one fixed goal across contexts, since in each context, the mechanistic reason why it makes the decisions it does is because it perceives this specific G-correlate in this context, and not because it represents that perceived thing as being a correlate of G.
Consider an agent that’s been trained on a large number of games, until it reached the point where it can be presented with a completely unfamiliar game and be seen to win at it. What’s likely happening, internally?
The agent looks at the unfamiliar environment. It engages in some general information-gathering activity, fine-tuning the heuristics for it as new rules are discovered, building a runtime world-model of this new environment.
Once that’s done, it needs to decide what to do in it. It feeds the world-model to some “goal generator” feature, and it spits out some goals over this environment (which are then fed to the heuristics generator, etc.).
The agent then pursues those goals (potentially branching them out into sub-goals, etc.), and its pursuit of these goals tends to lead to it winning the game.
To Q1: The agent doesn’t have hard-coded environment-specific correlates of G that it pursues; the agent has a procedure for deriving-at-runtime an environment-specific correlate of G.
To Q2: Doesn’t it? We’re prompting the agent with thousands of different games, and the goal generator reliably spits out a correlate of “win the game” in every one of them; and then the agent is primarily motivated by that correlate. Isn’t this basically the same as “pursuing a correlate of G independent of the environment”?
As to whether it’s motivated to pursue the G-correlate because it’s a G-correlate — to answer that, we need to speculate on the internals of the “goal generator”. If it reliably spits out local G-correlates, even in environments it never saw before, doesn’t that imply that it has a representation of a context-independent correlate of G, which it uses as a starting point for deriving local goals?
If we were prompting the agent only with games it has seen before, then the goal-generator might just be a compressed lookup table: the agent would’ve been able to just memorize a goal for every environment it’s seen, and this procedure just de-compresses them.
But if that works even in OOD environments — what alternative internal structure do you suggest the goal-generator might have, if not one that contains a context-independent G-correlate?
Well, you do address this:
there’s no requirement that the mechanistic cause of the heuristic-generator suggesting a particular goal in this environment is because it represented that environment-specific goal as subserving G or some other fixed goal, rather than because it recognized the decision-relevant factors in the environment from previously reinforced experiences (without the need for some fixed goal to motivate the recognition of those factors).
… I don’t see a meaningful difference, here. There’s some data structure internal to the goal generator, which it uses as a starting point when deriving a goal for a new environment. Reasoning from that data-structure reliably results in the goal generator spitting out a local G-correlate. What are the practical differences between describing that data structure as “a context-independent correlate of G” versus “decision-relevant factors in the environment”?
Or, perhaps a better question to ask is, what are some examples of these “decision-relevant factors in the environment”?
E. g., in the games example, I imagine something like:
The agent is exposed to the new environment; a multiplayer FPS, say.
It gathers data and incrementally builds a world-model, finding local natural abstractions. 3D space, playable characters, specific weapons, movements available, etc.
As it’s doing that, it also builds more abstract models. Eventually, it reduces the game to its pure mathematical game-theoretic representation, perhaps viewing it as a zero-sum game.
Then it recognizes some factors in that abstract representation, goes “in environments like this, I must behave like this”, and “behave like this” is some efficient strategy for scoring the highest.
Then that strategy is passed down the layers of abstraction, translated from the minimalist math representation to some functions/heuristics over the given FPS’ actual mechanics.
Do you have something significantly different in mind?
As I mentioned, bad exploration and deceptive alignment are names for the same phenomenon at different levels of cognitive sophistication
I still don’t see it. I imagine “deceptive alignment”, here, to mean something like:
“The agent knows G, and that scoring well at G reinforces its cognition, but it doesn’t care about G. Instead, it cares about some V. Whenever it notices its capabilities improve, it reasons that this’ll make it better at achieving V, so it attempts to do better at G because it wants the outer optimizer to preferentially reinforce said capabilities improvement.”
This lets it decouple its capabilities growth from G-caring: its reasoning starts from V, and only features G as an instrumental goal.
But what’s the bad-exploration low-sophistication equivalent of this, available before it can do such complicated reasoning, that still lets it couple capabilities growth with better performance on G?
Can you walk me through that spectrum, of “bad exploration” to “deceptive alignment”? How does one incrementally transform into the other?
For reference, I think you’ve formed a pretty accurate model of my model.
Given this analysis, it seems like the default behavior is for the GPS API-calls to gradient hack away whatever other API-calls that would predictably result in in-distribution behaviors not getting preserved (e.g., value-compilation).
Yup. But this requires these GPS instances to be advanced enough to do gradient-hacking, and indeed be concerned with preventing their current values from being updated away. Two reasons not to expect that:
Different GPS instances aren’t exactly “subagents”, they’re more like planning processes tasked to solve a given problem.
Consider an impulse to escape from certain death. It’s an “instinctive” GPS instance; the GPS has been prompted with “escape certain death”, and that prompt is not downstream of abstract moral philosophy. It’s an instinctive reaction.
But this GPS instance doesn’t care about preventing future GPS instances from updating away the conditions that caused it to be initiated (i. e., the agent’s tendency to escape certain death). It’s just tasked with deriving a plan for the current situation.
It wouldn’t bother looking up what abstract-moral-philosophy is plotting and maybe try to counter-plot. It wouldn’t care about that.
(And even if it does care, it’d be far from its main priority, so it wouldn’t do effective counter-plotting.)
The hard-coded pointer to value compilation might plainly be chiseled-in before the agent is advanced enough to do gradient-hacking. In that case, even if a given GPS instance would care to plot against abstract values, it just wouldn’t know how (or know that it needs to).
That said, you’re absolutely right that it does happen in real-life agents. Some humans are suspicious of abstract arguments for the greater good and refuse to e. g. go from deontologists to utilitarians. The strength of the drive for value compilation relative to shards’ strength is varying, and depending on it, the process of value compilation may be frozen at some arbitrary point.
It partly falls under the meta-cognition section. But in even more extreme cases, a person may simply refuse to engage in value-compilation at all, express a preference to not be coherent.
… Which is an interesting point, actually. We want there to be some value compilation, or agents just wouldn’t be able to generalize OOD at all. But it’s not obvious that we want maximum value compilation. Maximum value compilation leads to e. g. an AGI with a value-humans shard who decides to do a galaxy-brained merger of that shard with something else and ends up indifferent to human welfare. But maybe we can replicate what human deontologists are doing, and alter the “power distribution” among the AGI’s processes such that value compilation freezes just before this point?
I may be overlooking some reason this wouldn’t work, but seems promising at first glance.
I genuinely think it might be harder than the actual technical problem of alignment, and we ought to look for any path which isn’t that hopelessly doomed.