Claims like “ideal utility maximisation is computationally intractable”
This implies it could be impossible to know with given parameters and a function what the arg max is. I don’t see how you can get around this. I don’t understand what you are ultimately concluding is the result from that statement being false or not related to computation.
But that doesn’t mean that any particular, non-parameterized instance of the problem cannot be solved some other way, e.g. by exploiting a regularity in the particular instance,
This is probably typically the equivalent of using a probabilistic solution or approximation. Exploiting a regularity means you are using the average solution for that regularity. Or that regularity has already been solved/is computable. I think the latter is somewhat unlikely.
using a heuristic or approximation or probabilistic solution,
This introduces accumulated uncertainty.
or that a human or AI can find a way of sidestepping the need to solve the problem entirely.
The whole world model is a chaotic incomputable system. Example: how people will react to your plan, given no certain knowledge of which people will observer your plan
that the main bottleneck on AI capabilities progress going forward will be researcher time to think up, design, implement, and run experiments.
It will be hardware, as it has always been. The current paradigm makes it seem like algorithms progress faster than hardware does and is bottlenecked by hardware. I don’t see any evidence against this.
Such an estimate, if accurate, would imply that the compute and energy requirements for human-level AGI are roughly approximated by scaling current AI systems by 5x.
Unsure about this claim. How are you relating neuron count to current systems? I think the actual bio anchors claim puts it at ~2040. 5x puts it within the next 2 years. We are also rapidly approaching the limits of silicon.
Evolution discovering something doesn’t necessarily mean it is easy. Just that it’s possible. We also have way more energy than is provably required for human level AGI.
An AI system that is only slightly superhuman might be capable of re-arranging most of the matter and energy in the visible universe arbitrarily
This claim seems fantastical. Ignoring the fact that we already have slightly superhuman narrow AI that don’t enable us to do any of that, and that groups of smart humans are slightly superhuman, the AI will have to reverse entropy and probably generate unlimited energy to do that.
A misaligned human level genius isn’t exactly scary either unless it’s directing us somehow and it gains a lot of influence. However history is full of these and yet we still stand.
LLMs are evidence that abstract reasoning ability emerges as a side effect of solving any sufficiently hard and general problem with enough effort.
Don’t see why this has to be the case. If language itself plays a role in development of abstract thought, then this could explain why LLMs in particular seem to have abstract thought.
Even at human level, 99% honesty for AI isn’t good enough.
This claim isn’t particularly convincing. A lying politician lies 70% of the time. A righteous politician lies 30% of the time. I am also pretty confused by the story. What exactly is it portraying?
I don’t understand what you are ultimately concluding is the result from that statement being false or not related to computation.
One thing I’m saying is that arguments of the form: “doing X is computationally intractable, therefore a superintelligence won’t be able to do X” are typically using a loose, informal definition of “computationally intractable” which I suspect makes the arguments not go through. Usually because X is something like “build nanotech”, but the actual thing that is provably NP-hard is something more like “solve (in full generality) some abstracted modelling problem which is claimed to be required to build nanotech”.
Another thing is that even if X itself is computationally intractable, something epsilon different from X might not be. Non-rhetorical question: what does it matter if utility maximization is not “ideal”?
The whole world model is a chaotic incomputable system. Example: how people will react to your plan, given no certain knowledge of which people will observer your plan
This is a great example. non-computability has a precise technical meaning. If you want to link this meaning to a claim about a plan in the physical world or limit on ability to model human behavior, you have to actually do the work to make the link in a precise and valid way. (I’ve never seen anyone do this convincingly.)
Another example is the halting problem. It’s provably undecidable, meaning there is no fully general algorithm that will tell you whether a program halts or not. And yet, for many important programs that I encounter in practice, I can tell at a glance whether they will halt or not, and prove it so. (In fact, under certain sampling / distribution assumptions, the halting problem is overwhelmingly likely to be solvable for a given randomly sampled program.)
One thing I’m saying is that arguments of the form: “doing X is computationally intractable, therefore a superintelligence won’t be able to do X” are typically using a loose, informal definition of “computationally intractable” which I suspect makes the arguments not go through.
Not really, for an extreme example, simulating the entire universe is computationally intractable because it would require more resources than in the universe. By intractable I personally just mean it requires much more effort to simulate/guess than experiment. An obvious real world example is cracking a hash like you said. None of this has to do with NP hardness, it’s more like an estimate on the raw computational power required, and whether that exceeds the computational resources theoretically available. Another factor is also the amount of uncertainty the calculation will necessarily have.
Another thing is that even if X itself is computationally intractable, something epsilon different from X might not be.
Nothing that reduces to X can be computationally intractable or X isn’t. I don’t see any way around adding uncertainty by computing X-e which differs from X. Again this is just simply stated as approximations. In a model to build nano tech, an easy way this pops up is if there are multiple valid hypotheses given raw data, but you need high powered tools to experimentally determine which one is right based on some unknown unknowns. (Say a constant that hasn’t been measured yet to some requisite precision). This isn’t too contrived as it pops up in the real world a lot.
If you want to link this meaning to a claim about a plan in the physical world or limit on ability to model human behavior, you have to actually do the work to make the link in a precise and valid way.
Easy, lots of independent decision makers means there are an exponential number of states the entire system can be in. I’m sure there are also a lot of patterns in human behavior but humans are also diverse in their values and reactions that it would be difficult to know with certainty how some humans may react. Therefore you can only assign probabilities of what state an action might cause a system to transition to. A lot of these probabilities will be best guesses as well so there will be some sort of compounding error.
Concretely in a world domination plan, social engineering isn’t guaranteed to work. All it takes is for one of your actions to get the wrong reaction, i.e. evoke suspicion, and you get shut down. And it may be impossible to know for certain if you can avoid that ending.
I am not trying to relate it to computability theory, just showing that you get compounding uncertainty. Simulation of quantum dynamics of sufficient scale is an obvious example.
Another example is the halting problem. It’s provably undecidable, meaning there is no fully general algorithm that will tell you whether a program halts or not. And yet, for many important programs that I encounter in practice, I can tell at a glance whether they will halt or not, and prove it so.
If you know whether a program falls in a set of of solvable programs, you’ve already solved the halting problem. I assume you still double check your logic. I also don’t see how you can prove anything for sufficiently large modules.
I don’t see why the halting problem is relevant here, nor do I see that paper proving anything about real world programs. It’s talking about arbitrary tapes and programs. I don’t see how it directly relates to real life programming or other problem classes.
And if you want to talk about NP hardness, it seems that decision makers regularly encounter NP hard problems. It’s not obvious that they’re avoidable or why they would be. I don’t see why an ASI would for example be able to avoid the hardness of determining optimal resource allocation. The only side step I can think of is not needing resources or using a suboptimal solution.
How are you relating neuron count to current systems?
I’m not; the main point of the comparison method I propose is that it sidesteps the need to relate neurons in the brain to operations in AI systems.
The relevant question is what fraction of the brain is used to carry out a high-level task like speech recognition or visual processing. How to accurately measure that fraction might involve considering the number of neurons, energy consumption, or synaptic operations in a particular region of the brain, as a percent of total brain capacity. But the comparison between brains and AI systems is only on overall performance characteristics at high-level tasks.
5x puts it within the next 2 years.
I haven’t actually done the research to estimate what fractions of the brain are used to do which tasks; 5x was just an example. But it wouldn’t surprise me if there is enough existing compute already lying around for AGI, given the right algorithms. (And further, that those algorithms are not too hard for current human researchers to discover, for reasons covered in this post and elsewhere.)
I’m not; the main point of the comparison method I propose is that it sidesteps the need to relate neurons in the brain to operations in AI systems.
At best it may show we need a constant factor amount less than the brain has (I am highly doubtful about this claim) to reach its intelligence.
And no one disputes that we can get better than human performance at narrower tasks with worse than human compute. However, such narrow AI also augment human capabilities.
How exactly are you side stepping computation requirements? The brain is fairly efficient at what it has to do. I would be surprised if given the brains constraints, you could get more than 1 OOM more efficient. A brain also has much longer to learn.
But it wouldn’t surprise me if there is enough existing compute already lying around for AGI, given the right algorithms. (And further, that those algorithms are not too hard for current human researchers to discover, for reasons covered in this post and elsewhere.)
Do you have any evidence for these claims? I don’t think your evolution argument is strong in proving that they’re easy to find. I am also not convinced that current hardware is enough. The brain is also far more efficient and parallel at approximate calculations than our current hardware. The exponential growth we’ve seen in model performance has always been accompanied by an exponential growth in hardware. The algorithms used are typically really simple which makes them scalable.
Maybe an algorithm can make the computer I’m using super intelligent but I highly doubt that.
Also I think it would be helpful to retract the numbers or at least say it’s just a guess.
This implies it could be impossible to know with given parameters and a function what the arg max is. I don’t see how you can get around this. I don’t understand what you are ultimately concluding is the result from that statement being false or not related to computation.
This is probably typically the equivalent of using a probabilistic solution or approximation. Exploiting a regularity means you are using the average solution for that regularity. Or that regularity has already been solved/is computable. I think the latter is somewhat unlikely.
This introduces accumulated uncertainty.
The whole world model is a chaotic incomputable system. Example: how people will react to your plan, given no certain knowledge of which people will observer your plan
It will be hardware, as it has always been. The current paradigm makes it seem like algorithms progress faster than hardware does and is bottlenecked by hardware. I don’t see any evidence against this.
Unsure about this claim. How are you relating neuron count to current systems? I think the actual bio anchors claim puts it at ~2040. 5x puts it within the next 2 years. We are also rapidly approaching the limits of silicon.
Evolution discovering something doesn’t necessarily mean it is easy. Just that it’s possible. We also have way more energy than is provably required for human level AGI.
This claim seems fantastical. Ignoring the fact that we already have slightly superhuman narrow AI that don’t enable us to do any of that, and that groups of smart humans are slightly superhuman, the AI will have to reverse entropy and probably generate unlimited energy to do that.
A misaligned human level genius isn’t exactly scary either unless it’s directing us somehow and it gains a lot of influence. However history is full of these and yet we still stand.
Don’t see why this has to be the case. If language itself plays a role in development of abstract thought, then this could explain why LLMs in particular seem to have abstract thought.
This claim isn’t particularly convincing. A lying politician lies 70% of the time. A righteous politician lies 30% of the time. I am also pretty confused by the story. What exactly is it portraying?
One thing I’m saying is that arguments of the form: “doing X is computationally intractable, therefore a superintelligence won’t be able to do X” are typically using a loose, informal definition of “computationally intractable” which I suspect makes the arguments not go through. Usually because X is something like “build nanotech”, but the actual thing that is provably NP-hard is something more like “solve (in full generality) some abstracted modelling problem which is claimed to be required to build nanotech”.
Another thing is that even if X itself is computationally intractable, something epsilon different from X might not be. Non-rhetorical question: what does it matter if utility maximization is not “ideal”?
This is a great example. non-computability has a precise technical meaning. If you want to link this meaning to a claim about a plan in the physical world or limit on ability to model human behavior, you have to actually do the work to make the link in a precise and valid way. (I’ve never seen anyone do this convincingly.)
Another example is the halting problem. It’s provably undecidable, meaning there is no fully general algorithm that will tell you whether a program halts or not. And yet, for many important programs that I encounter in practice, I can tell at a glance whether they will halt or not, and prove it so. (In fact, under certain sampling / distribution assumptions, the halting problem is overwhelmingly likely to be solvable for a given randomly sampled program.)
Not really, for an extreme example, simulating the entire universe is computationally intractable because it would require more resources than in the universe. By intractable I personally just mean it requires much more effort to simulate/guess than experiment. An obvious real world example is cracking a hash like you said. None of this has to do with NP hardness, it’s more like an estimate on the raw computational power required, and whether that exceeds the computational resources theoretically available. Another factor is also the amount of uncertainty the calculation will necessarily have.
Nothing that reduces to X can be computationally intractable or X isn’t. I don’t see any way around adding uncertainty by computing X-e which differs from X. Again this is just simply stated as approximations. In a model to build nano tech, an easy way this pops up is if there are multiple valid hypotheses given raw data, but you need high powered tools to experimentally determine which one is right based on some unknown unknowns. (Say a constant that hasn’t been measured yet to some requisite precision). This isn’t too contrived as it pops up in the real world a lot.
Easy, lots of independent decision makers means there are an exponential number of states the entire system can be in. I’m sure there are also a lot of patterns in human behavior but humans are also diverse in their values and reactions that it would be difficult to know with certainty how some humans may react. Therefore you can only assign probabilities of what state an action might cause a system to transition to. A lot of these probabilities will be best guesses as well so there will be some sort of compounding error.
Concretely in a world domination plan, social engineering isn’t guaranteed to work. All it takes is for one of your actions to get the wrong reaction, i.e. evoke suspicion, and you get shut down. And it may be impossible to know for certain if you can avoid that ending.
I am not trying to relate it to computability theory, just showing that you get compounding uncertainty. Simulation of quantum dynamics of sufficient scale is an obvious example.
If you know whether a program falls in a set of of solvable programs, you’ve already solved the halting problem. I assume you still double check your logic. I also don’t see how you can prove anything for sufficiently large modules.
I don’t see why the halting problem is relevant here, nor do I see that paper proving anything about real world programs. It’s talking about arbitrary tapes and programs. I don’t see how it directly relates to real life programming or other problem classes.
And if you want to talk about NP hardness, it seems that decision makers regularly encounter NP hard problems. It’s not obvious that they’re avoidable or why they would be. I don’t see why an ASI would for example be able to avoid the hardness of determining optimal resource allocation. The only side step I can think of is not needing resources or using a suboptimal solution.
This paper seems to explain it better than I can hope to: https://royalsocietypublishing.org/doi/10.1098/rstb.2018.0138
Getting stuck in a local maxima. This seems to undermine the whole atoms line of thought.
I’m not; the main point of the comparison method I propose is that it sidesteps the need to relate neurons in the brain to operations in AI systems.
The relevant question is what fraction of the brain is used to carry out a high-level task like speech recognition or visual processing. How to accurately measure that fraction might involve considering the number of neurons, energy consumption, or synaptic operations in a particular region of the brain, as a percent of total brain capacity. But the comparison between brains and AI systems is only on overall performance characteristics at high-level tasks.
I haven’t actually done the research to estimate what fractions of the brain are used to do which tasks; 5x was just an example. But it wouldn’t surprise me if there is enough existing compute already lying around for AGI, given the right algorithms. (And further, that those algorithms are not too hard for current human researchers to discover, for reasons covered in this post and elsewhere.)
At best it may show we need a constant factor amount less than the brain has (I am highly doubtful about this claim) to reach its intelligence.
And no one disputes that we can get better than human performance at narrower tasks with worse than human compute. However, such narrow AI also augment human capabilities.
How exactly are you side stepping computation requirements? The brain is fairly efficient at what it has to do. I would be surprised if given the brains constraints, you could get more than 1 OOM more efficient. A brain also has much longer to learn.
Do you have any evidence for these claims? I don’t think your evolution argument is strong in proving that they’re easy to find. I am also not convinced that current hardware is enough. The brain is also far more efficient and parallel at approximate calculations than our current hardware. The exponential growth we’ve seen in model performance has always been accompanied by an exponential growth in hardware. The algorithms used are typically really simple which makes them scalable.
Maybe an algorithm can make the computer I’m using super intelligent but I highly doubt that.
Also I think it would be helpful to retract the numbers or at least say it’s just a guess.