faul_sname comments on Tyler Cowen’s challenge to develop an ‘actual mathematical model’ for AI X-Risk

faul_sname 17 May 2023 8:29 UTC
8 points
2
One mathematical model that seems like it would be particularly valuable to have here is a model of the shapes of the resources invested vs optimization power curve. The reason I think an explicit model would be valuable there is that a lot of the AI risk discussion centers around recursive self-improvement. For example, instrumental convergence / orthogonality thesis / pivotal acts are relevant mostly in contexts where we expect a single agent to become more powerful than everyone else combined. (I am aware that there are other types of risk associated with AI, like “better AI tools will allow for worse outcomes from malicious humans / accidents”. Those are outside the scope of the particular model I’m discussing).

To expand on what I mean by this, let’s consider a couple of examples of recursive self-improvement.

For the first example, let’s consider the game of Factorio. Let’s specifically consider the “mine coal + iron ore + stone / smelt iron / make miners and smelters” loop. Each miner produces some raw materials, and those raw materials can be used to craft more miners. This feedback loop is extremely rapid, and once that cycle gets started the number of miners placed grows exponentially until all available ore patches are covered with miners.

For our second example, let’s consider the case of an optimizing compiler like gcc. A compiler takes some code, and turns it into an executable. An optimizing compiler does the same thing, but also checks if there are any ways for it to output an executable that does the same thing, but more efficiently. Some of the optimization steps will give better results in expectation the more resources you allocate to them, at the cost of (sometimes enormously) greater required time and memory for the optimization step, and as such optimizing compilers like gcc have a number of flags that let you specify exactly how hard it should try.

Let’s consider the following program:
```
INLINE_LIMIT=1
# <snip gcc source download / configure steps>
while true; do
    make CC="gcc" CFLAGS="-O3 -finline-limit=$INLINE_LIMIT"
    make install
    INLINE_LIMIT=$((INLINE_LIMIT+1))
done
```
This is also a thing which will recursively self-improve, in the technical sense of “the result of each iteration will, in expectation, be better than the result of the previous iteration, and the improvements it finds help it more efficiently find future improvements”. However, it seems pretty obvious that this “recursive self-improver” will not do the kind of exponential takeoff we care about.

The difference between these two cases comes down to the shapes of the curves. So one area of mathematical modeling I think would be pretty valuable would be
1. Figure out what shapes of curves lead to gaining orders of magnitude more capabilities in a short period of time, given constant hardware
2. The same question, but given the ability to rent or buy more hardware
3. The same question, but now it invest in improving chip fabs, with the same increase in investment required for each improvement as we have previously observed for chip fabs
4. What do the empirical scaling laws for deep learning look like? Do they look like they come in under the curves from 1-3? What if we look at the change in the best scaling laws over time—where does that line point?
5. Check whether your model now says that we should have been eaten by a recursively self improving AI in 1982. If it says that, the model may require additional work.
I will throw in an additional $300 bounty for an explicit model of this specific question, subject to the usual caveats (payable to only one person, can’t be in a sanctioned country, etc), because I personally would like to know.

Edit: Apparently Tyler Cowen didn’t actually bounty this. My $300 bounty offer stands but you will not be getting additional money from Tyler it looks like.