In what sense are these two viewpoints in tension?
This seems more a question of “observable by whom” vs “observable in principle.”
In what sense are these two viewpoints in tension?
This seems more a question of “observable by whom” vs “observable in principle.”
This works too, yeah.
Yeah, I was thinking it’s hard to beat dried salted meat, hard cheese, and oil or butter.
You also don’t have to assume that all the food travels the whole way. If (hypothetically) you want to send 1 soldier’s worth of food and water 7 days away, and each person can only carry 3 days worth at a time, then you can try to have 3 days worth deposited 6 days out, and then have a porter make a 2 day round trip carrying 1 day’s worth to leave for that soldier to pickup up on day 7. Then someone needs to have carried that 3 days worth to 6 days out, which you can do by having more porters make 1 day round trips from 5 days out, etc. Basically it you need exponentially more people and supplies the farther out your supply chains stretch. I think I first read about this in the context of the Incas, because potatoes are less calorie dense per pound than dried grains so it’s an even bigger problem? Being able to get water along the way, and ideally to pillage the enemy’s supplies, are also a very big deal.
I think at that point the limiting factors become the logistics of food, waste, water, and waste heat. In Age of Em Robin Hanson spends time talking about fractal plumbing systems and the like, for this kind of reason.
All good points, many I agree with. If nothing else, I think that humanity should pre-commit to following this strategy whenever we find ourselves in the strong position. It’s the right choice ethically, and may also be protective against some potentially hostile outside forces.
However, I don’t think the acausal trade case is strong enough that I would expect all sufficiently powerful civilizations to have adopted it. If I imagine two powerful civilizations with roughly identical starting points, one of which expanded while being willing to pay costs to accommodate weaker allies while the other did not and instead seized whatever they could, then it is not clear to me who wins when they meet. If I imagine a process by which a civilization becomes strong enough to travel the stars and destroy humanity, it’s not clear to me that this requires it to have the kinds of minds that will deeply accept this reasoning.
It might even be that the Fermi paradox makes the case stronger—if sapient life is rare, then the costs paid by the strong to cooperate are low, and it’s easier to hold to such a strategy/ideal.
This seems to completely ignore transaction costs for forming and maintaining an alliance? Differences in the costs to create and sustain different types of alliance-members? Differences in the potential to replace some types of alliance-members with other or new types? There can be entities for whom forming an alliance that contains humanity will cause them to incur greater costs than humanity’s membership can ever repay.
Also, I agree that in a wide range of contexts this strategy is great for the weak and for the only-locally-strong. But if any entity knows it is strong in a universal or cosmic sense, this would no longer apply to it. Plus everyone less strong would also know this, and anyone who truly believed they were this strong would act as though this no longer applied to them either. I feel like there’s a problem here akin to the unexpected hanging paradox that I’m not sure how to resolve except by denying the validity of the argument.
On screen space:
When, if ever, should I expect actually-useful smart glasses or other tech to give me access to arbitrarily-large high-res virtual displays without needing to take up a lot of physical space, or prevent me from sitting somewhere other than a single, fixed desk?
On both the Three Body Problem and economic history: It really is remarkably difficult to get people to see that 1) Humans are horrible, and used to be more horrible, 2) Everything is broken, and used to be much more broken, and 3) Actual humans doing actual physical things have made everything much better on net, and in the long run “on net” is usually what matters.
On the Paul Ehrlich organization: Even if someone agrees with these ideas, do they not worry what this makes kids feel about themselves? Like, I can just see it: “But I’m the youngest of 3! My parents are horrible and I’m the worst of all!”
And this, like shame-based cultural norm enforcement, disproportionately punishes those who care enough to want to be pro-social and conscientious, with extra suffering.
Got it, makes sense, agreed.
I agree that filling a context window with worked sudoku examples wouldn’t help for solving hidouku. But, there is a common element here to the games. Both look like math, but aren’t about numbers except that there’s an ordered sequence. The sequence of items could just as easily be an alphabetically ordered set of words. Both are much more about geometry, or topology, or graph theory, for how a set of points is connected. I would not be surprised to learn that there is a set of tokens, containing no examples of either game, combined with a checker (like your link has) that points out when a mistake has been made, that enables solving a wide range of similar games.
I think one of the things humans do better than current LLMs is that, as we learn a new task, we vary what counts as a token and how we nest tokens. How do we chunk things? In sudoku, each box is a chunk, each row and column are a chunk, the board is a chunk, “sudoku” is a chunk, “checking an answer” is a chunk, “playing a game” is a chunk, and there are probably lots of others I’m ignoring. I don’t think just prompting an LLM with the full text of “How to solve it” in its context window would get us to a solution, but at some level I do think it’s possible to make explicit, in words and diagrams, what it is humans do to solve things, in a way legible to it. I think it largely resembles repeatedly telescoping in and out, to lower and higher abstractions applying different concepts and contexts, locally sanity checking ourselves, correcting locally obvious insanity, and continuing until we hit some sort of reflective consistency. Different humans have different limits on what contexts they can successfully do this in.
Oh, by “as qualitatively smart as humans” I meant “as qualitatively smart as the best human experts”.
I think that is more comparable to saying “as smart as humanity.” No individual human is as smart as humanity in general.
inside the fuzz
This is an excellent short mental handle for this concept. I’ll definitely be using it.
I was going to say the same. I can’t count the number of times a human customer service agent has tried to do something for me, or told me they already did do something for me, only for me to later find out they were wrong (because of a mistake they made), lying (because their scripts required it or their metrics essentially forced them into it), or foiled (because of badly designed backend systems opaque to both of us).
Here’s a simple test: Ask an AI to open and manage a local pizza restaurant, buying kitchen equipment, dealing with contractors, selecting recipes, hiring human employees to serve or clean, registering the business, handling inspections, paying taxes, etc. None of these are expert-level skills. But frontier models are missing several key abilities. So I do not consider them AGI.
I agree that this is a thing current AI systems don’t/can’t do, and that aren’t considered expert-level skills for humans. I disagree that this is a simple test, or the kind of thing a typical human can do without lots of feedback, failures, or assistance. Many very smart humans fail at some or all of these tasks. They give up on starting a business, mess up their taxes, have a hard time navigating bureaucratic red tape, and don’t ever learn to cook. I agree that if an AI could do these things it would be much harder to argue against it being AGI, but it’s important to remember that many healthy, intelligent, adult humans can’t, at least not reliably. Also, remember that most restaurants fail within a couple of years even after making it through all these hoops. The rate is very high even for experienced restauranteurs doing the managing.
I suppose you could argue for a definition of general intelligence that excludes a substantial fraction of humans, but for many reasons I wouldn’t recommend it.
Chris’ latest reply to my other comment resolved a confusion I had, so I now realize my comment above isn’t actually talking about the same thing as you.
I’m definitely one of those non-experts who has never done actual machine learning, but AFAICT that article you linked both is tied to and does not explicitly mentioned that the ‘principle of indifference’ is about the epistemological taste of the reasoner, while arguing that the cases where the reasoner lacks knowledge to hold a more accurate prior means the principle itself is wrong.
The training of an LLM is not a random process, therefore indifference will not accurately predict the outcome of this process. This does not imply anything about other forms of AI, or about whether people reasoning in the absence of knowledge about the training process were making a mistake. It also does not imply sufficient control over the outcome of the training process to ensure that the LLM will, in general, want to do what we want it to want to do, let alone to do what we want it to do.
The section where she talks about how evolution’s goals are human abstractions and an LLM’s training has a well-specified goal in terms of gradient descent is really where that argument loses me, though. In both cases, it’s still not well specified, a priori, how the well-defined processes cash out in terms of real-world behavior. The factual claims are true enough, sure. But the thing an LLM is trained to do is predict what comes next, based on training data curated by humans, and humans do scheme. Therefore, a sufficiently powerful LLM should, by default, know how to scheme, and we should assume there are prompts out there in prompt-space that will call forth that capability. No counting argument needed. In fact, the article specifically calls this out, saying the training process is “producing systems that behave the right way in all scenarios they are likely to encounter,” which means the behavior is unspecified in whatever scenarios the training process deems “unlikely,” although I’m unclear what “unlikely” even means here or how it’s defined.
One of the things we want from our training process is to not have scheming behavior get called up in a hard-to-define-in-advance set of likely and unlikely cases. In that sense, inner-alignment may not be a thing for the structure of LLMs, in that the LLM will automatically want what it is trained to want. But, it is still the case that we don’t know how to do outer-alignment for a sufficiently general set of likely scenarios, aka we don’t actually know precisely what behavioral responses our training process is instilling.
Yes, true, and often probably better than we would be able to write it down, too.
I was under the impression that this meant that a sufficiently powerful AI would be outer-aligned by default, and that this is what enables several of the kinds of deceptions and other dangers we’re worried about.
Is the difference between the goal being specified by humans vs being learned and assumed by the AI itself?
My mental shorthand for this has been that outer alignment is getting the AI to know what we want it to do, and inner alignment is getting it to care. Like the difference between knowing how to pass a math test, and wanting to become a mathematician. Is that understanding different from what you’re describing here?
This is an interesting thought I hadn’t come across before. Very much Sun Tzu, leave-your-enemy-a-path-of-retreat. As you said, its efficacy would depend very much on the nature of the AI in question. Do you think we’ll be able to determine which AIs are worthwhile to preserve, and which will just be more x-risk that way?
This was a great post, really appreciate the summary and analysis! And yeah, no one should have high certainty about nutritional questions this complicated.
For myself, I mostly eliminated these oils from my diet about 4 years ago, along with reducing industrially-processed food in general. Not 100%, I’m not a purist, but other than some occasional sunflower oil none of these are in foods I keep at home, and I only eat anywhere else 0-2 times per week. I did lose a little weight in the beginning, maybe 10 lbs, but then stabilized. But what I have mostly noticed is that when I eat lots of fried food, depending on the oil used (which to some degree you can taste), I’m either fine, or feel exhausted/congested/thirsty for basically a full day. I think you may have a point about trans fata from reusing oil, since anecdotally this seems even worse for leftovers.
Of course, another thing I did at the same time is switch to grass-fed butter and pasture-raised eggs. Organic meats and milk, not always pasture raised. Conventional cheeses. I’ve read things claiming the fatty acid composition is significantly different for these foods depending on what the animals eat, in terms of omega 3⁄6 ratios, saturated/unsaturated fat ratios, and fatty acid chain lengths. I’ve never looked too deeply into checking those claims, because for me the fact that they taste better is reason enough. As far as I can tell, it wasn’t until WWII or later that we really started feeding cows corn and raising chickens in dense indoor cages with feed? Yet another variable/potential confounder for studies.