True, but not convincing. They have been pretty consistent in their concern for America/Americans above others. E.g., in their latest statement, regarding fully autonomous killer weapons: “We will not knowingly provide a product that puts America’s warfighters and civilians at risk.” Now, one could argue that I am being insufficiently generous, but this wording sure makes it sound like the only civilians they are concerned for are American civilians. In the context of providing autonomous killer weapons to the American DoW.
Dana
Why couldn’t a democratic system of ownership and control implement those safeguards bottom up?
Is this actually misalignment? It seems they are planning to roll out ‘adult mode’ fairly soon, so I doubt they’ve put much effort into eliminating this kind of behavior.
Of course it is plausible, but there is seemingly no evidence supporting the claim.
That research is from August. Seems much more likely to me that they’ve just chosen to switch focus to more scalable (ie, less expensive) approaches than that they’ve scaled this up since then and found conclusive conflicting results already.
Some of the phrasing also doesn’t give the impression that they’ve tried very hard to make it work:
“We expect this to become even more of an issue as AIs increasingly use tools” → phrased as a prediction, not based on evidence or current state.
Applying filtering to tool use “wasn’t enough assurance against misuse”? What does that even mean? Are we demanding more of filtering than other approaches now?
”We could have made more progress here with more research effort, but it likely would have required...” → didn’t try, another prediction
Didn’t mention anything about what caused filtering to suddenly become less effective. Why?
Private American companies seem like the bigger risk from my perspective. As examples, many expect Anthropic/OpenAI to IPO this year, but if AGI is expected to be priced into public markets within ~2 years, that seems like a very small window for the leading AGI companies to not be able to secure private funding. And surely they won’t IPO if they can lock in sufficient funding privately, right? Plus all the other private AI companies.
I don’t share that intuition, from a few angles.
I think a 10x larger bet would be more than 10x as suspicious. There are more than 10x as many people who would bet 80k on low-medium conviction bets than 800k.
Also liquidity would dry up, quickly, once liquidity providers see the obvious insider, so the reward would be much less than 10x.
Also I see the disutility from suspicion as closer to a step function: you really do not want your suspicion to rise to a level that would warrant a serious investigation. Which is kind of binary, closely-related to the risk of traders framing you as an insider beforehand, and I would think 80k is already pretty close to this threshold (but I’m not familiar with how liquid this market was).
You can delete Youtube videos from your watch history if you don’t want it to be used for recommendations. I do this. It would be nice to have an easier way to switch through preference profiles than switching accounts though, that seems like a hassle.
I agree with your assessment of what the problem is, but I don’t agree that is the main point of this post. The majority of this post is spent asserting how ‘ordinary’, smart, and high functioning this victim is and how we can now conclude that therefore everyone, including you, is vulnerable, and AI psychosis in general is a very serious danger. It being suppressed is just mentioned in passing at the start of the post.
I also wonder what exactly is meant by AI psychosis. I mean, my co-worker is allowed to have an anime waifu but I’m not allowed to have a 4o husbando?
Imagine you were to provide full context. Would this affect how the recipient feels about the message? If so, they deserve to have that context. Reaching out to friends for advice is very different from an AI reaching out to you for approval of its message. You didn’t initiate, you didn’t provide any input, and you didn’t put in any effort aside from the single click.
I’m not sure I follow the argument as to why we should expect less liquidity on prediction markets. Assuming 0 fees and similar volumes, why wouldn’t bookmakers (also) offer similar liquidity at the same ~5% VIG on prediction markets? They can even use it to help balance their books. I would personally offer prediction market liquidity at 5% VIG but Polymarket generally has better rates.
Regarding obscure events, I understand the argument to be that they are using profits from their popular events to subsidize these likely unprofitable obscure events. Why couldn’t a prediction market do the same? Polymarket already has an attempt at doing this, liquidity rewards. Of course that requires some sort of fee to fund long term.
Regarding slippage, why would a sportsbook be able to fill $100k but not external LPs? If a sportsbook is willing to fill it, I’m assuming it’s profitable. Is it that you think the sportsbook will accept more risk than a collection of external LPs? The only other argument I see is insufficient volume to balance the trade which would be an issue in either case. Prediction markets can also just limit the bet size (Like sportsbooks do) to whatever is available at the current price to eliminate slippage.
Predatory Liquidity:
1. Feels wrong to me that you frame orders a few cents above fair as ‘predatory’. That’s what VIG is in sportsbooks, except on every order. It’s just generally not the current price in prediction markets because the current price is generally much closer to fair.
2. I agree with your second example, sniping after a score change. But this is also not fundamental. Polymarket could easily clear the book + suspend for a few seconds after score changes if they have the data feeds (Which they seem to).
Yes. And this actually seems to be a relatively common perspective from what I’ve seen.
“Richard Sutton rejects AI Risk” seems misleading in my view. What risks is he rejecting specifically?
His view seems to be that AI will replace us, humanity as we know it will go extinct, and that is okay. E.g., here he speaks positively of a Moravec quote, “Rather quickly, they could displace us from existence”. Most would consider our extinction as a risk they are referring to when they say “AI Risk”.
Great post.
This sort of superexponential growth vastly increases the amount of energy in the system and it seems to me that this amount of energy could very easily be enough to overcome the activation energy required to split groups (eg, countries) that are generally seen as stable.
If power/wealth becomes much more unevenly distributed within the AGI-owning group (top 1% currently at 67% of total wealth in USA, maybe ~20% of income?), why would they continue to support the rest of the group? Or, why exactly that group and not some other arbitrary group of their choosing? The government enforces/maintains the group boundary. What gives the government power to oppose the elites? The population. If the population is relatively poor, how can they maintain control of the government, and where would its power come from?
If the government cannot enforce the group boundary, decreasing the size of the group can greatly improve the group’s ability to prevent diffusion, and can easily make coordination/shared ideology much stronger.
Ideology seems like it could play major role if groups can be formed/broken at will by elites, and I don’t see why democratic/nationalistic ideologies would be favored in this case.
To what extent do you differentiate between anti-Israel sentiment and antisemitism? It seems to be very common to conflate the two, especially with many in the pro-Israel camp actively pushing this notion that they are one and the same. And it’s interesting that your sole concrete example is an attack on the Israeli government. I’m not familiar with Fuentes or Carroll. But anti-Israel sentiment is definitely up dramatically, and there will surely be significant collateral damage due to this. I wonder if we have any metrics which do a good job of disentangling these two.
Also, Ari Ben-Menashe linked both Epstein and Ghislaine Maxwell’s father to Mossad, so it’s not like this connection just came out of thin air.
I think LW consensus has been that the main existential risk is AI development in general. The only viable long-term option is to shut it all down. Or at least slow it down as much as possible until we can come up with better solutions. DeepSeek from my perspective should incentivize slowing down development (if you agree with the fast follower dynamic. Also by reducing profit margins generally), and I believe it has.
Anyway, I don’t see how this relates to these predictions. The predictions are about China’s interest in racing to AGI. Do you believe China would now rather have an AGI race with USA than agree to a pause?
I’m not convinced that these were bad predictions for the most part.
The main prediction: 1) China lacks compute. 2) CCP values stability and control → China will not be the first to build unsafe AI/AGI.
Both of these premises are unambiguously true as far as I’m aware. So, these predictions being bad suggests that we now believe China is likely to build AGI without realizing it threatens stability/control, and with minimal compute, before USA? All while refusing to agree to any sort of deal to slow down? Why? Seems unlikely.
American companies, on the other hand, are still explicitly racing toward AGI, are incredibly well resourced, have strong government support, and have a penchant for disruption. The current administration also cares less about stability than any other in recent history.
So, from my perspective, USA racing to AGI looks even more dangerous than before, almost desperate. Whereas China is fast following, which I think everyone expected? Did anyone suggest that China would not be able to fast-follow American AI?
I keep some folders (and often some other transient files) on my desktop and pin my main apps to the taskbar. With apps pinned to your taskbar, you can open a new instance with Windows+shift+num (or just Windows+num if the app isn’t open yet).
I do the same as you and search for any other apps that I don’t want to pin.
Well, vision and mapping seem like they could be pretty generic (and I expect much better vision in future base models anyway). For the third limitation, I think it’s quite possible that Claude could provide an appropriate segmentation strategy for whatever environment it is told it is being placed into.
Whether this would be a display of its intelligence, or just its capabilities, is beside the point from my perspective.
But these issues seem far from insurmountable, even with current tech. It is just that they are not actually trying, because they want to limit scaffolding.
From what I’ve seen, the main issues:
1) Poor vision → Can be improved through tool use, will surely improve greatly regardless with new models
2) Poor mapping → Can be improved greatly + straightforwardly through tool use
3) Poor executive function → I feel like this would benefit greatly from something like a separation of concerns. Currently my impression is Claude is getting overwhelmed with context, loses track of what’s going on, then starts messing with its long-term planning. From a clean context, its long-term planning seems fairly decent. Same for loops, I would expect a clean-context Claude could read a summary of recent steps constituting a loop and understand that it is in a loop and that it needs to try something else.
E.g., separate contexts for each of battling, navigation, summarization, long-term planning, coordination, etc.
Do you think they would stop the US from sharing its mass surveillance of British citizens with the British government? Or allow another country to use Claude to conduct mass surveillance of Americans?
It seems pretty clearly no in both cases from my perspective.