testingthewaters

Karma: 1,135

testingthewaters 24 Oct 2025 22:40 UTC
4 points
0
in reply to: Matt Dellago’s comment on: Matthias Dellago’s Shortform
In video games this is made literal by every entity having a central coordinate. Their body is merely a shell wrapped around the point-self and a channel for the will of the external power (the player).

testingthewaters 22 Oct 2025 17:13 UTC
2 points
0
on: testingthewaters’s Shortform
A postmortem for the Economic Safety movement (fiction):

After eminent economist Mr. Senyek warned in 1991 that a hypothetical future “economic tsunami” could cause systemic risks to the American-led global financial order as a whole, researchers and think tanks quickly rallied to the cause of Economic Safety. They reasoned that in order to anticipate the risks of this hypothetical “Economic Tsunami”, they needed access to the frontier of financial trading. Within several years Economic Safety advocates joined eminent firms like JP Morgan and Bear Stearns, with Mr. Senyek providing introductions to particularly promising young economists.

Disillusioned with the domination of the financial system by large corporations with no sense of social obligation, a group of billionaire investors and traders started OpenFinance, with the goal of creating a hedge fund that would benefit the public instead of a small circle of millionaires and the ultrawealthy. Convinced of the need to acquire a trading edge against the big firms, OpenFinance pioneered the use of CDCs and MBS products to achieve unheard of levels of leverage and record profits. Despite stating that they would use their earnings to benefit society, they decided that the dream of systemic overhaul would only be achieved by becoming the dominant financial player, incorporating a for profit arm to that end and raising major sums from Morgan Stanley to fund their General Partnership Trading (GPT) system. The system increased access to financial products and services by allowing the general public to invest in CDCs and MBSes, democratising the returns of financial trading, but was criticised for creating the systemic risk they seeked to avoid.

In 2004, a team of traders at OpenFinance (often shortened as OpenFi) accused OpenFi leadership of being reckless and insufficiently concerned about Economic Safety. They decided to start a new hedge fund known as Anthropocentric Trading which would offer services better aligned to their principles. A fundraising war for talent and capital to form new investment funds ensued, with both firms acquiring investors from the Middle East and courting governments as part of bids to reshape the global economic order. Anthropocentric Trading admitted to leveraging heavily based on the same products and tactics as OpenFi, reasoning that it needed to stay competitive in a multipolar economic race. It is now 2007...

testingthewaters 21 Oct 2025 10:56 UTC
4 points
2
in reply to: Steven Byrnes’s comment on: Mo Putera’s Shortform
Furthermore, going hard also imposes opportunity costs and literal costs on future you even if you have all your priorities perfectly lined up and know exactly what should be worked on at any time. If you destabilise yourself enough trying to “go for the goal” your net impact might ultimately be negative (not naming any names here...).

testingthewaters 19 Oct 2025 23:23 UTC
7 points
2
in reply to: leogao’s comment on: leogao’s Shortform
Some books you might like to read:
- Seeing Like a State by James C Scott (I’ve read most of it, I liked it)
- Bullshit Jobs, The Dawn of Everything, most books by David Graeber (I’ve read and liked long extracts of his work)
- The End: Hitler’s Germany 1944–45 by Sir Ian Kershaw (I’ve read all of it and found it very valuable as a complete picture of a society melting down)
- Open Letters by Vaclav Havel (I’ve read a lot of it, I like it a lot. He was the first president of Czechoslovakia and a famous communist dissident and his writing sketches out both what he finds soul-destroying about that system and what he thinks are the principles of good societies)
- System Effects: Complexity in Political and Social Life by Robert Jervis (I’m reading this now, very good case studies about non-obvious phenomena in international relations)
- Broken Code: Inside Facebook and the fight to expose its toxic secrets by Jeff Horwitz (Very good book about how social media platforms like Facebook shape and are shaped by modern civilisation, I read all of it)
All of these books to various degrees tackle the things you are describing from a holistic perspective. Hope this helps.

testingthewaters 17 Oct 2025 17:51 UTC
4 points
0
in reply to: kaiwilliams’s comment on: eggsyntax’s Shortform
It loads past conversations (or parts of them) into context, so it could change behaviour.

testingthewaters 11 Oct 2025 2:02 UTC
4 points
2
on: testingthewaters’s Shortform
A lesson from the book System Effects: Complexity in Political and Social Life by Robert Jervis, and also from the book The Trading Game: A Confession by Gary Stevenson.

When people talk about planning for the future, there is often a thought chain like this:
- All other things being equal, a world with thing/organisation/project X is preferable compared to a world without thing/organisation/project X
- Therefore, I should try to make X happen
- I will form a theory of change and start to work at making X happen
But of course the moment you start working at making X happen you have already destroyed the premise. There are no longer two equal worlds held in expectation, one with X and one with no X. There is now the world without X (in the past), and the world where you are trying to make X happen (the present). And very often the path to attaining X creates a world much less preferable for you than the world before you started, long before you reach X itself.

For example:
- I can see a lucrative trade opportunity where by the end of five months, the price for some commodity will settle at a new, higher point which I can forecast clearly. All other things being equal, if I take this trade I will make a lot of money.
- Therefore, I should try and make this trade.
- I will take out a large position, and double down if in the interim the price moves in the “wrong” direction.
However, the price can be much more volatile than you expect, especially if you are taking out big positions in a relatively iliquid market. Thus you may find that three months in your paper losses are so large that you reach your pain threshold and back out of the trade for fear that your original prediction was wrong. At the end of the five months, you may have predicted the price correctly, but all you did was lose a large sum of money in the interim.

For another example:
- All other things being equal, a world with an awareness of potential race dynamics around AGI is preferable compared to a world without such an awareness.
- Therefore, I should try to raise awareness of race dynamics.
- I will write a piece about race dynamics and make my arguments very persuasive, to increase the world’s awareness of this issue.
Of course, in the process of trying to raise awareness of this issue, you might first create a world where a small subset of the population (mostly policy and AI people) are suddenly very clued-in to the possibility of the race dynamics. There people are also in a very good position to create, maintain, and capitalize on those dynamics (whether consciously or not), including using them to raise large amounts of cash. Now suddenly the risk of race dynamics is much larger than before, and the world is in a more precarious state.

There isn’t really a foolproof way to get around this problem. However, one tactic might be to look at your theory of change, and instead of comparing the world state before and after the plan, look at the world state along each step of the path to change, and consciously weigh up the changes and tradeoffs at each step. If one of those steps looks like it would break a moral, social, or pain-related threshold, maybe reconsider that theory of change.

Addendum: I think this is also why systems/ecosystems/plans which rely on establishing positive or negative feedback loops are so powerful. They’ve set things up so that each stage incrementally moves towards the goal, so that even if there are setbacks you have room to fall back instead of breaching a pain threshold.

testingthewaters 8 Oct 2025 20:56 UTC
0 points
−4
in reply to: Matthew Barnett’s comment on: Jan_Kulveit’s Shortform
[I think this comment is too aggressive and I don’t really want to shoulder an argument right now]
With apologies to @Garrett Baker .

testingthewaters 7 Oct 2025 19:35 UTC
22 points
11
in reply to: Jan_Kulveit’s comment on: Jan_Kulveit’s Shortform
From the mechanise blogpost:

Nuclear weapons are orders of magnitude more powerful than conventional alternatives, which helps explain why many countries developed and continued to stockpile them despite international efforts to limit nuclear proliferation.

Yet the number of nuclear weapons in the world has decreased from its peak during the cold war. Furthermore, we’ve somehow stopped ourselves from using them, which suggests that some amount of steering is possible.

With regards to the blogpost as a whole, humanity fits their picture the most when it is uncoordinated and trapped in isolated clades, each of which is in a molochian red queen’s race with the other, requiring people to rapidly upgrade to new tech if only to keep pace with their opponents in commerce or war. But this really isn’t the only way we can organise ourselves. Many societies made do fairly well for long periods in isolation without “rising up the tech tree” (e.g. Japan post-sengoku jidai).

And even if it is inevitable… You can stop a car going at 60 mph by slowly hitting the brakes or by ramming it into a wall. Even if stopping is “inevitable”, it does not follow that the wall and the gentle decceleration are identically preferable for the humans inside.

testingthewaters 7 Oct 2025 5:24 UTC
4 points
0
on: Subliminal Learning, the Lottery-Ticket Hypothesis, and Mode Connectivity
I think a working continual learning implementation would mess with convergence-based results which have a relatively fixed training data distribution and only modify the starting seeds. This is mostly because a continual learning system is constantly drifting “off the base distribution” and incorporating new data. In other words, the car model has seen data from places and distributions the attacker’s base model never will.

testingthewaters 6 Oct 2025 14:14 UTC
2 points
0
in reply to: testingthewaters’s comment on: testingthewaters’s Shortform
Addendum for the future: Concepts like agents, agency, and choice only make sense at the systemic macroscale. If you had total atomic knowledge (complete knowledge of every single particle interaction in a human—which, as we discussed, basically requires complete knowledge of every single particle interaction in the universe), the determinists are right. There is no choice. It’s only neurons firing and chemicals bonding. But we operate at a higher level, with noise and uncertainty. Then preferences and policies make sense as things to talk about.

testingthewaters 5 Oct 2025 15:44 UTC
3 points
0
in reply to: Smaug123’s comment on: testingthewaters’s Shortform
I’ve heard it quite a few times when discussing emergence and complexity topics.

testingthewaters 5 Oct 2025 15:10 UTC
11 points
6
on: testingthewaters’s Shortform
My best argument as to why coarse-graining and “going up a layer” when describing complex systems are necessary:

Often we hear a reductionist case against ideas like emergence which goes something like this: “If we could simply track all the particles in e.g. a human body, we’d be able to predict what they did perfectly with no need for larger-scale simplified models of organs, cells, minds, personalities etc.”. However, this kind of total knowledge is actually impossible given the bounds of the computational power available to us.
- First of all, when we attempt to track billions of particle interactions we very quickly end up with a chaotic system, such that tiny errors in measurements and setting up initial states quickly compound into massive prediction errors (A metaphor I like is that you’re “using up” the decimal points in your measurement: in a three body system the first timestep depends mostly on the value of the non-decimal portions of the starting velocity measurements. A few timesteps down changing .15 to .16 makes a big difference, and by the 10000th timestep the difference between a starting velocity of .15983849549 and .15983849548 is noticeable). This is the classic problem with weather prediction.
- Second of all, tracking “every particle” means that the scope of the particles you need to track explodes out of the system you’re trying to monitor into the interactions the system has with neighbouring particles, and then the neighbours of neighbours, so on and so forth. In the human case, you need to track every particle in the body, but also every particle the body touches or ingests (could be a virus), and then the particles that those particles touch… This continues until you reach the point where “to understand the baking process of an apple pie you must first track the position of every particle in the universe”
The emergence/systems solution to both problems is to essentially go up a level. Instead of tracking particles, you should track cells, organs, individual humans, systems etc. At each level (following Erik Hoel’s Causal Emergence framework) you trade microscale precision for predictive power i.e. the size of the system you can predict for a given amount of computational power. Often this means collapsing large amounts of microscale interactions into random noise—a slot machine could in theory be deterministically predicted by tracking every element in the randomiser mechanism/chip, but in practice it’s easier to model as a machine with an output distribution set by the operating company. Similarly, we trade Feynman diagrams for brownian motion and Langevin dynamics.

testingthewaters 4 Oct 2025 1:42 UTC
3 points
0
in reply to: Wei Dai’s comment on: Wei Dai’s Shortform
Which period of “chinese civilisation” are you referring to? I think it would be hard to point to any isolated “chinese civilisation” just minding its own business and keeping a firm grip on a unified cultural and ethnic population. Over 3500+ years of written history the territory occupied by China today had multiple periods of unity and division, sometimes splitting up into 10 or more states, often with multiple empires and dynasties coexisting in various levels of war and peace and very loosely ruled areas in between. (This is IMO a central theme of Chinese history: the first line of the Romance of the Three Kingdoms reads “Of matters under heaven, we can say that what is long united must divide, what is long divided must unite”. At various points the “Chinese Empire” looked more like the Holy Roman Empire, e.g. during the late Zhou dynasty leading into the Spring and Autumn period)

The “chinese lands” were taken over by the Mongols and the Manchu during the Yuan and Qing dynasties (the latter one being the last dynasty before the 20th century), and at various points the borders of the Chinese empire would grow and shrink to encompass what we today recognise as Korea, Japan, South East Asia, Tibet… There are 56 recognised ethnic groups in China today. The importance and purpose of the Keju system also changed throughout the periods it was in use, and I have no idea where you got the eugenics thing from. I also think you would have a hard time building a case for any intentional or centralised control of scientific research beyond that of the European states at the time, mostly because the idea of scientific research is itself a very modern one (is alchemical research science?). As far as I can understand it you’re taking the “vibe” of a strong, unified, centralised state that people recognise today in the People’s Republic of China and then stretching it backwards to create some kind of artificial historical throughline.

testingthewaters 30 Sep 2025 21:14 UTC
3 points
0
on: Claude Sonnet 4.5: System Card and Alignment

Claude Sonnet 4.5 was released yesterday. Anthropic credibly describes it as the best coding, agentic and computer use model in the world

With regards to that self-assessment, I’m going to raise this comment from a previous thread by Raemon.

When I chatted with several anthropic employees at the happy hour a ~~couple months~~ ~year ago, at some point I brought up the “Dustin Moskowitz’s earnest belief was that Anthropic had an explicit policy of not advancing the AI frontier” thing. Some employees have said something like “that was never an explicit commitment. It might have been a thing we were generally trying to do a couple years ago, but that was more like “our de facto strategic priorities at the time”, not “an explicit policy or commitment.”

When I brought it up, the vibe in the discussion-circle was “yeah, that is kinda weird, I don’t know what happened there”, and then the conversation moved on.

I regret that. This is an extremely big deal. I’m disappointed in the other Anthropic folk for shrugging and moving on, and disappointed in myself for letting it happen.

[...] gwern also claims he talked to Dario and came away with this impression [...]

I leave open the possibility that Anthropic conducted a thorough soul-search and breaking this de-facto/promised/implied/possible commitment was considered the best way forward, but the lack of replies there from Anthropic employees (who were quite active in the thread elsewhere) was really unfortunate.

testingthewaters 27 Sep 2025 16:56 UTC
3 points
−1
in reply to: Eric Neyman’s comment on: Reasons to sell frontier lab equity to donate now rather than later
I mean, this argument holds generally for any kind of investment in future events. Supposing that some kind of TAI gets produced in the year y, investments made in the year y-10 are probably less likely to be accurate than investments made in year y-9, and so on for y-8… All the way to y-0 when we know for sure which group of actors will make TAI (which, of course, happens when they succeed). Unfortunately, the commensurate difficulty of using funding to make an impact also increases as we approach y-0.

So I agree with you that such considerations cannot provide too much sway, because on their own they justify indefinite inaction until it is definitely too late.

testingthewaters 24 Sep 2025 22:27 UTC
4 points
0
in reply to: dr_s’s comment on: Global Call for AI Red Lines—Signed by Nobel Laureates, Former Heads of State, and 200+ Prominent Figures
This has already been demoed: https://arxiv.org/abs/2412.12140 - Frontier AI systems have surpassed the self-replicating red line

testingthewaters 24 Sep 2025 13:30 UTC
4 points
0
in reply to: Katalina Hernandez’s comment on: The Problem with Defining an “AGI Ban” by Outcome (a lawyer’s take).
Just on the point of MAIM, I would point out that one of the authors of that paper (Alexandr Wang) has seemingly jumped ship from the side of “stop superintelligence being built” [1] to the side of “build superintelligence ASAP”, since he now heads up the somewhat unsubtly named “Meta Superintelligence Labs” as Chief AI officer.
[1]: I mean, as the head of Scale AI (a company that produces AI training data), I’m not sure he was ever on the side of “stop superintelligence from being built”, but he did coauthor the paper apparently.

testingthewaters 24 Sep 2025 12:11 UTC
8 points
3
on: Global Call for AI Red Lines—Signed by Nobel Laureates, Former Heads of State, and 200+ Prominent Figures
Glad to see that you’re doing this kind of work, I know that getting a statement like this agreed upon must have taken a lot of coordination work. Thanks for your efforts and congratulations on getting it across the finish line :)

testingthewaters 24 Sep 2025 0:14 UTC
2 points
0
in reply to: Thane Ruthenis’s comment on: Synthesizing Standalone World-Models (+ Bounties, Seeking Funding)
I mean, AI people are notoriously bad at doing these kinds of things xD I would expect the people running openai or anthropic to say similar things to this (when their orgs were just starting out). So I hope you can see why I wanted to ask this. None of this is to cast any doubt on your ability or motives, just noting the minefield that is unfortunately next to the park where we’re having this conversation.

testingthewaters 23 Sep 2025 23:20 UTC
2 points
0
in reply to: Thane Ruthenis’s comment on: Synthesizing Standalone World-Models (+ Bounties, Seeking Funding)
Glad to see we’re basically agreed. However, how would you take safety precautions around your own work on such algorithms, given our last big similar breakthrough (transformers for language modelling) basically instantly got coopted for RL to be “agentified”? Unless you’re literally doing this alone (with a very strong will) wouldn’t that be the natural path for any company/group once the simulator is finished?