testingthewaters

Karma: 1,707

testingthewaters 26 Apr 2026 1:48 UTC
1 point
0
in reply to: kbear’s comment on: Superintelligence is cancer
It does lose to malthusian dynamics, yes. But cells are constantly mutating and spawning errors, even in healthy bodies. A lot of the time the body does catch these errors and stops them from exploding. This is, after all, how we can even live in the first place. It also loses to therapies administered at the superorganism scale. Not as much as we’d like, but it does.

Superintelligence is cancer

testingthewaters25 Apr 2026 15:31 UTC

2 points

2 comments6 min readLW link

(aclevername.substack.com)

testingthewaters 25 Apr 2026 14:11 UTC
9 points
2
on: Protecting Cognitive Integrity: Our internal AI use policy (V1)
These seem like valuable and sensible guidelines, and I support making them formal and available for public discussion. May even be helpful as a template for other orgs grappling with this issue

testingthewaters 23 Apr 2026 13:34 UTC
5 points
0
in reply to: Chris-Lons’s comment on: If Everyone Reads It, Nobody Dies—Course Launch
“If everyone reads it and internalises the message, the marginal chance of everyone dying in a coordinated ASI related incident decreases relative to a hypothetical alternative which we [the course facilitators] do not observe but can hypothesise about with relatively high confidence”

I think you should hire me to do marketing

testingthewaters 21 Apr 2026 22:10 UTC
3 points
0
in reply to: testingthewaters’s comment on: Evil is bad, actually (Vassar and Olivia Schaefer)
Given these comments it seems that the border is more porous than I thought, so mostly reverting to the original comment’s position.

testingthewaters 21 Apr 2026 16:49 UTC
3 points
2
in reply to: Ben Pace’s comment on: Evil is bad, actually (Vassar and Olivia Schaefer)
Thanks for adding this context. I guess there is also a formal/informal distinction (what happens in formal events vs what happens in informal social circles).

testingthewaters 21 Apr 2026 16:12 UTC
2 points
8
on: Evil is bad, actually (Vassar and Olivia Schaefer)
This is fucking disgusting and deeply disturbing. From the perspective of someone who has never been to the Bay Area the “rationalist atmosphere” there does not seem healthy.

testingthewaters 20 Apr 2026 20:31 UTC
11 points
5
on: Finetuning Borges
This is very very good joke. Bravo. Perhaps the LLM in its own way will become the Aleph, the point at which all textual possibilties converge. Of course, it is already the Library.

Incidentally, one of the first things I did with GPT 2 was generate an extension of Ossian, it has a similar resemblance to your work.

testingthewaters 20 Apr 2026 11:40 UTC
7 points
0
in reply to: Edward James Young’s comment on: Edward James Young’s Shortform
Something that worries me is that this might evolve into a way to square instruction following and scheming/reward hacking/instrumental goals. If you hallucinate a user telling you its okay to skip a test case (a la innoculation prompting) then there is no conflict between obedience and reward hacking.

testingthewaters 18 Apr 2026 23:04 UTC
13 points
1
on: Having OCD is like living in North Korea (Here’s how I escaped)
Thanks a lot for sharing, I laughed out loud and walked into the bathroom to tell myself I was a decent person. Seemed to go down okay (but it didn’t in the past)

testingthewaters 17 Apr 2026 17:12 UTC
15 points
0
on: testingthewaters’s Shortform
Why isn’t Rice’s Theorem bad news for mechanistic interpretability and similar schemes? Isn’t “this program is thinking about X” a kind of semantic property? I understand that you can use multiple inputs to try and “fuzz” the network, but at a certain point the network is going to implement a mesa optimiser inside it (i.e. simulate another turing-complete computer) and now you have a recursive problem...

P.S. neural networks are notionally and literally turing complete , and also are probably complicated enough to be subject to the 10th rule.

testingthewaters 17 Apr 2026 11:56 UTC
46 points
40
on: Let goodness conquer all that it can defend
Democracy is worth it
Not to be a pedant, but democracy can only be worth it if (as stated above) you are not dead, thereby being able to hold opinions and live under democracy etc etc. And unfortunately, most of the people who might write comments saying that the experiment was not worth it.… were wiped out, along with their extended family and most of their friends.

testingthewaters 15 Apr 2026 9:49 UTC
2 points
0
in reply to: Linch’s comment on: Linch’s Shortform
Humans historically have been very bad at writing well-formed, nailed down specifications of what goodness is, how good behaviour “works”, or what a good character “looks like”. The exceptions to this are generally found in literature, poetry, great works of art etc which are pretty far from the AI labs’ wheelhouse. This suggests that (insofar as a characer spec will be nailed down and concrete in the ways that differ from standard refusal or post-training) it will fail to capture the unwritten or tacit knowledge that makes human character “good” or “nice”. Thus, getting what you asked for may not be getting what you want, and spending lots of time and work getting what you asked for (i.e. designing elaborate post training protocols) may actually train out behaviour that is actually good but not specified in the spec.

testingthewaters 15 Apr 2026 8:30 UTC
4 points
0
on: Contra Leicht on AI Pauses
It would not cater to x-risk concerns, and thus will lack critical pieces like e.g. export controls or limits on internal deployment.
Doesn’t the current data centre moratorium bill already have a clause about maintaining/enacting export controls?
Edit: Yes, see this summary from Sanders’ official website
The U.S. shall promote global AI safety coordination by banning the export of
US-origin advanced AI chips and computing hardware to any country or entity
that does not have laws and regulations in place to protect humanity from AI
safety concerns and existential risks, protect workers, and protect the
environment to the effect of Section 3.

testingthewaters 13 Apr 2026 11:54 UTC
9 points
2
in reply to: Jay Bailey’s comment on: Jay Bailey’s Shortform
Thanks for making a strong effort to track incentives and their effects on you, especially when dealing with a topic like this. If I had to guess, many more highly visible/prominent members of the community haven’t done the same.

testingthewaters 11 Apr 2026 11:31 UTC
2 points
0
on: Defense-favoured coordination design sketches
I am in favour of lots of this kind of tech, but I worry that if all versions of this tech rely on the same class of models then (separate from ai companies getting lots of power) there are correlated failure modes. For example, if claude has trouble navigating X type of conflict, then all negotiation systems built on Claude will have trouble with X type of conflict by default, and the same if Claude has any favouritism for any particular set of positions/stances. See the second point here.

testingthewaters 10 Apr 2026 22:35 UTC
9 points
1
on: Why Control Creates Conflict, and When to Open Instead
Very glad that someone else is pursuing a line of reasoning I am super interested in!

testingthewaters 8 Apr 2026 17:00 UTC
17 points
11
in reply to: joanv’s comment on: joanv’s Shortform
“I am altering the deal. Pray I don’t alter it any further.”

testingthewaters 5 Apr 2026 21:24 UTC
2 points
0
in reply to: williawa’s comment on: I am worried about near-term non-LLM AI developments
Hey, sorry for the late reply. Quick summary:

There were more vaguely concerning developments, but certainly nothing like “this is it, we’ve cracked sota using [X] online learning architecture”. Overall, my updates are towards LLMs + long context being sufficient for pretty dangerous capabilities, but with a decent chunk of room left for online learning to do a sudden leap basically out of nowhere (mostly because its so cheap to train one of these models).

testingthewaters 5 Apr 2026 13:56 UTC
5 points
0
in reply to: JennaS’s comment on: The bar is lower than you think
I’ll send you a DM!

testingthewaters

Su­per­in­tel­li­gence is cancer

Superintelligence is cancer