TristanTrim

Karma: 522

Still haven’t heard a better suggestion than CEV.

TristanTrim 15 Apr 2026 3:45 UTC
2 points
0
in reply to: sjadler’s comment on: sjadler’s Shortform
This reminds me of another communication problem I’ve been musing on here and there. If you solved the alignment problem to a sufficient degree that it was wise for humanity to proceed with ASI, could you convince others it was real and to take it seriously? It is a message that I would desperately want to effectively reach me and I harbor concerns that it might not.

TristanTrim 14 Apr 2026 22:26 UTC
1 point
0
in reply to: Raemon’s comment on: The Shapley Share of Responsibility?
I think it’s likely I misinterpreted your original sentiment “just, try not to be totalized about it.” to mean something like “don’t become very dedicated towards important well integrated aims” rather than what you probably meant which may be something more like “don’t become panicked and stressed about a poorly considered issue and adopt overly narrow strategies that neglect important considerations”.

I don’t think either of us is using “totalize” like the merriam-webster definition:

1: to add up : total

2: to express as a whole

I feel like I’m picking on your use of the word more than is valuable at this point. Sorry about that. I’ll try to explain my use of the word a bit by the following responses, but it’s probably not super important we get on the same page, so if it doesn’t make sense feel free to ignore.

you can’t be totalized about all of them!

I agree you can’t be totalized about all of them individually, but you can be totalized about the set of all of them together.

You might be saying “society should have a lot of totalized people, because this is a good way to solve problems that are very ‘all-or-nothing’ shaped.”

Yes. I think this is indeed what I am saying. Also that those totalized people need to be better at organization and coordination with one another.

if part of what needs to happen at the societal scale is to figure out solutions to many all-or-nothing problems at once, it seems surprising that what you want is “totalized people” instead of “people who specialize in the thing, but, are still tracking at least some of the other problems.”

I think you are very correct that specialization is very important, but coordination between specialists is also important. Specializing in helping to integrate and cross reference. IE specializing in totalizing. But also, I don’t think a totalized person must have a totalized career, more that they have totalized motivations.

As an example, I could become an EA and think that some cause, like preventing malaria, is important enough that it should consume all of my focus, and then conclude that I should specialize in law, make lots of money as a lawyer, and donate most of it to the against malaria foundation.

A totalized view of what is important does not need to imply one must become an expert in all subjects.

I also just think most of the other problems don’t make sense to be totalizing. Which ones are you thinking of?

Any claim that has extremely important implications if it is true. This includes ASI risk, climate change, the holocene extinction, societal collapse, global resource management, nuclear war and more generally geopolitical instability, pandemic risk, basically everything Toby Ord talked about in The Precipice, basically everything to do with communication and coordination, and also on the tail end weird intractable and difficult to make progress on things like philosophy of ethics, religion, consciousness, immortality, etc...

it’s more likely you accidentally neglect things that you needed, to either be personally healthy, or to make your local social/professional world healthy

I agree with you about that, and I think it is possible for totalization to cause a person to neglect the things you mention in ways that harm themselves and the cause they are trying to aid, but it seems more like a fact about people’s abilities to strategize and manage their executive functioning, not necessarily like a fact about totalization.

I don’t know how you would measure it but I would expect there are more people accidentally neglect things they need who aren’t totalized than those who are. I would expect totalization might have weakly positive effects, giving people motivation to do the things they must do to support themselves and their cause. This might be a case of general advice failing and needing to be reversed for some people.

TristanTrim 14 Apr 2026 20:24 UTC
1 point
0
on: TristanTrim’s Shortform
In worlds where leaders X of some social movement condemn taboo action Y you would expect to see X condemning Y. But also in worlds where X supports Y, you would expect them to not be able to publicly state that they support Y, and so in these worlds too, you would expect X to condemn Y. But this creates a problem. If condemning Y is what X does in both worlds where they support and condemn Y, how can X actually communicate with supporters who find themselves wondering if they can support the movement with Y. Do sufficiently strong taboos against Y make it impossible to actually communicate true condemnation clearly? That would suck.

TristanTrim 14 Apr 2026 20:06 UTC
3 points
0
on: The Shapley Share of Responsibility?
I’m going to note that this is a proxy conversation for recent current events and I’m not really going to respond with recent current events in mind but instead focus on the abstract social coordination issues.

I think there’s two very different aspects here both of which are important, (a) attribution of causal relevance and (b) attribution of reward & punishment to shape behaviour. When looking at the fire example from the perspective of (a) it might seem prudent to look at the psychology and upbringing of the “fire” shouter, and how society prepares or fails to prepare people for handling emergency situations such as fires. From the perspective of (b) on the other hand, it is more reasonable to want to have simple and clearly understandable rules and procedures. An appropriate rule in the fire example might be to ban shouting fire in a theater, making the “fire” shouter fully liable for harm caused, and make theaters liable for maintaining certain standard procautions and procedures in case of fire.

more broadly getting at “how do people who disagree about what’s true and what’s good, cooperate?”

Just wanna flag that I think this is a very important question for people to be focusing on.

I think it’s true. I don’t have a super principled answer other than “just, try not to be totalized about it.”

My view is closer to: be totalized about it, but:
- Really deeply understand that other people are not totalized about it.
- Recognize and learn from examples of other totalized people. Many have caused significant problems. Most have probably been crazy.
- Try to argue yourself out of the position that is making you totalized. Make sure you understand your own position well. Think about the nuances and details.
- Try to adopt and integrate other totalizing beliefs.
- Try not to be a jerk about it.
I think it is correct for the social-judgment-sphere to have an immune system against totalizing beliefs.

I mostly disagree. I think something between “people should be able to hold totalizing beliefs without becoming totalized” and “we are living in a poly-crisis! There are too many totalizing problems now to easily catalogue all of them! Everyone needs to get totalized and organized and coordinated NOW!”

TristanTrim 14 Apr 2026 19:17 UTC
1 point
0
in reply to: Vladimir_Nesov’s comment on: Only Law Can Prevent Extinction
Thanks! I fully agree. What I said was wrong and I edited my comment to reflect that.

It’s somewhat nice that “spreading misinformation” is an umbrella covering doing so intentionally (lying) and unintentionally. It is unpleasant that another way to say “misinformation” is “fake news” and accusations of such seem to be available as a cheap, fully general, attack on political opponents. I guess it would be pretty nice if people always used citations when talking about things, but that seems like an unrealistic ideal.

TristanTrim 14 Apr 2026 19:10 UTC
6 points
0
on: Annoyingly Principled People, and what befalls them
I like this post. A few thoughts:
- While doing PauseAI outreach one of the responses I really appreciated from people was “I agree this is an important issue” regardless of whether those people wanted to help build the social movement or try to help regulate AI or were focused on completely other things. Rationally it might be that their agreement implied that as things moved forward we could expect people to agree with and support the cause rather than fight it. It might also be that it is nice to hear agreement. So a good behaviour for Bobs who aren’t adopting X might be to acknowledge that they agree it is important, but at that time it feels too costly to engage with.
- The “you’re either with us or against us” mentality is interesting and problematic. I once heard a version “if you don’t support the oppressed you are supporting the oppressors”, or more poignant, Martin Niemöller’s poem ending with “Then they came for me—and there was no one left to speak for me”. I refer to it in 3 quotes to show different facets and how it isn’t easy to simply accept or oppose the concept. I think the concept does cause a lot of coordination difficulty, but there is validity to it. Perhaps factoring the useful and harmful aspects from the idea would be beneficial, like, polarized dehumanizing thinking is bad, but focusing on coordination and solidarity between social groups is good.
  - One problem with the “with us or against us” mentality is that people only have so much ability to focus on and understand things. It is bad if people are ignoring ALL issues, but it is also bad if people are spreading themselves too thin and becoming ineffective by trying to focus on ALL issues. Most people should focus on ONE or a small cluster of SOME issues. People who focus on systems/networks/communication/logistics kind of things are a special case. They are in some sense trying to focus on ALL issues, but not by becoming familiar with all the details of all issues, but instead by understanding abstracted views of issues and how they interrelate. This is valuable and is not the same thing as trying to simultaneously become an expert in all issues.
- I like the idea of ombudspeople. (Is it a contraction of omni-buddy?) I wonder if it would be possible to have something like a guild of ombudspeople for social change somehow.
- I like thinking of language and memes as a kind of technology. It would be nice if there was better linguistic technology for dealing with this kind of thing.
  - One tool that might help is making softer forms of support easier and more visible, like being able to easily sign a sentiment saying “yeah I heard a paragraph about this issue and based on that paragraph I agree some people should think about it some more but I don’t think I should specifically be one of those people”.
  - More ambitiously, I want better maps of the different worldviews people are living in and better jargon for talking about worldviews abstractly and communicating productively with people who can mutually understand and acknowledge that they are operating from within different worldviews.

TristanTrim 14 Apr 2026 18:35 UTC
1 point
0
in reply to: Karl Krueger’s comment on: TristanTrim’s Shortform
I think I should have used the word “polarizing” instead of “politicizing”.

I mean the first two also with the implication that people treat these things as quasi-conflicts between quasi-tribes, and so become less likely to focus on what is correct and beneficial and more likely to focus on signalling tribal membership and allegiance.

I think your third bullet point is related, but not necessarily what I’m talking about. Arguing about how society should respond to and think about school shootings is important. School shootings are bad and should be prevented just like traffic accidents and heart disease are bad and should be prevented. I believe responses like gun control are politicized in that people are likely to pattern match “gun control” into a quasi-tribe conflict and then respond accordingly, instead of actually thinking about it, or as should often be done, ignoring if they are not well versed on the relevant issues. But just talking about issues and which parties plan what responses to those issues isn’t necessarily a problem, except if it causes people to start contextualizing the issue as a quasi-tribal conflict.

Maybe instead of “politicized” or “polarized” a term like “quasi-tribalized” or “in-group-out-group-conflictized”, or something similar but less rhetorically unwieldily.

TristanTrim 14 Apr 2026 17:20 UTC
1 point
0
in reply to: EniScien’s comment on: Only Law Can Prevent Extinction

Because if you people truly believe that the world is ending, then you would be ready to show something more than cheap words, make some great sacrifices which would only people in such a great desperation make.

This seems reasonable. Signalling. One would hope that actions like dedicating one’s career to AI alignment and AI ethics or leaving AI companies over ethical concerns would count as such a signal, and there are many people doing such things. But I don’t know how compelling these actions actually are to people. I could be making way more money if I was just trying to make money instead of trying to figure out how to work on AI ethics. That’s a pretty significant sacrifice, maybe not as significant as cutting off a let, but in some ways it might actually be more significant. It seems hard to quantify.

But to be more visible, I have considered doing that “human statue” thing buskers do with a sign saying “if I can pause everything, you can pause AGI development”.

TristanTrim 14 Apr 2026 17:11 UTC
3 points
0
in reply to: EniScien’s comment on: Only Law Can Prevent Extinction

I suspect that “bomb data centers” meme causal story was not somebody lying, but somebody recalling by memory without a thought that such serious allegation maybe is worthy to actually look up it and not rely on unreliable memory.

I agree with this and think it’s an important thing to be aware of, but also, importantly, it is still ~~lying~~ spreading misinformation.

TristanTrim 14 Apr 2026 16:57 UTC
16 points
5
in reply to: Algon’s comment on: Only Law Can Prevent Extinction
I agree with the need to accurately model the thinking of anti-extinction madmen to better communicate with, and de-escalate them. I think the thinking might be “Sam Altman is one of the actors driving the race towards dangerous AI capabilities. The current environment seems to incentivize this behaviour. If I commit a visible violent act towards him, it will reduce the dangerous incentive, after all, people want money and prestige, but they don’t want to have their property vandalized or die violently.”

They may also have been thinking of this as a commitment signal. Throwing fire at someones house is a very bad thing to do, both in terms of the effect it could have on the victim and the effect it will likely have on the perpetrator. To know that and still be willing to do it could be seen as a signal of conviction to the believe that Sam Altman’s actions, and the actions of large AI companies, are harmful. Unfortunately, it can also be seen as a signal of the perpetrator being violently insane, and a signal that the anti-extinctionists are violently insane. Ironic and unfortunate.

Also to the end of de-escalating madmen, I think we need more compressed versions of the essence of this post. Maybe something like “global GPU control is the only sufficient control against ASI, anything that doesn’t move us towards international coordination is counterproductive”.

TristanTrim 14 Apr 2026 16:37 UTC
3 points
1
on: TristanTrim’s Shortform
We need better de-politicizing technology. The politicization of issues seems very difficult to avoid. Once something becomes politicized, can it be depoliticized? I think we need jargon and communication norms and platforms that helps push back on the politicization of everything.

Relatedly, the notion that politicization harms peoples ability to form accurate believes and coordinate with one another seems tacitly true to me, but I can’t easily find citations that straightforwardly support it, instead, it seems like most research takes it as assumed and then researches specific facets and details. Do you know of any good citations?

TristanTrim 8 Apr 2026 5:02 UTC
1 point
0
in reply to: XelaP’s comment on: Open Thread Winter 2025/26
Yeah, that’s interesting to point out, that belief state structures may be more complicated than the underlying state those beliefs represent. That’s difficult to square with my claim that all the information is present in the input, and that network layers can only destroy or change the geometric embedded of the information. Definitely something I want to look into and think about further.

TDA sounds cool. I’d like to take inspiration from it, even if it isn’t a tool that is useful as it is, it may contain good ways to think about things, inspire tools that are useful, or at the very least give insight into things that have been tried and found to not be useful.

TristanTrim 5 Apr 2026 2:50 UTC
3 points
0
in reply to: Elliot Temple’s comment on: I Changed My Mind about Error-Correcting Debate, Misogyny and More: Updates from a Former Student of David Deutsch

I wrote a response essay about debate

This is awesome. Thank you. I’m sorry I won’t be responding with the same level of effort.

I would like to engage more with this sometime in the future. I’m especially interested in the claims about indecisive and decisive epistemology. Am I understanding correctly that “critical fallibalism” is the name you’ve given to your decisive epistemology?

I’ll also give a few thoughts on a few parts that stuck out to me.

Would it be better if all debates were friendly and collaborative? If so, what makes debate different than collaborative truth seeking besides the existence of bad, adversarial debates?

I think debate implies opposing claims defended by people representing those claims. This representation doesn’t seem like a good incentive. Any participant should be incentivized to point out relevant flaws or reinforcing evidence wherever they see it.

I feel like nontrivial claims are not usually well stated enough to actually create a shared understanding. So either people make statements without actually getting on the same page and then need to either work backwards to get on the same page, or argue past each other without realizing.

In general, I place high value on having a good answer to the question, “If I’m wrong, and someone else knows it, how will I be corrected?” I do my best to correct my own errors, but I also want to be correctable by others.

I really like this sentiment. I wish to be less wrong, and welcome help!

I think this is important so people aren’t ignored due to poor culture fit, low social status, poor ability to get attention from crowds, lack of credentials, and other heuristics that are different than actually being wrong.

I really like this as well. I feel it is productive to think of language and communication and coordination as technologies. I think our communication technology is getting more powerful which leads us both to new communication strategies and new communication issues. I am hopeful that someday our communication technology will be sufficiently advanced that we can meaningfully have large scale consensus, rather than the awkward indirect, representative, and generally cumbersome and ineffectual ways that people currently must communicate with other people and systems they feel the need to understand and influence.

TristanTrim 4 Apr 2026 0:05 UTC
3 points
0
in reply to: Elliot Temple’s comment on: I Changed My Mind about Error-Correcting Debate, Misogyny and More: Updates from a Former Student of David Deutsch
I skimmed some of your writing on critical fallibilism and related topics. I like it. Maybe I’ll try to find time to read more sometime : )

I forgot to mention I do have affection for the term “argument” as it is used in logic and proof. I also like incentivization structures that promote people who are not friendly towards one another to work towards shared goals. And I think competition can be useful both in incentivization structures, and can be intrinsically fun. But the things that “debate” normally points to in natural language doesn’t seem worth trying to salvage. There might be some kind of hyperstition cascade thing going on with the word “debate”, but it feels in some way more complex than a simple slur type cascade.

Finding a point of contradiction, and knowing that at least one of you is wrong, and talking critically about it using arguments and evidence

I’d say you have to do a lot of work to have enough shared context to know that at least one of you is wrong. I like the “at least one” sentiment, since both people can be wrong, but both people can also be right and just using language differently and failing to really connect with one another. The parts doing all the work are sharing world models, synchronizing terminology, locating differences in world modelling, and looking at evidence. The debate framing doesn’t seem to add much to that.

TristanTrim 3 Apr 2026 20:33 UTC
4 points
0
on: I Changed My Mind about Error-Correcting Debate, Misogyny and More: Updates from a Former Student of David Deutsch
I like this post. A few thoughts:
- I read “Change” by Damon Centola recently. They wrote about some of their sociology work demonstrating that politics is indeed the mind killer, but also that centralized “fireworks” social networks for communication prevent spread of ideas. Peer to peer “fishing net” social networks do much better at spreading ideas and changing minds, but only if there is not a politically charged context.
- I suspect you have already heard of it, but I really like This Video Will Make You Angry for describing a memetic phenomena preventing discourse across disagreeing groups. A much longer treatment of the idea is found in The Toxoplasma Of Rage.
- I know you lampshade this towards the end of the post, but I dislike “argument” and “debate” as contexts for discussing discourse. As implied in Centola’s book, more minds are changed by discourse than by debate. I know it’s obtuse to use a phrase like “collaborative truth seeking” but it would be nice if there were more people standing up for the “collaborative truth seeking” vibes.
- Speaking of standing up for vibes, thank you for standing up for the “people should be allowed to not take strong stances on things they know nothing about” and “not every person needs to be informed about every issue” vibes. I feel they are required as we are moving into the global hivemind era of civilization.
  - Not knowing things that are important to know is painful. Disinformation campaigns are effective. This is awful, but the solution is not to pretend we can know things we cannot, but to try to find solutions to make disinformation less effective and collaborative truth seeking more effective.
- Social Dark Matter has some similar ideas I think are valuable about the dynamics we can expect surrounding taboos.
- About PUA themes, I liked “Models: Attract Women Through Honesty” by Mark Manson. I think it approaches the problem of men wanting to attract women with a fairly healthy perspective. I think it might help pull some people who are susceptible to “red pill” and “manosphere” vibes towards healthier places.
- About descriptions of social systems being kinda awful in ways that are not commonly described, I like Zvi’s Immoral Mazes Sequence
- Regarding “there aren’t enough competent people doing productive work”.… I think it is more polite (and maybe more accurate and useful) to frame the issue as “doing productive work is actually super difficult for subjects where verifying correctness is difficult, and verifying correctness is almost always difficult”.
- I think public, social media based discourse, is pretty badly broken in many ways. It is still doing a lot of good, but it is for sure doing a lot of bad. I feel it would be very difficult to determine if it’s doing more harm or good. I’m trying to write a sequence describing how I view the problem and potential social media ideas that could help solve them. But unfortunately writing takes a long time and this isn’t the top of my priorities right now.

TristanTrim 2 Apr 2026 23:20 UTC
1 point
0
in reply to: Eli Tyre’s comment on: TristanTrim’s Shortform
I’m 35. You can view my experience on my linkedin profile. I was working as a technologist at an automotive company, involved with some AI projects in collaboration with the Vector Institute. That’s when GPT-3 was released, prompting me to take the prosaic scaling hypothesis more seriously and change my plan to saving money so I could finish my CS BSc and change my career goal to working on technical AI alignment.

While completing my BSc I had the opportunity to focus on my NDSP project, first as a directed studies project supervised by George Tzanetakis, and then extend it into an honours project supervised by Teseo Schneider. George is a professor focused on classical AI and music algorithms. Teseo is a professor focused on graphics algorithms. They are probably the most relevant experienced people who have worked with me and could vouch for the quality of my work, but neither is focused on technical alignment so probably cannot vouch for my strategic orientation.

My project was mostly self driven, attempting to extend and apply the tools introduced in Visualizing Neural Networks with the Grand Tour to the same network that was examined in Understanding and controlling a maze-solving policy network as part of a long term plan to first build intuition for interactive n-dimensional tools while applying them to relatively easier to understand image networks, before applying them to relatively more difficult to understand transformer networks.

I have reached out to some of the authors of those papers and have had brief correspondences with Mingwei Li, TurnTrout, peligrietzer, and Ulisse Mini, but I’m unsure how deeply any of them have looked into my work.

TristanTrim 2 Apr 2026 6:35 UTC
1 point
0
in reply to: Eli Tyre’s comment on: TristanTrim’s Shortform
Here’s the page describing my projects.

I think the NDISP project has the clearest value prospect. I’m currently starting over writing the tool to be stand alone and ready for alpha users. I’d recommend skimming the videos for a sense of the project.

All of my other projects seem to be less legible with much lower probability of much greater usefulness. Charitably I they could be described as working on useful paradigm shifts for the field of AI Alignment and rational global coordination. Less charitably they could be described as a crackpot shouting at clouds. I might describe them as Butterfly Ideas that I really want to get out of the stage of being butterfly ideas, but alas, they keep flapping around me.

TristanTrim 2 Apr 2026 5:56 UTC
1 point
0
in reply to: Thomas Kwa’s comment on: TristanTrim’s Shortform
That is commonly given advice, and it makes sense. When you are starting out you don’t know what you don’t know and can’t see the flaws with your own ideas. But on the other hand, coming up with your own ideas is its own skill that may not be trained well by only learning from other peoples experience. It’s hard to say. I suppose the obvious ideal is to practice coming up with your own ideas and have experienced mentors to critique them.

What kinds of things do you have in mind when you say “get more experience”? I am applying to fellowships but haven’t been accepted to any yet. I don’t want to do more ML work that doesn’t focus on AI alignment if I can help it. I was considering writing some literature reviews. There are also some papers I would like to try replicating.

But if I’m being honest the things that feels most valuable to me is working on NDISP, OISs, and Maat, or finding other, similar enough projects and contributing to them. I guess I’m gambling with the time I have to focus on these things and I need to accept that if I’m deciding to focus on projects I think will be valuable but other people don’t see the value in, then I’ll have to keep focusing on them without financial or moral support, and accept the consequences for doing so.

TristanTrim 1 Apr 2026 18:45 UTC
4 points
2
in reply to: Seth Herd’s comment on: Product Alignment is not Superintelligence Alignment (and we need the latter to survive)
I was responding sardonically to their statement: “More broadly, I am struggling to see what evidence you have for why current alignment frameworks (among other things) would fail to transfer to more capable models.” I maybe deserve the “too sneering” or “too combative” react for it.

The statement seems indicative of the view that companies should be allowed to push ahead with whatever they are doing unless someone can prove it is unsafe and harmful. I think a much healthier view for society to hold is that companies should NOT be allowed to push ahead with whatever they are doing unless someone can prove that it IS safe and NOT harmful.

TristanTrim 31 Mar 2026 22:34 UTC
15 points
2
on: TristanTrim’s Shortform
I find trying to find funding or paid roles or even unpaid roles so demoralizing. How do I keep motivated?

I don’t want to focus on trying to survey the landscape of funding opportunities and learning to network with people productively. It’s so much nicer to just focus on the work I want to be doing, but it seems I either can’t make it legible enough fast enough, or it’s actually not valuable and I should go do something else with my time.

I want advice. How do I get funding? How do I think about getting funding? How do I stay motivated to keep thinking about how to get funding?