Mateusz Bagiński comments on Trying to understand my own cognitive edge

Mateusz Bagiński 3 Nov 2025 9:53 UTC
4 points
0
the main thing that appears to have happened is that I had exceptional intuitions about what problems/fields/approaches were important and promising
I’d like to double-click on your exceptional intuitions, though I don’t know what questions would be most revealing if answered. Maybe: could you elaborate on what you saw that others didn’t see and that made you propose b-money, UDT, the need for an AI pause/slowdown, etc?
E.g., what’s your guess re what Eliezer was missing (in his intuitions?) in that he came up with TDT but not UDT? Follow-up: Do you remember what the trace was that led you from TDT to UDT? (If you don’t, what’s your best guess what it was?)
- Wei Dai 3 Nov 2025 11:34 UTC
  18 points
  0
  Parent
  b-money: I guess most people working on crypto-based payments were trying to integrate with the traditional banking system, and didn’t have the insight/intuition that money is just a way for everyone to “keep tabs” of how much society as a whole owes to each person (e.g. for previous services rendered), and therefore a new form of money (i.e. not fiat or commodity) could be created and implemented as a public/distributed database or ledger.
  
  UDT: I initially became interested in decision theory for a very different reason than Eliezer. I was trying to solve anthropic reasoning, and tried a lot of different ideas but couldn’t find one that was satisfactory. Eventually I decided to look into decision theory (as the “source” of probability theory) and had the insight/intuition that if the decision theory didn’t do any updating then we could sidestep the entire problem of anthropic reasoning. Hal Finney was the only one to seriously try to understand this idea, but couldn’t or didn’t appreciate it (in fairness my proto-UDT was way more complicated than EDT, CDT, or the later UDT, because I noticed that it would cooperate with its twin in one-shot PD, and added complications to make it defect instead, not questioning the conventional wisdom that that’s what’s rational).
  
  Eventually I got the idea/hint from Eliezer that it can be rational to cooperate in one-shot PD, and also realized my old idea seem to fit well with what Nesov was discussing (counterfactual mugging), and this caused me to search for a formulation that was simple/elegant and could solve all of the problems known at the time, which became known as UDT.
  
  I think Eliezer was also interested in anthropic reasoning, so I think he was missing my move to look into decision theory for inspiration/understanding and then making the radical call that maybe anthropic reasoning is unsolvable as posed, and should be side-stepped via a change to decision theory.
  
  need for an AI pause/slowdown: I think I found Eliezer convincing when he started talking about the difficulty of making AI Friendly and why others likely wouldn’t try hard enough to succeed, and just found it implausible that he could with a small team win a race against the entire world who was spending much less effort/resources on trying to make their AIs Friendly. Plus I had my own worries early on that we needed to either solve all the important philosophical problems before building AGI/ASI, or figure out how to make sure the AI itself is philosophically competent, and both are unlikely to happen without a pause/slowdown (partly because nobody else seemed to share this concern or talked about it).
  What links here?
  - E. P. Cooper's comment on Distributed vs centralized agents by Richard_Ngo (14 Mar 2026 4:33 UTC; 1 point)
  - Mateusz Bagiński 3 Nov 2025 12:09 UTC
    2 points
    0
    Parent
    Thanks!
    The entire thing seems to have a very https://www.lesswrong.com/posts/bhLxWTkRc8GXunFcB/what-are-you-tracking-in-your-head vibes, though that’s admittedly not very specific.
    What stands out to me in the b-money case is that you kept tabs on “what the thing is for”/”the actual function of the thing”/”what role it is serving in the economy”, which helped you figure out how to make a significant improvement.
    Very speculatively, maybe something similar was going on in the UDT case? If the ideal platonic theory of decision-making “should” tell you and your alt-timeline-selves how to act in a way that coheres (~adds up to something coherent?) across the multiverse or whatever, then it’s possible that having anthropics as the initial motivation helped.