tailcalled comments on faul_sname’s Shortform

tailcalled 4 Mar 2025 14:32 UTC
6 points
3
That would be ones that are bounded so as to exclude taking your manipulation methods into account, not ones that are truly unbounded.
- Mateusz Bagiński 4 Mar 2025 16:00 UTC
  2 points
  0
  Parent
  I interpreted “unbounded” as “aiming to maximize expected value of whatever”, not “unbounded in the sense of bounded rationality”.
  - tailcalled 4 Mar 2025 16:44 UTC
    3 points
    0
    Parent
    The defining difference was whether they have contextually activating behaviors to satisfy a set of drives, on the basis that this makes it trivial to out-think their interests. But this ability to out-think them also seems intrinsically linked to them being adversarially non-robust, because you can enumerate their weaknesses. You’re right that one could imagine an intermediate case where they are sufficiently far-sighted that you might accidentally trigger conflict with them but not sufficiently far-sighted for them to win the conflicts, but that doesn’t mean one could make something adversarially robust under the constraint of it being contextually activated and predictable.
    - Mateusz Bagiński 5 Mar 2025 16:52 UTC
      2 points
      0
      Parent
      Alright, fair, I misread the definition of “homeostatic agents”.