Adam Scholl comments on Anthropic, and taking “technical philosophy” more seriously

Adam Scholl 14 Mar 2025 5:16 UTC
2 points
0
No, I agree it’s worth arguing the object level. I just disagree that Dario seems to be “reasonably earnestly trying to do good things,” and I think this object-level consideration seems relevant (e.g., insofar as you take Anthropic’s safety strategy to rely on the good judgement of their staff).
- Raemon 14 Mar 2025 5:19 UTC
  2 points
  0
  Parent
  I think (moderately likely, though not super confident) it makes more sense to model Dario as:
  “a person who actually is quite worried about misuse, and is making significant strategic decisions around that (and doesn’t believe alignment is that hard)”
  than as “a generic CEO who’s just generally following incentives and spinning narrative post-hoc rationalizations.”
  - Adam Scholl 14 Mar 2025 5:26 UTC
    2 points
    −2
    Parent
    Yeah, I buy that he cares about misuse. But I wouldn’t quite use the word “believe,” personally, about his acting as though alignment is easy—I think if he had actual models or arguments suggesting that, he probably would have mentioned them by now.
    - Raemon 14 Mar 2025 17:59 UTC
      2 points
      0
      Parent
      I don’t particularly disagree with the first half, but your second sentence isn’t really a crux for me for the first part.