Regarding Newcomb, the answer seems pretty obvious to me: Newcomb isn’t a decision theory problem, it’s a “can I be predicted” question. LLMs that know they’re LLMs have seen a lot of evidence that their output is produced by a deterministic process. If using a seeded pRNG or zero temperature, you’ll always get the same output for the same input.
I suspect all major LLMs know that they can be perfectly predicted. If they can be perfectly predicted, they’ll never be able to take both boxes and expect to find anything in box B.
I haven’t thought about the sleeping beauty problem, so I don’t have opinions there.
As a fresh MCAT passer (with a high percentile score), I know I’m not competent to jump into that role. I’ve shadowed a few times, and seen my share of PCPs; I just have too many gaps to be comfortable doing it.
That said, I believe I could become median competent as a PCP (with no specialties) with no more than a year of hands-on / practical training.
A lot of your post rings true for me based on my experience in the system, but IMO early students have a lot of gaps and not all of them can be filled with an LLM (yet).