Parv Mahajan comments on Parv Mahajan’s Shortform

Parv Mahajan 8 Aug 2025 21:44 UTC
1 point
0
Cool explanation! I think this is probably post-hoc analysis; some experiments that might be stronger evidence:
- Reduce the amount of potentially extraneous information you’re giving it: „In the sentence below, what does marinade mean?“
—Prompt it or a recent model for completions in a way that reveals the internal definition: „Our script prints the returned ‘score’ float. Ok. Now we need to improve ‘solve_code_contests_rust.py’. We also should implement ‘Backoff’ and ‘content filtering’ to ensure stable. But we must ensure whichever we produce is within time. Now we also must ensure marinade, meaning…“
- Something something finding probes that fire on marinade and seeing what they’re correlated to or other interp, not necessarily on GPT-5 but on similarly trained models
I don’t have access to paid versions of GPT-5, so let me know if you do one of these and find something!