AI PhD @ EPFL | Foundation Models, AI Safety
Raghav Singhal
Karma: 94
That sounds great! We would love to chat about this. One of our main priorities right now is to reliably measure the effect of our pretraining on model internals.
AI PhD @ EPFL | Foundation Models, AI Safety
That sounds great! We would love to chat about this. One of our main priorities right now is to reliably measure the effect of our pretraining on model internals.
We chose the rule-oriented constitutional framework both for clarity and practical reasons. This seemed to be a reasonable and scalable way to generate reflections. It might be possible that better ways exist for this; we plan to improve our synthetic data generation pipeline for future scaling runs.