Jozdien comments on Evaluations (of new AI Safety researchers) can be noisy

Jozdien 5 Feb 2023 9:54 UTC
7 points
1
I think this post is valuable, thank you for writing it. I especially liked the parts where you (and Beth) talk about historical negative signals. To a certain kind of person, I think that can serve better than anything else as stronger grounding to push back against unjustified updating.
A factor that I think pulls more weight in alignment relative to other domains is the prevalence of low-bandwidth communication channels, given the number of new researchers whose sole interface with the field is online and asynchronous, textual or few-and-far-between calls. Effects from updating too hard on negative evals is probably amplified a lot when those form a bulk of the reinforcing feedback you get at all. To the point where at times for me it’s felt like True Bayesian Updating from the inside even as you acknowledge the noisiness of those channels, because there’s little counterweight to it.
My experience here probably isn’t super standard given that most of the people I’ve mentored coming into this field aren’t located near the Bay Area or London or anywhere else with other alignment researchers, but their sole point of interface to the rest of the field being a sparse opaque section of text has definitely discouraged some far more than anything else.