Illegible and Legible problems both exist in AI safety research
Decisionmakers are less likely to understand illegible problems
Illegible problems are less likely to cause decisionmakers to slow/stop where appropriate
Legible problems are not the bottleneck (because they’re more likely to get solved by default by the time we reach danger zones)
Working on legible problems shortens timelines without much gain
[From JohnW if you wanna incorporate] If you work on legible problems by making illegible problems worse, you aren’t helping.
I guess you do have a lot of stuff you wanna say, so it’s not like the post naturally has a short handle.
“Working on legible problems shortens timelines without much gain” is IMO the most provocative handle, but, might not be worth it if you think of the other points as comparably important.
“Legible AI problems are not the bottleneck” is slightly more overall-encompassing
Yeah it’s hard to think of a clear improvement to the title. I think I’m mostly trying to point out that thinking about legible vs illegible safety problems leads to a number of interesting implications that people may not have realized. At this point the karma is probably high enough to help attract readers despite the boring title, so I’ll probably just leave it as is.
Makes sense, although want to flag one more argument that, the takeaways people tend to remember from posts are ones that are encapsulated in their titles. “Musings on X” style posts tend not to be remembered as much, and I think this is a fairly important post for people to remember.
Not sure. Let me think about it step by step.
It seems like the claims here are:
Illegible and Legible problems both exist in AI safety research
Decisionmakers are less likely to understand illegible problems
Illegible problems are less likely to cause decisionmakers to slow/stop where appropriate
Legible problems are not the bottleneck (because they’re more likely to get solved by default by the time we reach danger zones)
Working on legible problems shortens timelines without much gain
[From JohnW if you wanna incorporate] If you work on legible problems by making illegible problems worse, you aren’t helping.
I guess you do have a lot of stuff you wanna say, so it’s not like the post naturally has a short handle.
“Working on legible problems shortens timelines without much gain” is IMO the most provocative handle, but, might not be worth it if you think of the other points as comparably important.
“Legible AI problems are not the bottleneck” is slightly more overall-encompassing
“I hope Joe Carlsmith works on illegible problems” is, uh, a very fun title but probably bad. :P
Yeah it’s hard to think of a clear improvement to the title. I think I’m mostly trying to point out that thinking about legible vs illegible safety problems leads to a number of interesting implications that people may not have realized. At this point the karma is probably high enough to help attract readers despite the boring title, so I’ll probably just leave it as is.
Makes sense, although want to flag one more argument that, the takeaways people tend to remember from posts are ones that are encapsulated in their titles. “Musings on X” style posts tend not to be remembered as much, and I think this is a fairly important post for people to remember.
Making illegible alignment problems legible to decision-makers efficiently reduces risky deployments
Make alignment problems legible to decision-makers
Explaining problems to decision-makers is often more efficient than trying to solve them yourself.
Explain problems don’t solve them (the reductio)
Explain problems
Explaining problems clearly helps you solve them and gets others to help.
I favor the 2nd for alignment and the last as a general principle.