Thank you, this was very useful to me
Luc Brinkman
I agree with the prior (at least in the case of CG) but not the latter. CG says they’re bottlenecked by funding allocation. Thus, distrubuting funding to ‘AI Automation’ likely takes away funding-allocating-resources from other domains, and thereby reduces funding for those other domains.
Conversely, if they do get more more human resources to allocate funding, they initially still have to choose whether to spend those marginally on ‘classical funding’ or on ‘Safety Automation’
I think so too. Part of an upcoming post in this sequence :)
Does this cartoon basically define impact as “affecting something”? Because then I’d say that’s not what I’m referring to with impact in the context of AI Safety. I mean something like “reduce probability of existentially bad outcomes”
Can you tell me more about that non-obvious relationship? At first I thought the main difference you were pointing at is that alignment is not a one-off thing but more of a continuous/recurring state. But I think you’re pointing at something else (too)
Yup, I agree that’s one factor working in favor of making nonprofit easier than forprofit
Meaning in for-profit there’s more competition and existing solutions are more efficient, thus you need to be very good to be marginally better?
Why Even Experts Don’t Know What to Do About AI Risk
Lack of conceptual understanding of the basics seems to me like a major reason why people keep on doing activities that sound like “AI Safety” but are likely making things worse.
It’s also what we’re trying to change with Lens Academy.
Interesting hypothesis that such basics might be less understood because they’re inherently less trained in practice while doing research. Seems plausible.
I do also think the AI Safety education space has a share in this, focusing too much on prosaic empirical methods at the cost of strategy and conceptual fundamentals.
Shout outs to AFFINE and Iliad Intensive for also emphasising the basics (as far as I can see)
As someone with a decent amount of ideas but not a lot of published writings, I notice I have a few things blocking me from writing more:
Writing taking (me with my current skillset and behaviors) a long time, and being very busy
Good writing taking a long time; not having internalized various methods of making writing good; needing to look up and reflect on such methods while writing, which is a very slow process until those methods are internalized
not wanting to waste people’s time with low quality writing
noticing that if I do iterate on a draft, I do (at least feel like) I am learning about writing and making progress in my abilities. So spending a long time on a draft doesn’t feel like wasted time. But it also doesn’t produce output, so reward is low.
not wanting to “dilute” my account with mediocre posts compared to a few posts that have seen much much more effort and with which I’m much happier / more proud.
I do enjoy the few times I have simply taken 20min to write a draft without the intention of publishing it. Those have usually been helpful in shaping my ideas.
This doesn’t really relate to your posts but it came to mind and I guess your post incentivized me to write more, resulting in this comment
Can you say more about what motivated you to write this? I like the points your make in and of themselves but I feel they’re left hanging without a conclusion or reason.
Yes, only we think that the book seems to hit a high enough rate of internalizing the message that the claim would generalize to just “if everyone reads it” without being explicitly conditional on everyone internalizing the message.
Yeah, I think a more accurate statement would have been that insiders with a low p(doom) aren’t massively swayed to a higher p(doom) by the book.
If Everyone Reads It, Nobody Dies—Course Launch
That cartoon of the maze could be made into a nice little collaborative mini-game.
Not sure what that would be useful for, but it’s just something that came to mind.
Nice pattern of using an AI block inside of a collapsable section, and using it to clarify/examplify what you mean in a slightly different voice.
Yup, that would be a healthier view for society to hold. Sadly in pretty much any field that I’m aware of, companies are allowed to push ahead until accidents happen and until it’s proven that it’s unsafe.
We’ll need to find some way of overcoming that default since we’ll only get 1 real shot atn superintelligence alignment.
I see how “Local alignment != asymptotic alignment” is more accurate but I find the current title/claim easier to understand.
I coudl see them being a good pair, where the current title makes the claim and the local vs asymptotic stuff adds the mechanism. But the mechanism without the claim, I would fear, would fail to land for many people. Just sth to keep in mind as one datapoint if you ever do a followup post :)
Hmm, sth like doing ML research or creating tooling that’s dual use without really attending to the idea that it might be dual use and without a clear theory of change. Sorry I don’t have something more specific for you here.