Thank you, this was very useful to me
Luc Brinkman
I agree with the prior (at least in the case of CG) but not the latter. CG says they’re bottlenecked by funding allocation. Thus, distrubuting funding to ‘AI Automation’ likely takes away funding-allocating-resources from other domains, and thereby reduces funding for those other domains.
Conversely, if they do get more more human resources to allocate funding, they initially still have to choose whether to spend those marginally on ‘classical funding’ or on ‘Safety Automation’
I think so too. Part of an upcoming post in this sequence :)
Does this cartoon basically define impact as “affecting something”? Because then I’d say that’s not what I’m referring to with impact in the context of AI Safety. I mean something like “reduce probability of existentially bad outcomes”
Can you tell me more about that non-obvious relationship? At first I thought the main difference you were pointing at is that alignment is not a one-off thing but more of a continuous/recurring state. But I think you’re pointing at something else (too)
Yup, I agree that’s one factor working in favor of making nonprofit easier than forprofit
Meaning in for-profit there’s more competition and existing solutions are more efficient, thus you need to be very good to be marginally better?
Lack of conceptual understanding of the basics seems to me like a major reason why people keep on doing activities that sound like “AI Safety” but are likely making things worse.
It’s also what we’re trying to change with Lens Academy.
Interesting hypothesis that such basics might be less understood because they’re inherently less trained in practice while doing research. Seems plausible.
I do also think the AI Safety education space has a share in this, focusing too much on prosaic empirical methods at the cost of strategy and conceptual fundamentals.
Shout outs to AFFINE and Iliad Intensive for also emphasising the basics (as far as I can see)
As someone with a decent amount of ideas but not a lot of published writings, I notice I have a few things blocking me from writing more:
Writing taking (me with my current skillset and behaviors) a long time, and being very busy
Good writing taking a long time; not having internalized various methods of making writing good; needing to look up and reflect on such methods while writing, which is a very slow process until those methods are internalized
not wanting to waste people’s time with low quality writing
noticing that if I do iterate on a draft, I do (at least feel like) I am learning about writing and making progress in my abilities. So spending a long time on a draft doesn’t feel like wasted time. But it also doesn’t produce output, so reward is low.
not wanting to “dilute” my account with mediocre posts compared to a few posts that have seen much much more effort and with which I’m much happier / more proud.
I do enjoy the few times I have simply taken 20min to write a draft without the intention of publishing it. Those have usually been helpful in shaping my ideas.
This doesn’t really relate to your posts but it came to mind and I guess your post incentivized me to write more, resulting in this comment
Can you say more about what motivated you to write this? I like the points your make in and of themselves but I feel they’re left hanging without a conclusion or reason.
Yes, only we think that the book seems to hit a high enough rate of internalizing the message that the claim would generalize to just “if everyone reads it” without being explicitly conditional on everyone internalizing the message.
Yeah, I think a more accurate statement would have been that insiders with a low p(doom) aren’t massively swayed to a higher p(doom) by the book.
That cartoon of the maze could be made into a nice little collaborative mini-game.
Not sure what that would be useful for, but it’s just something that came to mind.
Nice pattern of using an AI block inside of a collapsable section, and using it to clarify/examplify what you mean in a slightly different voice.
Yup, that would be a healthier view for society to hold. Sadly in pretty much any field that I’m aware of, companies are allowed to push ahead until accidents happen and until it’s proven that it’s unsafe.
We’ll need to find some way of overcoming that default since we’ll only get 1 real shot atn superintelligence alignment.
I see how “Local alignment != asymptotic alignment” is more accurate but I find the current title/claim easier to understand.
I coudl see them being a good pair, where the current title makes the claim and the local vs asymptotic stuff adds the mechanism. But the mechanism without the claim, I would fear, would fail to land for many people. Just sth to keep in mind as one datapoint if you ever do a followup post :)
Yeah, that’s tricky. My hope is that once you’re decently good at execution, you can recognize good execution, such that you can distinguish between the two.
What seems harder yet is “idea is flop and should be adjusted” vs “idea is flop and I should switch to a different idea” (a distinction which, by the way, lives on a spectrum)
The Lean Startup describes this as the “Pivot or Persevere” question.Where persevere is roughly: iterate more on the current idea.
And Pivot is roughly: iterate/change to a different idea.
There are movements that say people should quit sooner (see e.g. https://www.youtube.com/watch?v=3Xdyioqs5Ds ), and stories of startup founders who stubbornly stuck to their vision, rejection after rejection (whilst iterating).
Oh, also, getting advice from others is probably helpful on the execution front. If you’re underexecuting, people can probably recognize that, I expect. Whereas good advise on the product-market-fit side, i.e. “should I iterate on this idea or pivot to another one) is very hard to give/get.
Awesome. We’re about to create our facilitator training course so that could be one way to contribute. There’s loads of other ways too. Can you reach out in the intro-offers-asks channel on our server? https://discord.gg/nn7HrjFZ8E
For marking/tracking AI vs human written content, I’ve been watching Every’s new “Proof Editor” with some interest. https://proofeditor.ai/ Probably worth checking out their approach for inspiration. (Might be implementing something similar for our team’s internal custom Obsidian/Notion replacement)
Hmm, sth like doing ML research or creating tooling that’s dual use without really attending to the idea that it might be dual use and without a clear theory of change. Sorry I don’t have something more specific for you here.