Nina Panickssery
And good decaf black tea is even harder to get…
Counterargument: sure, good decaf coffee exists, but it’s harder to get hold of. Because it’s less popular, the decaf beans at cafés are often less fresh or from a worse supplier. Some places don’t stock decaf coffee. So if you like the taste of good coffee, taking caffeine pills may limit the amount of good coffee you can access and drink without exceeding your desired dose.
On optimizing for intelligibility to humans (copied from substack)
One risk of “vibe-coding” a piece of software with an LLM is that it gets you 90% of the way there, but then you’re stuck—the last 10% of bug fixes, performance improvements, or additional features is really hard to figure out because the AI has written messy, verbose code that both of you struggle to work with. Nevertheless, to delegate software engineering to AI tools is more tempting than ever. Frontier models can spit out almost-perfect complex React apps in just a minute, something that would have taken you hours in the past. And despite the risks, it’s often the right decision to prioritize speed, especially as models get smarter.
There is, of course, a middle ground between “vibe-coding” and good old-fashioned typing-every-character-yourself. You could use LLMs for smart autocomplete, occasionally asking for help with specific functions or decisions, or small and targeted edits. But models don’t seem optimized for this use case. It’s actually difficult to do so—it’s one thing to build an RL environment where the goal is to write code that passes some tests or gets a high preference score. It’s another thing to build an RL environment where the model has to guide a human to do a task, write code that’s easy for humans to build on, or ensure the solution is maximally legible to a human.
Will it become a more general problem that the easiest way for an AI to solve a problem is to produce a solution that humans find particularly hard to understand or work with? Some may say this is not a problem at the limit, when AIs are robustly superhuman at the task, but until then there is a temporary period of slop. Personally, I think this is a problem even when AIs are superhuman because of the importance of human oversight. Optimizing for intelligibility to humans is important for robustness and safety—at least some people should be able to understand and verify AI solutions, or intervene in AI-automated systems when needed.
People talk about meditation/mindfulness practices making them more aware of physical sensations. In general, having “heightened awareness” is often associated with processing more raw sense data but in a simple way. I’d like to propose an alternative version of “heightened awareness” that results from consciously knowing more information. The idea is that the more you know, the more you notice. You spot more patterns, make more connections, see more detail and structure in the world.
Compare two guys walking through the forest: one is a classically “mindful” type, he is very aware of the smells and sounds and sensations, but the awareness is raw, it doesn’t come with a great deal of conscious thought. The second is an expert in botany and birdwatching. Every plant and bird in the forest has interest and meaning to him. The forest smells help him predict what grows around the corner, the sounds connect to his mental map of birds’ migratory routes.
Sometimes people imply that AI is making general knowledge obsolete, but they miss this angle—knowledge enables heightened conscious awareness of what is happening around you. The fact that you can look stuff up on Google, or ask an AI assistant, does not actually lodge that information in your brain in a way that lets you see richer structure in the world. Only actually knowing does that.
In case someone wants a more extreme version of this post: https://ninapanickssery.substack.com/p/stormicism
I’d guess that you can suffer quite severe impairment from only a small amount of physical brain damage if the damage occurs in locations important for connecting different brain areas/capabilities. Information being “not lost, just inaccessible” seems realistic to me. However, I wouldn’t base this intuition on cases of terminal lucidity.
I am not saying care and compassion is incompatible with rationality and high-quality writing.
Yes, perhaps it’s reasonable to require some standard, but personally I think there’s a place for events where that standard is as or more permissive than it is at LessOnline. This is my subjective opinion and preference, but I would not be surprised if many LessWrong readers shared it.
It’s of course reasonable to skip an event because people you don’t like will be there.
However, it’s clear that many people have the opposite preference, and wouldn’t want LessOnline attendees or invited guests to have to meet a “standard of care and compassion,” especially one wherever you’re putting it.
LessOnline seems to be about collecting people interested in and good at rationality and high-quality writing, not about collecting people interested in care and compassion. For the latter I’d suggest one go to something like EA Global or church…
This, and several of the passages in your original post such as, “I agree such a definition of moral value would be hard to justify,” seem to imply some assumption of moral realism that I sometimes encounter as well, but have never really found convincing arguments for. I would say that the successionists you’re talking to are making a category error, and I would not much trust their understanding of ‘should’-ness outside normal day-to-day contexts.
I broadly agree.
I am indeed being a bit sloppy with the moral language in my post. What I mean to say is something like “insofar as you’re trying to describe a moral realist position with a utility function to be optimized for, it’d be hard to justify valuing your specific likeness”.
In a similar fashion, I prefer and value my family more than your family but it’d be weird for me to say that you also should prefer my family to your own family.
However, I expect our interests and preferences to align when it comes to preferring that we have the right to prefer our own families, or preferring that our species exists.
(Meta: I am extremely far from an expert on moral philosophy or philosophy in general, I do aspire to improve how rigorously I am able to articulate my positions.)
Not a fair trade, but also present-day “Mundane Mandy” does not want to risk everything she cares about to give “Galaxy-brain Gavin” the small chance of achieving his transhumanist utopia.
There’s no reason for me to think that my personal preferences (e.g. that my descendants exist) are related to the “right thing to do”, and so there’s no reason for me to think that optimizing the world for the “right things” will fulfil my preference.
I think most people share similar preferences to me when it comes to their descendants existing, which is why I expect my sentiment to be relatable and hope to collaborate with others on preventing humanity’s end.
Why I am not a successionist
Fair, there’s a real tension between signaling that you think someone has a good mindset (a form of intellectual respect) and signaling that you are scared of someone’s power over you or that you care a lot about their opinion of you.
Perhaps a situation to avoid giving advice in is if you think your advice is likely to be genuinely worthless because you have no expertise, knowledge, or intelligence that is relevant to the matter and you don’t trust your own judgment at all. Otherwise if you respect the other person, you’d consider them able to judge the usefulness of your advice for themselves.
You can’t know for sure that they’ve heard some advice before. Also you are providing the information that the piece of advice occurred to you, which in and of itself is often interesting/useful. So if you’re giving someone advice they are likely to have heard before this means there is a small chance that’s wrong and it’s still useful, and a larger chance that it has value zero. So in expectation the value is still positive. If you don’t give the advice, you are prioritizing not looking stupid or not offending them, which are both selfish motives.
Related to (2) is that telling someone you disapprove or think less of them for something, i.e. criticizing without providing any advice at all, is also a good signal of respect, because you are providing them with possibly useful information at the risk of them liking you less or making you feel uncomfortable.
My intuition is that:
The average person undervalues trying to find more rewarding/impactful/interesting work, making the mistake you describe in the post
The average LessWrong reader undervalues investing in family, i.e. they think too much about how to have an impactful career and too little about how to have a successful family
I agree with this but separately the whole notion of what one might think on their deathbed is overrated (it’s partially a metaphor but still). You can reflect on your life so far at any time and it’s unlikely that your clearest thoughts on the matter will come on your deathbed...
Giving unsolicited advice and criticism is a very good credible signal of respect
I have often heard it claimed that giving advice is a bad idea because most people don’t take it well and won’t actually learn from it.
Giving unsolicited advice/criticism risks:
The recipient liking you less
The recipient thinking you are stupid because “obviously they have heard this advice before”
The recipient thinking you are stupid because they disagree with the advice
The recipient being needlessly offended without any benefit
People benefit from others liking them and not thinking they are stupid, so these are real costs. Some people also don’t like offending others.
So clearly it’s only worth giving someone advice or criticism if you think at least some of the following are true:
Their wellbeing/impact/improvement is important enough that the small chance your advice has a positive impact is worth the cost
They are rational enough to not take offense in a way that would damage your relationship
They are particularly good at using advice/criticism, i.e. they are more likely to update than the average person
They value honest opinions and feedback even when they disagree, i.e. they prefer to know what others think about them because it’s interesting and potentially useful information even if not immediately actionable
The above points all reflect a superior attitude compared to the average person. And so, if you choose to give someone advice or criticism despite all the associated risks, you are credibly signaling that you think they have these positive traits.
Not giving unsolicited advice and criticism is selfish
The “giving advice is bad” meme is just a version of “being sycophantic is good”—you personally benefit when others like you and so often it’s useful to suck up to people.
Even the risk that your interlocutor is offended is not a real risk to their wellbeing—people dislike offending others because it feels uncomfortable to them. Being offended is not actually meaningfully harmful to the offended party.
I think people who predict significant AI progress and automation often underestimate how human domain experts will continue to be useful for oversight, auditing, accountability, keeping things robustly on track, and setting high-level strategy.
Having “humans in the loop” will be critical for ensuring alignment and robustness, and I think people will realize this, creating demand for skilled human experts who can supervise and direct AIs.
(I may be responding to a strawman here, but my impression is that many people talk as if in the future most cognitive/white-collar work will be automated and there’ll be basically no demand for human domain experts in any technical field, for example.)
Was recently reminded of these excellent notes from Neel Nanda that I came across when first learning ML/MI. Great resource.
What do you do, out of curiosity?