Nina Panickssery

Karma: 3,252

https://ninapanickssery.com/

Views purely my own unless clearly stated otherwise

Nina Panickssery 12 Dec 2025 3:31 UTC
5 points
4
on: Does dissolving newcomb’s paradox matter?
I was actually thinking to make a follow-up post like this. I basically agree.
Let’s talk about two kinds of choice:
- choice in the moment
- choice of what kind of agent to be
I think this is the main insight—depending on what you consider the goal of decision theory, you’re thinking about either (1) or (2) and they lead to conflicting conclusions. My implicit claim in the linked post is that when describing thought experiments like Newcomb’s Problem, or discussing decision theory in general, people appear to be referring to (1), at least in classical decision theory circles. But on LessWrong people often switch to discussing (2) in a confusing way.
the core problem in decision theory is reconciling these various cases and finding a theory which works generally
I don’t think this is a core problem because in this case it doesn’t make sense to look for a single theory that does best at two different goals.

Nina Panickssery 16 Nov 2025 18:21 UTC
2 points
0
in reply to: Ryan Kidd’s comment on: AI safety undervalues founders
I think those other types of startups also benefit from expertise and deep understanding of the relevant topics (for example, for advocacy, what are you advocating for and why, how well do you understand the surrounding arguments and thinking...). You don’t want someone who doesn’t understand the “field” working on “field-building”.

Nina Panickssery 16 Nov 2025 17:57 UTC
2 points
−1
in reply to: Ryan Kidd’s comment on: AI safety undervalues founders
My bad, I read you as disagreeing with Neel’s point that it’s good to gain experience in the field or otherwise become very competent at the type of thing your org is tackling before founding an AI safety org.
That is, I read “I think that founding, like research, is best learned by doing” as “go straight into founding and learn as you go along”.

Nina Panickssery 16 Nov 2025 17:54 UTC
11 points
14
in reply to: Ryan Kidd’s comment on: AI safety undervalues founders
I naively expect the process of startup ideation and experimentation, aided by VC money
It’s very difficult to come with AI safety startup ideas that are VC-fundable. This seems like a recipe for coming up with nice-sounding but ultimately useless ideas, or wasting a lot of effort on stuff that looks good to VCs but doesn’t advance AI safety in any way.

Nina Panickssery 16 Nov 2025 17:52 UTC
2 points
0
in reply to: Ryan Kidd’s comment on: AI safety undervalues founders
I disagree with this frame. Founders should deeply understand the area they are founding an organization to deal with. It’s not enough to be “good at founding”.

Nina Panickssery 15 Nov 2025 18:07 UTC
11 points
5
in reply to: johnswentworth’s comment on: Don’t use the phrase “human values”
This makes sense as a strategic choice, and thank you for explaining it clearly, but I think it’s bad for discussion norms because readers won’t automatically understand your intent as you’ve explained it here. Would it work to substitute the term “alignment target” or “developer’s goal”?

Nina Panickssery 15 Nov 2025 18:02 UTC
5 points
1
in reply to: quetzal_rainbow’s comment on: Don’t use the phrase “human values”
When I say “human values” without reference I mean “type of things that human-like mind can want and their extrapolations”
This is a reasonable concept, but should have a different handle from “human values”. (Because it makes common phrases like “we should optimize for human values” nonsensical. For example, human-like minds can want chocolate cake but that tells us nothing about the relative importance of chocolate cake and avoiding disease, which is relevant for decision making.)

Nina Panickssery 15 Nov 2025 17:39 UTC
2 points
0
in reply to: Vladimir_Nesov’s comment on: Don’t use the phrase “human values”
What “human values” gesture at is distinction from values-in-general, while “preferences” might be about arbitrary values.
I don’t understand what this means.
Taking current wishes/wants/beliefs as the meaning of “preferences” or “values” (denying further development of values/preferences as part of the concept) is similarly misleading as taking “moral goodness” as meaning anything in particular that’s currently legible, because the things that are currently legible are not where potential development of values/preferences would end up in the limit.
Is your point here that “values” and “preferences” are based on what you would decide to prefer after some amount of thinking/reflection? If yes, my point is that this should be stated explicitly in discussions, for example like “here I am discussing the preferences you, the reader, would have, after thinking for many hours.”
If you want to additionally claim that these preferences are tied to moral obligation, this should also be stated explicitly.

Don’t use the phrase “human values”

Nina Panickssery15 Nov 2025 16:49 UTC

50 points

10 comments1 min readLW link

Nina Panickssery 15 Nov 2025 6:43 UTC
2 points
0
in reply to: TsviBT’s comment on: The problem of graceful deference
Yeah that’s fair. I didn’t follow the “In other words” sentence (it doesn’t seem to be restating the rest of the comment in other words, but rather making a whole new (flawed) point).

Nina Panickssery 15 Nov 2025 6:29 UTC
8 points
0
on: “But You’d Like To Feel Companionate Love, Right? … Right?”
Has this train of thought caused you to update away from “Human Values” as a useful construct?

Nina Panickssery 14 Nov 2025 18:59 UTC
15 points
8
in reply to: TsviBT’s comment on: The problem of graceful deference
I was curious so I read this comment thread, and am genuinely confused why Tsvi is so annoyed by the interaction (maybe I am being dumb and missing something). My interpretation of Wei Dai’s point is the following:
- Tsvi is saying something like:
  1. People have a tendency to defer too much (though deferring sometimes is necessary). They should consider deferring less and thinking for themselves more.
  2. When one does defer, it’s good to be explicit about that fact, both to oneself and others.
- As an example to illustrate his point, Tsvi mentions a case where he deferred to Yudkowsky. This is used as an example because Yudkowsky is considered a particularly good thinker on the topic Tsvi (and many others) deferred on, but nevertheless there was too much deference.
- Wei Dai points out that he thinks the example is misleading, because to him it looks more like being wrong about who it’s worth deferring to, rather than deferring too much. The more general version of his point is “You, Tsvi, are noticing problems that occur from people deferring. However, I think these problems may be at least partially due to them deferring to the wrong people, rather than deferring at all.”
(If this is indeed the point Wei Dai is making, I happen to think Tsvi is more correct, but I don’t think WD’s contribution is meaningless or in bad faith.)

Nina Panickssery 14 Nov 2025 16:05 UTC
12 points
0
in reply to: johnswentworth’s comment on: How I Learned That I Don’t Feel Companionate Love
That’s a decision whose emotional motivation is usually mainly oxytocin IIUC.
I strongly doubt this, especially in men. I suspect it plays a role in promoting attachment to already-born kids but not in deciding to have them.
Oxytocin is one huge value-component which drives people to sink a large fraction of their attention and resources into local things which don’t pay off in anything much greater. It’s an easier alternative outlet to ambition. People can feel basically-satisfied with their mediocre performance in life so long as they feel that loving connection with the people around them, so they’re not very driven to move beyond mediocrity.
I know you are posting on LW which is a skewed audience, but most people are mediocre at most things and are unlikely to achieve great feats according to you, even with more ambition. Having a happy family is quite a reasonable ambition for most people. In fact, it is of the few things an everyday guy can do that “pays off in anything much greater” (i.e. the potential for a long generational line and family legacy).
(Also consider that stereotypically, women are the ones who spend the most effort on domestic and child-related matters, and are also less likely to be on the far right of bell curves.)

Notes on the book “Talent”

Nina Panickssery14 Nov 2025 5:43 UTC

24 points

1 comment15 min readLW link

(blog.ninapanickssery.com)

Nina Panickssery 13 Nov 2025 15:39 UTC
21 points
16
in reply to: Wei Dai’s comment on: Please, Don’t Roll Your Own Metaethics
At risk of committing a Bulverism, I’ve noticed a tendency for people to see ethical bullet-biting as epistemically virtuous, like a demonstration of how rational/unswayed by emotion you are (biasing them to overconfidently bullet-bite). However, this makes less sense in ethics where intuitions like repugnance are a large proportion of what everything is based on in the first place.

Favorite quotes from “High Output Management”

Nina Panickssery13 Nov 2025 5:47 UTC

71 points

4 comments5 min readLW link

Nina Panickssery 13 Nov 2025 5:28 UTC
2 points
0
in reply to: habryka’s comment on: Do not hand off what you cannot pick up
Maybe I will make a (somewhat lazy) LessWrong post with my favorite quotes
Edit: I did it: https://www.lesswrong.com/posts/jAH4dYhbw3CkpoHz5/favorite-quotes-from-high-output-management

Nina Panickssery 13 Nov 2025 4:05 UTC
10 points
4
on: Do not hand off what you cannot pick up
Nice principle.
Reminds me of the following quote from classic management book High Output Management:
Given a choice, should you delegate activities that are familiar to you or those that aren’t? Before answering, consider the following principle: delegation without follow-through is abdication. You can never wash your hands of a task. Even after you delegate it, you are still responsible for its accomplishment, and monitoring the delegated task is the only practical way for you to ensure a result. Monitoring is not meddling, but means checking to make sure an activity is proceeding in line with expectations. Because it is easier to monitor something with which you are familiar, if you have a choice you should delegate those activities you know best. But recall the pencil experiment and understand before the fact that this will very likely go against your emotional grain.

Nina Panickssery 13 Nov 2025 1:18 UTC
5 points
2
in reply to: Dweomite’s comment on: Human Values ≠ Goodness
A common use of “Human Values” is in sentences like “we should align AI with Human Values” or “it would be good to maximize Human Values upon reflection”, i.e. normative claims about how Human Values are good and should be achieved. However, if you’re not a moral realist, there’s no (or very little) reason to believe that humans, even if they reflect for a long time etc., will arrive on the same values. Most of the time if someone says “Human Values” they don’t mean to include the values of Hitler or a serial killer. This makes the term confusing, because it can both be used descriptively and normatively, and the normative use is common enough to make it confusing when used as a purely descriptive term.
I agree that if you’re a moral realist, it’s useful to have a term for “preferences shared amongst most humans” as distinct from Goodness, but Human Values is a bad choice because:
1. It implies preferences are more consistent amongst humans than they really are
2. The use of “Human Values” has been too polluted by others using it in a normative sense

Nina Panickssery 12 Nov 2025 3:47 UTC
6 points
4
in reply to: habryka’s comment on: GradientDissenter’s Shortform
I really appreciate your clear-headedness at recognizing these phenomena even in people “on the same team”, i.e. people very concerned about and interested in preventing AI X-Risk.
However, I suspect that you also underrate the amount of self-deception going on here. It’s much easier to convince others if you convince yourself first. I think people in the AI Safety community self-deceive in various ways, for example by choosing to not fully think through how their beliefs are justified (e.g. not acknowledging the extent to which they are based on deference—Tsvi writes about this in his recent post rather well).
There are of course people who explicitly, consciously, plan to deceive, thinking things like “it’s very important to convince people that AI Safety/policy X is important, and so we should use the most effective messaging techniques possible, even if they use false or misleading claims.” However, I think there’s a larger set of people who, as they realize claims A B C are useful for consequentialist reasons, internally start questioning A B C less, and become biased to believe A B C themselves.

Nina Panickssery

Don’t use the phrase “hu­man val­ues”

Notes on the book “Ta­lent”

Fa­vorite quotes from “High Out­put Man­age­ment”

Don’t use the phrase “human values”

Notes on the book “Talent”

Favorite quotes from “High Output Management”