ZY

Karma: 89

I try to practice independent reasoning/critical thinking, to challenge current solutions to be more considerate/complete. I value receiving and giving dissent. I do not reply to DMs for non-personal (with respect to the user who reached out directly) discussions, and will post here instead with reference to the user and my reply.

ZY 26 Sep 2025 18:28 UTC
1 point
0
in reply to: Zach Stein-Perlman’s comment on: Zach Stein-Perlman’s Shortform
The basic approach is: do evals; find weaker capabilities than other open-weights models; infer that it’s safe to release weights.
Curious—what made you think this is new to Code World Model comparing to other Meta releases?

ZY 13 Aug 2025 21:50 UTC
1 point
0
in reply to: Kyle O’Brien’s comment on: Kyle O’Brien’s Shortform
Thanks for sharing this paper; this also reminded me of a paper, A Pretrainer’s Guide to Training Data: Measuring the Effects of Data Age, Domain Coverage, Quality, & Toxicity (https://arxiv.org/pdf/2305.13169), and their section on toxicity filtering (threshold, classifier vs generation trade-off)

ZY 2 Aug 2025 1:39 UTC
1 point
0
in reply to: ZY’s comment on: My Empathy Is Rarely Kind
I was just reminded of a story I saw online that is related to this and wanted to share since it was positive and reflective. The story OP shared an experience when walking behind a family; they encountered a homeless and the father turned to the kid and started with something like “study well and after you grow up…”; the OP thought maybe the father wanted to say “don’t end up like the homeless” which is what the poster’s father used to say to them. Instead the father said “help these people to be in better situations”. And the OP found it beautiful and I found it beautiful too. It seems the two fathers both “understood” the pain of being a homeless, but had different understanding on the “how”, and decided to act differently based on that understanding.

ZY 31 Jul 2025 18:41 UTC
1 point
0
on: I am worried about near-term non-LLM AI developments
Do agree not to just focus on LLM (LLM base or agents), but also other architectures.

ZY 30 Jul 2025 15:28 UTC
3 points
0
on: My Empathy Is Rarely Kind
I would agree the mindset of “I can fix things if I were you” could prevent “empathy”. (I was also reading other comments mentioning this is not true empathy but simulation and I found it insightful too.) The key problem is if you would be able to tell if this is something they are able to fix, and what part of this is attributable to what they can do, and what part is attributable to lack of privilege. For example, a blind person cannot really type easily without special equipment. They or their family may not have the money to buy that special equipment. The parents were not able to get a college degree without some form of generational wealth. The same is true for intelligence level. (For example.)
Even growth mindset, is something that is developed through our education, environment growing up, experience, or even something like visa status. This is probably where empathy starts to develop further.

ZY 30 Jul 2025 4:06 UTC
1 point
0
in reply to: Seth Herd’s comment on: DresdenHeart’s Shortform
Would recommend checking out the link I posted from the EA forum to see why AI X risk may not get to some population and they die before then; and the proposal I have to work on both precisely avoids caring only for subsets

ZY 29 Jul 2025 23:35 UTC
1 point
−8
in reply to: habryka’s comment on: DresdenHeart’s Shortform
It would make sense in capability cases. But unfortunately, in a lot of live saving cases, all are important (this gets a bit more into other things so let me focus on only the following two points for now). 1. Many causes are not actually comparable in general cause prioritization context (one of which is people may inherent personal biases based their experience and worlds, second is it is hard to value 10 kids’ lives in US vs 10 kids’ lives in Canada, for example), and 2. Time is critical when thinking of lives. You can think of this as emergency rooms.
https://forum.effectivealtruism.org/posts/s3N8PjvBYrwWAk9ds/a-perspective-on-the-danger-hypocrisy-in-prioritizing-one
The link above illustrates an example of when time is important.

ZY 29 Jul 2025 23:04 UTC
0 points
−1
in reply to: james oofou’s comment on: DresdenHeart’s Shortform
Sharing different perspective on why current risks is also important https://forum.effectivealtruism.org/posts/s3N8PjvBYrwWAk9ds/a-perspective-on-the-danger-hypocrisy-in-prioritizing-one, given I also believe long term risks (which to me is mostly mapping to agent safety at this point in time) is important too.
(Edit and gentle call out generally based on my observation: while downvote on disagreeing is perfectly reasonable if one disagrees, the downvote on overall karma when disagreeing seems to be inconsistent with that lesswrong’s values and what it stands for; Suppressing professional dissent might be dangerous.)

ZY 29 Jul 2025 23:02 UTC
−1 points
−43
in reply to: DresdenHeart’s comment on: DresdenHeart’s Shortform
saying that people shouldn’t be concerned with existential risk in the future as communities today are being affected—and that I should not have done this research
Sorry to hear this. As someone who works in societal harms of AI, I would disagree with this view in the quote. My disagreement is common in my circle, and this view in the quote is uncommon. It is interesting/I can empathize because I usually hear this the other way around (AI X risk people telling others that societal harms should not be considered).
But I also believe there should be no claim on [one is “bigger” than the other one] in the situation of saving lives (edited to clarify). (This might be an unpopular opinion and contradicts with cause prioritization, which I am personally not a believer in when working with causes that is related to saving people.) On societal harms for example, PII, deep fakes, child sexual exploitation, self harm, subtle bias and toxicity are real. Both societal harms and long term risks (which maps to me as agent safety) are both important, and both need resources to work on. This view is again common in my circle. Furthermore, many research methods, safety-focused mindset, and policy are actually shared (may be more than people think) between the two camps.^[1]
1. ^
  (Edit and gentle call out generally based on my observation: while downvote on disagreeing is perfectly reasonable if one disagrees, the downvote on overall karma when disagreeing seems to be inconsistent with that lesswrong’s values and what it stands for; Suppressing professional dissent and different opinions might be dangerous.)

ZY 25 Jul 2025 22:00 UTC
9 points
2
in reply to: Gordon Seidoh Worley’s comment on: Women Want Safety, Men Want Respect
I unfortunately had the same feeling that you had the concept of respect wrong, and lacking of the understanding of the underlying social aspect of risk averse vs risk taking in this post, and felt didn’t have enough time yet to educate. Sorry for the bluntness.
I could only maybe say—as a woman I could try to say is we want basic respect (which any decent human should get), impact, power, authority, influence, winning, fights, adventures, becoming better versions of oneself as much as non-woman, but if you as non-women and strongly believe that’s not true I am not sure how much better I could approach the problem/dispute. (As you hinted in the post you might be overestimating how much you know about the population you are making a claim on.)

ZY 25 Jul 2025 20:09 UTC
1 point
0
in reply to: Gordon Seidoh Worley’s comment on: Women Want Safety, Men Want Respect
Yeah I have a lot (that’s why in the first place I don’t really know where to start.)
Now sometimes there are direct tradeoffs, like @Cole Wyeth mentioned in this comment about UFC fights. In fact, men frequently take on dangerous jobs because it earns them respect. Big game hunting was perhaps the original dangerous job for humans, as was war. In modern times men do things like risk their financial safety for a shot at high variance gains
I have replied to this from the DM with Cole, and posted below. See the comment here: https://www.lesswrong.com/posts/9jhrWnxYkoZPxMZMj/women-want-safety-men-want-respect?commentId=TvjJd2gKfewbR6v8a
men frequently take on dangerous jobs because it earns them respect
Besides the definition that I disagree on respect, I also disagree even if that word is replaced by “validation”. Whether people choose to do something according to social approval or not is independent of gender. The problem is what society consider as socially approvable behavior for certain groups. If as a society people approve men to be more prudent in taking risks, are you saying men still “want” to take risks? In other words the point I am getting at is, if for women, the socially approvable behavior is to take less risks, and for men the socially approvable behavior is to take more risks, then for a women and a men who choose to conform to these expectations, they are both asking for validation.
They won’t get lots of respect as an individual as a nurse, paralegal, etc
To clarify: what you mean by individually nurse and paralegal are respected? You seem to have an assumption that more risks means more “respect” which is circling back to my disagreement on the definition.
All people, if they can, want to thrive, granted if they are able to. For example, RBG need to fight for a lot of cases where schools do not admit women, or west point do not admit women. “Want” is a way too strong of a word. I would suggest something more along the lines of “historically women may be forced to pursue more risk averse options for reasons xxxx during time period A to Z, and this need to change”, if “women ending up in more risk averse options” is true for some time A to Z.
(I have a bit more to say as well but will probably come back to this later).

ZY 25 Jul 2025 19:58 UTC
1 point
0
in reply to: Cole Wyeth’s comment on: Women Want Safety, Men Want Respect
Here is the answer that I replied in DM since I wasn’t able to post, and the full history of conversation in DM upon Cole’s consent, for audience of this post and comments. @Gordon Seidoh Worley I just saw your reply too! Will reply to yours once I got a second as well; but some of the answers could also be found below. The first part is immediate response to this example Cole provided.
Seems I cannot post the reply in 20hours, but here is the reply, and I will post it to the comment in 20h.
It is fine if you disagree.
I have three criticisms to the example.
First I do not consider this as “respect” seeking, but “validation”/winning seeking. (See my long reply for the difference). I also don’t consider this as safety but that gets more involved into physical vs mental.
Second, with the definition of winning/validation, this seems to be an out of context example with this post with weak general connection to general life. The context for this is a fighting game. Participation of the game signals a prior willing to take risks.
ZY 19h
Two criticisms* (grouped one of them together)
Cole Wyeth 18h
I think you’re mainly trying to dispute the way that the word respect is used. Yes, a basic level of respect is somewhat more about boundaries, and I would even say that at this level it is pretty closely connected to safety because it implies something like moral or legal or just agreed upon basic rights. But it seems to me that there can be greater or lesser degrees of respect, and the greater degrees look a lot like admiration / reputation for competence. Also, my reading of the post is that it is this higher degree of respect which the OP intended to talk about—or rather, the entire axis, including this higher degree. I’m not very interested in further debating the word choice.

And no, I don’t think the MMA example is too special. You could also consider mountain climbing or for many people military service. Really, any profession or pursuit that puts the self at significant physical risk (or for that matter, intense mental strain) seems to be disproportionately chosen among men. The major exception is probably pregnancy / childbirth, which is interesting (maybe that’s risk enough for many women!) but obviously this is also a risk that men cannot choose to take.
ZY 18h
I don’t see mountain climbing being disproportionally “chosen” by men. Nor military service. West Point only starting to even admit women since 1970s, not bc of ability or lack of demand, but more for discrimination, either policy wise or mentally enforced. https://warontherocks.com/2020/11/justice-ruth-bader-ginsburg-and-the-u-s-military/

There are many cases like this.
There is also “want” that I want to dispute for definition. A lot of things people ends up, or not what they “want” generally. The environmental factor, either explicitly or even implicitly, restricts on if people can get what they want.
Cole Wyeth 18h
I’m sorry, but I frankly can’t take seriously your assertion that women would be equally keen to enroll in the military if allowed. That is very clearly untrue—which is why, as far as I’m aware, there has never been any society in history with an equal number of men and women in the military. It seems very hard to believe that every society has conspired to prevent women from entering armed service. Can you produce any examples of nations which allowed women to serve, and then actually saw women enroll at similar rates to men?
Cole Wyeth 17h
As for mountain climbing, see this list of people who have climbed Everest multiple times for a basic sanity check: https://en.m.wikipedia.org/wiki/List_of_Mount_Everest_summiters_by_frequency
Cole Wyeth 17h
As you can see, the vast majority are men.
ZY 17h
Society need to adopt to the restrictions that was historically put on women, both explicit laws and discrimination/stereotypes.
Additionally, your comments does not address the want vs consequence part. Similarly, there are less female CEOs, or has less pay, but not bc they do not want to.
Cole Wyeth 6h
The idea that all of these differences are because of societal restrictions is just an assertion you are making. You attribute every difference across genders to “society.” Why do these differences then persist across all societies, across all times? That seems to beg for an explanation, and without providing one, you are only speculating, and perhaps choosing the explanation that seems most ideal to you.
Concretely, it also seems very unlikely in the case of Everest. Women (from, say, America) have about the same ability to climb (or at least, to attempt) Everest as men do, but we see much lower numbers. If this isn’t convincing to you, feel free to look into the number of attempts by sex (do you want to bet which way it will come out? I think we both already have the same guess). So there is no restriction here, but women still chose the dangerous activity less often. Do you really believe that somehow, the restrictions that were once placed on women 100 years ago are still preventing them from climbing Everest because they haven’t “adapted?” This does not seem to make any sense to me; I do not believe that.

But, even if we were to accept for a moment that women take less risks because they haven’t “adapted” to restrictions being removed—fine, that means they choose to take less risks, so the OP’s point stands. You’re simply asserting a different (and in my opinion, much less plausible) explanation.
ZY 5h
You could look up history in examples countries to see why it persists at all times, and maybe some news.
There are less free will, and more influence from environment, and maybe some psychology books help. Women, and all people want power, and are risk taking if their env allow them to, and when the env does not, and are stereotype enforcing, that’s when people are frustrated, have protests, and push for legal reforms.
ZY 5h
And by power I don’t mean power an abusive way, but to win, to take risks, and to achieve more, and to influence
Cole Wyeth 2h
You’re just saying things without any supporting evidence or arguments.
I’m not going to continue this conversation.
ZY 1h
It is probably a few history/sociology sciences class as evidence, and also anecdotes that I am not comfortable sharing yet (sorry; and anecdotes to me creates more empathy than statistical evidence anyways, and may be biased which started the mess in the first place, though might be good counter examples to balance things out). I welcome you to study more of these in the future.
For Everest: https://www.markhorrell.com/blog/2020/10-facts-about-everest-success-and-death-rates-based-on-scientific-data/ here are a lot of interesting facts; and I would also encourage you to read if you are interested. “Success”, “respect”, and “risk-taking” are all different words, and consequence does not imply intention.
Finally, as I mentioned yesterday, I would be posting my reply in this dm to your comment yesterday—are you comfortable with me posting our entire history including your turns as well (I wrote this on my page but I reached out first in dm)? If not I will remove your turns in this DM and only post mine.
Cole Wyeth 20m
Sure, you can copy paste everything, but regardless I am not going to engage on this topic further.
ZY 1m
Yeah, I understand; this is for audience of the comments/post.
What links here?
- ZY's comment on Women Want Safety, Men Want Respect by Gordon Seidoh Worley (25 Jul 2025 20:09 UTC; 1 point)

ZY 24 Jul 2025 22:16 UTC
1 point
−4
in reply to: Cole Wyeth’s comment on: Women Want Safety, Men Want Respect
Yeah, and what my point is, these two are not able to be prioritized, for the reason (roughly) “they usually are not or can not be constrained by the same resources/not in the same resource pool”

ZY 24 Jul 2025 21:44 UTC
1 point
0
in reply to: Cole Wyeth’s comment on: Women Want Safety, Men Want Respect
In the context of limited resources, they are contradictory with that resource. Usually only with limited resource, you need a preference/prioritization, or a need to point out that preference/prioritization, and differences between prioritizations (in this case women vs men).

ZY 24 Jul 2025 20:38 UTC
3 points
0
in reply to: Gordon Seidoh Worley’s comment on: Women Want Safety, Men Want Respect
Thanks for the answer! Appreciate you recognize these as stereotypes. I also may unconsciously believed that being more general is less rude than being more specific, but it creates more confusion. I have a couple levels of comments (if I accidentally posted this before finished, it could be an error as I was in and out between meetings.)
That is, men are more willing than women to trade off safety to earn respect and women are more willing than men to trade off respect to increase safety.
First of all, what I take as “safety” and “respect” do not belong to the same dimension. Respect is something everyone want and need, and it is independent of how much “safety” a person want. Any human without respect feels not a human. From your context, I would think by “respect” you probably meant something else as “validation” or “admiration” or even “authority”/”power”. The safety-respect trade-off seems to be best along the lines of “risk-averse” vs “risk-taking”.
To wit, women can only reproduce if they are alive.
...it helps to understand that women don’t share their same strong drive for respect.
On above first quote: they can also kill themselves/the baby if they do not want to reproduce but was forced to. On the above second quote, it is a wrong claim to me, as women share the same strong drive for “respect”.
Following from the previous paragraphs, with the new definition of “risk-averse” vs “risk-taking”—are there correlations with gender? That I am also not sure about based on my experience, and need to do some additional research and experiments. But what I know as a fact, is that there is no causal relationship. Therefore, I generally have problems with claims that could imply causal relationships. This will be enhancing bias and stereotypes about not only the opposite sex but your own sex/in yourself. It is almost how people enjoy constellation reading sometimes, where the more you read something about “your group and their associated traits”, the more you try to fit to that description unconsciously, and the more you feel descriptions may be accurate. The causal factor here is—how is this person raised? This could also be quite situational, as opposed to a fixed preference.
I mostly offer an evolutionary explanation (which is not strictly the same thing as a biological one) because I think it’s the source of the difference.
I agree with evolutionary explanation is different with biological explanation, but the source of that evolution seems still to be biological. And I personally believe that we as humans have developed far beyond our animal instincts, so evolutionary/biological arguments might not be a good source of explanation, though could be a side one, conditional on if it is explained properly (following up with how things have changed/humans have developed).
Further, it seems you are extrapolating this to dating context, which would be good to be separated out. Dating preferences is also quite different among people, at least those who have seriously explored themselves (as opposed to following what the societies impose on them).
It’s not unfair to think that I’m reasoning about stereotypes here because stereotypes often reflect common behavioral patterns. I didn’t give must justifications because to me these patterns seem obvious.
Reasoning about stereotypes and believing in certain stereotypes versus another could mean it is something about the person who is exhibiting the stereotypes, not necessarily the stereotyped group themselves (opinions vs truth). Overall, this also echoes back to avoid making simple/surface level claims.
Another layer is “want” vs “being forced to”. Historically for example, women were not able to participate in lots of activities, and were educated to not take risks (though times have already changed with lots of efforts and reforms, that I hope is not going to be gone in a second given recent events). These are explicit and implicit restrictions societies has put on certain types of populations.
Finally, I have a ton of anecdotes to share on all points written above that may help understanding, but still debating on if I should.

ZY 24 Jul 2025 20:23 UTC
−1 points
0
in reply to: Cole Wyeth’s comment on: Women Want Safety, Men Want Respect
That is, men are more willing than women to trade off safety to earn respect and women are more willing than men to trade off respect to increase safety.
This quote in the post shows the implication. There were frankly also a lot of things to point out, but I should and will post a longer reply to the OP.

ZY 24 Jul 2025 16:14 UTC
0 points
1
in reply to: Gordon Seidoh Worley’s comment on: Women Want Safety, Men Want Respect
Yes, I read the post (I should have clarified that), but was not convinced by the detailed claims, and how they support the overall claim/was not able to find my answers.

ZY 24 Jul 2025 15:27 UTC
5 points
1
on: Women Want Safety, Men Want Respect
It is still not clear to me from this post how “safety” and “respect” are contradictory (edit to clarify—have trade-offs). Additionally these seems to be biased from certain experiences from the author and most claims are still clearly stereotypical to me (sorry), from non-trivial counter examples in my experience (for each explanation in the post). I also think people’s needs for certain things are mostly reactive to their experience/environment.^[1]
1. ^
  As opposed to something biological.

ZY 22 Jul 2025 17:13 UTC
1 point
0
in reply to: Gordon Seidoh Worley’s comment on: Change My View: AI is Conscious
Non-LLM AI do sometimes meet my definition for consciousness.
Curious to hear a few examples for this? Would something like AlphaGo meet the definition?

ZY 18 Jul 2025 23:15 UTC
1 point
0
in reply to: Rauno Arike’s comment on: Rauno’s Shortform
From my experience, there are usually sections (for at least the earlier papers historically) on safety for model releasing.
- PaLM: https://arxiv.org/pdf/2204.02311
- Llama 2: https://arxiv.org/pdf/2307.09288
- Llama 3: https://arxiv.org/pdf/2407.21783
- GPT 4: https://cdn.openai.com/papers/gpt-4-system-card.pdf
Labs release some general paper like (just examples):
- https://arxiv.org/pdf/2202.07646, https://openreview.net/pdf?id=vjel3nWP2a etc (Nicholas Carlini has a lot of paper related to memorization and extractability)
- https://arxiv.org/pdf/2311.18140
- https://arxiv.org/pdf/2507.02735
- https://arxiv.org/pdf/2404.10989v1
If we see less contents in these sections, one possibility is increased legal regulations that may make publication tricky (imagine an extreme case, companies have sincere intents to produce some numbers that is not indication of harm yet, these preliminary or signal numbers could be used in an “abused” way in legal dispute, may be for profitability reasons/”lawyers need to win cases” reasons). To remove sensitive information, it would comes down then to time cost involved in paper writing, and time cost in removing sensitive information. And interestingly, political landscape could steer companies away from being more safety focused. I do hope there could be a better way to resolve this, providing more incentives for companies to report and share mitigations and measurements.
As far as I know, safety tests usually are used for internal decision making at least for releases etc.