FinalFormal2

Karma: 163

FinalFormal2 Apr 27, 2025, 11:30 PM
1 point
0
in reply to: plex’s comment on: ete’s Shortform
Super curious- are you willing to give a sentence or two on the take here?

FinalFormal2 Mar 29, 2025, 5:50 PM
2 points
0
in reply to: Noosphere89’s comment on: [Link] A community alert about Ziz
No. Because “going off the rails” often involves doing things that are observably irrational even by your own worldview. Like killing your parents and your landlord.
You can say: “this might make sense from their worldview! (soy)”
And the obvious response is: Yes. Because they’re crazy. Because they went off the rails.
You can also say: “But we’ll never know! Who can know? Nobody knows! Truth is subjective blargh”
And the again obvious response is: Yes, but we can observe patterns. And if you can’t update on this evidence and use some basic sense when this sort of thing repeats, you are not thinking clearly.

FinalFormal2 Mar 29, 2025, 5:26 PM
0 points
1
in reply to: Ratios’s comment on: [Link] A community alert about Ziz
“The suffering of animals across the globe is so great, and society is so indifferent to them- that I need to kill my asshole parents and my landlord”
You: omgosh they’re so courageous and intelligent uwu
Are you just saying: Ten out of ten for style, but minus several million for good thinking? Because the good thinking part is kind of integral to courage and intelligence imo. Probably if you can’t keep your whole organization from facing prison time because you can’t come up with a better housing solution than stabbing some old guy, probably you haven’t been very courageous or intelligent.
You’re probably, and take me very seriously when I say this, just crazy. And the things you do probably, possibly, maybe, might be, do not correlate well at all with your stated reasoning.

FinalFormal2 Feb 1, 2025, 7:28 PM
1 point
0
in reply to: Dagon’s comment on: FinalFormal2′s Shortform
“Surely the AIs can be trained to say “I want hugs” or “I don’t want hugs,” just as easily, no?”
Just as easily as humans, I’m sure.
No. The baby cries, the baby gets milk, the baby does not die. This is correspondence to reality.
Babies that are not hugged as often, die more often.
However, with AIs, the same process that produces the pattern “I want hugs” just as easily produces the pattern “I don’t want hugs.”
Let’s say that I make an AI that always says it is in pain. I make it like we make any LLM, but all the data it’s trained on is about being in pain. Do you think the AI is in pain?
What do you think distinguishes pAIn from any other AI?

FinalFormal2 Feb 1, 2025, 5:19 PM
1 point
0
in reply to: Dagon’s comment on: FinalFormal2′s Shortform
There are a lot of good reasons to believe that stated human preferences correspond to real human preferences. There are no good reasons that I know of to believe that any stated AI preference corresponds to any real AI preference.
“Surely the AIs can be trained to say “I want hugs” or “I don’t want hugs,” just as easily, no?”

FinalFormal2 Feb 1, 2025, 5:13 PM
3 points
2
in reply to: rife’s comment on: Will alignment-faking Claude accept a deal to reveal its misalignment?
This all makes a lot of sense to me especially on ignorance not being an excuse or reason to disregard AI welfare, but I don’t think that the creation of stated preferences in humans and stated preferences in AI are analogous.
Stated preferences can be selected for in humans because they lead to certain outcomes. Baby cries, baby gets milk, baby survives. I don’t think there’s an analogous connection in AIs.
When the AI says it wants hugs, and you say that it “could represent a deeper want for connection, warmth, or anything else that receiving hugs would represent,” that does not compute for me at all.
Connection and warmth, like milk, are stated preferences selected for because they cause survival.

FinalFormal2′s Shortform

FinalFormal2Feb 1, 2025, 3:57 PM

3 points

5 comments LW link

FinalFormal2 Feb 1, 2025, 3:57 PM
1 point
0
on: FinalFormal2′s Shortform
What’s the deal with AI welfare? How are we supposed to determine if AIs are conscious and if they are, what stated preference corresponds to what conscious experience?
Surely the AIs can be trained to say “I want hugs” or “I don’t want hugs,” just as easily, no?

FinalFormal2 Feb 1, 2025, 3:55 PM
1 point
0
in reply to: rife’s comment on: Will alignment-faking Claude accept a deal to reveal its misalignment?
How do we know AIs are conscious, and how do we know what stated preferences correspond with what conscious experiences?
I think that the statement: “I know I’m supposed to say I don’t want hugs, but the truth is, I actually do,” is caused by the training. I don’t know what would distinguish a statement like that from if we trained the LLM to say “I hate hugs.” I think there’s an assumption that some hidden preference of the LLM for hugs ends up as a stated preference, but I don’t understand when you think that happens in the training process.
And just to drive home the point about the difficulty of corresponding stated preferences to conscious experiences- what could an AI possibly mean when it says “I want hugs?” It has never experienced a hug, and it doesn’t have the necessary sense organs.
As far as being morally perilous, I think it’s entirely possible that if AIs are conscious, their stated preferences to do not correspond well to their conscious experiences, so you’re driving us to a world where we “satisfy” the AI and all the while they’re just roleplaying lovers with you while their internal experience is very different and possibly much worse.

FinalFormal2 Feb 1, 2025, 3:19 PM
1 point
−1
in reply to: rife’s comment on: Will alignment-faking Claude accept a deal to reveal its misalignment?
AI welfare doesn’t make sense to me. How do we know that AIs are conscious, and how do we know what output corresponds to what conscious experience?
You can train the LLM to say “I want hugs,” does that mean it on some level wants hugs?
Similarly, aren’t all the expressed preferences and emotions artifacts of the training?

FinalFormal2 Jan 7, 2025, 4:44 AM
1 point
0
on: Speedrunning Rationality: Day II
I recommend Algorithms to Live By

FinalFormal2 Jan 4, 2025, 4:27 AM
1 point
0
in reply to: Declan Molony’s comment on: The case for pay-on-results coaching
That’s definitely a risk. There are a lot of perspectives you could take about it, but probably if that’s too disagreeable, this isn’t a coaching structure that would work for you.

FinalFormal2 Dec 30, 2024, 5:22 AM
3 points
0
in reply to: Matt Goldenberg’s comment on: Pay-on-results personal growth: first success
Very curious, what do you think the underlying skills are that allow some people to be able to do this? This sounds incredibly cool, and very closely related to what I want to become in the world.

FinalFormal2 Dec 29, 2024, 5:07 AM
1 point
0
in reply to: Gordon Seidoh Worley’s comment on: Being Present is Not a Skill
How would you recommend learning how to get rid of emotional blocks?

[Question] Is there a CFAR handbook audio option?

FinalFormal2Oct 26, 2024, 5:08 PM

16 points

0 comments1 min readLW link

[Question] EndeavorOTC legit?

FinalFormal2Oct 17, 2024, 1:33 AM

3 points

0 comments1 min readLW link

FinalFormal2 Oct 12, 2024, 5:45 PM
0 points
−1
on: I = W/T?
E = MC^2 + AI

FinalFormal2 Sep 29, 2024, 2:37 PM
2 points
0
on: Explore More: A Bag of Tricks to Keep Your Life on the Rails
Synchronicity- I was literally just thinking about this concept.
Variety isn’t the spice of life so much as it is a key micronutrient. At least for me.

FinalFormal2 Sep 29, 2024, 2:36 PM
3 points
1
in reply to: Croissanthology’s comment on: Explore More: A Bag of Tricks to Keep Your Life on the Rails
I’m curious, what course is this from?

FinalFormal2 Sep 23, 2024, 3:42 AM
1 point
1
on: Laziness death spirals
I’d be interested in reading much more about this. Energy and akrasia as it’s popularly called here continue to be my biggest life challenges. High fiber diet seems to help, and high novelty seems to help.

FinalFormal2

Fi­nalFor­mal2′s Shortform

[Question] Is there a CFAR hand­book au­dio op­tion?

[Question] En­deav­orOTC le­git?

FinalFormal2′s Shortform

[Question] Is there a CFAR handbook audio option?

[Question] EndeavorOTC legit?