Ronny Fernandez

Karma: 1,221

The Principle of Predicted Improvement

Ronny Fernandez23 Apr 2019 21:21 UTC

69 points

17 comments3 min readLW link

LW Philosophers versus Analytics

Ronny Fernandez28 Nov 2011 15:40 UTC

50 points

85 comments5 min readLW link

Are We Right about How Effective Mockery Is?

Ronny Fernandez27 Aug 2020 10:48 UTC

44 points

12 comments9 min readLW link

Ronny Fernandez 9 Mar 2023 20:12 UTC
38 points
9
on: Why do we assume there is a “real” shoggoth behind the LLM? Why not masks all the way down?
The shoggoth is supposed to be a of a different type than the characters. The shoggoth for instance does not speak english, it only knows tokens. There could be a shoggoth character but it would not be the real shoggoth. The shoggoth is the thing that gets low loss on the task of predicting the next token. The characters are patterns that emerge in the history of that behavior.
What links here?
- ′ petertodd’’s last stand: The final days of open GPT-3 research by mwatkins (22 Jan 2024 18:47 UTC; 107 points)
- The “spelling miracle”: GPT-3 spelling abilities and glitch tokens revisited by mwatkins (31 Jul 2023 19:47 UTC; 85 points)

Ronny Fernandez 14 Oct 2022 22:46 UTC
38 points
17
on: Counterarguments to the basic AI x-risk case
There’s a nearby kind of obvious but rarely directly addressed generalized version of one of your arguments, which is that ML learns complex functions all the time, so why should human values be any different? I rarely see this discussed, and I thought the replies from Nate and the ELK related difficulties were important to have out in the open, so thanks a lot for including the face learning <-> human values learning analogy.
What links here?
- jacob_cannell's comment on Counterarguments to the basic AI x-risk case by KatjaGrace (18 Oct 2022 0:16 UTC; 2 points)

Ronny Fernandez 14 Feb 2021 15:39 UTC
37 points
on: “PR” is corrosive; “reputation” is not.
I came here to say something pretty similar to what Duncan said, but I had a different focus in mind.
It seems like it’s easier for organizations to coordinate around PR than it is for them to coordinate around honor. People can have really deep intractable, or maybe even fundamental and faultless, disagreements about what is honorable, because what is honorable is a function of what normative principles you endorse. It’s much easier to resolve disagreements about what counts as good PR. You could probably settle most disagreements about what counts as good PR using polls.
Maybe for this reason we should expect being into PR to be a relatively stable property of organizations, while being into honor is a fragile and precious thing for an organization.

Testing the Efficacy of Disagreement Resolution Techniques (and a Proposal for Testing Double Crux)

Ronny Fernandez21 Oct 2019 22:57 UTC

35 points

2 comments15 min readLW link

(Subjective Bayesianism vs. Frequentism) VS. Formalism

Ronny Fernandez26 Nov 2011 5:05 UTC

32 points

107 comments5 min readLW link

Ronny Fernandez 8 Apr 2024 1:01 UTC
32 points
21
on: Ronny Fernandez’s Shortform
AN APOLOGY ON BEHALF OF FOOLS FOR THE DETAIL ORIENTED

Misfits, hooligans, and rabble rousers.
Provocateurs and folk who don’t wear trousers.
These are my allies and my constituents.
Weak in number yet suffused with arcane power.

I would never condone bullying in my administration.
It is true we are at times moved by unkind motivations.
But without us the pearl clutchers, hard asses, and busy bees would overrun you.
You would lose an inch of slack per generation.

Many among us appreciate your precision.
I admit there are also those who look upon it with derision.
Remember though that there are worse fates than being pranked.
You might instead have to watch your friend be “reeducated”, degraded, and spanked
On high broadband public broadcast television.

We’re not so different really.
We often share your indignation
With those who despise copulation.
Although our alliance might be uneasy
We both oppose the soul’s ablation.

So let us join as cats and dogs, paw in paw
You will persistently catalog
And we will joyously gnaw.

Ronny Fernandez 22 Dec 2023 11:29 UTC
31 points
0
on: Ronny and Nate discuss what sorts of minds humanity is likely to find by Machine Learning
To be clear, I did not think we were discussing the AI optimist post. I don’t think Nate thought that. I thought we were discussing reasons I changed my mind a fair bit after talking to Quintin.

Comment on Coherence arguments do not imply goal directed behavior

Ronny Fernandez6 Dec 2019 9:30 UTC

30 points

8 comments5 min readLW link

Ronny Fernandez 30 Nov 2021 2:55 UTC
30 points
on: Visible Thoughts Project and Bounty Announcement
For anyone who may have the executive function to go for the 1M, I propose myself as a cheap author if I get to play as the dungeon master role, or play as the player role, but not if I have to do both. I recommend taking me as the dungeon master role. This sounds genuinely fun to me. I would happily do a dollar per step.

I can also help think about how to scale the operation, but I don’t think I have the executive function, management experience, or slack to pull it off myself.

I am Ronny Fernandez. You can contact me on fb.

Ronny Fernandez 24 Apr 2012 1:58 UTC
29 points
in reply to: Vivi’s comment on: A Fable of Science and Politics
That’s not semantics, it’s syntactics.

hehehe

(Get it? Cause that is a minor semantic issue.)

High schoolers can apply to the Atlas Fellowship: $10k scholarship + 11-day program

Ronny Fernandez and ACapellaDivergence

18 Apr 2023 2:53 UTC

26 points

0 comments3 min readLW link

[Question] Asking for help teaching a critical thinking class.

Ronny Fernandez7 Mar 2019 2:15 UTC

22 points

9 comments1 min readLW link

Ronny Fernandez 10 Nov 2022 17:41 UTC
21 points
1
on: What it’s like to dissect a cadaver
I loved this, but maybe should come with a cw.

MSF Theory: Another Explanation of Subjectively Objective Probability

Ronny Fernandez30 Jul 2011 19:46 UTC

20 points

11 comments5 min readLW link

Aligned Behavior is not Evidence of Alignment Past a Certain Level of Intelligence

Ronny Fernandez5 Dec 2022 15:19 UTC

19 points

5 comments7 min readLW link

Ronny Fernandez 2 Aug 2019 19:17 UTC
19 points
on: Brangus’s Shortform
Sometimes I sort of feel like a grumpy old man that read the sequences back in the good old fashioned year of 2010. When I am in that mood I will sometimes look around at how memes spread throughout the community and say things like “this is not the rationality I grew up with”. I really do not want to stir things up with this post, but I guess I do want to be empathetic to this part of me and I want to see what others think about the perspective.
One relatively small reason I feel this way is that a lot of really smart rationalists, who are my friends or who I deeply respect or both, seem to have gotten really into chakras, and maybe some other woo stuff. I want to better understand these folks. I’ll admit now that I have weird biased attitudes towards woo stuff in general, but I am going to use chakras as a specific example here.
One of the sacred values of rationality that I care a lot about is that one should not discount hypotheses/perspectives because they are low status, woo, or otherwise weird.
Another is that one’s beliefs should pay rent.
To be clear, I am worried that we might be failing on the second sacred value. I am not saying that we should abandon the first one as I think some people may have suggested in the past. I actually think that rationalists getting into chakras is strong evidence that we are doing great on the first sacred value.
Maybe we are not failing on the second sacred value. I want to know whether we are or not, so I want to ask rationalists who think a lot or talk enthusiastically about chakras a question:
Do chakras exist?
If you answer “yes”, how do you know they exist?
I’ve thought a bit about how someone might answer the second question if they answer “yes” to the first question without violating the second sacred value. I’ve thought of basically two ways that seems possible, but there are probably others.
One way might be that you just think that chakras literally exist in the same ways that planes literally exist, or in the way that waves literally exist. Chakras are just some phenomena that are made out of some stuff like everything else. If that is the case, then it seems like we should be able to at least in principle point to some sort of test that we could run to convince me that they do exist, or you that they do not. I would definitely be interested in hearing proposals for such tests!
Another way might be that you think chakras do not literally exist like planes do, but you can make a predictive profit by pretending that they do exist. This is sort of like how I do not expect that if I could read and understand the source code for a human mind, that there would be some parts of the code that I could point to and call the utility and probability functions. Nonetheless, I think it makes sense to model humans as optimization processes with some utility function and some probability function, because modeling them that way allows me to compress my predictions about their future behavior. Of course, I would get better predictions if I could model them as mechanical objects, but doing so is just too computationally expensive for me. Maybe modeling people as having chakras, including yourself, works sort of the same way. You use some of your evidence to infer the state of their chakras, and then use that model to make testable predictions about their future behavior. In other words, you might think that chakras are real patterns. Again it seems to me that in this case we should at least in principle be able to come up with tests that would convince me that chakras exist, or you that they do not, and I would love to hear any such proposals.
Maybe you think they exist in some other sense, and then I would definitely like to hear about that.
Maybe you do not think they exist in anyway, or make any predictions of any kind, and in that case, I guess I am not sure how continuing to be enthusiastic about thinking about chakras or talking about chakras is supposed to jive with the sacred principle that one’s beliefs should pay rent.
I guess it’s worth mentioning that I do not feel as averse to Duncan’s color wheel thing, maybe because it’s not coded as “woo” to my mind. But I still think it would be fair to ask about that taxonomy exactly how we think that it cuts the universe at its joints. Asking that question still seems to me like it should reduce to figuring out what sorts of predictions to make if it in fact does, and then figuring out ways to test them.
I would really love to have several cooperative conversations about this with people who are excited about chakras, or other similar woo things, either within this framework of finding out what sorts of tests we could run to get rid of our uncertainty, or questioning the framework I propose altogether.

Ronny Fernandez 24 Oct 2011 20:37 UTC
15 points
on: Rationality Quotes October 2011

My faith in the expertise of physicists like Richard Feynman, for instance, permits me to endorse—and, if it comes to it, bet heavily on the truth of—a proposition that I don’t understand. So far, my faith is not unlike religious faith, but I am not in the slightest bit motivated to go to my death rather than recant the formulas of physics. Watch: E doesn’t equal mc2, it doesn’t, it doesn’t!

--Dan Dennet: Breaking the Spell