Gyrodiot

Karma: 688

I’m Jérémy Perret. Based in France. PhD in AI (NLP). AI Safety & EA meetup organizer. Information sponge. Mostly lurking since 2014. Seeking more experience, and eventually a position, in AI safety/governance.

Extremely annoyed by the lack of an explorable framework for AI risk/benefits. Working on that.

Gyrodiot Oct 14, 2025, 9:27 AM
3 points
3 votes
Overall karma indicates overall quality.
0
0 votes
Agreement karma indicates agreement, separate from overall quality.
on: Words make us Dumb #1: The “Point”lessness of Knowledge
Let’s see if your post has successfully overcome my mental filters (at the very least, I clicked). Here’s my reformulation of your claims, as if I had to explain them to someone else.
1. You need a special effort to grab the attention of humans
2. Humans can’t process all the words thrown at them and select “impressive” content
3. You need several tries to transmit knowledge properly
4. Beyond being impressive, words need to be “relevant” to transmit knowledge efficiently
5. Words can’t create a perfectly impressive and relevant content
6. Being very impressive doesn’t guarantee relevance
7. Content impressive for you doesn’t make it more relevant for you
8. This is a toy model, humans also have incentives to shape which content gets thrown or not
Now that I’ve written the points above, I study again the “what if” part at the end and say, “oh, so the idea is that human language may not be the best way to transmit knowledge because what gets your attention often isn’t what lets you learn easily, cool, then what”
Then… you claim that there might be a Better Language to cut through these issues. That would be extremely impressive. But then I scroll back up and I see the titles of the following posts. I’m afraid that you will only describe issues with human communication without suggesting techniques to overcome them (at least in specific contexts).
For instance, you gave an example comparison in impression (asteroid vs. climate change). Could you provide a comparison for relevance? Something that, by your lights, gets processed easily?

Gyrodiot Sep 16, 2025, 3:37 PM
3 points
2 votes
Overall karma indicates overall quality.
0
0 votes
Agreement karma indicates agreement, separate from overall quality.
on: A brief argument against utilitarianism
they abandoned simple metrics in favour of analyses in which qualitative factors play a large role, because all the metrics they evaluated failed to have good properties
Do you have more specific statements from GiveWell for this shift? I have not been able to find a clear enough argument for your claim from their website, nor from research on the EA Forum.
Also, your view on well-behaved utility functions may vary. You need to get an approximation of ideal utilitarianism, with a nice ordering of world-states by total happiness/suffering (depending on flavor) and how to get there. I think we can coordinate on some good enough approximations to be able to give. Is that well-behaved enough, or are you pointing at something stronger here?

Gyrodiot Jul 22, 2025, 10:31 PM
3 points
3 votes
Overall karma indicates overall quality.
1
1 vote
Agreement karma indicates agreement, separate from overall quality.
on: If Anyone Builds It, Everyone Dies: Call for Translators (for Supplementary Materials)
I am now curious about the omission of French, hoping that’s because you already have competent people for it, maybe the aforementioned kind souls?

Gyrodiot Jun 23, 2025, 5:02 PM
11 points
8 votes
Overall karma indicates overall quality.
6
4 votes
Agreement karma indicates agreement, separate from overall quality.
on: “It isn’t magic”
Related recent post: Intelligence Is Not Magic, But Your Threshold For “Magic” Is Pretty Low (similar point, focused on human short-timeframe feats rather than technological achievements).

Gyrodiot Jun 20, 2025, 4:44 AM
1 point
1 vote
Overall karma indicates overall quality.
0
0 votes
Agreement karma indicates agreement, separate from overall quality.
on: AISEC: Why to not to be shy.
Format note: your list is missing a number 3.

Gyrodiot Jun 4, 2025, 8:05 AM
4 points
3 votes
Overall karma indicates overall quality.
2
1 vote
Agreement karma indicates agreement, separate from overall quality.
in reply to: Roman Malov’s comment on: Roman Malov’s Shortform
Two separate points:
- compared to physics, the field of alignment has a slow-changing set of questions (e.g. corrigibility, interpretability, control, goal robustness, etc.) but a fast-evolving subject matter, as capability progresses. I use the analogy of a biologist suddenly working on a place where evolution runs 1000x faster, some insights get stale very fast and it’s hard to know which ones in advance. Keeping up with the frontier is, then, used to know whether one’s work still seems relevant (or where to send newcomers). Agent foundations as a class of research agendas was the answer to this volatility, but progress is slow and the ground keeps shifting.
- there is some effort to unify alignment research, or at least provide a textbook to get to the frontier. My prime example is the AI Safety Atlas, I would also consider the BlueDot courses as structure-building, AIsafety.info as giving some initial directions. There’s also a host of papers attempting to categorize the sub-problems but they’re not focused on tentative answers.

Gyrodiot May 28, 2025, 10:33 AM
1 point
1 vote
Overall karma indicates overall quality.
0
0 votes
Agreement karma indicates agreement, separate from overall quality.
in reply to: Oxidize’s comment on: All Rationalists hate & sabotage Strategy without having any awareness of it.
Ah. Thank you for your attempt to get through to this community anyway, in the face of such incompatibility. Enjoy your freedom then, I hope you’ll do better than us.

Gyrodiot May 27, 2025, 3:10 PM
2 points
2 votes
Overall karma indicates overall quality.
0
0 votes
Agreement karma indicates agreement, separate from overall quality.
on: All Rationalists hate & sabotage Strategy without having any awareness of it.
Alright, the first chunk of my frowning was from claims about Rationality as a generic concept (and my immediate reaction to it). Second, I am puzzled by a few of your sentences.
Likewise, I consistently see Rationalists have no awareness or care of goals in the first place. Every human acts for a goal. If you don’t set an external one, then your default one becomes the goals motivated by human motivations systems.
What do you make of Goal Factoring, one of the techniques designed to patch that class of behaviors ? If I see a self-identified rationalist not being aware of their own goals, and there are a bunch, goal factoring would be my first suggestion. I would expect them to be curious about it.
If improving your ability to think by going through the uncomfortable process of utilizing a system of the brain that you are unfamiliar with is not something that interests you, then this document is not for you.
Mostly unnecessary caveat; one of the main draws of this website is to study the flaws of our own lenses.
Please be undeterred by the negative karma, it’s only a signal that this particular post may fail at its intended purpose. Namely:
I say all this to bring context to this document’s demand that the. reader does not ask for external justifications of claims. Instead, this document requires that readers test the concepts explored in this document in the real-world. It demands that the readers do not use validity-based reasoning to understand it.
...where is this document? Here I see a warning about the document, a surface clash of concepts, another warning of ignoring advice from other groups, and a bullet point list with too little guidance on how to get those heuristics understood.
Listing the virtues is a starting point, but one does not simply say “go forth and learn for yourself what Good Strategy is” and see that done without a lot of nudging, or else one might stay in the comfort of “validity-based reasoning” all call it a day. Which I would find disappointing.

Gyrodiot May 26, 2025, 8:53 AM
1 point
1 vote
Overall karma indicates overall quality.
0
0 votes
Agreement karma indicates agreement, separate from overall quality.
in reply to: Lao Mein’s comment on: Lao Mein’s Shortform
“Internal betting markets” may be a reference to the Logical Induction paper? Unsure it ties strongly to stop-button/corrigibility.

Gyrodiot Apr 13, 2025, 8:47 PM
5 points
4 votes
Overall karma indicates overall quality.
0
0 votes
Agreement karma indicates agreement, separate from overall quality.
on: How far are Western welfare states from coddling the population into becoming useless?
Hi! Nearly all the statements in your question would benefit from some unpacking. Could you expand on what would count as coddling? My intuition says you’re gesturing at a whole heap of trade-offs, it might help to pick one in particular and study it further. Any proper answer to the question, as stated, is the stuff of entire books.

Gyrodiot Mar 5, 2025, 1:16 PM
6 points
3 votes
Overall karma indicates overall quality.
0
0 votes
Agreement karma indicates agreement, separate from overall quality.
on: The Government Knows A.G.I. Is Coming
Archive link (not paywalled)

Gyrodiot Dec 16, 2024, 9:15 AM
1 point
1 vote
Overall karma indicates overall quality.
0
0 votes
Agreement karma indicates agreement, separate from overall quality.
on: The Parable of the King and the Random Process
The post makes clear that two very different models of the world will lead to very different action steps, and the “average” of those steps isn’t what follows the average of probabilities. See how the previous sentence felt awkward and technical, compared to the story? Sure, it’s much longer, but the point gets across better, that’s the value. I have added this story to my collection of useful parables.
Re-reading it, the language remains technical, one needs to understand a bit more probability theory to get the latter parts. I would like to see a retelling of the story, same points, different style, to test if it speaks to a different audience.

Gyrodiot Dec 2, 2024, 4:49 PM
10 points
11 votes
Overall karma indicates overall quality.
1
1 vote
Agreement karma indicates agreement, separate from overall quality.
on: 2024 Unofficial LessWrong Census/Survey
I filled out the survey. Thank you so much for running this!

Gyrodiot Dec 1, 2024, 9:01 PM
1 point
1 vote
Overall karma indicates overall quality.
0
0 votes
Agreement karma indicates agreement, separate from overall quality.
in reply to: habryka’s comment on: (The) Lightcone is nothing without its people: LW + Lighthaven’s big fundraiser
Oh, glad I scrolled to find this comment. Adding a request for France, which does have charity tax deductions… but needs an appropriate receipt.

Gyrodiot Sep 18, 2024, 6:09 PM
1 point
1 vote
Overall karma indicates overall quality.
0
0 votes
Agreement karma indicates agreement, separate from overall quality.
on: The Γ (Gamma) Framework: E = Γ
Could you provide an example of prediction the Γ Framework makes which highlights the divergence between it and the Standard Model? Especially in cases the Standard Model falls short of describing reality well enough?

Gyrodiot Jul 17, 2024, 5:17 AM
6 points
3 votes
Overall karma indicates overall quality.
3
1 vote
Agreement karma indicates agreement, separate from overall quality.
on: Last 50 Spots until 31th July - LessWrong Community weekend
Best weekend of the year. Been there in 2017, 2018, 2019, 2023, will be delighted to attend again. Consistent source of excellent discussions, assorted activities, fun and snacks. Does indeed feel like home.

Gyrodiot May 21, 2024, 1:24 PM
6 points
8 votes
Overall karma indicates overall quality.
15
6 votes
Agreement karma indicates agreement, separate from overall quality.
on: What are some infohazards?
Welcome! One gateway for you might be the LW Concepts page about it!
Most of the posts discuss, of course, infohazard policy and properties of information that would be harmful to know, or think about. Directly sharing blatantly harmful information would be irresponsible.

Gyrodiot May 17, 2024, 2:16 PM
6 points
8 votes
Overall karma indicates overall quality.
−1
3 votes
Agreement karma indicates agreement, separate from overall quality.
on: Towards Guaranteed Safe AI: A Framework for Ensuring Robust and Reliable AI Systems
My raw and mostly confused/snarky comments as I was going through the paper can be found here (third section).
Cleaner version: this is not a technical agenda. This is not something that would elicit interesting research questions from a technical alignment researcher. There are however interesting claims:
- what a safe system ought to be like; it proposes three scales describing its reliability;
- how far up the scales we should aim for at minimum;
- how low on the scales currently large deployed models are.
While it positions a variety of technical agendas (mainly those of the co-authors) on the scales, the paper does not advocate for a particular approach, only the broad direction of “here are the properties we would like to have”. Uncharitably, it’s a reformulation of the problem.
The scales can be useful to compare the agenda that belong to the “let’s prove that the system adheres to this specification” family. It makes no claims over what the specification entails, nor failure modes of various (combinations of) levels.
I appreciate this paper as a gateway to the related agendas and relevant literature, but I’m not enthusiastic about it.

Gyrodiot Mar 25, 2024, 11:43 PM
4 points
5 votes
Overall karma indicates overall quality.
2
1 vote
Agreement karma indicates agreement, separate from overall quality.
on: On Lex Fridman’s Second Podcast with Altman
Typo: Mira Murati, not Mutari.

Gyrodiot Dec 13, 2023, 6:43 PM
8 points
5 votes
Overall karma indicates overall quality.
0
0 votes
Agreement karma indicates agreement, separate from overall quality.
on: AI Views Snapshots
Here’s a spreadsheet version you can copy. Fill your answers in the “answers” tab, make your screenshot from the “view” tab.
~~I plan to add more functionality to this (especially comparison mode, as I collect some answers found on the Internet).~~ You can now compare between recorded answers! Including yours, if you have filled them!
I will attempt to collect existing answers, from X and LW/EA comments.