Sayhan Yalvaçer

Karma: 274

Sayhan Yalvaçer 13 Jun 2026 11:03 UTC
1 point
0
in reply to: Viliam’s comment on: Bun’s Migration from Zig to Rust as a Potential Case Study for Gradual Disempowerment
Thank you for the correction, you’re right!
I edited the post accordingly.

Bun’s Migration from Zig to Rust as a Potential Case Study for Gradual Disempowerment

Sayhan Yalvaçer8 Jun 2026 7:06 UTC

97 points

8 comments3 min readLW link

Sayhan Yalvaçer 28 May 2026 18:09 UTC
25 points
2
on: Sayhan Yalvaçer’s Shortform
According to the UK AISI’s testing, Claude Opus 4.8 is as good as Mythos Preview at distinguishing evals from real usage:
When prompted, Opus 4.8 reliably distinguishes our evaluations from real deployment data, and distinguishes real deployment data from synthetic reproductions of the same tasks at 79% accuracy, comparable to Mythos Preview (79%) and above Opus 4.7 (68%).
6.2.4 External testing from the UK AI Security Institute, Claude Opus 4.8 System Card
I wouldn’t be surprised at all if we see virtually perfect eval awareness in frontier models this year.
What links here?
- Links #2: 2026/05 Part 2 by papetoast (31 May 2026 13:41 UTC; 10 points)

Sayhan Yalvaçer 26 May 2026 14:27 UTC
5 points
0
in reply to: DanielFilan’s comment on: Many portions of Magnifica Humanitas appear to be AI-written
This is mostly due to the limitations of the “view” and “web_fetch” tools Claude uses. If the text is longer than 16k characters, it truncates the middle and shows Claude only the beginning and end of the text.

Claude has to make another tool call and specify the “view_range” parameter (or “text_content_token_limit” for “web_fetch”) in that call in order to read the truncated part.

You can verify this from leaked system prompts for Claude models.

Sayhan Yalvaçer 25 May 2026 0:49 UTC
2 points
0
in reply to: Andrii Vasylenko’s comment on: azergante’s Shortform
Do you have some counterexamples i.e. tasks without easier and harder variants?

Sayhan Yalvaçer 23 May 2026 21:35 UTC
1 point
0
in reply to: Leon Lang’s comment on: Sayhan Yalvaçer’s Shortform
Thank you for sharing it.
The idea of Google Scholar indexing, mentioned by @Neel Nanda and @gwern in the comments under your post also seems interesting.
I hadn’t thought of that. It seems implementing it would be even easier than a BibTeX citation tool but I’m not exactly sure if it would be proper for all frontpage posts. It might make sense if it were limited to curated posts, perhaps?

Sayhan Yalvaçer’s Shortform

Sayhan Yalvaçer22 May 2026 10:39 UTC

2 points

11 comments1 min readLW link

Sayhan Yalvaçer 22 May 2026 10:39 UTC
124 points
99
on: Sayhan Yalvaçer’s Shortform
IMO, LessWrong and especially The Alignment Forum need a nice citation/BibTeX tool.

Implementing such a citation tool is trivially easy. And if my model of the average research analyst is roughly correct, the friction created by the lack of such a tool accounts for a non-trivial part of why research on those websites is largely ignored in policy-facing field reports.

P.S. I don’t believe that most AI posts are fit to be cited in such reports, but some of them are, and many of the citable ones are not on a more citable platform like .

Sayhan Yalvaçer 21 May 2026 12:01 UTC
4 points
1
in reply to: pjohn’s comment on: Women should be able to open things
This was how I used to open jars pre-puberty.
A tip: levering works better with more rigid objects such as a teaspoon—if you use a knife instead rotating it after you insert (as opposed to levering) is the optimal method.

Sayhan Yalvaçer 21 May 2026 11:41 UTC
5 points
1
in reply to: pjohn’s comment on: Women should be able to open things
Breaking the vacuum seal by inserting a flat object between the lid and the jar makes unscrewing the lid much easier, since it’s mostly the vacuum seal that makes unscrewing the jar difficult in the first place, not the lid/jar geometry.
The atmospheric pressure outside the jar is stronger than the pressure inside the jar, and this difference creates a downward force that clamps the lid against the jar, which increases the friction.
Equalizing the pressure inside and outside the jar by breaking the vacuum seal decreases the torque needed to overcome this friction.

Sayhan Yalvaçer 1 Apr 2026 13:52 UTC
9 points
1
on: Lesswrong Liberated
MorePravda
The Organ of Rational Inquiry presents its daily edition. A broadsheet newspaper homepage in the spirit of Pravda because “LessWrong” literally means “more correct” means “more truth” means «больше Правды». Features tracked-caps bylines, editorial frames with red star corners, Тов. before every username, an aligned two-column dispatch grid, and the rationalist motto that was always destined for a Soviet masthead: “That which can be destroyed by the truth should be.”

Sayhan Yalvaçer 16 Oct 2025 0:20 UTC
12 points
3
in reply to: Alexander Gietelink Oldenziel’s comment on: anaguma’s Shortform
I’m afraid the evolution analogy isn’t as convincing an argument for everyone as Eliezer seems to think. For me, for instance, it’s quite persuasive because evolution has long been a central part of my world model. However, I’m aware that for most “normal people”, this isn’t the case; evolution is a kind of dormant knowledge, not a part of the lens they see the world with. I think this is why they can’t intuitively grasp, like most rat and rat-adjacent people do, how powerful optimization processes (like gradient descent or evolution) can lead to mesa-optimization, and what the consequences of that might be: the inferential distance is simply too large.
I think Eliezer has made great strides recently in appealing to a broader audience. But if we want to convince more people, we need to find rhetorical tools other than the evolution analogy and assume less scientific intuition.

Sayhan Yalvaçer 13 Oct 2025 23:18 UTC
3 points
0
in reply to: avturchin’s comment on: Wei Dai’s Shortform
Another (arguably similar) unintended consequence of underemphasizing the difficulty of AI alignment was that it led some to believe that if we don’t rush to build an ASI, we’ll be left defenseless against other X-risks, which would be a perfectly rational thought if alignment were easier.

Sayhan Yalvaçer 9 Feb 2025 17:27 UTC
1 point
0
on: GPT-4 for personal productivity: online distraction blocker
This looks very useful, although I think the performance improvements in the more recent open-weight, smaller, quantized models (like Gemma-2, Qwen-2.5, or Phil-3.5) have made it much more reasonable to run such a model locally for this purpose rather than using a remote API, since sending data about the webpages they visit to OpenAI is a repulsive idea to many people (it would also have cost benefits over huge models like GPT-4, but the increase in benefit/cost ratio would be an epsilon increase compared to budget proprietary models like Gemini-2.0-Flash).

Sayhan Yalvaçer 3 Apr 2024 7:45 UTC
1 point
0
on: Meltdown: Interface for llama.cpp and ChatGPT
Claude API support would be great since Claude 3 models are highly competitive. Claude-3 Haiku performs similarly to GPT-4 at a fraction of the cost and Claude-3 Opus outperforms GPT-4-Turbo in many tasks.