If you get an email from aisafetyresearch@gmail.com , that is most likely me. I also read it weekly, so you can pass a message into my mind that way.
Other ~personal contacts: https://linktr.ee/uhuge
Martin Vlach
It’s provided the current time together with other 20k sys-prompt tokens, so substantially more diluted influence on the behaviours..?
Folks like this guy hit it on hyperspeed -
https://www.facebook.com/reel/1130046385837121/?mibextid=rS40aB7S9Ucbxw6v
I still remember university teacher explaining how early TV transmission were very often including/displaying ghosts of dead people, especially dead relatives.
As the tech matures from art these phenomena or hallucinations evaporate.
you seem to report one OOM less than this picture in https://alexiglad.github.io/blog/2025/ebt/#:~:text=a%20log%20function).-,Figure%208,-%3A%20Scaling%20for
Link to Induction section on https://www.lesswrong.com/lw/dhg/an_intuitive_explanation_of_solomonoff_induction/#induction seems broken on mobile Chrome, @habryka
I’ve heard that hypothesis in a review of that blog post of Anthropic, likely by
AI Explained
maybe by
bycloud
.
They’ve called it “Chekov’s gun”.
What’s your view on sceptic claims about RL on transformer LMs like https://arxiv.org/abs/2504.13837v2 or one that CoT instruction yields better results than <thinking> training?
Not the content I expect labeled AIb Capabilities,
although I see how that’d be vindicated.
By the way, if I write an article about LMs generating SVG, that’s a plaintext and if I put an SVG illustration up, that’s an image, not a plaintext?
Trivial, but do token-based LMs follow instructions like “only output tokens ‘1’, ‘2’, ‘3’” where they’d output 123 as one token without that instruction?
Draft: A concise theory of agentic consciousness
I’d update my take from a very pessimist/gloom one to an (additional) excited one: Those more intelligent models building a clear view of the person they/it interacts with is a sign of emerging empathy, which is a hopeful property for alignment/respect.
False Trichotomy?
Your model assumes that one cannot be all three, however, some roles demand it, and in reality people do navigate all three traits, my top example would be empathic project managers.
Largely Sycophantically Reasoning Models—should we claim the term for this behavior where the language model profiles the user and customizes the responses heavily?
Hello @habryka, could you please adjust the text on the page to include the year when applications closed, so that it confuses people( like me) less and they won’t spend reading it all wasting their time stupidly?
THANKS!
You mean the chevrons like this is non-standard, but also sub-standard, although it has the neat property to represent >Speaker one< and >>Speaker two<<? I can see the typography of those here is meh at best.-\
I’m excited to find your comment, osten, that reads as a pretty insightful view to me.
Let me restate what I understood your light( and welcome) critique to be: I have put “human civilization” out as an actor which lasted/endured a long time which heuristically suggests it has high resilience and robustness properties and thus deserves respect and holding the control. Here you say it did not endure much as a single structure to consider/test with Lindy, as it got changed significantly and many times, thus we maybe should split it like “feudal civilization”, “democratic civilization”, etc.
The other interpretation I see is that yeah, it is one structure, but ASI will keep the structure, but lead (in) it. I enjoy that argument, but it would not fully work unless AIs get the status of a physical person, but would somewhat work when it can gather human proxies whenever possible.
Thou shalt not command an alighned AI
G.D. as Capitalist Evolution, and the claim for humanity’s (temporary) upper hand
There’s the thing where Gemini 2.5 Pro surprisingly isn’t very good at geo guessing, a woman’s tweet is too be linked <here>.
I’d bet the webpage parser ignored images, their contents.
I’d bet “re-based” model ala https://huggingface.co/jxm/gpt-oss-20b-base when instruction-tuned would do same as similarly sized Qwen models.