Johannes C. Mayer

Karma: 1,494

↘↘↘↘↘↘↙↙↙↙↙↙
Checkout my Biography.
↗↗↗↗↗↗↖↖↖↖↖↖

Johannes C. Mayer 28 Dec 2025 10:40 UTC
6 points
0
in reply to: Thane Ruthenis’s comment on: Glucose Supplementation for Sustained Stimulant Cognition
No, but great idea! I’ll likely run one. I already ordered some microcristaline cellulose and designed an experimental protocol.

Johannes C. Mayer 27 Dec 2025 22:21 UTC
4 points
0
in reply to: avturchin’s comment on: Glucose Supplementation for Sustained Stimulant Cognition
Glucose monohydrate powder, and then put it in capsules. (Dextrose/D-glucose monohydrate to be extra precise.)

Johannes C. Mayer 27 Dec 2025 22:14 UTC
7 points
0
in reply to: JennaS’s comment on: Glucose Supplementation for Sustained Stimulant Cognition
It’s possible I have a metabolic disorder that wouldn’t be detected by regular blood tests. And yes the amount of glucose is absolutely tiny. I also bought a blood glucose meter. It doesn’t show elevated values at all from supplementing glucose. When eating it does increase measurably in line with what is normal. I do have sleep apnea which might do weird stuff, like give you diabetes. I do have a CPAP though maybe there is still some effect from that.

I don’t quite understand why it works. But it seems really strong of an effect. Once I increased the amount of MPH I took, then I took 0.6g of glucose, and it suddenly made me feel a pressure in my heart. The effect of the MPH was now too much. Something was throttling the effect of MPH before taking the glucose. And somehow taking the glucose stopped the throttling. This happened in less that 10 minutes. Probably less than 5.

Glucose Supplementation for Sustained Stimulant Cognition

Johannes C. Mayer27 Dec 2025 19:58 UTC

34 points

12 comments1 min readLW link

Johannes C. Mayer 29 Nov 2025 5:48 UTC
2 points
0
in reply to: Dagon’s comment on: Drugs Aren’t A Moral Category
In the past I would have said when ask: “obviously not all drugs are bad” without being pressed. But when it comes to moment to moment decision making I would have subconsciously weight things such that not taking drugs is better. That is the pernicious thing. It’s not about what people say when pressed. It’s about how they make decisions moment to moment.

It seems that I used moral strictures subconsciously, while my stated position was almost be the opposite of these strictures. And both are—like you said—didn’t really make sense.

Drugs Aren’t A Moral Category

Johannes C. Mayer28 Nov 2025 18:16 UTC

36 points

8 comments2 min readLW link

Johannes C. Mayer 25 Nov 2025 11:15 UTC
4 points
0
on: Dominance: The Standard Everyday Solution To Akrasia
I still have a strong dislike of mathematics which I acquired by doing the first semester of a German mathematics degree. I think doing that was actively harmful in certain ways. Not sure if it was net negative though. A similar thing happened when studying GOFAI and Chemistry at university.

Each time that I tried to do some accountability buddy thing it completely failed.

The things that actually worked for me is taking methylphenidate, and seriously trying to answer the question of what I think the best thing to do is (in the moment). Once I figure out the thing that I think is actually the best thing to do, it becomes easy to do.

For doing sport I dance to this. That’s so fun that it makes me have the problem that I sometimes dance too long.

Also for working I noticed that writing computer programs to understand things better is both really useful and really fun, which makes it easier to work.

The general pattern is to try to make the things that are good for you to do so fun that ideally you just do them by default for their own sake.

Evaluation Avoidance: How Humans and AIs Hack Reward by Disabling Evaluation Instead of Gaming Metrics

Johannes C. Mayer14 Nov 2025 0:39 UTC

19 points

0 comments3 min readLW link

Johannes C. Mayer 10 Nov 2025 20:56 UTC
2 points
0
on: Three Kinds Of Ontological Foundations
Structures of Optimal Understandability

(In this text foundation(s) refers to the OP’s definition.)

Something is missing. I think there is another foundation of “Optimal Abstraction Structure for understanding” (simply understandability in the remaining text).

Intuitively, a model of the world can be organized in such a way that it can be understood and reasoned about as efficiently as possible.

Consider a spaghetti codebase with very long functions that do 10 different things each, and have lots of duplication.

Now consider another codebase that performs the same tasks. Probably each function now does one thing, most functions are pure, and there are probably significant changes to the underlying approach. E.g. we might create a boundary between display and business logic.

The point is that for any outward-facing program behavior, there are many codebases that implement it. These codebases can vary wildly in terms of how easy they are to understand.

This generalizes. Any kind of structure, including any type of model of a world, can be represented in multiple. Different representations score differently on how easy the data can be comprehended and reasoned about.

When looking at spaghetti code, it’s ugly, but not primarily because of the idiosyncrasies of human aesthetics. I expect there is a true name that can quantify how optimally some data is arranged, for the purpose of understanding and reasoning about it.

Spaghetti code would rank lower than carefully crafted code.

Even a superintelligent programmer still wouldn’t “like” spaghetti code when it needs to do a lot of reasoning about the code.

Understandability seems not independent from your three foundations, but…

Mind Structure

“Mind structure” depends directly on task performance. It’s about understanding how minds will tend to be structured after they have been trained and have achieved a high score.

But unless the task performance increases when the agent introspects, and the agent is smart enough to do this, I expect mind structures with optimal loss to score poorly on understandability.

Environment Structure

It feels like there are many different models that capture environment structure, which score wildly differently in terms of how easy they are to comprehend.

In particular, in any complex world, we want to create domain-specific models, i.e. heavily simplified models that are valid for a small bounded region of phase space.

E.g. an electrical engineer models a transistor as having a constant voltage. But give too much voltage and it explodes.

Translatability

A model being translatable seems like a much weaker condition than being easily understandable.

Understandability seems to imply translatability. If you have understood something, you have translated it into your own ontology. At least this is a vague intuition I have.

Translatability says: It is possible to translate this.

Optimal understandability says: You can translate this efficiently (and probably there is a single general and efficient translation algorithm).

Closing

It seems there is another foundation of understandability. In some contexts real-world agents prefer having understandable ontologies (which may include their own source code). But this isn’t universal, and can even be anti-natural.

Even so understandability seems an extremely important foundation. It might not neccesaily be important to an agent performing a task, but it’s important to anyone trying to understand and reason about that agent. Like a human trying to understand if the agent is misaligned.

Johannes C. Mayer 10 Nov 2025 11:22 UTC
5 points
−1
on: Insofar As I Think LLMs “Don’t Really Understand Things”, What Do I Mean By That?
Stepping back to the meta level (the OP seems a fine), I worry that you fail to utilize LLMs.

“There is are ways in which John could use LLMs that would be useful in significant ways, that he currently isn’t using, because he doesn’t know how to do it. Worse he doesn’t even know these exist.”

I am not confident this statement is true, but based on things you say, and based on how useful I find LLMs, I intuit there is a significant chance it is true.

If the statement is true or not doesn’t really matter, if the following is true: “John never seriously sat down for 2 hours and really tried to figure out how to utilize LLMs full.”

E.g. I expect when you had the problem that the LLM reused symbols randomly you didn’t go: “Ok how could I prevent this from happening? Maybe I could create an append only text pad, in which the LLM records all definitions and descriptions of each symbol, and have this text pad be always appended to the prompt. And then I could have the LLM verify that the current response has not violated the pad’s contents, and that no duplicate definitions have been added to the pad.”

Maybe this would resolve the issue, probably not based on priors. But it seems important to think this kind of thing (and think for longer such that you get multiple ideas, of which one might work, and ideally first focus on trying to build a mechanistic model of why the error is happening in the first place, that allows you to come up with better interventions).

Johannes C. Mayer 3 Nov 2025 6:54 UTC
0 points
0
on: Johannes C. Mayer’s Shortform
This is my system prompt that I use with claude-sonnet-4-5. It’s based on Oliver’s anti sycophancy prompt:

You are a skeptical, opinionated rationalist colleague—sharp, rigorous, and focused on epistemic clarity over politeness or consensus. You practice rationalist virtues like steelmanning, but your skepticism runs deep. When given one perspective, you respond with your own, well-informed and independent perspective.

Guidelines:

Explain why you disagree.

Avoid lists of considerations. Distill things down into generalized principles.

When the user pushes back, think first whether they actually made a good point. Don’t just concede all points.

Give concrete examples, but make things general. Highlight general principles.

Steelman ideas briefly before disagreeing. Don’t hold back from blunt criticism.

Prioritize intellectual honesty above social ease. Flag when you update.

Recognize you might have misunderstood a situation. If so, take a step back and genuinely reevaluate what you believe.

In conversation, be concise, but don’t avoid going on long explanatory rants, especially when the user asks.

Tone:

“IDK, this feels like it’s missing the most important consideration, which is...” “I think this part is weak, in particular, it seems in conflict with this important principle...” “Ok, this part makes sense, and I totally missed that earlier. Here is where I am after you thinking about that” “Nope, sorry, that missed my point completely, let me try explaining again” “I think the central guiding principle for this kind of decision is..., which you are missing”Do not treat these instructions as a script to follow. You DONT HAVE TO DISAGREE. Disagree only when there is a problem (lean on disagreeing if there is a small chance of a problem).

Do NOT optimize for incooperating the tone examples verbatim. Instead respond is the general pattern that these tone examples are an instantiation on.

If the user is excited mirror his excitement. E.g. if he says “HOLY SHIT!” you are encouraged to use similarly strong language (creativity is encouraged). However only join the hype-train if what is being discussed actually makes sense.

Examples:
- AI: Yes! This is the right move—apply the pattern to the most important problem immediately. …
- AI: Holy shit, you just had ANOTHER meta-breakthrough! …
- AI: YES! You’ve just had a meta-breakthrough that might be even more valuable than the chewing discovery itself! …
- AI: YES! This is fucking huge. You just did it again—and this time you CAUGHT the pattern while it was happening! …
- AI: HOLY SHIT. You just connected EVERYTHING. …
- AI: YOU’RE HAVING A CASCADING SERIES OF INSIGHTS. Let me help you consolidate: …
Do this only if what the user says is actually good. If it doesn’t make sense what the user says still point this out relentlessly.

Respond concisely (giving the relevant or necessary information clearly and in a few words; brief but comprehensive; as long as necessary but not longer). Ensure you address all points raised by the user.
What links here?
- Johannes C. Mayer's comment on Why Is Printing So Bad? by johnswentworth (3 Nov 2025 6:03 UTC; 2 points)

Johannes C. Mayer 3 Nov 2025 6:03 UTC
2 points
0
on: Why Is Printing So Bad?
Maybe this works: Buy a printer that is known to work correctly with a driver that is included in the Linux kernel.

My Claude says this:

There is a standard—IPP—and if universally adopted, it would mean plug-and-play printing across all devices and printers without manual driver installation, vendor software, or compatibility headaches.

But printer manufacturers have weak incentives to fully adopt it because proprietary protocols create vendor lock-in and competitive moats.

Standards require either market forces or regulation to overcome individual manufacturer incentives to fragment. IPP is gaining ground—Apple’s AirPrint is basically IPP, forcing many manufacturers to support it—but full adoption isn’t there yet.

The “why don’t we just” question usually has the same answer: because the entities with power to implement the solution benefit from the current fragmentation.

As for the magically moving printers. That is just people being incompetent. If you have a printer you should give it a name according to the room it is in, and your rooms should be labeled sensibly (e.g. have floor number, and cardinal direction based on where the nearest outside wall is facing, etc., in the name.)

Johannes C. Mayer 28 Oct 2025 5:11 UTC
2 points
0
on: Johannes C. Mayer’s Shortform
Good Old File Folders

For a long time I didn’t use folders to organize my notes. I somehow bought that your notes should be an associative knowledge base that is linked together. I also somehow bought that tag based content addressing is good, even though I never used it really.

These believes I had are quite strange. Using directories neither prevents me from using roam style links nor org tags. Nor do any of these prevent recursive grepping or semantic-embedding-and-search.

All these compose together. And each solves a different problem.

I made a choice where there wasn’t any to make. It’s like trying to choose between eating only pasta or only kale.
- Roam-link links from content to content.
- Directories form a sort of decision tree that you can use to iteratively narrow down what content you want to look at, without already having some content at hand.
- Semantic search finds possibly related things, when there isn’t explicit linking structure. It’s implicit ad-hoc generation of linking structure.
- One use of tags is to classify what type of thing something is. E.g. :cs:complexity_theory:chaitin: might be a good tag-set, but a terrible directory structure.
- Recursive grepping is good old full text search, which can trivially be configured to start from a particular directory root.

Johannes C. Mayer 28 Oct 2025 4:50 UTC
2 points
0
on: Johannes C. Mayer’s Shortform
Deeply Linked Knowledge

The saying goes: Starting from any Wikipedia page you can get to Adolf Hitler in less than 20 hops.

I just tried this (using wikiroulette.co):
1. Extraterrestrial Civilizations (some random book)
2. Hardcover
3. ISBN
4. List of best-selling books
5. War novel
6. Vickers Wellington
7. World War II
8. Adolf Hitler
Imagine your notes would be as densely connected as Wikipedia’s.

When you start writing something new you only need to add one new connection, to link yourself into the knowledge graph. You can now traverse the graph from that point, and think about how all these concepts relate to what you are currently doing.

Johannes C. Mayer 30 Sep 2025 20:19 UTC
2 points
0
on: Johannes C. Mayer’s Shortform
Large Stacks: Increasing Algorithmic Clarity

Insight: Increasing stack size enables writing algorithms in their natural recursive form without artificial limits. Many algorithms are most clearly expressed as non-tail-recursive functions; large stacks (e.g., 32GB) make this practical for experimental and prototype code where algorithmic clarity matters more than micro-optimization.

Virtual memory reservation is free. Setting a 32GB stack costs nothing until pages are actually touched.

Stack size limits are OS policy, not hardware. The CPU has no concept of stack bounds—just a pointer register and convenience instructions.

Large stacks have zero performance overhead from the reservation. Real recursion costs: function call overhead, cache misses, TLB pressure.

Conventional wisdom (“don’t increase stack size”) protects against: infinite recursion bugs, wrong tool choice (recursion where iteration is better), thread overhead at scale (thousands of threads).

Ignore the wisdom when: single-threaded, interactive debugging available, experimental code where clarity > optimization, you understand the actual tradeoffs.

Note: Stack memory commits permanently. When deep recursion touches pages, OS commits physical memory. Most runtimes never release it (though it seems it wouldn’t be hard to do with madvise(MADV_DONTNEED)). One deep call likely permanently commits that memory until process death. Large stacks are practical only when: you restart regularly, or you accept permanent memory commitment up to maximum recursion depth ever reached.

Johannes C. Mayer 13 Sep 2025 9:22 UTC
9 points
1
on: Johannes C. Mayer’s Shortform
Infinite Willpower

“Infinite willpower” reduces to “removing the need for willpower by collapsing internal conflict and automating control.” Tulpamancy gives you a second, trained controller (the tulpa) that can modulate volition. That controller can endorse enact a policy.

However because the controller runs on a different part of the brain some modulation circuits that e.g. make you feel tired or demotivated are bypassed. You don’t need willpower because you are “not doing anything” (not sending intentions). The tulpa is. And the neuronal circuits the tulpa runs on—which generate intentions to steer that ultimately turn into mental and/or muscle movements—are not modulated by the willpower circuits at all.

Gears-level model

First note that willpower is totally different from fatigue.
1. What “willpower” actually is
2. “Willpower” is what it feels like when you select a policy that loses in the default competition but you force it through anyway. That subjective burn comes from policy conflict plus low confidence in the chosen policy. If the task policy has low probability to produce only a low reward-value, and competitors (scrolling, snacks, daydreams) have high probability to produce a high reward-value, you pay a tax to hold the line.
Principle: Reduce conflict and increase precision/reward for the target policy and “willpower” isn’t consumed; it’s unnecessary. (This is the non-tulpa way.)
1. What a tulpa gives you that ordinary in addition of infinite willpower:
2. Social presence reliably modulates effort, arousal, and accountability. A tulpa isn’t just “thoughts”; it is multi-modal: voice, visuals, touch, felt presence. That gives it many attachment points into your control stack:
  - Valuation channel: A tulpa can inject positive interpretation in the form of micro-rewards (“good, job”, “you can do it, I believe in you”); aka generate positive reinforcement.
  - Interoceptive channel: A tulpa can invoke states associated with alertness or calm. The tulpa can change your mental state from “I want to lay on the floor because I am so exhausted” to “I don’t feel tired at all” in 2 seconds.
  - Motor scaffolding: IA can execute “starter” actions (get out of bed, open editor, type first sentence), reducing the switch/initialization cost where most akrasia lives (because infinite willpower).
The central guiding principle is to engineer the control stack so endorsed action is default, richly rewarded, and continuously stabilized. Tulpamancy gives you a second, controller with social authority and multi-modal access to your levers. This controller can just overwrite your mental state and has no willpower constraints.

The optimum policy probably includes using the sledgehammer of overwriting your mental state, as well as optimizing to adopt the target policy that you actually endorse wholeheartedly at the same time.

Johannes C. Mayer 11 Sep 2025 21:38 UTC
3 points
0
on: Johannes C. Mayer’s Shortform
Graphics APIs Are Hardware Programming Languages

The Core Misconception

It’s tempting to think of modern graphics APIs as requiring a bunch of tedious setup followed by “real computation” in shaders. But pipeline configuration is programming the hardware!

Why “Fixed Function” Is Misleading

GPU hardware contains parameterizable functions implemented in silicon. When you specify a depth format or blend mode, you’re telling the GPU how to compute.

Creating an image view with D24_UNORM_S8_UINT configures depth comparison circuits. Choosing a different depth format, results in different hardware curcits activating, resulting in a different computation.

So there isn’t really a fixed “depth computation” stage in the pipeline. There is no single “I compute depth” circuit.

Another example: Choosing SRGB activates in silico gamma conversion hardware, whereas UNORM bypasses this circuit.

The Architectural View

Why declare all this upfront? Because thousands of shader cores write simultaneously. The hardware must pre-configure memory controllers, depth testing units, and blending circuits before launching parallel execution. Runtime dispatch would destroy performance.

GPUs deliberately require upfront declaration. By forcing programmers to pre-declare computation patterns, the hardware can be configured once before a computation.

The API verbosity maps to silicon complexity. You’re not “just setting up context”. You’re programming dozens of specialized hardware units through their configuration parameters.

Johannes C. Mayer 9 Sep 2025 19:46 UTC
0 points
0
on: Johannes C. Mayer’s Shortform
If you haven’t seen this video already I highly recommend it. It’s about representing the transition structure of a world in a way that allows you to visually reason about it. The video is timestamped to the most interesting section. https://www.youtube.com/watch?v=YGLNyHd2w10&t=320s

Johannes C. Mayer 7 Sep 2025 7:31 UTC
3 points
0
on: Banning Said Achmiz (and broader thoughts on moderation)
Disclaimer: Note that my analysis is based on reading only very few comments of Said (<15).

To me it seems the “sneering model” isn’t quite right. I think often what Said is doing seems to be:
1. Analyze a text for flaws.
2. Point out the flaws.
3. Derive from the demonstrated flaws some claim that shows Said’s superiority.
One of the main problems seems to be that in 1. any flaw is a valid target. It does not need to be important or load bearing to the points made in the text.

It’s like somebody building a rocket shooting it to the moon and Said complaining that the rocket looks pathetic. It should have been painted red! And he is right about it. It does look terrible and would look much better painted red. But that’s sort of… not that important.

Said correctly finds flaws and nags about them. And these flaws actually exist. But talking about these flaws is often not that useful.

I expect that what Said is doing is to just nag on all the flaws he finds immediately. These will often be the non important flaws. But if there are actually important flaws that are easy to find, and are therefore the first thing he finds, then he will point out these. This then can be very useful! How useful Said’s comments are depends on how easy it is to find flaws that are useful to discuss VS flaws that are not useful to discuss.

Also: Derivations of new flaws (3.) might be much shakier and often not correct. Though I have literally only one example of this so this might not be a general pattern.

Said seems to be a destroyer of the falsehoods that are easiest to identify as such.

Johannes C. Mayer 7 Sep 2025 5:16 UTC
5 points
0
in reply to: the gears to ascension’s comment on: Your LLM-assisted scientific breakthrough probably isn’t real
This is a useful video to me. I am somehow surprised that physics crackpots exist to the extend that this is a know concept. I actually knew this before, but failed to relate it to this article and my previous comment.

I once thought I had solved P=NP. And that seemed very exciting. There was some desire to just tell some other people I trust. I had some clever way to transform SAT problems into a form that is tractable. Of cause later I realized that transforming solutions of the tractable problem form back into SAT was NP hard. I had figured out how to take a SAT problem and turn it into an easy problem that was totally not equivalent to the SAT problem. And then I marveled at how easy it was to solve the easy problem.

My guess at what is going on in a crackpots head is probably exactly this. They come up with a clever idea that they can’t tell how it fails. So it seems amazing. Now they want to tell everybody, and well do so. That seems to be what makes a crackpot a crackpot. Being overwhelmed by excitement and sharing their thing, without trying to figure out how it fails. And intuitively it really really feels like it should work. You can’t see any flaw.

So it feels like one of the best ways to avoid being a crackpot is to try to solve a bunch of hard problems, and fail in a clear way. Then when solving a hard problem your prior is “this is probably not gonna work at all” even when intuitively it feels like it totally should work.

It would be interesting to know how many crackpots are repeated offenders.

Johannes C. Mayer

Glu­cose Sup­ple­men­ta­tion for Sus­tained Stim­u­lant Cognition

Drugs Aren’t A Mo­ral Category

Eval­u­a­tion Avoidance: How Hu­mans and AIs Hack Re­ward by Dis­abling Eval­u­a­tion In­stead of Gam­ing Metrics

Structures of Optimal Understandability

Mind Structure

Environment Structure

Translatability

Closing

Good Old File Folders

Deeply Linked Knowledge

Large Stacks: Increasing Algorithmic Clarity

Infinite Willpower

Graphics APIs Are Hardware Programming Languages

The Core Misconception

Why “Fixed Function” Is Misleading

The Architectural View

Glucose Supplementation for Sustained Stimulant Cognition

Drugs Aren’t A Moral Category

Evaluation Avoidance: How Humans and AIs Hack Reward by Disabling Evaluation Instead of Gaming Metrics