Mis-Understandings

Karma: 169

Mis-Understandings Jun 20, 2025, 10:02 PM
3 points
0
in reply to: Mo Putera’s comment on: Mo Putera’s Shortform
I think this might be an attempted countermeasure against prompt injection. That is, it wants to mix autoregressive and reconstructed residuals. Otherwise, it might lose it’s train of thought (end up continuing the article not following the prompt).

Mis-Understandings Jun 16, 2025, 11:13 PM
2 points
0
on: RTFB: The RAISE Act
There is a hidden legal standard that this law would like to endorse about existing laws, but I am not sure that it sets out. It is at least suggesting a precedent when it talks about “Things that would be crimes requiring intent”, in that there is an argument that LLMs or any AIs do not yet have the requisite mental state, since they don’t really have mental states. So they can’t be liable for crimes becasue of mens rea reasons, and since you did not know you can’t have intent. This law is trying to argue that that is basically bullshit.
(i) Acts with no meaningful human intervention; and
(ii) Would, if committed by a human, constitute a crime specified in the penal law that requires intent, recklessness, or gross negligence, or the solicitation or aiding and abetting of such a crime.
seems to imply that the lawmakers believe that there should not be a way for using an AI to sever liability, criminal or civil, for an action, regardless of what you intended the AI do, and if you are in a position to do that this makes it the providing companies problem. They fucked up, but it informs the prosecutor that they should prosecute, because the provider fucked up.
Basically, it is 1 law away from doing the “if your dog commited a violation, x happens”, for “If an AI commits a tort, who is liabile” with an answer that is not nobody. There is an argument under current law that there is a level of independence where that is nobody, because the AI can’t have relative intent. This law tries to say “IT REALLY SHOULD BE THE AI COMPANY” if liability would sink in an independent agent without it’s own assets.

Mis-Understandings Jun 15, 2025, 6:57 PM
15 points
12
in reply to: Cole Wyeth’s comment on: Intelligence Is Not Magic, But Your Threshold For “Magic” Is Pretty Low
We are exactly worried about that though. It is not that AGI will be inteligent (that is the name), but that it can and probably will develop dangerous capabilities. Inteligence is the word we use to describe it, since it is associated with the ability to gain capability, but even if the AGI is sometimes kind of brute force or dumb does not mean that it cannot also have dangerous enough capabilities to beat us out.

Mis-Understandings Jun 12, 2025, 6:31 PM
2 points
0
on: AI #120: While o3 Turned Pro
On AI diplomacy, 1. It is a useful model benchmark but 2. Is not new, since we already had human level full press diplo harnesses (CICERO) (2022). It would be useful to compare these new harnesses to the more targeted system.

Mis-Understandings May 22, 2025, 2:17 PM
2 points
0
on: AI #117: OpenAI Buys Device Maker IO
whether
Is that supposed to be weather?

Mis-Understandings May 17, 2025, 1:59 PM
1 point
0
in reply to: leogao’s comment on: Mis-Understandings’s Shortform
No, because you need different strategies to prove the loop case (which you can do with just a sequence of transitions), the halt case (the same) and the use the infinite amount of memory case.
There is no proof that will catch 100% of case 3, (because then you would have a halting oracle). But you can create a program that will halt iff a program halts, or halt iff it loops in finite states or halts. (I could write it, but it just runs real slow). You cannot write the program that halts if another program uses an infinite amount of memory (since then you could build a halting oracle). There are NO halting oracles, not just no efficient halting oracles.

Mis-Understandings May 16, 2025, 7:17 PM
1 point
0
on: Slow corporations as an intuition pump for AI R&D automation
Random Thought. Automated Corp has a real advantage. That is, inference and training can run on the same GPUs, (to a first approximation). So for slow corp, if they spend a day deciding, and don’t commit to runs, they are wasting GPU time. But the other corps don’t have this problem. It is a big problem. There is something there, the real question is (How much does thinking about the results of your test queue improve the informational value of the tests you run).

Mis-Understandings May 16, 2025, 2:58 AM
5 points
0
on: Mis-Understandings’s Shortform
A program must either loop, halt, or use an infinite amount of memory.
Halting is almost never the highest EV action for any goal
Looping mught be high EV in some cases and some goals, but I would not expect many of them, and definitely not short loops (the longer the loop, the more like case 3).
So the policy (Always take the highest EV action for a particular goal (that is the danger model)) is a program that will use an infinite amount of memory, since it neither halts nor loops.

Mis-Understandings May 11, 2025, 5:44 PM
1 point
0
on: Better Air Purifiers
Thoughts about HEPA is as a standard. A lower percentage removed can happen in two ways. One is probabilistic, where the same particle might or might not get trapped. Managing probabilistic capture, total circulation and fan stats (airflow vs static pressure) is probably a good idea. Introducing determinsitic non-captures (where there are a a class of particles not captured), can be a problem, as those will not be affected by the purifier. But that is engineering that requires only dilligence.
Another way of putting it is that it makes more sense to use higher airflow, lower static pressure fans, and the filters should work with that.

I agree that the correct measure is particulates where people breathe, not simply exhaust particulates.

Mis-Understandings May 8, 2025, 2:59 PM
3 points
0
in reply to: plex’s comment on: ete’s Shortform
A contextualization of people toting big personal speedup numbers.
People get way more productive by rethinking their workflow, especially in research, not all the time, but like it was not an unprecedented story in 2015.
Do you remember when people were talking about 10x engineers in the 2010s.
Discovering that in a new workflow, you are the 10x engineer is not unprecedented.
The question is the rate of (try new thing)-> clicks with workflow so output jumps, higher.
Sometimes, people got 10x more productive from some change before any of this, so understand that any change in workflow has a noise floor even at these productivity leaps.

Mis-Understandings May 7, 2025, 3:37 PM
1 point
0
in reply to: otto.barten’s comment on: otto.barten’s Shortform
No, because we want some of that behaviour. It is neccesary for being able to split research tasks across multiple models (in research settings), so that we can get integrated work out of forks, which requires some amount of communication.
Additionally, cross inference communication is likely a goal for practical applications, since it is what allows (I send a customer service bothering agent to talk to the customer service agent), which is a predicted pattern.
Basically, stenography concerns mean that cross inference communication can come through any shared environement element, and so to rule it out is to block all communication, and additionally hide all shared environment, and so we can never guarntee that there is not a bypass.
But siloing is still a useful tool for principle of minimum required privledge out of regular cybersecurity.

Mis-Understandings May 4, 2025, 8:03 PM
−6 points
−12
on: Interpretability Will Not Reliably Find Deceptive AI
The basic theory of action is that signs of big problems (red hands) will generate pauses and drastic actions.
Which is a governance claim.

Mis-Understandings May 4, 2025, 3:23 PM
1 point
0
on: What’s up with AI’s vision
Also note that there are people who can tutor you in geoguesser, but not in interpreting pixel art.
If even one blog that goes through that process step by step ends up in the training data, and it is routinely a useful subtask in image tasks (What and where are correlated), then the subcapacity can be directly elicited.

Mis-Understandings May 4, 2025, 1:40 PM
9 points
3
on: The Ukraine War and the Kill Market
This works probably better for the drone units than the infantry (for instance).
Specifically, the policy of sending drones to the units that confirm the most kills in a way that is really hard to fake (the video, and the fact that lying here would result in punishment (obviously)) is a regular logistics policy.
This is just doing 3 things. The first is making the requisition game more legible to individual soldiers (for NATO style militaries this is very good), because this policy of supply priority and flexibility focusing on the most successful units is not a new system (it is built into the actual practice of professional militaries). Secondly it probably results in better data, because now they have a data pipeline for it. Third is that it affects morale, because all military communication also does that.

Mis-Understandings May 3, 2025, 4:34 PM
1 point
0
on: Mis-Understandings’s Shortform
So I ran into this
https://www.youtube.com/watch?v=AF3XJT9YKpM
And I noticed a lot of talk about error taxonomy.
Which seems like an important idea in general, but especially in interpretability for safety.
Specifically, error taxonomy is a subset of action by consequence taxonomy, which is the main goal of interpretability for safety (as it allows us to act on the fact that the model will take actions with bad consequences).

Mis-Understandings Apr 30, 2025, 12:03 AM
3 points
−2
on: Misrepresentation as a Barrier for Interp (Part I)
I think this is missing the mechanics of interpretability. Interpretability is about the opposite, “what it does”
So basically, interpretability only cares about mixed features (malfunction where the thing is not as labeled) only insofar as the feature does not only do the thing that the label would make us think that it does.
That is to say, in addition to labeling representational parts of the model, interp wants to know the relation between those parts
So we know ultimatley enough about what the model will do to either debug as capabilities research, or prove that it will not try to do x, y, and z that will kill us for safety.
Basically, for an alignment researcher the polysemantics that come from being wrong sometimes, if the wrongness really is in the model, so produces the same actions, that is basically okay.
Even just plain polysemantics is not the end of the world for you interp tools, because there is not one “right” semantics. You just want to span the model behaviour.

Mis-Understandings Apr 28, 2025, 1:55 PM
1 point
0
in reply to: O O’s comment on: O O’s Shortform
Global GDP growth over the same period was around 3 percent.
The question is how did equities outperform gdp growth.
I think that this has to do with changes in asset prices in general.

Mis-Understandings Apr 21, 2025, 10:10 PM
10 points
5
in reply to: ryan_greenblatt’s comment on: Vladimir_Nesov’s Shortform
o3 has a different base model (presumably).
All of the figures are base model equivalated between RL and not
I would expect “this paper doesn’t have the actual optimal methods” is true, this is specifically a test for PPO for in distribution actions. Concretely, there is a potential story here about PPO reinforces traces that hit in self-play, consequently, there is a sense which we would expect it to only select previously on policy actions.
But if one has enough money, you can finetune GPT models, and test that.
Also note that 10k submissions is about 2 OOM out of distribution for the charts in the paper.
Pass at inf k includes every path with nonzero probability (if there is a policy of discarding exact repeat paths).
We know that RL decreases model entropy, so the first k passes will be more different for a high variance model.
Pass at k is take best, so for normal distribution take best has EV mean+variance*log(samples).
At very large K, we would expect variance to matter more than mean.

Mis-Understandings Apr 21, 2025, 1:24 PM
5 points
0
in reply to: Yair Halberstadt’s comment on: How to end credentialism
Noone cared.
You don’t know what questions they did not ask you, and the assumptions of shared cultural background that they made because they saw that. They would not tell you. (unless you have comparisons to job searching before getting the degree).
Fundamentally, this is the expected phenomenology, since people do not tend to notice sources of your own status.

Mis-Understandings Apr 20, 2025, 8:33 PM
5 points
2
on: How to end credentialism
Credentialism is good because the limiting factor on employment is trust, not talent for most credential requiring positions (white collar, buisness and engineering work).
Universities are bad at teaching skills, but generate trust and social capital.
Trust that allows the system to underwrite new white collar workers to do things that might lose buisnesses lots of money is important and expensive.
Consequently you get credential requirements, because there is no test other than years of being in social systems that can tell you that a person has the ability to go 4 years without crashing out (which is the key skill).
Additionally, going to university has become a class signifier, and all classes wish they were bigger and more prominent.
The alternative to credentialism is selection, or real meritocracy.
The alternative to credentialism is not selection, it is hiring your buddies, hiring by visible factors, and hiring randomly. Most business are not that guy that they can run a competitive selective process (THOSE ARE REALLY EXPENSIVE).
“universities provide to employers is the ability to confirm you are clever, driven, and have relevant skills” is false. They provide that you are a member of the professional class that is not going to do stupid things that lose money/generate risk.
Fundamentally, this misunderstands the purpose of the degree to the hiring bureaucracy, and the political economy behind it.