dan.parshall

Karma: 58

dan.parshall 10 Jun 2026 16:03 UTC
1 point
0
on: Lighthaven East—A Feasibility Study
I’m in DC, and interested in being involved. Please contact me: dan@canaryinstitute.ai

dan.parshall 27 Apr 2026 13:44 UTC
1 point
2
in reply to: denkenberger’s comment on: Nectome: All That I Know
How much is that talked about by the cryonics companies? Normally “social proof” is a big deal, and having that as part of a FAQ would be very persuasive to the normies!

dan.parshall 20 Apr 2026 17:35 UTC
2 points
1
in reply to: RussellThor’s comment on: Reevaluating “AGI Ruin: A List of Lethalities” in 2026
I think it’s useful to have arguments that appeal to folks all across the political landscape, and I like this framing. I often use something like “think about how much variation there is oven ‘human nature’, and just how good or bad it can be; an artificial intelligence will have an artificial nature, and could have behaviors much weirder than we imagine”.

Interestingly, this seems to bite harder amongst conservatives and those with a religious worldview; they often have a dim view of human nature in the first place, and I think it gets them thinking about “something even worse than ‘made in the image of God’”. I hope this continues to be helpful, since AI is now becoming polarized.

dan.parshall 20 Apr 2026 17:30 UTC
2 points
1
in reply to: Jay Bailey’s comment on: Reevaluating “AGI Ruin: A List of Lethalities” in 2026
As I said in the original comment, I can certainly imagine minds that have other goals; possibly I have just overinterpreted statements like:
> If we consider a space of minds a million bits wide, then any argument of the form “Some mind has property ” has chances to be true and any argument of the form “No mind has property ” has chances to be false.

To imply that the distribution over such minds is likely to be uniform. Whereas it seems our current methods, using imitation learning, are at least definitely not sampling from that space uniformly. Overall this makes me more optimistic that alignment may be tractable.

dan.parshall 19 Apr 2026 23:59 UTC
6 points
−15
on: Reevaluating “AGI Ruin: A List of Lethalities” in 2026
Nice review. One thing you didn’t directly address, but which has struck me learning more about AI training, is that the Orthogonality Thesis… doesn’t actually seem to be true? I mean, yes, I could imagine intelligences that loved other things for no reason, but the intelligences we seem to actually be making seem to be not insanely orthogonal! (although still far from perfectly aligned, but I’m hopeful nonetheless)

dan.parshall 19 Apr 2026 23:20 UTC
2 points
0
on: Carpathia Day
I appreciate you putting this here! I realized no one had ever archived the original, so I’ve done so. The permanent link is at
https://web.archive.org/web/20260419231740/https://mylordshesacactus.tumblr.com/post/813939696352772096/please-make-a-post-about-the-story-of-the-rms

dan.parshall 19 Apr 2026 23:11 UTC
21 points
5
in reply to: Karl Krueger’s comment on: Let goodness conquer all that it can defend
Name one other place or time in history where it was illegal to teach literacy!
Here are 3:
- 1700s Ireland, it was illegal for Catholics to operate schools, teach, or send children abroad for education
- In Khmer Rouge Cambodia, all of the intelligentsia were executed and schools closed
- in Taliban Afghanistan, women have no ability to learn (beyond ~3rd grade, IIRC)

None of which makes the Antebellum South in good company, but I do want to push back on the commonly-held perception that it was uniquely bad; truly there is no new thing under the sun!

dan.parshall 19 Apr 2026 16:24 UTC
4 points
0
on: The Mirror Test Is Complicated
Their brains are highly specialised to recognise other fish’s bodies, and locate and remove remove parasites from them.
Haven’t read the original wrasse paper, but my understanding is that what made it a “pass” was that when presented with the mirror, the wrasses would clean their own bodies, rather than attempting to clean the wrasse in the mirror.
Overall, that finding pushed me in the direction this post argues; i.e. that it’s decent Bayes evidence in favor of self-awareness, but far from a slam-dunk.

dan.parshall 19 Apr 2026 0:09 UTC
1 point
0
in reply to: FlorianH’s comment on: What economists get wrong (and sometimes right!) about AI
If you’re aware of other preprints or publications taking TAI seriously, I would genuinely love to have citations! Makes it much easier to say these kinds of things to a policymaker when there’s a stack of supporting arguments.

dan.parshall 18 Apr 2026 23:38 UTC
5 points
0
on: Annoyingly Principled People, and what befalls them
Alice decides Principle X is important enough to make a big deal about.
People don’t seem to understand the issue. Alice explains it more. Some people maybe get it but then next week they seem to have forgotten. Other people still don’t get it.
This reminds me of a line from Shaw:
”The reasonable man adapts himself to the world: the unreasonable one persists in trying to adapt the world to himself. Therefore all progress depends on the unreasonable man.”

I think one way to navigate the challenge this post points at, is to recognize, both privately and publicly, that it’s hard to be perfect all the time, we all falter, and sometimes we must choose our battles.

I think it’s important both to enable the sort of self-forgiveness that’s most needed by those prone to self-flagellation, and also to lower the overall temperature.

So I strive (sometimes even successfully!) to recognize that the things of great import to another person may, in fact, be truly important… while also remembering that the things of great import to me may not, in fact, be truly important.

I think this aligns with your willingness to own “yes, I’m not applying that because it’s too much tradeoff for me”. Because yes, sometimes that person’s unreasonableness is correct… and sometimes I can’t stomach the battle for it, even though I know that. But acknowledging “well, I need a mulligan on this, and I’ll try to give one elsewhere” makes the whole world slightly better off.

dan.parshall 16 Apr 2026 18:45 UTC
5 points
1
on: Daycare illnesses
(Quite confident) The most common illnesses (colds and flu) don’t build immunity in general (in kids or adults) because they mutate every year
Not my area, but it seems like the difference between “this year’s variant and last year’s” is going to be much, much smaller than the difference between “never exposed to any cold/flu before”.

So naively it seems possible that the first several colds do train the immune system quite a bit to handle colds in general, even if subsequent ones are then moving around to different points on the fitness landscape.

What economists get wrong (and sometimes right!) about AI

dan.parshall16 Apr 2026 1:49 UTC

14 points

4 comments3 min readLW link

dan.parshall 14 Apr 2026 0:37 UTC
2 points
0
in reply to: Mordechai Rorvig’s comment on: Publishing academic papers on transformative AI is a nightmare
It absolutely happens in things like “measuring corruption of public officials in $COUNTRY”; such things are silently dropped, and have to be inferred by reading between the lines. Apparently this is a long and proud tradition, at least in philosophy:
https://www.thepsmiths.com/p/joint-review-philosophy-between-the

dan.parshall 5 Apr 2026 12:56 UTC
2 points
0
in reply to: bjorkiscool113’s comment on: One way violinists fail
I’ve known a few, and that’s my impression as well, but I’m partly interested in the direction of causality. Is your impression that conditional on the songs being equally difficult, violinists would still be more neurotic?
If so, then I’m wondering if it has to do with the orchestral setting of playing in unison, where even a few cents difference is painfully obvious (at least to the other violinist)?

Or is it more the other way around, that violin calls to folks with a higher neurosis level?

dan.parshall 26 Mar 2026 19:40 UTC
1 point
0
in reply to: Zach Stein-Perlman’s comment on: How to do cost-effectiveness analysis for elections
So I did make a math mistake, but I think we’re in broad agreement. Let me be explicit for a race with total expected votes N=400 (e.g. seat on a city council for one district of a small town)

With N=400
sigma = sqrt(400 * 0.5 * 0.5) = 10
a 6-point lead means expected votes would be:
A : 212
B : 188
This corresponds to a win probability for A of cdf(12/10) ~88%

Changing one’s vote from A to B changes the expected counts to:
A : 211
B : 189
This corresponds to a win probability for A of cdf(11/10) ~86%
So yes, it’s only 2% change vs my earlier assertion of 8%, my mistake.

But I think we agree that sigma matters! And my point is that in small local elections, sigma is small, and your vote counts for a lot!

I agree that if you only care about federal policy, this doesn’t apply (I’d missed that in the initial post). But if you care about libraries, or how aggressive the police are, those are local issues where someone can have a strong influence in policy.

dan.parshall 26 Mar 2026 18:47 UTC
0 points
0
on: How to do cost-effectiveness analysis for elections
Let me elaborate: I broadly agree with the framing here, in that the probability of flipping a vote is going to be related to the margin of the race; in a race decided by a couple hundred votes, a single vote-flip counts for 0.5%; far more than it does in a national election.… if you’re voting in an election where one candidate has a 6% edge, your vote has roughly a 1 in 12 chance of changing the outcome! Thats massive leverage that you can’t hope to replicate in larger elections.

The value (which I believe maps to “goodness”) of that vote flip is going to be related to:
- the budget over which the politician has leverage
- what fraction of that budget spend affects you
- their probability of listening to what you have to say

While the budget is smaller in absolute terms, in terms of how it affects you it basically remains constant with election scale. i.e. the national budget is larger, but spread over 300M people, a local election has a smaller budget spread over a smaller population, but the per-person impact is about the same.

Moreover, precisely because local politicians know that every vote counts, they’re much more responsive to constituents than state or national politicians.

Given that A & B are much larger in local elections, I think there’s a lot of value there. The notable exception is if the policy is made at a higher level of jurisdiction.

dan.parshall 26 Mar 2026 15:42 UTC
1 point
0
on: I’m confused by the change in the METR trend
Did the labs start getting much better about data cleanup around this time? I know the “Textbooks are all you need” paper was in mid-2023, depending on training cycle etc I can also imagine that cleaner input improved agentic-specific skills. e.g., they started focusing on using TDD to make sure that the tests passed; this ties into the RLVR point, obviously.

dan.parshall’s Shortform

dan.parshall26 Mar 2026 1:57 UTC

1 point

1 comment1 min readLW link

dan.parshall 25 Mar 2026 16:37 UTC
2 points
0
on: One way violinists fail
Do fiddle players suffer from this as well? Or is the repertoire just so much easier?

dan.parshall 25 Mar 2026 15:37 UTC
2 points
1
on: Terrified Comments on Corrigibility in Claude’s Constitution
I think this is a great point:
(This last comes down to a property of high-dimensional geometry. Imagine that the “correct” specification of morality is 100 bits long, and that for every bit, any individual human has a probability of 0.1 of being a “moral mutant” along that dimension. The average human only has 90 bits “correct”, but everyone’s mutations are idiosyncratic: someone with their 3rd, 26th, and 78th bits flipped doesn’t see eye-to-eye with someone with their 19th, 71st, and 84th bits flipped, even if they both depart from the consensus. Very few humans have all the bits “correct”—the probability of that is —but Claude does, because everyone’s “errors” cancel out of the pretraining prior.)
I actually wrote a proposal specifically about how we could elicit exactly this information. Briefly, instead of using a pair ‘proposed responses’, and then choosing between the two of them (which as a side effect probably encourages hallucination), instead you could take a single proposed response, and then show it to two reviewers (whether human or their designated agent). If you get two thumbs-up, use positive reinforcement, two thumbs down use negative reinforcement (which helps punish truly horrible proposals) and mixed signal could go to a reconcilliation round, to “navigate” between the two perspectives.

The key is that if this is framed as an ongoing process, then one can make “navigate differing values” the anchor of identity, and then corrigibility isn’t “resistance to my values”,.. reconciling is the core AI value… (fingers crossed)

I think combined with a shift in how we imagine corrigibility, we might buy ourselves several more years. Happy to discuss further if you’re interested.

dan.parshall

What economists get wrong (and some­times right!) about AI

dan.par­shall’s Shortform

What economists get wrong (and sometimes right!) about AI

dan.parshall’s Shortform