lemonhope

Karma: 1,729

lemonhope 20 Nov 2025 7:12 UTC
2 points
0
in reply to: cousin_it’s comment on: Orient Speed in the 21st Century
I think that the politicians are doing their best to prop up home prices and keep the pensions alive and make everything stay normal etc. Banks try to stop you from withdrawing a lot of money at once. The only problem is that we got used to growth and innovation and freedom and peace and comfort and security as the norm. It is deep in the culture… Scammers and wars usually come from across national borders; what you are asking for is not a simple easy clear thing; I don’t blame you for wishing.

lemonhope 20 Nov 2025 6:59 UTC
2 points
0
in reply to: JohnGreer’s comment on: Orient Speed in the 21st Century
This is really clear and good and I finally understand why I was so bad at basketball.

lemonhope 20 Nov 2025 6:55 UTC
2 points
0
on: Orient Speed in the 21st Century
This immediately came to my mind. And the exact opposite. Any movie recommendations?

lemonhope 16 Nov 2025 10:07 UTC
3 points
1
in reply to: James Diacoumis’s comment on: Questioning Computationalism
I think you are right. The original argument sounds dead to me.

lemonhope 14 Nov 2025 9:42 UTC
LW: 2 AF: 1
0
AF
in reply to: Wei Dai’s comment on: Please, Don’t Roll Your Own Metaethics
The WWDSC is nearly a consensus. Certainly a plurality.

lemonhope 12 Nov 2025 23:45 UTC
LW: 6 AF: 1
2
AF
on: Please, Don’t Roll Your Own Metaethics
Please just write the standard library!

lemonhope 29 Oct 2025 2:39 UTC
7 points
0
on: Mottes and Baileys in AI discourse
Florida man goes insane after accepting conclusion!

Texas woman cleverly navigates around the conclusion!

Raemon found on overpass shouting at cars “just accept the conclusion and deal with it, you can actually do that, that is actually fine, it is just a regular ol conclusion guys, you have dealt with annoying true facts before”

Record-setting interstate pileup as people swerve to avoid the conclusion!

lemonhope 14 Oct 2025 7:26 UTC
6 points
0
on: How AI Manipulates—A Case Study
TLDR: a big manipulation trick, maybe the biggest, is to prod people about their own identities and almost-forgotten memories, then get them to pick a stance, then get them to stand for it.

Watch out for: 1: “remember back to when you were a very young child” etc etc; 2: “now if you believe that then act on it” or “remember this next time”

Problem! This is disabling most coordination or acting-on-your-principles etc! Solution!: be reasonable and nonstupid i guess?

lemonhope 7 Oct 2025 6:24 UTC
2 points
0
on: Gradual Disempowerment Monthly Roundup
The journalists who caught the “donate to all the political opponents of your enemies regardless of political party with a superPAC called Team America” strategy and brought it to the people’s attention did great work. Crypto bought half the elections in the country and got away with it! People should know!

Wait a minute… They actually published a tutorial! Crap!

lemonhope 7 Oct 2025 6:10 UTC
2 points
0
in reply to: R S’s comment on: You’re probably overestimating how well you understand Dunning-Kruger
Stop at the first elf plot. If people do not know what their skill level is (or what the future holds) then they guess values near the median. If you do bad (or roll the dice poorly) then you look overconfident / optimistic. This explains almost all of the DK effect.

lemonhope 7 Oct 2025 5:52 UTC
3 points
1
in reply to: Josh Snider’s comment on: AI Science Companies: Evidence AGI Is Near
Hmm people usually post analysis of link or fulltext of link but not a sneak preview

lemonhope 27 Sep 2025 9:54 UTC
4 points
0
on: AI Safety Isn’t So Unique
Great post! Nice to see something constructive! And half your citations are new to me. Thank you for sharing.

I have spent the last few months reinventing the wheel with LLM applications in various ways. I’ve been using my own code assistant for about 7 months. I did an alpha-evolve-style system for generating RL code that learns atari. Last year I was trying some fancy retrieval over published/patented electronics circuits. Did some activation steering and tried my hand at KellerJordan/modded-nanogpt for a day. Of course before that I was at METR helping them set evals stuff up.

It hasn’t occurred to me to try to draw any conclusions from all this different work, and I didn’t think of it really as inter-related in any significant way or relevant experience for much of anything, but your topic here is making me think...

Almost every “optimizing” system I make ends up breaking/glitching/cheating the score function. Then I patch and patch until it works, and by then it looks more like a satisficer.

Getting something really useful seems to take about a month of corrections like this. It looks done/working on the first day, I notice something broken and fix it and declare it done on the second day, etc, but after a month I just don’t have any more corrections to make. This is different from eg a web app or game which I never run out of todo items for. Of course when LLMs are involved you have to look three times more carefully to be sure you are measuring what you mean to be measuring.

My point is that I expect projects fitting your description here to basically actually work and be worthwhile, but if it is your (speaking to the anonymous reader) first time doing this, expect that you’ll spend 10x as long correcting/improving/balancing scores & heuristics as you’ll spend on the core functionality.

As you stated in the post, that’s not so different from the process used to make AI assistants (etc) in general.

Making my own AI tools has definitely given some depth/detail to all the theoretical problems I’ve been reading about and talking about all these years. Particularly it is impressive how long my tools have tricked me at times. It is possible I am still tricked right now.

lemonhope 25 Sep 2025 0:26 UTC
2 points
0
on: This is a review of the reviews
Since nobody seems to have posted it yet:

Riding a motorcycle for 60 years:

(1-1/800)^60=0.928

Sailing across the ocean every month for 60 years:

(1-(1/10000))^(60*12)=0.931

The sailing risk is probably overestimated. I have never met anyone who was lost at sea, never seen pictures of someone lost at sea, never heard back from the people who I thought might be lost at sea, and I’m sure to find shore soon and I think I have enough peanut butter for another week...

lemonhope 24 Sep 2025 23:21 UTC
27 points
2
on: lcmgcd’s Shortform
Two AI accelerants I had not noticed before:
1. Users with the hardest problems will use the smartest model available. So if you release your best model and your competitor only releases their 2nd smartest, then you will get better training data from users. Not just code, but vision and robotics too.
2. Among those affected by AI job loss is AI/ML devs/researchers themselves. The higher the hiring bar for AI devs, (A) the more applicants will put effort into learning stuff deeply and doing interesting useful novel work to get a job and (B) the easier it is to spot/snag unusual talent.

lemonhope 3 Sep 2025 9:37 UTC
4 points
0
on: Your LLM-assisted scientific breakthrough probably isn’t real
I can attest that you can fool yourself quite often (twice a week?) even if you just use CLI, frequently switch models, and almost never have context/history. When you drop something in front of an LLM they say “ok lets make it work.” That’s a strong signal from a fellow human. Trips something in my brain I guess. I tried adding criticism and refusals to my CLI thing but it doesn’t work.

lemonhope 3 Sep 2025 9:20 UTC
2 points
0
on: The Cats are On To Something
This could work. I think the hard part is finding a meaningful way to simulate the environment so that the conclusions transfer to real life.

lemonhope 3 Sep 2025 9:02 UTC
−1 points
0
on: Simulating the *rest* of the political disagreement
I was at this weird party where everyone started drinking poison. I tried explaining it to them but I didn’t have any proof or anything and I said “look that’s poison sir” and “m’am if you think there’s any chance I’m right you should stop drinking that” but no luck. They said “I’m thirsty” or “i already have this cup in my hand” or “maybe water is poison and this is antidote, you know more people die from drowning than random poisoning”. I realized I was failing to think about it from their perspective. If I had known all these people for years and this big party was basically what we’ve been planning for a while and it was blowing up online then I would definitely drink it too.

After all the poison-is-actually-antidote guy could’ve also said “if there’s any chance I’m right you have to drink it right now” which would be just political manuvineering clearly.

Anyways I miss those guys they were fun

lemonhope 23 Aug 2025 19:55 UTC
2 points
0
on: lcmgcd’s Shortform
With controlling a theoretical rl agent, what’s the problem with asking the ai to be 99% sure that it mopped 99% of the floor and stop?

I remember that if you just ask for 99% floormop then agent will spend forever getting 99.99999% sure that at least 99% is mopped, but I can’t remember the problem with this little patch.

lemonhope 15 Aug 2025 7:27 UTC
4 points
0
in reply to: Nathan Helm-Burger’s comment on: We can do better than DoWhatIMean
This resolved yes!

lemonhope 15 Aug 2025 7:14 UTC
2 points
0
in reply to: David Johnston’s comment on: MIRI’s “The Problem” hinges on diagnostic dilution
Ok it does seem like an example then. Thank you for spelling it out.