Sean Herrington

Karma: 124

Sean Herrington’s Shortform

Sean Herrington2 Nov 2025 1:04 UTC

3 points

0 comments1 min readLW link

Sean Herrington 28 Oct 2025 22:15 UTC
1 point
0
on: How can I prevent despair in myself, without imminent AI death
I think I agree with the sentiment, but I thought I should probably mention that Death with Dignity was in fact posted on April Fool’s—Yudkowsky is pessimistic, but not quite 0% pessimistic :).

Sean Herrington 16 Oct 2025 23:22 UTC
5 points
0
in reply to: robo’s comment on: strawberry calm’s Shortform
I have to back you on this… There are elo systems which go down to 100 elo and still have a significant number of players who are at the floor. Having seen a few of these games, those players are truly terrible but will still occasionally do something good, because they are actually trying to win. I expect random to be somewhere around −300 or so when not tested in strange circumstances which break the modelling assumptions (the source described had multiple deterministic engines playing in the same tournament, aside from the concerns you mentioned in the other thread).

Sean Herrington 13 Oct 2025 15:52 UTC
1 point
0
in reply to: williawa’s comment on: williawa’s Shortform
Ok, so I went off and thought about it for a couple of hours. This is what I’ve come up with so far:
- I think that the output space (and by extension the loss) is probably very spiky with gradients pointing all over the place.
- The main reason I think this is that by default relu is a load of 1d cuts through the surface, and so the average angle between adjacent facets is going to be 90 degrees. I think with small facets this makes for a very spiky surface (although probably not quite to the extent that the loss graph you showed would suggest).
- I basically expect that this bumpy surface averages out to something vaguely smooth corresponding more directly to positions in parameter space which are actually high or low loss.
- I think that this is where large batch sizes and momentum come in: you’re effectively averaging over all the bumps.
- This also probably helps explain why warmup learning rates are often used? A few iterations are needed to get the averaging effects of the momentum in the right ballpark before we can actually move in the correct direction.
I still think that the facets are going to be small enough that you’re going to consistently hop between them, but yeah this does make me think that neural nets are a load more noisy than I’d previously considered.

Sean Herrington 27 Sep 2025 14:27 UTC
2 points
0
in reply to: StanislavKrym’s comment on: Ranking the endgames of AI development
Yeah I think that figuring out how to move probability mass between these scenarios is probably a good next move, although at some point I think I may want to revisit how I’ve drawn the boundaries—they seem pretty neat atm but I think it fairly likely the future will throw us a curveball at some point.

Sean Herrington 27 Sep 2025 14:25 UTC
3 points
0
in reply to: AnthonyC’s comment on: Ranking the endgames of AI development
Yeah so for sure #9 is a bit of an outlier in terms of stability, my general conception of it was something akin to “a disaster that puts us back in the middle ages”. More broadly, I think of a lot of these as states which will dominate the next few centuries/millenium or so, rather than infinitely, and I think that mostly justifies inclusion of “not technically actually fully stable states but will kinda last for a while”. I think it would be interesting to do some sort of analysis on how long we would expect e.g an AI dictatorship to last for though.
I think that practically speaking differentiating 7/8/11 when in that world or when planning for the future is probably very hard? Misinformation is gonna be nuts in most of those places, but I felt that the moral outcome was so varied they deserved to be split up.
In terms of #6, I feel like in these worlds you’re creating some sort of a simulator based ASI, which sometimes goes Bing Sydney on you and therefore cannot be reasonably called “aligned”, but has human enough motivations that it doesn’t take over the world? There are presumably minds in mind-space which aren’t aligned to humans and also don’t want to take over the world, although I admit I don’t give this high probability.

Ranking the endgames of AI development

Sean Herrington27 Sep 2025 11:47 UTC

18 points

4 comments5 min readLW link

Sean Herrington 28 Aug 2025 10:44 UTC
8 points
6
in reply to: J Bostock’s comment on: Will Any Crap Cause Emergent Misalignment?
My current impression (although not all that precise), is that under the simulator view the pretraining gives the model the capacity to simulate any character that appears in text.
Finetuning then selects which character/type of text/ etc it should model. Training on chaotic answers like this are going to steer the model towards emulating a more chaotic character.
I’d imagine that the sort of people who respond “dog poo” to being asked what is on the pavement are more likely to give misaligned responses in other situations than the baseline (note, more likely—majority of responses are fine in this experiment).
What links here?
- nostalgebraist's comment on Will Any Crap Cause Emergent Misalignment? by J Bostock (2 Sep 2025 17:11 UTC; 3 points)

Sean Herrington 30 Jul 2025 7:04 UTC
2 points
0
in reply to: CronoDAS’s comment on: johnswentworth’s Shortform
I think with enough enthusiasm anyone can go clubbing, and tbh imo stuff which looks stupid in a club just becomes entertaining. If you really feel embarrassed about it, one way to go about this is to play into the stupidity by really overexaggerating the moves to play into the humour.
I think with age the ick comes from older guys who come to look at young girls and nothing else. I have a mate who’s 49 and comes out clubbing with us, and is more enthusiastic than any of us on the dance floor and everyone loves it.

Sean Herrington 29 Jul 2025 3:27 UTC
7 points
0
in reply to: johnswentworth’s comment on: johnswentworth’s Shortform
Oof honestly I feel like I mostly just kind of go and find a place with decent music that’s open. I normally find there’s at least one (or maybe my standards are just low), but I’d imagine that in places where that isn’t the case you’d be able to look on the good clubs websites to see when they have events.
I know that in Oxford clubs often have weekly theme nights, such as this one https://www.bridgeoxford.co.uk/wednesday. I’d imagine that a quick browse of your favourite clubs’ websites would give you a good idea of where to go when.

Sean Herrington 29 Jul 2025 2:50 UTC
4 points
0
in reply to: johnswentworth’s comment on: johnswentworth’s Shortform
Yeah having the right friends to go with is important. I’ve recently finished university so that’s been easier for me than most, but in general I think it’s easier when going to an event with a decent number of people (I play ice hockey and so team/club dinners are a good example). With more people there’s a greater chance of there being a critical mass willing to go.
Aside from that I’ve recently been backpacking around Vietnam, Cambodia and Thailand and I’ve found that being in a hostel makes it incredibly easy to meet people and go out locally. This does require being comfortable in that environment though.
I think that all you really need is one friend who is willing to go with you, and they then become the main point of contact when you want to go.
It’s also possible to go alone, especially in communities like the backpacker community where it’s incredibly easy to meet people. This is generally a lot more sketchy in many places though as you have no backup if you e.g get spiked or drink too much.

Sean Herrington 29 Jul 2025 0:12 UTC
3 points
0
in reply to: Brendan Long’s comment on: LLMs are stuck in Plato’s cave
Hmm, interesting observation. I guess that counts as having the chemical state of our body as an input? I think defining it this way includes other similar cases such as feeling hunger and the need to sleep.
I’m not sure how useful these would be for text generation—it would probably allow for something like empathy, which would probably be good, assuming we could instill it with access to something close the emotions humans have.
I think that access to internal state would give much larger performance gains for embodied systems however—robots which are aware they need recharging are likely to be much more effective.

Sean Herrington 28 Jul 2025 17:01 UTC
32 points
14
in reply to: johnswentworth’s comment on: johnswentworth’s Shortform
TLDR: People often kiss/go home with each other after meeting in clubs, less so bars. This isn’t necessarily always obvious but should be observable when looking out for it.
OK, so I think most of the comments here don’t understand clubs (@Myron Hedderson’s comment has some good points though). As someone who has made out with a few people in clubs, and still goes from time to time I’ll do my best to explain my experiences.
I’ve been to bars and clubs in a bunch of places, mostly in the UK but also elsewhere in Europe and recently in Korea and South East Asia.
In my experience, bars don’t see too many hookups, especially since most people go with friends and spend most of their time talking to them. I imagine that one could end up pairing up at a bar if they were willing enough to meet new people and had a good talking game (and this also applied to the person they paired up with), but I feel like most of the actual action happens in clubs on the dancefloor.
I think matching can happen at just about any club in my experience, although I think . Most of the time it just takes the form of 2 people colliding (not necessarily literally), looking at each other, drunkeness making both much more obvious than usual and then them spending a while making out with each other. Sometimes things go beyond that point. Mostly not, in my experience although a friend recently told me that he rarely kisses girls in clubs and instead directly asks them home (apparently successfully).
I’ve seen enough people making out in clubs before to be confused as to why John hasn’t seen this sort of behaviour. I don’t know in what ways clubbing in the Bay Area is different from the UK, so I won’t speculate on that but I think that there is sometimes a difference in attitude depending on the music being played. In particular, I think people are more likely to make out to pop/classics than to e.g house. It may also just be that I’m more likely to kiss people when listening to music I enjoy.
Additional advice for clubs (heterosexual male):
- Go there to enjoy the music (this may sound weird but enjoying clubs is very much a skill)
- Don’t worry about pairing up with someone too much, this will remove opportunities to have fun (although you can still take actions which improve your odds)
- Drink enough that you have no issues with dancing badly
- When dancing, do literally any movement in time with the beat (ideally make the motions as varied as possible)
- Humour is king, if something funny pops into your head do it.
- Good examples: Miming the lyrics of a song (depending on the song), dancing with another guy (the more exaggerated, the more obvious it is you’re being funny), miming sex positions (you’d be shocked how many people in clubs are completely cool with this, and just find it entertaining)
- If someone else does something entertaining support them (apart from anything else the more funny stuff is happening around you the more you have to bounce off of)
- These tips do tend to require some extroversion—I don’t know how good this advice is macroscopically but in the clubbing scene this tends to be achieved via alcohol
- If getting with girls really is the priority, then be obvious (there’s always the caveat not to do things likely to upset people, but I think that in the context of a) LessWrong b) clubs, the advice is overwhelmingly on the side of being far more forward and less worried about misdemeanours)
- Pick one girl and single her out, don’t hedge your bets. Read body language (it’ll be more obvious when everyone else is drunk, and hearing each other can be a pain)
- If rejected, brush yourself off and try again (probably in another part of the club, although remember having fun is the main thing so don’t abandon a good group)
- The centre of the circle is centre stage—go nuts here, this is your opportunity to entertain people with the dumbest idea that just occurred to you
Caveats: this is what works for me. I have found that people consistently commenting they enjoy nights out with me significantly more than average, and I have found I enjoy nights out more when I employ these methods. I have not tried this everywhere and there have been places where I’ve felt a bit out of place (although I’d still argue I was having more fun than those around me).
I expect introverts to be scared by many of the ideas here, but I also feel like there are situations in life where acting more confident is universally better (public speaking is another example). Personally I’ve found this becomes easier with time and practise. Good luck all.
Edit: I just remembered I first got together with my ex-girlfriend at a bar. However we already knew each other and decided to meet up just the 2 of us, which is a somewhat different situation from most occasions I go to the bar.

Sean Herrington 27 Jul 2025 15:25 UTC
4 points
0
on: LLMs Can’t See Pixels or Characters
Great post, I think it’s very complimentary my last post, where I argue that what LLMs can and can’t do is strongly affected by the modes of input they have access to.
I think overall this updates me towards thinking there’s a load of progress which will be made in AI literally just from giving it access to data in a nicer format.

Sean Herrington 24 Jul 2025 15:30 UTC
6 points
9
in reply to: lesswronguser123’s comment on: Women Want Safety, Men Want Respect
Another young man here largely identified as a woman by many of these criteria. I’m not sure that the author’s post generalises past their own experience, although I do agree with some of the statements as correct societal commentary.

Sean Herrington 22 Jul 2025 13:45 UTC
13 points
6
in reply to: williawa’s comment on: williawa’s Shortform
My instinct on this is that the loss surface with just relus is as you say a bunch of intersecting planes, but with a large enough neural network these are cut up and recombined enough to form a function with small enough “facets” that they are insignificant compared to the step size of the optimiser, and the surface therefore might as well be smooth.
However I have no maths to back this up, and will defer to anyone who has done any calculations at all.

LLMs are stuck in Plato’s cave

Sean Herrington13 Jul 2025 20:37 UTC

7 points

3 comments6 min readLW link

Lenses, Metaphors, and Meaning

WillPetillo, Sean Herrington, Spencer Ames, Adebayo Mubarak and Cancus

8 Jul 2025 19:46 UTC

7 points

0 comments4 min readLW link

Sean Herrington 7 Jul 2025 18:32 UTC
1 point
0
in reply to: azergante’s comment on: Eutopia is Scary
There’s a really weird irony that Claude would write about lack of continuity of memory and say that it’s something it holds sacred when it is not actually something it has.

Emergence of Simulators and Agents

WillPetillo, Sean Herrington, Spencer Ames, Adebayo Mubarak and Cancus

25 Jun 2025 6:59 UTC

10 points

0 comments5 min readLW link

Sean Herrington

Sean Her­ring­ton’s Shortform

Rank­ing the endgames of AI development

LLMs are stuck in Plato’s cave

Lenses, Me­taphors, and Meaning

Emer­gence of Si­mu­la­tors and Agents

Sean Herrington’s Shortform

Ranking the endgames of AI development

Lenses, Metaphors, and Meaning

Emergence of Simulators and Agents