Tbh I’m not convinced an AI tuned for “anti-human chess” actually loses with knight-bishop-rook odds against a mediocre player, particularly if it can review the player’s games in advance. Mediocre club player, yes- but I’m not even completely certain of that. Stockfish plays this at a handicap because it models its opponent as just as capable as itself. (Google’s AI says two-piece odds bring Stockfish to about club level, and it seems plausible that play targeted at somebody’s weaknesses, clock pressure, forcing deep reads etc. could get you another piece worth of advantage, or enough to at least make it contested for a few games.)
FeepingCreature
I’m not sure this argument holds at all. I think this is biased because we already know some people in the group have the ability to turn consideration into violence… generally speaking if somebody endorses torture in torture vs dust specks, or pulling the lever in the trolley problem, I would still not worry at all about them torturing me or arranging for me to be killed so that my organs can save two other people. Most moral consideration on big-picture problems occurs in a different universe from everyday behavior, or many rationalists—maybe the median rationalist! - would be considered deeply unsafe people rather than the highly safe people they are in reality.
Do you get life insurance on MAiD? I’d be very surprised.
edit: well, the google AI agent says yes you actually do! Color me impressed. Guess they trust the legal checks on that.
I think what is actually happening is “yes, all the benchmarks are inadequate.” In humans, those benchmarks correlate to a particular kind of ability we may call ‘able to navigate society and to, in some field, improve it.’ Top of the line AIs still routinely deletes people’s home dirs and cannot run a profitable business even if extremely handheld. AIs have only really started this year to convincingly contribute to software projects outside of toys. There are still many software projects that could never be created by even a team of AIs all running in pro mode at 100x the cost of living of a human. Benchmarks are fundamentally an attempt to measure a known cognitive manifold by sampling it at points. What we have learnt in these years is that it is possible to build an intelligence that has a much more fragmented cognitive manifold than humans do.
This is what I think is happening. Humans use maybe a dozen strong generalist strategies with diverse modalities that are evaluated slowly and then cached. LLMs use one—backprop on token prediction—that is general enough to generate hundreds of more-or-less-shared subskills. But that means that the main mechanism that gives these LLMs a skill in the first place is not evaluated over half its lifetime. As a consequence of this, LLMs are monkey paws: they can become good at any skill that can be measured, and in doing so they demonstrate to you that the skill that you actually wanted- the immeasurable one that you hoped the measurable one would provide evidence towards—actually did not benefit nearly as much as you hoped.
It’s strange how things worked out. Decades of goalshifting, and we have finally created a general, weakly superhuman intelligence that is specialized towards hitting marked goals and nothing else.
I think he just objected to the phrasing. I do think “set up a system where people can be banned by others whom Said does not instruct on who to ban” is a stretch for “Said bans people from DSL.”
I have generally found Said to mean the things he says quite literally and to expect others to do so as well. It’s painful to read a conversation where one person keeps assigning subtext to another who quite clearly never intended to put it there.
Sidechannel note: Said wishes it to be known that he neither bans people from DSL nor customarily has the right to, the task being delegated to moderators rather than the sysop. ( https://share.obormot.net/textfiles/MINHjLX7 )
I’m very good friends with someone who is persistently critical and it has imo largely improved my mental health, fwiw, by forcing me to construct a functioning and well-maintained ego which I didn’t really have before.
I think in my whole life I have once seen a person come back because another person left, and they didn’t stay long anyway. Broadly speaking I don’t think this ever works.
I think this only works if your standards for posts are in sync with those of the outside world. Otherwise, you’re operating under incompatible status models and cannot sustain your community standards against outside pressure; you will always be outcompeted by the outside world (who can pretty much always offer more status than you can simply by volume) unless you can maintain the worth of your respect, and you cannot do that by copying outside appraisal.
I think you failed to establish that the long, well-written and highly-upvoted critiques lived in the larger LW archipelago, so there’s a hole in your existence proof. On that basis, I would surmise that on priors Said assumed you were referring to comments or on-site posts.
Sounds like you should create PokemonBench.
I don’t understand it but it does make me feel happy.
Haven’t heard back yet...
edit: Heard back!
Okay, I’ll do that, but why do I have to send an email...?
Like, why isn’t the how-to like just in a comment? Alternately, why can’t I select Lightcone as an option on Effektiv-Spenden?
Unless there’s some legal reason, this seems like a weird unforced own-goal.
Any news on this?
Original source, to my knowledge. (July 1st, 2014)
“So long, Linda! I’m going to America!”
Human: “Look, can’t you just be normal about this?”
GAA-optimized agent: “Actually-”
Hm, I guess this wouldn’t work if the agent still learns an internalized RL methodology? Or would it? Say we have a base model, not much need for GAA because it’s just doing token pred. We go into some sort of (distilled?) RL-based cot instruct tuning, GAA means it picks up abnormal rewards from the signal more slowly, ie. it doesn’t do the classic boat-spinning-in-circles thing (good test?), but if it internalizes RL at some point its mesaoptimizer wouldn’t be so limited, and that’s a general technique so GAA wouldn’t prevent it? Still, seems like a good first line of defense.
The issue is, from a writing perspective, that a positive singularity quickly becomes both unpredictable and unrelatable, so that any hopeful story we could write would, inevitably, look boring and pedestrian. I mean, I know what I intend to do come the Good End, for maybe the next 100k years or so, but probably a five-minute conversation with the AI will bring up many much better ideas, being how it is. But … bad ends are predictable, simple, and enter a very easy to describe steady state.
A curve that grows and never repeats is a lot harder to predict than a curve that goes to zero and stays there.
It’s a historic joke. The quote is from the emails. (I think) Attributing it to Lenin references the degree to which the original communists were sidelined by Stalin, a more pedestrian dictator; presumably in reference to Sam Altman.
I genuinely think, and note that this is just a personal aggregate view and that I don’t have cites for this, but I’ve known several people like this: the default mode of engagement of some people in this community is one that can be hurtful or can be interpreted as aggressive, neglectful, spiteful or trolling. I think that you need a baseline of cognitive/social modeling skill to avoid this, and that people often don’t have this. From what I saw of Said, on this forum and other places, he says exactly what he thinks, but is often completely unable to read or predict the second-order implications of his communications, and in fact is so unable to perceive them that he doesn’t even see that there’s something he’s not seeing. I’ve seen this pattern multiple times in the discussion.
He’s generally perfectly honest about the exact words, but someone with “normal-ISI communication” sees second order inferences as so obvious that they’re also interpreted as a deliberate part of the communication, and this is just never the case with Said.
I’m not saying this to excuse him, or to excuse Habryka, but I think it’s impossible to understand this if you don’t have the model of “Said’s way of speaking has a factor that raises antipathy against him, that is impossible to understand if you don’t have very high instinctual social intelligence (way more than me, to be clear) and impossible to even notice if you have low instinctual social intelligence, and this factor causes people with more social perception to grasp for reasons why he is in the wrong. This is the actual reason for the ban.”