TekhneMakre

Karma: 3,232

TekhneMakre 24 Apr 2023 0:41 UTC
69 points
67
on: Contra Yudkowsky on AI Doom
Biology is incredibly efficient, and generally seems to be near pareto-optimal.
This seems really implausible. I’d like to see a debate about this. E.g. why can’t I improve on heat by having super-cooled fluid pumped throughout my artificial brain; doesn’t having no skull-size limit help a lot; doesn’t metal help; doesn’t it help to not have to worry about immune system stuff; doesn’t it help to be able to maintain full neuroplasticity; etc.

TekhneMakre 5 Oct 2023 22:34 UTC
LW: 45 AF: 27
24
AF
in reply to: Matthew Barnett’s comment on: Evaluating the historical value misspecification argument
Without digging in too much, I’ll say that this exchange and the OP is pretty confusing to me. It sounds like MB is like “MIRI doesn’t say it’s hard to get an AI that has a value function” and then also says “GPT has the value function, so MIRI should update”. This seems almost contradictory.

A guess: MB is saying “MIRI doesn’t say the AI won’t have the function somewhere, but does say it’s hard to have an externally usable, explicit human value function”. And then saying “and GPT gives us that”, and therefore MIRI should update.

And EY is blobbing those two things together, and saying neither of them is the really hard part. Even having the externally usable explicit human value function doesn’t mean the AI cares about it. And it’s still a lot of bits, even if you have the bits. So it’s still true that the part about getting the AI to care has to go precisely right.

If there’s a substantive disagreement about the facts here (rather than about the discourse history or whatever), maybe it’s like:

Straw-EY: Complexity of value means you can’t just get the make-AI-care part to happen by chance; it’s a small target.

Straw-MB: Ok but now we have a very short message pointing to roughly human values: just have a piece of code that says “and now call GPT and ask it what’s good”. So now it’s a very small number of bits.
What links here?
- Alignment Implications of LLM Successes: a Debate in One Act by Zack_M_Davis (21 Oct 2023 15:22 UTC; 238 points)

TekhneMakre 22 Nov 2022 0:06 UTC
45 points
27
in reply to: Valentine’s comment on: Here’s the exit.
You write in a gaslighty way, trying to disarm people’s critical responses to get them to accept your frame. I can see how that might be a good thing in some cases, and how you might know that’s a good thing in some cases. E.g. you may have seen people respond some way, and then reliable later say “oh this was XYZ and I wish I’d been told that”. And it’s praiseworthy to analyze your own suffering and confusion, and then explain what seem like the generators in a way that might help others.

But still, trying to disarm people’s responses and pressure them to accept your frame is a gaslighting action and has the attendant possible bad effects. The bad effects aren’t like “feel quite so scared”, more like having a hostile / unnatural / external / social-dominance narrative installed. Again, I can see how a hostile narrative might have defenses that tempt an outsider to force-install a counternarrative, but that has bad effects. I’m using the word “gaslighting” to name the technical, behavioral pattern, so that its common properties can be more easily tracked; if there’s a better word that still names the pattern but is less insulting-sounding I’d like to know.
A main intent of my first comment was to balance that out a little by affirming simple truths from outside the frame you present. I don’t view you as open to that sort of critique, so I didn’t make it; but if you’re interested I could at least point at some sentences you wrote.
ETA: Like, it would seem less bad if your post said up front something more explicit to the effect of: “If you have such and such properties, I believe you likely have been gaslighted into feeding the doomsday cult. The following section contains me trying to gaslight you back into reality / your body / sanity / vitality.” or something.
What links here?
- TekhneMakre's comment on Here’s the exit. by Valentine (22 Nov 2022 3:35 UTC; 5 points)

TekhneMakre 16 Nov 2021 8:34 UTC
LW: 44 AF: 15
AF
in reply to: evhub’s comment on: A positive case for how we might succeed at prosaic AI alignment
Certainly it doesn’t matter what substrate the computation is running on.
I read Yudkowsky as positing some kind of conservation law. Something like, if the plans produced by your AI succeed at having specifically chosen far-reaching consequences if implemented, then the AI must have done reasoning about far-reaching consequences. Then (I’m guessing) Yudkowsky is applying that conservation law to [a big assemblage of myopic reasoners which outputs far-reaching plans], and concluding that either the reasoners weren’t myopic, or else the assemblage implements a non-myopic reasoner with the myopic reasoners as a (mere) substrate.
Reasoning correctly about far-reaching consequences by default (1) has mistargeted consequences, and (2) is done by summoning a dangerous reasoner.
Such optimizers can still end up producing actions with far-reaching consequences on the world if they deploy their optimization power in the service of an objective like imitating HCH that requires producing actions with particular consequences, however.
I think what you’re saying here implies that you think it is feasible to assemble myopic reasoners into a non-myopic reasoner, without compromising safety. My possibly straw understanding, is that the way this is supposed to happen in HCH is that, basically, the humans providing the feedback train the imitator(s) to implement a collective message-passing algorithm that answers any reasonable question or whatever. This sounds like a non-answer, i.e. it’s just saying ”...and then the humans somehow assemble myopic reasoners into a non-myopic reasoner”. Where’s the non-myopicness? If there’s non-myopicness happening in each step of the human consulting HCH, then the imitator is imitating a non-myopic reasoner and so is non-myopic (and this is compounded by distillation steps). If there isn’t non-myopicness happening in each step, how does it come in to the assembly?

TekhneMakre 29 Nov 2021 20:38 UTC
43 points
in reply to: AnnaSalamon’s comment on: Frame Control
I think I agree with ~everything in your two comments, and yet reading them I want to push back on something, not exactly sure what, but something like: look, there’s this thing (or many things with a family resemblance) that happens and it’s bad, and somehow it’s super hard to describe / see it as it’s happening.… and in particular I suspect the easiest, the first way out of it, the way out that’s most readily accessible to someone mired in an “oops my internal organs are hooked up to a vampiric force” situation, does not primarily / mainly involve much understanding or theorizing (at least given our collective current level of understanding about these things), and rather involves something with a little more of “wild” vibe, the vibe of running away, of suddenly screaming NO, of asserting meaningful propositions confidently from a perspective, etc. And I get some of this vibe from the OP; like part of the message is (what I’m interpreting to be) the stance someone takes when calling something “frame control” (or “gaslighting” or “emotional abuse” or “cult” or what-have-you).

Which, I still agree with the things you say, and the post does make lots of sort-of-specific, sort-of-vague claims, and gives good data with debatable interpretation, and so on. But there’s also this sort of necessarily pre-theoretic theoretic action happening, and I guess I want to somehow have that [hypothesis mixed with judgement mixed with action] be possible as well, including in the common space. (Like, the action is theoretic in that you’re reifying some pattern (e.g. “frame control”). It’s almost necessarily pre-theoretic, in the sense that you don’t even close to fully understand it and it’s probably only very roughly joint-carving, because the pattern itself involves making you confused about what’s happening and less able to clearly understand patterns. It’s an action, a judgement that something is really seriously wrong and you need to change it, a mental motion that rejects something previously accepted, that catapults you out of a satisficing basin; and you’re doing this action in a way that somewhat strongly depends or is helped by the non-joint-carving unrefined concept, like “this thing, IDK what it is really, but it’s really bad and I have to get out of it, and after escaping I’ll think about it more”.)

I see ~~you~~ your comments as partly rejecting, or at least incidentally pushing against, this sort of action: to “do it in a way that telegraphs the early-stage-ness” is, when speaking from a pre-theoretic standpoint, in tension with the vibe/action of sharply reclaiming one’s own perspective even when that perspective is noticeably incoherent (“something was deeply wrong, I don’t know what”). Like, it’s definitely a better artifact if you put in the right epistemic tags that point towards uncertainty, places to refine and investigate, etc.; but that’s harder to do and requires the author to be detailedly tracking a more complicated boundary around known and unknown, in a way that’s, like, not the first mental motion that (AFAIK) has to happen to get the minimum viable concept to self-coordinate on a narrative that says the thing is bad. Internally coordinating on a narrative that X-whatever-it-is is bad, seems important if you’re going to have to first push against X in big ways, before it’s very feasible to get a better understanding of X. (There’s bucket errors here, and it could be helpful to clarify that; but that’s maybe sort of the point: someone who’s been given a heavy dose of frame control is bucket-errored such that they doubt the goodness of holding their own perspective in part because it’s been tied up with other catastrophic things such as disagreeing with their social environment without having a coherent alternative or a coherent / legible grounds for disagreeing.)

Are humans misaligned with evolution?

TekhneMakre and jacob_cannell

19 Oct 2023 3:14 UTC

42 points

13 comments18 min readLW link

TekhneMakre 21 Nov 2022 18:52 UTC
41 points
27
on: Here’s the exit.
Neither up- nor down-voted; seems good for many people to hear, but also is typical mind fallacying / overgeneralizing. There’s multiple things happening on LW, some of which involve people actually thinking meaningfully about AI risk without harming anyone. Also, by the law of equal and opposite advice: you don’t necessarily have to work out your personal mindset so that you’re not stressed out, before contributing to whatever great project you want to contribute to without causing harm.

TekhneMakre 12 Oct 2023 5:10 UTC
38 points
16
on: Evolution Solved Alignment (what sharp left turn?)
You write:

The utility function is fitness: gene replication count (of the human defining genes)[1]. And by this measure, it is obvious that humans are enormously successful. If we normalize so that a utility score of 1 represents a mild success—the expectation of a typical draw of a great apes species, then humans’ score is >4 OOM larger, completely off the charts.[2]

Footnote 1 says:

Nitpick arguments about how you define this specifically are irrelevant and uninteresting.

Excuse me, what? This is not evolution’s utility function. It’s not optimizing for gene count. It does one thing, one thing only, and it does it well: it promotes genes that increase their RELATIVE FREQUENCY in the reproducing population.

The failure of alignment is witnessed by the fact that humans very very obviously fail to maximize the relative frequency of their genes in the next generation, given the opportunities available to them; and they are often aware of this; and they often choose to do so anyway. The whole argument in this post is totally invalid.

TekhneMakre 25 Feb 2023 3:16 UTC
38 points
15
on: Sam Altman: “Planning for AGI and beyond”
It sounds nice, but also, it sounds like he’s just not taking seriously that the alignment problem might be more difficult than an engineering problem that you can solve fairly easily if only you’re given access to predecessor systems, and not taking seriously that an AGI can tip over into fast takeoff for inscrutable internal reasons.

TekhneMakre 14 Apr 2023 4:57 UTC
36 points
23
on: A freshman year during the AI midgame: my approach to the next year
I think there’s something wrong with your categories: they’re all about social perception. There’s some reason for these to be correlated with the reality, but not that strong of a reason. People can be confused in either direction about what sort of AI is coming soon, and confusing people’s sense of what sort of AI is coming soon wich what actual AI is coming soon would suggest bad plans.

TekhneMakre 2 Dec 2021 12:41 UTC
32 points
3
on: Morality is Scary

> All the rest is an act of shared imagination. It’s a dream we weave around a status game.
> They’re part of the dream of reality in which they exist, a dream that feels no less obvious and true to them than ours does to us.
> Moral ‘truths’ are acts of imagination. They’re ideas we play games with.

IDK, I feel like you could say the same sentences truthfully about math, and if you “went with the overall vibe” of them, you might be confused and mistakenly think math was “arbitrary” or “meaningless”, or doesn’t have a determinate tendency, etc. Like, okay, if I say “one element of moral progress is increasing universalizability”, and you say “that’s just the thing your status cohort assigns high status”, I’m like, well, sure, but that doesn’t mean it doesn’t also have other interesting properties, like being a tendency across many different peoples; like being correlated with the extent to which they’re reflecting, sharing information, and building understanding; like resulting in reductionist-materialist local outcomes that have more of material local things that people otherwise generally seem to like (e.g. not being punched, having food, etc.); etc. It could be that morality has tendencies, but not without hormesis and mutually assured destrubtion and similar things that might be removed by aligned AI.

[Question] Does Braess’s paradox show up in markets?

TekhneMakre29 Dec 2021 12:09 UTC

31 points

4 comments2 min readLW link

TekhneMakre 22 Nov 2022 1:00 UTC
31 points
19
in reply to: Valentine’s comment on: Here’s the exit.
(Mainly for third parties:)
I don’t care about people accepting my frame.
I flag this as probably not true.
Frankly, lots of folk here are bizarrely terrified of frames. I get why; there are psychological methods of attack based on framing effects.
It’s the same sort of thing your post is about.
Might have filtered folk well early on and helped those for whom it wasn’t written relax a bit more.
I flag this as centering critical reactions being about the reacters not being relaxed, rather than that there might be something wrong with his post.

TekhneMakre 8 Sep 2023 4:58 UTC
30 points
16
in reply to: Unreal’s comment on: Sharing Information About Nonlinear
Alternative theory: Alice felt on thin ice socially + professionally. When she was sick she finally felt she had a bit of leeway and therefore felt even a little willing to make requests of these people who were otherwise very “elitist” wrt everyone, somewhat including her. She tries to not overstep. She does this by stating what she needs, but also in the same breath excusing her needs as unimportant, so that the people with more power can preserve the appearance of not being cruel while denying her requests. She does this because she doesn’t know how much leeway she actually has.

Unfortunately this is a hard to falsify theory. But at a glance it seems consistent, and I think it’s also totally a thing that happens.

TekhneMakre 18 Oct 2021 11:00 UTC
30 points
in reply to: Viliam’s comment on: My experience at and around MIRI and CFAR (inspired by Zoe Curzi’s writeup of experiences at Leverage)
you publicly describe your suffering as a way to show people that MIRI/CFAR is evil.
Could you expand more on this? E.g. what are a couple sentences in the post that seem most trying to show this.
Because it seems like you call it bad when you attribute it to MIRI/CFAR, but when other people suggest that Vassar was responsible, then it seems a bit like no big deal, definitely not anything to blame him for.
I appreciate the thrust of your comment, including this sentence, but also this sentence seems uncharitable, like it’s collapsing down stuff that shouldn’t be collapsed. For example, it could be that the MIRI/CFAR/etc. social field could set up (maybe by accident, or even due to no fault of any of the “central” people) the conditions where “psychosis” is the best of the bad available options; in which case it makes sense to attribute causal fault to the social field, not to a person who e.g. makes that clear to you, and therefore more proximal causes your breakdown. (Of course there’s disagreement about whether that’s the state of the world, but it’s not necessarily incoherent.)
I do get the sense that jessicata is relating in a funny way to Michael Vassar, e.g. by warping the narrative around him while selectively posing as “just trying to state facts” in relation to other narrative fields; but this is hard to tell, since it’s also what it might look like if Michael Vassar was systematically scapegoated, and jessicata is reporting more direct/accurate (hence less bad-seeming) observations.

Interpersonal alignment intuitions

TekhneMakre23 Feb 2023 9:37 UTC

29 points

18 comments2 min readLW link

TekhneMakre 19 Dec 2021 7:47 UTC
29 points
in reply to: Daniel Kokotajlo’s comment on: In Defense of Attempting Hard Things, and my story of the Leverage ecosystem
[Not really responding to your comment, just saying something sparked while reading it]
I wish that there were more things sort of like Leverage, honestly, given my current info. I’d remove the leader worship and the cosmic-battle stuff and the information suppression and the sleeping with subordinates, but I’d keep the living together, the dogfooding, the intense mental exploration, the jargon, the (non-anti-epistemic aspects of) exceptionalism. I’d add concrete stuff like making physical products, or computer code that does obviously interesting or useful stuff according to other people, or math proofs that academic mathematicians think are correct and interesting, etc. My sense is that these groups don’t include that stuff because it’s too, like, menial or non-abstract or something. Like if you’re trying to save the world, “of course” it’s mostly a waste of time to make a physical product that people want to buy; you should be spending all your time chewing on highly abstract high leverage questions about minds that have effect that radiate out into everything and determine the course of all future wise leadership etc. etc., like OAK/MAPLE or Circlers or Leverage. Which isn’t totally silly, but also yuck. 90% object, 10% meta, maybe 20% if you’re careful.

TekhneMakre 22 Oct 2021 19:44 UTC
29 points
in reply to: Ben Pace’s comment on: My experience at and around MIRI and CFAR (inspired by Zoe Curzi’s writeup of experiences at Leverage)
1. Thank you for all your effort to make LW valuable.
2. I think there’s something pretty valuable about this particular comment of Ilya’s. I’m not tracking all the tradeoffs, thinking through what it would be like if every comment this rude and judgemental were allowed, etc.; so I’ll just try to say what I think is valuable about it, without trying to make an overall judgement. It’s something like, from inside [the thing Ilya is calling a cult, insofar as it’s a thing at all], we’re at risk of feedback loopy dynamics. For example, (mis)information cascades, where people keep updating on each other’s judgements (which are only discretized summaries of previous evidence), rather than updating on each other’s observations exactly once (which would be harder). For example, narrative pyramid schemes, where stories about where a group will put its effort gain political capital in a way disconnected from object-level evaluations of consequences of plans. For example, fear of retribution materializing out of nothing, by people seeing each other act as though they’re afraid of retribution and inferring that they themselves have something to fear.
So, these feedback dynamics are bad, and also very natural. It seems valuable to have some Ilyas, who are rightly viewed as having some weight by our own supposed values, and who will break all frames of political “respect”. Frames of political respect sometimes become mechanisms for propagating pyramid schemes, and sometimes cause people to infer that those around them are deferring out of fear and so the leaders are to be feared rather than reasoned about. So, political frames contribute to mostly bad feedback dynamics, and Ilyas break political frames.
Speaking more phenomenally and less theoretically: sometimes an Ilya says something that gives me a “jolt”, and then I seem to suddenly have more access to peeking behind certain things, or being able to occupy, maybe temporarily, an outlook that’s like “a different Normal”. And this feels basically good to me, like it seems less like I’m being tugged around, and more like I’ve jumped to another spot and now I can get triangulation / parallax on more things.

TekhneMakre 30 Dec 2023 18:46 UTC
28 points
23
on: If Clarity Seems Like Death to Them
I certainly haven’t read even a third of your writing about this. But… I continue to not really get the basic object-level thing. Isn’t it simply factually unknown whether or not there’s such a thing as men growing up with brains that develop like female brains? Or is that not a crux for anything?

Separately, isn’t the obvious correct position simply: there’s a bunch of objective stuff about the differences between men and women; there’s uncertainty about exactly how these clusters overlap / are violated in real life, e.g. as described in the previous paragraph; and separately there’s a bunch of conduct between people that people modulate depending on whether they are interacting with a man or a woman; and now that there are more people openly not falling neatly into the two clusters, there’s some new questions about conduct; and some of the conduct questions involve factual questions, for which calling a particular XY-er a woman would be false, and some of the conduct questions involve factual questiosn (e.g. the brain thing) for which calling a particular XY-er a woman would be true, and some of the conduct questions are instead mainly about free choices, like whether or not to wear a dress or whatever?

I mean, if person 1 is using the word “he” to mean something like “that XY-er”, then yeah, it’s false for them to say “he” of an XX-er. If person 2 is using the word “he” to mean something like “that person, who wants to be treated in the way that people usually treat men”, then for some XX-ers, they should call the XX-er “he”. This XX-er certainly might seek to decieve person 1; e.g. if the XX-er wants to be treated by person 1 the way person 1 treats XY-ers, and person 1 does not want to treat this XX-er that way, but would treat the XX-er this way if they don’t know the XX status, then the XX-er might choose to have allies say “he” in order to decieve person 1. But that’s not the only reason. One can imagine simply that everyone is like person 2; then an XX-er asking to be called “he” is saying something like “I prefer to not be flirted with by heterosexual men; I’d like people to accurately expect me to be more interested in going to a hackathon rather than going to a mall; etc.”, or something. I mean, I’m not at all saying there’s no problem, but… It’s not clear (though again, I didn’t read your voluminous writing on this carefully) who is saying what that’s wrong… Like, if there’s a bunch of conventional conduct that’s tied up with words, then it’s not just about the words’ meaning, and you have to actually do work to separate the conduct from the reference, if you want them to be separate.

TekhneMakre 23 Oct 2021 21:15 UTC
28 points
on: Zoe Curzi’s Experience with Leverage Research
Off the cuff thoughts from me listening to the Twitch conversation between Anna and Geoff:
- I think Geoff, more than he’s seeing clearly, disagrees or at least in the past disagreed with the claim that using narratives to boost morale—specifically, deemphasizing information that contradicts a narrative plan—is basically just bad in the long run. Would be better to have deeper understanding of what morale is.
- Geoff describes being harmed by some sort of initial rejection by the rationality/EA community (around 2011? 2010?). This suggests, to me, a (totally conjectural!) story where he got into an escalating narrative cold war with the rationality community: first he perceives (possibly correctly) that the community rejects him, and thereby cuts off his ability to work with people for projects he thinks are good; then, he corrects for this with narrative pushback—basically, firmly reemphasizing his positive vision or whatever. Then people in the community sense this as narrative distortion / deception, and react (more or less consciously) with further counter-distortion. (Where the mechanism is like, they sense something fishy but don’t know how to say “Geoff is slightly distorting things about Leverage’s plans”, so instead they want people to just not work with Geoff; but they can’t just tell people to do that, so they distort facts about Geoff/Leverage to cause others to take their prefered actions; etc.)
- [ETA: sorry for all the caveats… specifically, I do use judgy language, but don’t endorse the judgements, but don’t want to change the language.] [The following if taken as a judgement is very harsh and basically unfair, and it would suck to punish Geoff for having conversations like this. So please don’t take it as a judgement. I want to get a handle on what’s up with Geoff, so I want to describe his behavior. Maybe this is bad, LMK if you think so.] It was often hard to listen to Geoff. He seemed to talk in long, apparently low content sentences with lots of hemming and hawing and attention to appearance, and lots of very general statements that seemed to not address precisely the topic. (Again this is unfairly harsh if taken as a judgement, and also he was talking in front of 50 people, sort of.)
- Anna says there were in the early 2010s rumors that Leverage was trying to fundraise from “other people’s donors”. And that Leverage/Geoff was trying to recruit, whether ideologically or employfully, employees of other EA/rationality orgs.
- I didn’t hear anything that strongly confirms or denies adversarial hypotheses like “Geoff was fairly actively doing something pretty distortiony in Leverage that caused harm, and is sort of hiding this by downplaying / redirecting attention / etc.”.
- Broadly it would be really good to understand better how to have world-saving narratives and such, especially ones that can recruit and retain political will if they really ought to, without narrative fraud / information cascades / etc.

TekhneMakre

Are hu­mans mis­al­igned with evolu­tion?

[Question] Does Braess’s para­dox show up in mar­kets?

In­ter­per­sonal al­ign­ment in­tu­itions

Are humans misaligned with evolution?

[Question] Does Braess’s paradox show up in markets?

Interpersonal alignment intuitions