Shoshannah Tekofsky

Karma: 624

Shoshannah Tekofsky 4 Oct 2022 8:17 UTC
22 points
−4
on: Humans aren’t fitness maximizers
I disagree humans don’t optimize IGF:
1. We seem to have different observational data. I do know some people who make all their major life decisions based on quality and quantity of offspring. Most of them are female but this might be a bias in my sample. Specifically, quality trades off against quantity: waiting to find a fitter partner and thus losing part of your reproductive window is a common trade off. Similarly, making sure your children have much better lives than you by making sure your own material circumstances (or health!) are better is another. To be fair, they seem to be a small minority currently but I think that is due to point 3 and would be rectified in more a constant environment.
2. A lot of our drives do indirectly help IGF. Your aestethic sense may be somewhat wired to your ability to recognize and enjoy the visual appearance of healthy mates. Similarly for healthy environments to grow up in, etc. Sure, it gets hijacked for 20 other things, but how big is the loss in IGF to keep it around? I would argue it’s generally not an issue for the subsection of humans that are directly driven to have big families.
3. Many of us have badly optimized drives cause our environments have changed too fast. It will take a few generations of constant environment (not gonna happen at our current level of technological progress) to catch up. The obvious example is birth control: sex drive used to actually be a great proxy signal to optimize on offspring. Now it’s no longer but we still love sex. But in a few generations the only people alive are the descendants of people who wanted kids no matter their sex drive. ‘evolution’ will now select directly on desire for kids but it takes awhile to catch up.
I’m not saying evolution optimized us very well, but I don’t think it’s accurate to say that we are not IGF maximizers. The environment has just changed much too quickly and selection pressure has been low the last few generations, but things like birth control actually introduce a new selection pressure on drive to reproduce. Humans are mediocre IGF maximizers in an environment that is changing unusually fast.

Shoshannah Tekofsky 16 Jul 2022 23:31 UTC
18 points
3
on: All AGI safety questions welcome (especially basic ones) [monthly thread]
Thanks for doing this!

I was trying to work out how the alignment problem could be framed as a game design problem and I got stuck on this idea of rewards being of different ‘types’. Like, when considering reward hacking, how would one hack the reward of reading a book or exploring a world in a video game? Is there such a thing as ‘types’ of reward in how reward functions are currently created? Or is it that I’m failing to introspect on reward types and they are essentially all the same pain/pleasure axis attached to different items?

That last explanation seems hard to resolve with the huge difference in qualia between different motivational sources (like reading a book versus eating food versus hugging a friend… These are not all the same ‘type’ of good, are they?)

Sorry if my question is a little confused. I was trying to convey my thought process. The core question is really:

Is there any material on why ‘types’ of reward signals can or can’t exist for AI and what that looks like?

Shoshannah Tekofsky 2 Jul 2022 20:47 UTC
8 points
0
in reply to: Rob Bensinger’s comment on: Naive Hypotheses on AI Alignment
Thank you! And adding that to my reading list :D

Shoshannah Tekofsky 7 Jan 2023 22:01 UTC
7 points
2
on: Looking for Spanish AI Alignment Researchers
There is an EU telegram group where they are, among other things, collecting data on where people are in Europe. I’ll DM an invite.

Shoshannah Tekofsky 6 Oct 2022 19:37 UTC
7 points
0
in reply to: gwern’s comment on: Some humans are fitness maximizers
On further reflection, I changed my mind (see title and edit at top of article). Your comment was one of the items that helped me understand the concepts better, so just wanted to add a small thank you note. Thank you!

Shoshannah Tekofsky 6 Oct 2022 18:55 UTC
7 points
0
in reply to: TekhneMakre’s comment on: Some humans are fitness maximizers
I wasn’t sure how I hadn’t argued that, but between all the difference comments, I’ve now pieced it together. I appreciate everyone engaging me on this, and I’ve updated the essay to “deprecated” with an explanation at the top that I no longer endorse these views.

Shoshannah Tekofsky 4 Feb 2023 5:41 UTC
6 points
−1
on: Fucking Goddamn Basics of Rationalist Discourse
Should we have a rewrite the Rationalist Basics Discourse contest?

Not that I think anything is gonna beat this. But still :D

Ps: can be both content and/or style

Shoshannah Tekofsky 28 Dec 2022 4:28 UTC
4 points
0
in reply to: Leon Lang’s comment on: Loose Threads on Intelligence
Did you accidentally forget to add this post to your research journal sequence?
I thought I added it but apparently hadn’t pressed submit. Thank you for pointing that out!
1. optimization algorithms (finitely terminating)
2. iterative methods (convergent)
That sounds as if as if they are always finitely terminating or convergent, which they’re not. (I don’t think you wanted to say they are)
I was going by the Wikipedia definition:
To solve problems, researchers may use algorithms that terminate in a finite number of steps, or iterative methods that converge to a solution (on some specified class of problems), or heuristics that may provide approximate solutions to some problems (although their iterates need not converge).
I don’t quite understand this. What does the sentence “computational optimization can compute all computable functions” mean? Additionally, in my conception of “computational optimization” (which is admittedly rather vague), learning need not take place.
I might have overloaded the phrase “computational” here. My intention was to point out what can be encoded by such a system. Maybe “coding” is a better word? E.g., neural coding. These systems can implement Turing machines so can potentially have the same properties of turing machines.
these two options are conceptually quite different and might influence the meaning of the analogy. If intelligence computes only a “target direction”, then this corresponds to a heuristic approach in which locally, the correct direction in action space is chosen. However, if you view intelligence as an actual optimization algorithm, then what’s chosen is not only a direction but a whole path.
I’m wondering if our disagreement is conceptual or semantic. Optimizing a direction instead of an entire path is just a difference in time horizon in my model. But maybe this is a different use of the word “optimize”?
You write “Learning consists of setting the right weights between all the neurons in all the layers. This is analogous to my understanding of human intelligence as path-finding through reality”
- Learning is a thing you do once, and then you use the resulting neural network repeatedly. In contrast, if you search for a path, you usually use that path only once.
If I learn the optimal path to work, then I can use that multiple times. I’m not sure I agree with the distinction you are drawing here … Some problems in life only need to be solved exactly once, but that’s the same as any thing you learn only being applicable once. I didn’t mean to claim the processes are identical, but that they share an underlying structure. Though indeed, this might an empty intuitive leap with no useful implementation. Or maybe not a good matching at all.
I do not know what you mean by “mapping a utility function to world states”. Is the following a correct paraphrasing of what you mean?
“An aligned AGI is one that tries to steer toward world states such that the neurally encoded utility function, if queried, would say ‘these states are rather optimal’ ”

Yes, thank you.
I don’t quite understand the analogy to hyperparameters here. To me, it seems like childbirth’s meaning is in itself a reward that, by credit assignment, leads to a positive evaluation of the actions that led to it, even though in the experience the reward was mostly negative. It is indeed interesting figuring out what exactly is going on here (and the shard theory of human values might be an interesting frame for that, see also this interesting post looking at how the same external events can trigger different value updates), but I don’t yet see how it connects to hyperparameters.
A hyperparameter is a parameter across parameters. So say with childbirth, you have a parameter pain on physical pain which is a direct physical signal, and you have a hyperparameter ‘Satisfaction from hard work’ that takes ‘pain’ as input as well as some evaluative cognitive process and outputs reward accordingly. Does that make sense?
What if instead of trying to build an AI that tries to decode our brain’s utility function, we build the process that created our values in the first place and expose the AI to this process?
Digging in to shard theory is still on my todo list. [bookmarked]
Many models that do not overfit also memorize much of the data set.
Is this on the sweet spot just before overfitting or should I be thinking of something else?

Thank you for you extensive comment! <3

Shoshannah Tekofsky 27 Dec 2022 19:52 UTC
4 points
0
in reply to: plex’s comment on: Announcing: The Independent AI Safety Registry
Oh my, this looks really great. I suspect between this and the other list of AIS researchers, we’re all just taking different cracks at generating a central registry of AIS folk so we can coordinate at all different levels on knowing what people are doing and knowing who to contact for which kind of connection. However, maintaining such an overarching registry is probably a full time job for someone with high organizational and documentation skills.

Shoshannah Tekofsky 5 Oct 2022 12:23 UTC
4 points
1
in reply to: Ben Pace’s comment on: Humans aren’t fitness maximizers
I think the notion that people are adaptation-executors, who like lots of things a little bit in context-relevant situations, predicts our world more than the model of fitness-maximizers, who would jump on this medical technology and aim to have 100,000s of children soon after it was built.

I think this skips the actual social trade-offs of the strategy you outline above:
1. The likely back lash in society against any woman who tries this is very high. Any given rich woman would have to find surrogate women who are willing to accept the money and avoid being the target of social condemnation or punitive measures of the law. It’s a high risk / high reward strategy that also needs to keep paying off long after she is dead, as her children might be shunned or lose massive social capital as well. If you consider people’s response to eugenics or gene editing of human babies, then you can imagine the backlash if a woman actually paid surrogates at scale. It’s not clear to me that the strategy you outline above is actually all that viable for the vast majority of rich women.
2. I’d argue some of are IGF maximizers for the hand that we have been dealt, which includes our emotional response, intelligence, and other traits. Many of us have things like fear-responses to heavily hard-wired that no matter what we recognize as the optimal response, we can’t actually physically execute it.
I realize item 2 points to a difference in how we might define an optimizer, but it’s worth disambiguating this. I suspect claiming no humans are IGF maximizers or some humans are IGF maximizers might come down to the definition of maximizer that one uses. And thus might explain the pushback that Nate runs in to for a claim he finds self-evident.

Shoshannah Tekofsky 6 Sep 2022 10:20 UTC
4 points
−1
in reply to: ChristianKl’s comment on: Overton Gymnastics: An Exercise in Discomfort
My intuition is that there is a gradient from controversial statements to this-will-cause-unrecoverable-social-status damage. I think I might have implicitly employed a ‘softer’ definition of Overton window as ‘statements that make others or yourself uncomfortable to express/debate’, where the ‘harder’ definition would be statements you can’t socially recover from. I think intuitively I wouldn’t presume anyone wants to share the latter and I don’t see much benefit in doing so. But overall, my concept of Overton window is much more gradient than a binary, and this exercise aims to allow people to stretch through the (perceived) low range.

Shoshannah Tekofsky 3 Jul 2022 14:47 UTC
4 points
0
in reply to: MSRayne’s comment on: Naive Hypotheses on AI Alignment
Hmmm, yes and no?
e.g. many people that care about animal welfare differ on the decisions they would make for those animals. What if the AGI ends up a negative utilitarian and sterilizes us all to save humanity from all future suffering? The missing element would again be to have the AGI aligned with humanity, which brings us back to H4: What’s humanity’s alignment anyway?

Shoshannah Tekofsky 20 Oct 2023 21:38 UTC
3 points
0
on: Mini-Workshop on Applied Rationality
The ACX meeting on the same day is unfortunately cancelled. For that reason we are extending the deadline for sign up:

If you have a confirmation email, then you can definitely get in.

Otherwise, fill out the form and we’ll select 3 people for the remaining spots. If people show up without signing up, they can get in if we are below 20. If we are on 20 or more, then no dice :D

(Currently 17)

Shoshannah Tekofsky 22 Apr 2023 6:08 UTC
3 points
0
in reply to: habryka’s comment on: United We Align: Harnessing Collective Human Intelligence for AI Alignment Progress
Well damn… Well spotted.

I found the full-text version and will dig in to this next week to see what’s up exactly.

Shoshannah Tekofsky 21 Apr 2023 18:12 UTC
3 points
0
in reply to: Daniel Kokotajlo’s comment on: United We Align: Harnessing Collective Human Intelligence for AI Alignment Progress
Thank you! I wholeheartedly agree to be honest. I’ve added a footnote to the claim, linking and quoting your comment. Are you comfortable with this?

Shoshannah Tekofsky 21 Apr 2023 5:28 UTC
3 points
0
in reply to: Christopher King’s comment on: United We Align: Harnessing Collective Human Intelligence for AI Alignment Progress
Oooh gotcha. In that case, we are not remotely any good at avoiding the creation of unaligned humans either! ;)

Shoshannah Tekofsky 8 Jan 2023 19:50 UTC
3 points
0
in reply to: johnswentworth’s comment on: Optimizing Human Collective Intelligence to Align AI
Even in experiments, I think most of the value is usually from observing lots of stuff, more than from carefully controlling things.
I think I mostly agree with you but have the “observing lots of stuff” categorized as “exploratory studies” which are badly controlled affairs where you just try to collect more observations to inform your actual eventual experiment. If you want to pin down a fact about reality, you’d still need to devise a well-controlled experiment that actually shows the effect you hypothesize to exist from your observations so far.
If you actually go look at how science is practiced, i.e. the things successful researchers actually pick up during PhD’s, there’s multiple load-bearing pieces besides just that.
Fair!
Note that a much simpler first-pass on all these is just “spend a lot more time reading others’ work, and writing up and distilling our own”.
I agree, but if people were both good at finding necessary info as an individual and we had better tools for coordinating (e.g.,finding each other and relevant material faster) then that would speed up research even further. And I’d argue that any gains in speed of research is as valuable as the same proportional delay in developing AGI.

Shoshannah Tekofsky 7 Jan 2023 3:31 UTC
3 points
0
in reply to: Nina Rimsky’s comment on: Optimizing Human Collective Intelligence to Align AI
That makes a lot of sense! And was indeed also thinking of Elicit

Shoshannah Tekofsky 26 Dec 2022 22:14 UTC
3 points
0
in reply to: plex’s comment on: Announcing: The Independent AI Safety Registry
Great idea!

So my intuition is that letting people edit a file that is publicly linked is inviting a high probability of undesirable results (like accidental wipes, unnoticed changes to the file, etc). I’m open to looking in to this if the format gains a lot of traction and people find it very useful. For the moment, I’ll leave the file as-is so no one’s entry can be accidentally affected by someone else’s edits. Thank you for the offer though!

Shoshannah Tekofsky 13 Nov 2022 6:29 UTC
3 points
0
on: Solstice 2022 Roundup
Netherlands

Small celebration in Amsterdam: https://www.lesswrong.com/events/mTxNWEes265zkxhiH/winter-solstice-amsterdam