Call for cognitive science in AI safety

Linda Linsefors29 Sep 2017 20:35 UTC

6 points

Epistemic status: High expected utility, but also very high variance

The more I realise that AI take off is something that actually might happen, the more I am pulled towards this problem:

What are human preferences really?
What is the generator of human preferences?
What are our preferences made of?
What is the structure behind it all?

Before we tell our brand new AI overlord to figure out our values and do whatever we want it to do, we really ought to have a clearer idea of what “values” and “want” means.

I have a good idea of what my preferences are within the limited reach of my lived experience, and even a little bit beyond that. But to extrapolate from that into the vast distance of possible futures seems extremely dangerous.

My values are inconsistent and conflicting and definitely not constant over time. On top of that, there are the big heap of unknown unknowns with respect to how the brain works.

I am convinced that to solve AI safety we need to have a good understanding of human values, and I know I don’t have this understanding. I am just a physics and math nerd. I don’t know this stuff. I don’t know if the questions I have are open research questions, or if this stuff is already well known and understood in some separate community somewhere. That is why we need psychology nerds to join the cause.

Another topic that I would want AI safety orientated psychology research to do, is something like a case study of friendliness in existing agents (humans, subsystems in the brain, organisations). What are the mechanisms in the human brain that make us care about others, and can that be replicated?

* * *

A problem I see is that only math and computer nerds are called upon to work on AI safety, and all the psychology nerds out there do not even know that they are needed. Or maybe the psychology research that I am looking for is already out there and we just need to find each other to collaborate more.

I think that it is important that technical AI safety research does not try to set the agenda for psychology AI safety research. Information and inspirations needs to flow both ways. Both fields need to be free to follow their own curiosity, but we also need to collaborate to ground our work in each other’s knowledge.

* * *

Linda Linsefors

Cosigning:

Alexander Appel

Holden Lee

somnulence logencia

What links here?

Extensive and Reflexive Personhood Definition by Linda Linsefors (29 Sep 2017 21:50 UTC; 3 points)

Linda Linsefors29 Sep 2017 20:35 UTC

6 points

12 comments1 min readLW link

Kaj_Sotala 30 Sep 2017 11:44 UTC
4 points
I wrote about the same thing here: Cognitive science/psychology as a neglected approach to AI safety
Linda Linsefors 29 Sep 2017 20:36 UTC
4 points
Recent talk by Stuart Armstrong related to this topic:
https://www.youtube.com/watch?v=19N4kjYbZD4
Ben Pace 30 Sep 2017 0:25 UTC
3 points
[Note from the Sunshine Regiment] Hi Linda! At the current time, the frontpage is for discussion of ideas, and not for discussion of the community, coordination, or calls-to-action. As such, I’ve moved this post to your personal LW blog, and removed it from the front page.
Apologies for this not being fully clear so far, I’ve tried to point to this sort of thing in the current frontpage content guidelines, but it will be made more explicit by launch.
- Raemon 30 Sep 2017 1:03 UTC
  2 points
  Parent
  Just to clarify—would a similar post that was framed more around “here are potential ideas relating to cognitive science in AI that I think are valuable and here’s why” (without making it into an explicit call to action, more of a “if you buy this claim than you’d probably end up doing something with it” be fine for front-page?
  - Ben Pace 30 Sep 2017 1:26 UTC
    2 points
    Parent
    Yup, arguing for epistemic conclusions about AI and/or cognitive science is the appropriate category of content for the frontpage—for example Linda’s other post today.
    - Linda Linsefors 30 Sep 2017 7:26 UTC
      2 points
      Parent
      Basically, if I change the title, it can go on the front page?
ESRogs 29 Sep 2017 21:16 UTC
3 points
Linda Linsefors
Alexander Appel
Holden Lee
somnulence logencia
I am confused by this signature—the post uses “I”, but with multiple signatories I would have expected “we”.

Should I consider Linda to be the author, with Alexander, Holden, and somnulence simply expressing their assent?
- Linda Linsefors 29 Sep 2017 21:22 UTC
  1 point
  Parent
  Yes, that is correct.
  I wrote the text and asked people to cosign if the agreed, for signaling value.
  
  Do you have a good idea on how to make this clearer?
  - magfrump 29 Sep 2017 22:23 UTC
    2 points
    Parent
    Maybe just write “Cosigned,” above the names?
    - Linda Linsefors 29 Sep 2017 22:56 UTC
      1 point
      Parent
      Better?
      - magfrump 29 Sep 2017 22:57 UTC
        2 points
        Parent
        Yeah I think that’s pretty clear
whpearson 29 Sep 2017 23:15 UTC
2 points
Hi, I’ve been thinking along a related line. I wrote something today as arguing we should investigate IQ in the hope that it will help us predict takeoff. If anyone has a good psych background reading this, I’d some like feedback.

I also think that engaging with psychology would also be a sign that AI safety is maturing into a science that takes all sources of information seriously and would increase it’s credibility with the general scientific community.