mako yass comments on “AI Safety for Fleshy Humans” an AI Safety explainer by Nicky Case

mako yass 4 May 2024 2:29 UTC
2 points
0
This is good! I would recommend it to a friend!
Some feedback.
- An individual human can be inhumane, but the aggregate of human values kind of visibly isn’t and in most ways couldn’t be: Human cultures are getting more humane reliably as transparency/reflection and coordination increases over time, but also inevitably if you aggregate a bunch of concave values it will produce a value system that treats all of the subjects of the aggregation pretty decently.
  A lot of the time, when people accuse us of conflating something, we equate those things because we have an argument that they’re going to turn out to be equivalent.
  So emphasizing a difference between these two things could be really misleading, and possibly kinda harmful, given that it could undermine the implementation of the simplest/most arguably correct solutions to alignment (which are just aggregations of humans’ values). This could be a whole conversation, but could we just not define humane values as being necessarily distinct from human values? How about this:
  - People are sometimes confused by ‘Human values’, as it seems to assume that all humans value the same things, but many humans have values that conflict with the preferences of other humans. When we say ‘Humane values’, we’re defining a value system that does a decent job at balancing and reconciling the preferences of every human (Humans, Every one).
- [graph point for “systems programmer with mlp shirt”] would it be funny if there were another point, “systems programmer without mlp shirt”, and it was pareto-inferior
- “What if System 2 is System 1”. This is a great insight, I think it is, and I think the main reason nerdy types often fail to notice how permeable and continuous the boundary is a kind of tragic habitual cognitive autoimmune disease, and I have a post brewing about this after I used a repaired relationship with the unconscious bulk to cure my astigmatism (I’m going to let it sit for a year just to confirm that the method actually worked and myopia really was averted)
- Exponential growth is usually not slow, and even if it were slow, it wouldn’t entail that “we’ll get “warning shots” & a chance to fight back”, it only takes a small sustained advantage to be able to utterly win a war (though contemporary humans don’t like to carry wars to completion these days, the 20th century should have been a clear lesson that such things are within our abilities at current tech levels). Even if progress in capabilities over time continued to be linear, impact over capabilities is not going to be linear, it never has been.
But overall I think it addresses a certain audience who I know much better than my version of this that I hastily wrote last year when I was summoned to speak at a conference would have (and so I never showed it to them. Maybe one day I will show them yours.).