I think roughly just the various normal straightforward meanings if someone says “X is important”? E.g.
You care a lot about the difference
You would strongly prefer one over the other
You’d make decisions in accordance with that preference
You’d presume in discourse that people will or at least should care a lot about it, maybe after learning + reflecting
similarly for “terminal humane-aligned goals”
Well, let’s just say, what humans would arrive at on some healthy long-term reflection process. I don’t mean to imply some kind of strong finality, like we get to Alignment Day and now everything about the future / who we are / what we want / etc. is determined or something. But more like “several important differences between possible long-term trajectories have been determined”. For example, Alignment Day would probably include things like
There will be no torture or killing of sentients, except possibly in some cases that meet a high bar of deeply free / self-sovereign reflection or something
There will be multiple freely growing minds which reach out to each other (e.g. for love, play, discourse, partial collectivity, etc.)
These things are I think
Not at all determined by convergence; probably contingent on at least species evolution, probably more specifically on things about group intelligence in the evolutionary history; most likely outcomes don’t have the versions of these we want
Important to basically all properly-human-derived souls forever
I think there are other things like this, at various levels of parochialness, some of which might get reflected away for many / most / all human-descendants eventually, but many of which wouldn’t get fully reflected away. I think there are flavors to humane reflection that are also contingent but that we care a lot about.
So for the subjective meaning of “important” you’re talking about here, I think going by revealed preference is helpful. My revealed preference is to continue writing about philosophy topics relative to AI and the future, find many parts of AI safety culture annoying and occasionally worth criticizing, talk with AIs a lot about philosophy, not generally support AI regulation, vibe positively about Landian anti-orthogonalist philosophy, etc. Some people in AI safety have different revealed preferences, which involve more talking about AI philosophy in an orthodox LessWrongian manner, worrying publicly and loudly about LLMs killing us all in the near future, organizing political activity to ban AI as much as possible, etc. This difference in revealed preference relates to differences in subjective importance, but it’s unclear how to isolate contributions from factors such as AIs having humane goals, given there are other differences like background beliefs and feasibility.
Humans would come to some conclusions on reflection and so would aliens and AIs etc. I’m not sure how much they agree or disagree on reflection. That’s a probabilistic/statistical question, whose answer is not implied by weak orthogonality. I don’t know if humans would agree to no killing of sentients upon reflection, I’d very roughly guess less likely than not but who knows. The ‘freely growing minds’ part is a ‘maybe humans would agree to this on reflection, maybe not’ also but maybe in the ‘more likely than not’ camp but also it’s pretty vague so I’m not convinced assigning a probability is a good idea.
I don’t really agree that we can pick out things like this and make strong statements like “any properly humanly derived soul would agree with these values”, it seems like a very hard thing to predict given that they have much more cognition than we do.
I don’t really agree that we can pick out things like this and make strong statements like “any properly humanly derived soul would agree with these values”, it seems like a very hard thing to predict given that they have much more cognition than we do.
I kinda agree, though probably not fully. If we want to talk about empirical orthogonality, I would say that, yeah, I’m pretty sure an AGI intelligence explosion sampled from likely AGI IEs starting from now would end up with something I strongly don’t want, compared to for example worlds with no AGI and yes human intelligence amplification.
I think roughly just the various normal straightforward meanings if someone says “X is important”? E.g.
You care a lot about the difference
You would strongly prefer one over the other
You’d make decisions in accordance with that preference
You’d presume in discourse that people will or at least should care a lot about it, maybe after learning + reflecting
Well, let’s just say, what humans would arrive at on some healthy long-term reflection process. I don’t mean to imply some kind of strong finality, like we get to Alignment Day and now everything about the future / who we are / what we want / etc. is determined or something. But more like “several important differences between possible long-term trajectories have been determined”. For example, Alignment Day would probably include things like
There will be no torture or killing of sentients, except possibly in some cases that meet a high bar of deeply free / self-sovereign reflection or something
There will be multiple freely growing minds which reach out to each other (e.g. for love, play, discourse, partial collectivity, etc.)
These things are I think
Not at all determined by convergence; probably contingent on at least species evolution, probably more specifically on things about group intelligence in the evolutionary history; most likely outcomes don’t have the versions of these we want
Important to basically all properly-human-derived souls forever
I think there are other things like this, at various levels of parochialness, some of which might get reflected away for many / most / all human-descendants eventually, but many of which wouldn’t get fully reflected away. I think there are flavors to humane reflection that are also contingent but that we care a lot about.
So for the subjective meaning of “important” you’re talking about here, I think going by revealed preference is helpful. My revealed preference is to continue writing about philosophy topics relative to AI and the future, find many parts of AI safety culture annoying and occasionally worth criticizing, talk with AIs a lot about philosophy, not generally support AI regulation, vibe positively about Landian anti-orthogonalist philosophy, etc. Some people in AI safety have different revealed preferences, which involve more talking about AI philosophy in an orthodox LessWrongian manner, worrying publicly and loudly about LLMs killing us all in the near future, organizing political activity to ban AI as much as possible, etc. This difference in revealed preference relates to differences in subjective importance, but it’s unclear how to isolate contributions from factors such as AIs having humane goals, given there are other differences like background beliefs and feasibility.
Humans would come to some conclusions on reflection and so would aliens and AIs etc. I’m not sure how much they agree or disagree on reflection. That’s a probabilistic/statistical question, whose answer is not implied by weak orthogonality. I don’t know if humans would agree to no killing of sentients upon reflection, I’d very roughly guess less likely than not but who knows. The ‘freely growing minds’ part is a ‘maybe humans would agree to this on reflection, maybe not’ also but maybe in the ‘more likely than not’ camp but also it’s pretty vague so I’m not convinced assigning a probability is a good idea.
I don’t really agree that we can pick out things like this and make strong statements like “any properly humanly derived soul would agree with these values”, it seems like a very hard thing to predict given that they have much more cognition than we do.
I kinda agree, though probably not fully. If we want to talk about empirical orthogonality, I would say that, yeah, I’m pretty sure an AGI intelligence explosion sampled from likely AGI IEs starting from now would end up with something I strongly don’t want, compared to for example worlds with no AGI and yes human intelligence amplification.
look at the uk or the EU. look at global birth rate trends, and attitudes towards ie germline selection.
p(doom|ai) is negative. there’s no world with no agi and human intelligence amplification