Garrett Baker comments on Thane Ruthenis’s Shortform

Garrett Baker 17 Sep 2025 0:30 UTC
−3 points
0

Becoming really aggressive and accusing me of being ’”absurd” and “appealing to authority” doesn’t change this.

You were appealing to authority, and being absurd (and also appealing to in/out-groupness). I feel satisfied getting a bit aggressive when people do that. I agree that style doesn’t have any bearing on the validity of my argument, but it does discourage that sort of talk.

I’m not certain what you’re arguing for in this latest comment, I definitely don’t think you show here that humans aren’t privileged objects when it comes to human values, nor do you show that your quote by Eliezer recommends any special process more than a pointer to humans thinking about their values in an ideal situation, which were my main two contentions in my original comment.

I don’t think anyone in this conversation argued that humans can generalize from a fixed training distribution arbitrarily far, and I think everyone also agrees that humans think about morality by making iterative, small, updates to what they already know. But, of course, that does still privilege humans. There could be some consistent pattern to these updates, such that something smarter wouldn’t need to run the same process to know the end-result, but that would be a pattern about humans.
- jdp 17 Sep 2025 2:15 UTC
  4 points
  0
  Parent
  I was not appealing to authority or being absurd (though admittedly the second quality is subjective), it is in fact relevant if we’re arguing about...if you say
  
  How.… else… do you expect to generalize human values out of distribution, except to have humans do it?
  
  This implies, though I did not explicitly argue with the implication, that to generalize human values out of distribution you run a literal human brain or approximation of a human brain (e.g. Hansonian Em) to get the updates. What I was pointing out is that CEV, which is the classic proposal for how to generalize human values out of distribution and therefore a relevant reference point for what is and is not a reasonable plan (and as you allude to, considered a reasonable plan by people normally taken to be clearly thinking about this issue) to generalize human values out of distribution, does not actually call for running a literal emulation of a human brain except perhaps in its initial stages (and even then only if absolutely necessary, Yudkowsky is fairly explicit in the Arbital corpus that FAI should avoid instantiating sapient subprocesses) because the entire point is to imagine what the descendants of current day humanity would do under ideal conditions of self improvement, a process which if it’s not to instantiate sapient beings must in fact not really be based on having humans generalize the values out of distribution.
  
  If this is an absurd thing to imagine, then CEV is absurd, and maybe it is. If pointing this out is an appeal to authority or in-groupness/outgroupness then presumably any argument of the form “actually this is normally how FAI is conceived and therefore not an apriori unreasonable concept” is invalid on such grounds and I’m not really sure how I’m meant to respond to a confused look like that. Perhaps I’m supposed to find the least respectable plan which does not consider literal human mind patterns to be a privileged object (in the sense their cognition is strictly functionally necessary to make valid generalizations from the existing human values corpus) and point at that? But that doesn’t seem very convincing obviously.
  
  “Pointing at anything anyone holds in high regard as evidence about whether an idea is apriori unreasonable is an appeal to authority and in-groupness.” is to be blunt parodic.
  
  I feel satisfied getting a bit aggressive when people do that. I agree that style doesn’t have any bearing on the validity of my argument, but it does discourage that sort of talk.
  
  I agree it’s an effective way to discourage timid people from saying true or correct things when they disagree with people’s intuitions, which is why the behavior is bad.
- TAG 26 Sep 2025 13:36 UTC
  2 points
  0
  Parent
  
  , I definitely don’t think you show here that humans aren’t privileged objects when it comes to human values
  
  He showed that “human values” isn’t fixed or well defined , which pulls theory from under humans are privileged objects when it comes to human values.