the gears to ascension comments on Even if we lose, we win

the gears to ascension 5 Mar 2024 23:24 UTC
2 points
0
do you have thoughts on how to encode “doing philosophy” in a way that we would expect to be strongly convergent, such that if implemented on the last ai humans ever control, we can trust the process after disempowerment to continue to be usefully doing philosophy in some nailed down way?
- Wei Dai 6 Mar 2024 2:33 UTC
  5 points
  1
  Parent
  I think we’re really far from having a good enough understanding of what “philosophy” is, or what “doing philosophy” consists of, to be able to do that. (Aside from “indirect” methods that pass the buck to simulated humans, that Pi Rogers also mentioned in another reply to you.)
  
  Here is my current best understanding of what philosophy is, so you can have some idea of how far we are from what you’re asking.
- Morphism 6 Mar 2024 1:19 UTC
  3 points
  0
  Parent
  Maybe some kind of simulated long-reflection type thing like QACI where “doing philosophy” basically becomes “predicting how humans would do philosophy if given lots of time and resources”
- TAG 5 Mar 2024 23:34 UTC
  0 points
  0
  Parent
  That would be a philosophical problem...