But humans don’t seem to optimize for reward all that often!
You might be interested in an earlier discussion on whether “humans are a hot mess”: https://www.lesswrong.com/posts/SQfcNuzPWscEj4X5E/the-hot-mess-theory-of-ai-misalignment-more-intelligent https://www.lesswrong.com/posts/izSwxS4p53JgJpEZa/notes-on-the-hot-mess-theory-of-ai-misalignment
You might be interested in an earlier discussion on whether “humans are a hot mess”: https://www.lesswrong.com/posts/SQfcNuzPWscEj4X5E/the-hot-mess-theory-of-ai-misalignment-more-intelligent https://www.lesswrong.com/posts/izSwxS4p53JgJpEZa/notes-on-the-hot-mess-theory-of-ai-misalignment