Oxidize comments on How familiar is the Lesswrong community as a whole with the concept of Reward-modelling?

Oxidize 10 Apr 2025 15:07 UTC
1 point
0
I’d say that the ⁸⁰⁄₂₀ of the concept is how reward & punishment affect human behavior.

Is it about which forces?
- I would say I’m referring to a combination of instinct, innate attraction/aversion, previous experience, decision-making, attention, and how they relate to each other in an everyday practical context.

Seems to me that “genetics”
- I would say your disentanglement is right on the money. Rather than making an analysis for LLMs, I’m particularly interested in fleshing out the inter relations between concepts as they relate to the human brain.

Do you want a similar analysis for LLMs?
I mean it from a high-level agency perspective, as opposed to in specific AI or machine learning contexts.

Goal?
My goal is to learn more about what information Lesswrongers use and are interested in so that I can better create a post for the community.

Adjacent concepts
- Self-discipline
- Positive psychology
- Systems & patterns thinking
- Maybe reward-functions?
- faul_sname 11 Apr 2025 0:28 UTC
  8 points
  0
  Parent
  Can you give one extremely concrete example of a scenario which involves reward modeling, and point to the part of the scenario that you call “reward modeling”?