I feel like being the code master for Codenames is a good exercise for understanding this concept.
I wasn’t thinking of shards as reward prediction errors, but I can see how the language was confusing. What I meant is that when multiple shards are activated, they affect behavior according to how strongly and reliably they were reinforced in the past. Practically, this looks like competing predictions of reward (because past experience is strongly correlated with predictions of future experience), although technically it’s not a prediction—the shard is just based on the past experience and will influence behavior similarly even if you rationally know the context has changed. E.g. the cake shard will probably still reinforce eating cake even if you know that you just had mouth-changing surgery that means you don’t like cake anymore.
(However, I would expect that shards evolve over time. So in the this example, after enough repetitions reliably failing to reinforce cake eating, the cake shard would eventually stop making you crave cake when you see cake.)
So in my example, cleaner language might be: For example, I more reliably ate cake in the past if someone was currently offering me the slice of cake, compared to someone promising that they will bring a slightly better cake to the office party tomorrow. So when the “someone is currently offering me something” shard and the “someone is promising me something” shard are both activated, the first shard affects my decisions more, because it was rewarded more reliably in the past.
(One test of this theory might be whether people are more likely to take the bigger, later payout if they grew up in extremely reliable environments where they could always count on the adults to follow through on promises. In that case, their “someone is promising me something” shard should have been reinforced similarly to the “someone is currently offering me something” shard. This is basically one explanation given for the classic Marshmallow Experiment—kids waited if they trusted adults to follow through with the promised two marshmallows; kids ate the marshmallow immediately if they didn’t trust adults.)
Cool, I’m happy if you’re relaxing with a leisure activity you enjoy! The people I spoke with were explicitly not doing this for fun.
Time inconsistency example: You’ve described shards as context-based predictions of getting reward. One way to model the example would be to imagine there is one shard predicting the chance of being rewarded in the situation where someone is offering you something right now, and another shard predicting the chance you will be rewarded if someone is promising they will give you something tomorrow.
For example, I place a substantially better probability on getting to eat cake if someone is currently offering me the slice of cake, compared to someone promising that they will bring a slightly better cake to the office party tomorrow. (In the second case, they might get sick, or forget, or I might not make it to the party.)
I have lots of points of contact with the world, but it feels really effortful to be always mindful and noting down observations (downright overwhelming if I don’t narrowing my focus to a single cluster of datapoints I’m trying to understand)
@Logan, how do you make space for practicing naturalism? It sounds like you rely on ways of easing yourself into curiosity, rather than forcing yourself to pay attention.
(Also, just saw the comment rules for the first time while copying these over—hope mindfulness mention doesn’t break them too hard)
Speculating here, I’m guessing Logan is pointing at a subcategory of what I would call mindfulness—a data point-centered version of mindfulness. One of my theories of how experts build their deep models is that they start with thousands of data points. I had been lumping frameworks along with individual observations, but maybe it’s worth separating those out. If this is the case, frameworks help make connections more quickly, but the individual data points are how you notice discrepancies, uncover novel insights, and check that your frameworks are working in practice.
(Copying over FB reactions from while reading) Hmm, I’m confused about the Observation post. Logan seems to be using direct observation vs map like I would talk about mindfulness vs mindlessness. Except for a few crucial differences: I would expect mindfulness to mean paying attention fully to one thing, which could include lots of analysis/thinking/etc. that Logan would put in the map category. It feels like we’re cutting reality up slightly differently.
Trying to sit with the thought of the territory as the thing that exists as it actually is regardless of whatever I expect. This feels easy to believe for textbook physics, somewhat harder for whatever I’m trying to paint (I have to repeatedly remind myself to actually look at the cloud I’m painting), and really hard for psychology. (Like, I recently told someone that, in theory, if their daily planning is calibrated they should have days where they get more done than planned, but in practice this gets complicated because of interactions between their plans and how quickly/efficiently they work.)
Yet, to the best of my knowledge, psychology isn’t *not* physics… It’s just that we humans aren’t yet good enough at physics to understand psychology.
(Copying over FB reactions from while reading) There’s something that feels familiar so far. For myself and when I’m working with clients, I often encourage experiments and journaling about the experience as they go. Part of the reason is uncertainty about the result, but another part is taking the time to check your expectations against your actual experience as it’s happening.
Like, I recently felt really drained. Noticing that feeling, I could immediately say several things that had been draining, but I wouldn’t have said they were hard while I’d been doing them. I was just caught up in making things go smoothly at the time. It was only afterwards noticing that how drained I felt that made me realize the activity was very energy draining. It would have been easy to brush over the activity with “Oh, I had a fun time”, which was true. But it was also true that I had to recover afterwards.
I feel like there’s something valuable here in being able to pay the kind of attention to my experience that led me to that knowledge. But I don’t have a good vocabulary to describe it to my clients.
Coming back after finishing the series, I notice the “scary!” reaction is gone. Based on some of Logan’s comments on the FB thread, I think I updated toward 1. worry less about having to explicitly remember every detail → instead just learn to pay attention and let those observations filter into your consciousness, and 2. it’s still fine to use/learn from frames, just make sure you also have direct observations to let you know if your frames/assumptions are off.
I want to understand how to actually get humans to do the right things, and that task feels gargantuan without building on the foundation of simplified handles other people have discovered.
Yet, I value something like naturalism because I don’t trust many handles as they are now, especially coming from psychology. “Confusion-spotting” seems pretty important if I’m going to improve on the status quo.
(Copying FB reactions I made as I read) 1. I care a lot about deep mastery. 2. It doesn’t feel immediately obvious that direct observation is the fastest route to deep mastery. Like, say I wanted to understand a new field—bio for example. I would start with some textbooks, not staring at my dog and hoping I’d understand how he worked. I’d get to examples and direct experience, but my initial instinct is to start with pre-existing frameworks. But maybe I’m just misunderstanding what “direct observation” means?
Probably worth noticing that my mind spent the last half of the post trying to skip ahead. Like, there’s a storyteller setting the scene and my brain wants to skip over “Once upon a time…” ….but I’m guessing that skipping over something because it seems familiar is antithetical to the whole point of this series.
I get the problem about misleading frames and not noticing you’re skipping over counter evidence—I’m always worried/annoyed/frustrated about how little confidence I have in any claim because I can think of nuances. But ah, that’s why I like frames! I want cached answers, damn it.
Initial reaction: “Ah, scary.” Their move from frames to unfiltered, direct observations feels scary. Like I’m going to lose something important. I rely on frames a lot to organize and remember stuff, because memory is hard and I forget so many important data points. I can chunk lots of individual stuff under a frame.
I wrote reactions on FB while reading, coping them here and on the other posts afterwards.