ADifferentAnonymous

Karma: 451

ADifferentAnonymous 18 Jun 2020 16:56 UTC
4 points
0
on: Simulacra Levels and their Interactions
I’m not sure Level 3 is actually less agentic than Level 1. The Oracle does not choose which truths to speak in order to pursue goals; if they did, they’d be the Sage.

ADifferentAnonymous 20 Nov 2020 23:13 UTC
LW: 11 AF: 5
AF
on: Inner Alignment in Salt-Starved Rats
This might just be me not grokking predictive processing, but...
I feel like I do a version of the rat’s task all the time to decide what to have for dinner—I imagine different food options, feel which one seems most appetizing, and then push the button (on Seamless) that will make that food appear.
Introspectively, this feels to me there’s such a thing as ‘hypothetical reward’. When I imagine a particular food, I feel like I get a signal from… somewhere… that tells me whether I would feel reward if I ate that food, but does not itself constitute reward. I don’t generally feel any desire to spend time fantasizing about the food I’m waiting for.
To turn this into a brain model, this seems like the neocortex calling an API the subcortex exposes. Roughly, the neocortex can give the subcortex hypothetical sensory data and get a hypothetical reward in exchange. I suppose this is basically hypothesis two with a modification to avoid the pitfall you identify, although that’s not how I arrived at the idea.
This does require a second dimension of subcortex-to-neocortex signal alongside the reward. Is there a reason to think there isn’t one?

ADifferentAnonymous 24 Nov 2020 0:25 UTC
5 points
in reply to: Steven Byrnes’s comment on: Inner Alignment in Salt-Starved Rats
Thanks for the reply; I’ve thought it over a bunch, and I think my understanding is getting clearer.
I think one source of confusion for me is that to get any mileage out of this model I have to treat the neocortex as a black box doing trying to maximize something, but it seems like we also need to rely on the fact that it executes a particular algorithm with certain constraints.
For instance, if we think of the ‘reward predictions’ sent to the subcortex as outputs the neocortex chooses, the neocortex has no reason to keep them in sync with the rewards it actually expects to receive—instead, it should just increase the reward predictions to the maximum for some free one-time RPE and then leave it there, while engaging in an unrelated effort to maximize actual reward.
(The equation V(sprev)+=(learning rate)⋅(RPE) explains why the neocortex can’t do that, but adding a mathematical constraint to my intuitive model is not really a supported operation. If I say “the neocortex is a black box that does whatever will maximize RPE, subject to the constraint that it has to update its reward predictions according to that equation,” then I have no idea what the neocortex can and can’t do)
Adding in the basal ganglia as an ‘independent’ reward predictor seems to work. My first thought was that this would lead to an adversarial situation where the neocortex is constantly incentivized to fool the basal ganglia into predicting higher rewards, but I guess that isn’t a problem if the basal ganglia is good at its job.
Still, I feel like I’m missing a piece to be able to understand imagination as a form of prediction. Imagining eating beans to decide how rewarding they would be doesn’t seem to get any harder if I already know I don’t have any beans. And it doesn’t feel like “thoughts of eating beans” are reinforced, it feels like I gain abstract knowledge that eating beans would be rewarded.
Meanwhile, it’s quite possible to trigger physiological responses by imagining things. Certainly the response tends to be stronger if there’s an actual possibility of the imagined thing coming to pass, but it seems like there’s a floor on the effect size, where arbitrarily low probability eventually stops weakening the effect. This doesn’t seem like it stops working if you keep doing it—AIUI, not all hungry people are happier when they imagine glorious food, but they all salivate. So that’s a feedback channel separate from reward. I don’t see why there couldn’t also be similar loops entirely within the brain, but that’s harder to prove.
So when our rat thinks about salt, the amygdala detects that and alerts… idk, the hypothalamus? The part that knows it needs salt… and the rat starts salivating and feels something in its stomach that it previously learned means “my body wants the food” and concludes eating salt would be a good idea.

ADifferentAnonymous 25 Nov 2020 0:24 UTC
3 points
in reply to: Steven Byrnes’s comment on: Inner Alignment in Salt-Starved Rats
Glad to hear this is helpful for you too :)
I didn’t really follow the time-derivative idea before, and since you said it was equivalent I didn’t worry about it :p. But either it’s not really equivalent or I misunderstood the previous formulation, because I think everything works for me now.
So if we (1) decide “I will imagine yummy food”, then (2) imagine yummy food, then (3) stop imagining yummy food, we get a positive reward from the second step and a negative reward from the third step, but both of those rewards were already predicted by the first step, so there’s no RPE in either the second or third step, and therefore they don’t feel positive or negative. Unless we’re hungrier than we thought, I guess...
Well, what exactly happens if we’re hungrier than we thought?
(1) “I will imagine food”: No reward yet, expecting moderate positive reward followed by moderate negative reward.
(2) [Imagining food]: Large positive reward, but now expecting large negative reward when we stop imagining, so no RPE on previous step.
(3) [Stops imagining food]: Large negative reward as expected, no RPE for previous step.
The size of the reward can then be informative, but not actually rewarding (since it predictably nets to zero over time). The neocortex obtains hypothetical reward information form the subcortex, without actually extracting a reward—which is the thing I’ve been insisting had to be possible. Turns out we don’t need to use a separate channel! And the subcortex doesn’t have to know or care whether its receiving a genuine prediction or an exploratory imagining from the neocortex—the incentives are right either way.
(We do still need some explanation of why the neocortex can imagine (predict?) food momentarily but can’t keep doing it food forever, avoid step (3), and pocket a positive RPE after step (2). Common sense suggests one: keeping such a thing up is effortful, so you’d be paying ongoing costs for a one-time gain, and unless you can keep it up forever the reward still nets to zero in the end)

ADifferentAnonymous 7 Dec 2020 21:08 UTC
1 point
on: D&D.Sci
I like to read blog posts by people who do real statistics, but with a problem in front of me I’m very much making stuff up. It’s fun, though!
The approach I settled on was to estimate the success chance of a possible stat line by taking a weighted success rate over the data, weighted by how similar the hero’s stats are to the stats being evaluated. My rationale is that based on intuitions about the domain I would not assume linearity or independence of stats’ effects or such, but I would assume that heroes with similar stats would have similar success chances.
In pseudocode:
estimatedchance(stats) = sum(weightfactor(hero.stats, stats) * hero.succeeded) / sum(weightfactor(hero, stats))
weightfactor(hero.stats, stats) = k ^ distance(hero.stats, stats)
(Assuming 0 < k < 1, and hero.succeeded is 1 if the hero succeeded and 0 otherwise)
I tried using both Euclidean and Manhattan distances, and various values for k as well. I also tried a hacky variant of Manhattan distance that added abs(sum(statsA) - sum(statsB)) to the result, but it didn’t seem to change much.
Lastly, I tried the replacing (hero.succeeded) with (hero.succeeded—linearprediction(sum(hero.stats))) to try to isolate builds that do well relative to their stat total. linearprediction is a simple model I threw together by eyeballing the data: 40% chance to succeed with total stats of 60, 100% chance with total stats >= 95, linear in between. Could probably be improved with not too much effort, but I have to stop somewhere.
I generally found two clusters of optima, one around (8, 14, 13, 13, 8, 16)—that is, +4 CHA, +2 STR, +4 WIS—and the other around (4, 16, 13, 14, 9, 16)—that is, +2 CON, +1 INT, +3 STR, +4 WIS. The latter was generally favored by low k values, as the heroes with stats closest to that value generally did quite well but those a little farther away got less impressive. So it could be a successful strategy that doesn’t allow too much deviation, or just a fluke. Using the linear prediction didn’t seem to change things much.
If I had to pick one final answer, it’s probably (8, 14, 13, 13, 8, 16) (though there seems to be a fairly wide region of variants that tend to do pretty well—the rule seems to be ‘some CHA, some WIS, and maybe a little STR’), but I find myself drawn towards the maybe-illusory (4, 16, 13, 14, 9, 16) niche solution.
ETA: Looks like I was iterating over an incomplete list of possible builds… but it turned out not to matter much.
ETA again (couldn’t leave this alone): I tried computing log-likelihood scores for my predictors (restricting the ‘training’ set to the first half of the data and using only the second half for validation. I do find that with the right parameters some of my predictors do better than simple linear regression on sum of stats, and also better the apparently-better predictor of simple linear regression on sum of non-dex stats. But they don’t beat it by much. And it seems the better parameter values are the higher k values, meaning the (8, 14, 13, 13, 8, 16) cluster is probably the one to bet on.

ADifferentAnonymous 11 Dec 2020 14:08 UTC
7 points
in reply to: Dirichlet-to-Neumann’s comment on: Luna Lovegood and the Chamber of Secrets—Part 7
I think I know what you mean (about even-numbered pages; I’m not familiar with Manuscript), but there isn’t actually missing necessary information (unless you haven’t read HPMoR, in which case you’re definitely missing necessary information). I suppose what’s missing is unnecessary information—each scene is stripped to its bare essentials.

ADifferentAnonymous 15 Dec 2020 17:24 UTC
2 points
in reply to: Measure’s comment on: Hermione Granger and Newcomb’s Paradox
The answer looks something like “if she had been planning to do that, the opaque envelope would have been empty”.

ADifferentAnonymous 15 Dec 2020 21:17 UTC
4 points
in reply to: Measure’s comment on: Hermione Granger and Newcomb’s Paradox
Had she been the sort to do that, Omega wouldn’t have made her the offer in the first place.

ADifferentAnonymous 15 Dec 2020 23:36 UTC
27 points
on: Motive Ambiguity
One (admittedly idealistic) solution would be to spread awareness of this dynamic and its toxicity. You can’t totally expunge it that way, but you could make it less prevalent (i.e. upper-middle managers probably can’t be saved, but it might get hard to find enough somewhat-competent lower-middle managers who will play along).
What would it look like to achieve an actually-meaningful level of awareness? I would say “there is a widely-known and negative-affect-laden term for the behavior of making strictly-worse choices to prove loyalty”.
Writing this, I realized that the central example of “negative-sum behavior to prove loyalty” is hazing. (I think some forms of hazing involve useful menial labor, but classic frat-style hazing is unpleasant for the pledges with no tangible benefit to anyone else). It seems conceivable to get the term self-hazing into circulation to describe cases like the one in OP, to the point that someone might notice when they’re being expected to self-haze and question whether they really want to go down that road.

ADifferentAnonymous 29 Dec 2020 16:47 UTC
1 point
on: Would the Real Economy Please Stand Up
I find this distinction useful as well. I suspect it’s one that many people understand implicitly and many others totally lack. Evidence of the latter: I’ve seen intelligent people be far too upset by https://en.wikipedia.org/wiki/K_Foundation_Burn_a_Million_Quid.

ADifferentAnonymous 26 Jan 2021 19:17 UTC
7 points
in reply to: jbash’s comment on: New Empty Units
Searching “real estate money laundering”, it does sound like this is a real thing. But the few pages I just read generally don’t emphasize the “overpaying in exchange for out-of-band services” mechanism—they seem to be thinking in terms of buying (with dirty money) and selling (for clean money) at market prices, and emphasize that real estate’s status as “a good investment” is an important part of why criminals use it.
(They also bring up international tax avoidance strategies. Obviously using property to “park your wealth” also relies on prices not going down and hopefully going up at a reasonable rate).
So it sounds like OP’s strategy of building more and more until the speculators stop paying would work almost equally well against these types of buyers.

ADifferentAnonymous 8 Feb 2021 16:25 UTC
1 point
in reply to: guzey’s comment on: Massive consequences
I think the idea is that Huemer’s quote seems to itself be an effort to repair society without fully understanding it.
I don’t think this is a facile objection, either*—I think it’s very possible that “Voters, activists, and political leaders” are actually an essential part of the complex mechanism of society and if they all stopped trying to remedy problems things would get even worse.
On the other hand, you can recurse this reasoning and say that maybe bold counterintuitive philosophical prescriptions like Huemer’s are also part of the complex mechanism.
*To the quote as a standalone argument, anyway—haven’t read the essay.

ADifferentAnonymous 8 Feb 2021 17:14 UTC
5 points
in reply to: Bucky’s comment on: Quadratic, not logarithmic
Another explanation for logarithmic thinking is Laplace’s rule of succession.
If you have N exposures and have not yet had a bad outcome, the Laplacian estimate of a bad outcome from the next exposure goes as 1/N (the marginal cost under a logarithmic rule).
Applying this to “number of contacts” rather than “number of exposures” is admittedly more strained but I could still see it playing a part.

ADifferentAnonymous 10 Feb 2021 2:09 UTC
1 point
in reply to: mwacksen’s comment on: Quadratic, not logarithmic
If you’re deciding whether or not to add the (n+1)th person, what matters is the marginal risk of that decision.

ADifferentAnonymous 10 Feb 2021 3:01 UTC
3 points
on: Quadratic, not logarithmic
I remember very early in the pandemic reading an interview with someone who justified their decision to continue going to bars by pointing that they had a high-contact job that they still had to do. I noticed that this in fact made their decision worse (in terms of total societal Covid risk).
(And as the number of cases was still quite low at the time, the 100% bound on risk was much less plausibly a factor)

ADifferentAnonymous 23 Feb 2021 15:11 UTC
2 points
on: The slopes to common sense
I think one of my main contrarian instincts is to see a flat direction and worry we’ve been creeping up it, to the point that I’m actually pretty receptive to arguments for going the other way.

I take it somewhat as a sign I have this well-calibrated that your more-sleep and less-sleep paragraphs sounded about equally reasonable to me.

ADifferentAnonymous 15 Mar 2021 23:00 UTC
4 points
in reply to: Dave Lindbergh’s comment on: Blue is arbitrary
“Eurocentric paint” is an imprecise phrase. I first read it as meaning “traditionally-used European paints”, with the implication that other cultures chose their colors based on different paints. But the rest of the post makes clear it’s the idea of basing colors on paints that’s allegedly Eurocentric; so the better phrasing might be “Eurocentric fixation on paint”.
I was taught in (US) school that the primary colors were red, yellow, and blue and the secondaries were green, orange and purple (which matches the ‘rainbow’ in the comic, though the ‘rainbow’ I learned was ROYGBIV). Per https://en.wikipedia.org/wiki/Color_theory#Traditional_color_theory, this only works with paint:
One reason the artist’s primary colors work at all is due to the imperfect pigments being used have sloped absorption curves, and change color with concentration… Another reason the correct primary colors were not used by early artists is they were not available as durable pigments. Modern methods in chemistry were needed to produce them.
Granted, I was taught those colors in conjunction with being given paint to play with, which is a good reason to teach them. But it’s still a bit striking that at no point in my education was I taught any other set of primary colors, except implicitly by picking RGB colors in MS Paint (an ironic name, in context).
I’m pretty sure that the common intuition among my classmates, way back in childhood, was that the first-tier colors were red, yellow, blue and green. This turns out to be supported by a relatively sophisticated color theory based neither on natural occurrences of colors nor on any means of producing colors but rather the brain’s fundamental abstractions for processing them.

ADifferentAnonymous 18 Mar 2021 16:51 UTC
3 points
on: Product orientation
For question 5, maybe try out different shopping-like activities to see if any of them are less aversive.
Some examples:
- Researching a product category without the intention to make a purchase.
  - A few ways to motivate this, if ‘product orientation practice’ isn’t motivating
    Market research for a potential product
    Write a buying guide others might appreciate
    Things you might buy someday but not soon
    Fantasy purchases. “If I were going to buy a yacht/private plane/supercar, which one would I want?”
- Cheap unimportant purchases where the consequences of choosing wrong are minimal
- Choosing among free things (e.g open-source libraries)

ADifferentAnonymous 30 Mar 2021 15:49 UTC
1 point
on: The (not so) paradoxical asymmetry between position and momentum
‘Symmetric vs. asymmetric’ isn’t the right distinction; merely noting that a Hamiltonian is asymmetric in position and momentum can’t tell you anything about which one is fundamental!
The notable thing about position in our universe is that there are no interactions that don’t lose strength with increasing distance (I think?), and in ancestral human life the Earth’s gravity is the only obviously-important violation of strong locality.
As for why this is, I’m inclined toward anthropic explanations. This could just be a limit of human intuition, but it seems like locality is really helpful for complex purposeful structures. E.g., it allows a cell to control an interaction neighborhood such that everything that happens inside the membrane is coordinated. If some interactions were position-local and others momentum-local, you’d have to try to defend a neighborhood in both position-space and momentum-space, but your momentum-space boundaries would drift apart in position-space, and the need to stay in your momentum-space neighborhood would constrain your ability to update your position… it seems hard.

ADifferentAnonymous 2 Apr 2021 21:28 UTC
1 point
in reply to: Bunthut’s comment on: Learning Russian Roulette
Even after you’ve gotten an infinite amount of evidence against every possible alternative consideration, you’ll still believe that youre certain to survive
Isn’t the prior probability of B the sum over all specific hypotheses that imply B? So if you’ve gotten an arbitrarily large amount of evidence against all of those hypotheses, and you’ve won at Russian Roulette an arbitrarily high number of times… well, you’ll just have to get more specific about those arbitrarily large quantities to say what your posterior is, right?