simon

Karma: 1,178

simon May 2, 2025, 2:59 AM
2 points
0
in reply to: abstractapplic’s comment on: D&D.Sci Tax Day: Adventurers and Assessments Evaluation & Ruleset
Interesting link on symbolic regression. I actually tried to get an AI to write me something similar a while back^[1] (not knowing that the concept was out there and foolishly not asking, though in retrospect it obviously would be).
From your response to kave:
calculate a quantity then use that as a new variable going forward
In terms of the tree structure used in symbolic regression (including my own attempt), I would characterize this as wanting to preserve a subtree and letting the rest of the tree vary.
Possible issues:
1. If the coding modifies the trees leaf-first, trees with different roots but common subtrees aren’t treated as close to each other. This is an issue that my own version would likely have had even if actually implemented^[2]. However, I think PySR might at least partially address this issue (It uses genetic programming and the pictures in the associated paper seem to indicate that it is generating trees which at least sometimes preserve subtrees.) (Though the genetic programming approach is likely to make it hard to find the very simplest solutions in practice imo.^[3])
2. Even if you are treating trees with common subtrees as close to each other, if your evaluation of trees is only comparing final calculated values on the entire dataset, then it’s hard to make the call “I know this subtree is important even if I don’t know the rest of the tree” because the results are not likely to be all that close unless you already have a reasonable guess for the rest of the tree. One partial (heh) answer might be to award part marks to solutions that work well for some of the data even if wildly off for other parts. Careful thinking might be required to do this in a way that doesn’t backfire horribly, though. Hmm—or maybe you CAN do that in the existing paradigm by including if/then nodes in the tree? Say, a node that has three child nodes/subtrees, and chooses between two of them based on the value of the third? And then (in some genetic-programming-like approach perhaps) explore what happens if you copy those subtrees elsewhere, or existing subtrees into new if-then nodes?) (I can imagine the horrific unreadable mess already though...)
edited to add: it might be more appropriate to say that I had been planning on asking an AI to code something, but the initial prototype was sufficiently lame and gave me enough insight into the difficulties ahead I didn’t continue. Claude chat link if anyone’s interested.
edited to further add: hmm, what you are wanting (“new variable”) is probably not just preserving a subtree, but for the mutation system to be able to copy that subtree to other parts of the tree (and the complexity calculator to not give to much penalty to that, I guess). Interestingly, it seems that PySR’s backend at least (SymbolicRegression.jl) does have the capability to do this already, using a “form_random_connection!” mutation function that apparently allows the same subtree to appear as child of multiple parents, making a DAG instead of a tree. In general, I’ve been pretty impressed looking at SymbolicRegression.jl. Maybe other symbolic regression software is as feature-rich, but haven’t checked.
1. ^
  Apparently November 2024. Feels longer ago somehow.
2. ^
  I hadn’t actually gone beyond breadth-first search though.
3. ^
  This is informed by (a tiny amount of) practical experience. After SarahNibs’ comment suggested genetic programming would have worked on the “Arena of Data”, I attempted genetic programming on it and on my initial attempt got … a horrific unreadable mess. Maybe it wasn’t “halfway decently regularized” but I updated my intuition to say: complicated ways to do things so greatly outnumber the simple ways that anything too reliant on randomness is not likely to find the simple way.

simon Apr 29, 2025, 5:42 PM
2 points
0
in reply to: aphyer’s comment on: D&D.Sci Tax Day: Adventurers and Assessments Evaluation & Ruleset
And just now I thought, wait, wouldn’t this sometimes round to 10, but no, an AI explained to apparently-stupid me again that since it’s a 0.25 tax rate on integer goods, fractional gold pieces before rounding (where not a multiple of 0.1) can only be 0.25, which rounds down to 2 silver, or 0.75, which rounds up to 8 silver. Which makes it all the more surprising that I didn’t notice this pattern.

simon Apr 29, 2025, 5:42 AM
6 points
0
on: D&D.Sci Tax Day: Adventurers and Assessments Evaluation & Ruleset
Thanks aphyer, it was an interesting puzzle. I feel like it was particularly amenable to being worked out by hand relative to machine learning because of the determinism, rules easy to express in a spreadsheet, and simple subsets of the data (like the special cleric bracket) that could be built on.
This is resolved using python’s round() function...my apologies to simon, who seems to have spent a while trying to figure out when it rounded up or down.
I don’t recall that taking all that long really, I added one monster part type at a time (much like the main calculation but much easier since it’s just binary and only for the special 5U+ tax bracket, so less interactions to worry about).
Funny thing is, after seeing the calculation I still didn’t understand why it rounded the way it does, until I asked an AI which explained that Python rounds to even numbers on ties (apparently called “banker’s rounding” or “banker’s rule”). The only source of non-integer values is U which provides a single factor of 2 in the number of silver pieces, resulting in all rounding being of numbers with 0.5 in the remainder of silver pieces and so applying this rule. The other monster parts provide 2 factors of 2 of each, not directly causing rounding but each one incrementing the tax by 1 silver (modulo 2 silver) and thus changing whether this rule rounds up or down (except 2nd and up dragon heads which provide 3 factors of 2) .
Amusingly, I could have much more simply expressed this rounding rule as “if rounding is required, round to the nearest even number of silver pieces” but I wasn’t thinking about it in terms of output, so missed this and expressed it much more complicatedly in terms of the input. Oops!
if taxed_goods[‘U’] >= 5: tax_rate = min(tax_rate, 0.25)
Ah, makes sense that this would be its own special tax rate rather than the 50% bracket with a X2 discount (which amounts to the same thing in the end). That min though with the tax rate from other sources is another thing that was never triggered (and literally couldn’t be triggered since 5U alone is enough to get to the 30% bracket and also prevents eligibilty for the cleric bracket).

simon Apr 17, 2025, 4:58 PM
9 points
0
on: D&D.Sci Tax Day: Adventurers and Assessments
Thanks aphyer. Solution:
Number of unique optimal assignments (up to reordering) (according to AI-written optimizer implementing my manually found tax calculation): 1
Minimum total tax: 212 (err thats 21gp 2 sp)
Solution 1:
Member 1: C=1, D=1, L=0, U=0, Z=4, Tax=0
Member 2: C=1, D=1, L=0, U=1, Z=1, Tax=0
Member 3: C=1, D=1, L=0, U=1, Z=1, Tax=0
Member 4: C=1, D=1, L=5, U=5, Z=2, Tax=212
Tax calculation:
1. Add up the base values: 6 for C, 14 for D, 10 for L, 7 for U, 2 for Z
2. If only L and Z, just take the total base and exit.
3. Otherwise, set a tier as follows: tier 0 if 0 < base < 30, tier 1 if 30 ⇐ base < 60, tier 2 if 60 ⇐ base < 100, tier 3 if base >= 100.
4. If U >= 5, then use the max tier regardless (but a 2x discount is also triggered later).
5. If D >=2, then increase the base as if the extra dragons beyond 1 are doubled. This doesn’t change the tier.
6. multiply the base value by tier + 2
7. If U >= 5, divide by 2
8. Discount by 60*Ceiling(C/2) (can’t go below 0)
Rounding is needed if U is an odd number >=5. To determine if you round up or down, add up the numbers for C,L and Z, add to (U-1)/2, plus 1 if there is at least one D. Then round down if that number is odd and up if it is even. (Presumably this arises in some more natural way in the actual calculation used, but this calculation gives 0 residuals, so...). Todo: find a more natural way for this to happen.
Post-hoc rationalization of optimal solution found: giving everyone a C helps get everyone an up to 60 tax credit. Spreading the D’s out also prevents D doubling. The C and D add up to a base value of 20. We can fit up to 9 more base value without going to the next tax bracket; this is done using 4Z (8 base value) or 1U 1Z (9 base value). The last member has to pay tax at the highest bracket but U=5 also gives the factor of 2 discount so it’s not so bad. They get everything they have to take in order to not push the others above 29 base, but no more.
This seemed relatively straightforward conceptually to solve (more a matter of tweaking implementation details like D doubling and so on - I expect many things are conceptualized differently in the actual calculation though). I went from the easiest parts (L and Z), then added U, then looked at D since it initially looked less intimidating than C, then switched to C, solved that, and finally D). It would have been much harder if it were not deterministic, or if there wasn’t data within simpler subsets (like L and Z only) for a starting point.
The ultimate solution found is 12 sp better than the best solutions available using only rows that actually occur in the data (also found using AI-written script).
Tools used: AI for writing scripts to manipulate CSV files and finding optimal solutions, etc. LibreOfficeCalc for actually looking at the data and figuring out the tax calculation.
Additional todo: look at the distributions of part numbers to try to find out how the dataset was generated.
edited to add: my optimizer used brute(ish) force, unlike DrJones’. It uses double meet-in-the-middle (split two ways, then split again) with memoization and symmetry reduction using lexicographical ordering (symmetry reduction and memoization was optimization added by AI after I complained to it about the time an initial version, also written by AI, was taking).
P. S. I used GPT-4.1 in Windsurf for the AI aspects. They’re running a promotion where it costs 0 credits until, IIRC, April 21.

simon Apr 3, 2025, 6:53 PM
21 points
2
in reply to: Mateusz Bagiński’s comment on: Why Have Sentence Lengths Decreased?
FWIW there is a theory that there is a cycle of language change, though it seems maybe there is not a lot of evidence for the isolating → agglutinating step. IIRC the idea is something like that if you have a “simple” (isolating) language that uses helper words instead of morphology eventually those words can lose their independent meaning and get smushed together with the word they are modifying.

simon Mar 20, 2025, 8:32 PM
16 points
0
on: Intention to Treat
Also, when doing a study, please write down afterwards whether you used intention to treat or not.
Example: I encountered a study that says post meal glucose levels depend on order in which different parts of the meal were consumed. But the study doesn’t say whether every participant consumed the entire meal, and if not, how that was handled when processing the data. Without knowing if everyone consumed everything, I don’t know if the differences in blood glucose were caused by the change in order, or by some participants not consuming some of the more glucose-spiking meal components.
In that case, intention to treat (if used) makes the result of the study less interesting since it provides another effect that might “explain away” the headline effect.

simon Mar 16, 2025, 7:18 PM
8 points
1
on: 2024 Unofficial LessWrong Survey Results
Issues with the dutch book beyond the marginal value of money:
- It’s not as clear as it should that the LLM IQ loss question is talking about a permanent loss (I may have read it as temporary when answering)
- Although the LLM IQ drop question does say “your IQ” there’s an assumption that that sort of thing is a statistical average—and I think the way I use LLMs, for example, is much less likely to drop my IQ than the average person’s usage.
- I think is that the LessWrong subscription question is implictly asking about the marginal value of LessWrong given the existence of other resources while the relative LessWrong/LLM value question is implicitly leaning more towards non-marginal value obtained, which might be very many times more
impact: these issues increase LLM/IQ and (Lesswrong/LLM relative to LessWrong/$), which cause errors in the same direction in the LLM/IQ/$/Lesswrong/LLM cycle, potentially by a very large multiplier.
Marginal value due to the high IQ gain of 5 lowers $/IQ which increases IQ/$. This also acts in the same direction.
(That’s my excuse anyway. I suspected the cycle when answering and was fairly confident, without actually checking, that I was going to be way off from a “consistent” value. I gave my excuse as a comment in the survey itself that I was being hasty, but on reflection I still endorse an “inconsistent” result here, modulo the fact that I likely misread at least one question).

simon Mar 2, 2025, 8:41 PM
2 points
0
on: Maintaining Alignment during RSI as a Feedback Control Problem
Control theory I think often tends to assume that you are dealing with continuous variables. Which I think the relevant properties of AIs are likely (in practice) not—even if the underlying implementation uses continuous math RSI will make finite changes and even small changes could cause large differences in results.
Also, the dynamics here are likely to depend on capability thresholds which could cause trend extrapolation to be highly misleading.
Also, note that RSI could create a feedback loop which could enhance agency including towards nonaligned goals (agentic AI convergently wants to enhance its own agency).
Also beware that agency increases may cause increases in apparent capability because of Agency Overhang.

simon Feb 24, 2025, 6:54 PM
2 points
0
on: Complete Feedback
The AI system accepts all previous feedback, but it may or may not trust anticipated future feedback. In particular, it should be trained not to trust feedback it would get by manipulating humans (so that it doesn’t see itself as having an incentive to manipulate humans to give specific sorts of feedback).
I will call this property of feedback “legitimacy”. The AI has a notion of when feedback is legitimate, and it needs to work to keep feedback legitimate (by not manipulating the human).
Legitimacy is good—but if an AI that’s supposed to be intent-aligned to the user would find that it has an “incentive” to purposefully manipulate the user in order to get particular feedback from the user, unless it pretends that it would ignore that feedback, it’s already misaligned and that misalignment should be dealt with directly IMO—this feels to me like a band-aid over a much more serious problem.

simon Feb 2, 2025, 8:02 PM
6 points
2
on: Escape from Alderaan I
Luke ignited the lightsaber Obi-Wan stole from Vader.
This temporarily confused me until I realized
it was not talking about the lightsaber Vader was using here, but about the one that Obi-Wan took from him in the Revenge of the Sith and gave to Luke near the start of A New Hope.

simon Jan 11, 2025, 10:48 PM
6 points
−1
in reply to: gwern’s comment on: Fluoridation: The RCT We Still Haven’t Run (But Should)
We may thus rule out negative effects larger than
0.14 standard deviations in cognitive ability if fluoride is increased by
1 milligram/liter (the level often considered when artificially fluoridat-
ing the water).
That’s a high level of hypothetical harm that they are ruling out (~2 IQ points?). I would take the dental harms many times over to avoid that much cognitive ability loss.

simon Jan 10, 2025, 8:17 PM
2 points
0
in reply to: aphyer’s comment on: D&D.Sci Dungeonbuilding: the Dungeon Tournament Evaluation & Ruleset
actually, there are ~100 rows in the dataset where Room2=4, Room6=8, and Room3=5=7.
I actually did look at that (at least some subset with that property) at some point, though I didn’t (think of/ get around to) re-looking at it with my later understanding.
In general, I think this is a realistic thing to occur: ‘other intelligent people optimizing around this data’ is one of the things that causes the most complicated things to happen in real-world data as well.
Indeed, I am not complaining! It was a good, fair difficulty to deal with.
That being said, there was one aspect I did feel was probably more complicated than ideal, and that was the combination of the tier-dependent alerting with the tiers not having any other relevance than this one aspect. That is, if the alerting had in each case been simply dependent on whether the adventurers were coming from an empty room or not, it would have been a lot simpler to work out. And if there was tier dependent alerting, but the tiers were more obvious in other ways*, it would still have been tricky but at least there would be a path to recognize the tiers and then try to figure out other ways that they might have relevance. The way it was it seemed to me you pretty much had to look at what were (ex ante) almost arbitrary combinations of (current encounter, next encounter) to figure that aspect out, unless you actually guessed the rationale of the alerting effect.
That might be me rationalizing my failure to figure it out though!
* e.g. perhaps the traps/golems could have had the same score as the same-tier nontrap encounter when alerted (or alternatively when not alerted)

simon Jan 10, 2025, 6:52 PM
6 points
1
on: Rebuttals for ~all criticisms of AIXI
The biggest problem about AIXI in my view is the reward system - it cares about the future directly, whereas to have any reasonable hope of alignment an AI in my view needs to care about the future only via what humans would want about the future (so that any reference to the future is encapsulated in the “what do humans want?” aspect).
I.e. the question it needs to be answering is something like “all things considered (including the consequences of my current action on the future, as well as taking into account my possible future actions) what would humans, as they exist now, want me to do at the present moment?”
Now maybe you can take that question and try to slice it up into rewards at particular timesteps, which change over time as what is known about what humans want changes, without introducing corrigibility issues, but the AIXI reward framework isn’t really buying you anything imo even if that works, relative to directly trying to get an AI to solve the question.
On the other hand approximating Solomonoff induction might afaik be a fruitful approach, though the approximations are going to have to be very aggressive for practical performance. I do agree embeddding/self-reference can probably be patched in.

simon Jan 8, 2025, 9:14 PM
7 points
−3
on: On Eating the Sun
I think that it’s likely to take longer than 10000 years, simply because of the logistics (not the technology development, which the AI could do fast).
The gravitational binding energy of the sun is something on the order of 20 million years worth of its energy output. OK, half of the needed energy is already present as thermal energy, and you don’t need to move every atom to infinity, but you still need a substantial fraction of that. And while you could perhaps generate many times more energy than the solar output by various means, I’d guess you’d have to deal with inefficiencies and lots of waste heat if you try to do it really fast. Maybe if you’re smart enough you can make going fast work well enough to be worth it though?

simon Jan 8, 2025, 6:35 AM
5 points
0
on: D&D.Sci Dungeonbuilding: the Dungeon Tournament Evaluation & Ruleset
I feel like a big part of what tripped me up here was an inevitable part of the difficulty of the scenario that in retrospect should have been obvious. Specifically, if there is any variation in difficulty of an encounter that is known to the adventurers in advance, the score contribution of an encounter type in actual paths taken is less than the difficulty of the encounter as estimated by what best predicts the path taken (because the adventurer takes the path when it’s weak, but avoids when it’s strong).
So, I wound up with an epicycle saying hags and orcs were avoided more than their actual scores warranted, because that effect was most significant for them (goblins are chosen over most other encounters even if alerted, and Dragons mostly aren’t alerted).
This effect was made much worse by the fact that I was getting scores mainly from lower difficulty dungeons, with lots of “Nothing” rooms and low level encounters. But even once I estimated scores from the overall data with my best guesses for preference order, the issue still applied, just not quite so badly.
In the “what if” department, I had said:

> I’m also getting remarkably higher numbers for Hag compared with my earlier method. But I don’t immediately see a way to profitably exploit this.
The most obvious way to exploit this would have been the optimal solution. Why didn’t I do it? The answer is that, as indicated above, I was still underestimating the hag (whereas at this point I had mostly-accurate scores for the traps and orcs). With my underestimate for the hag’s score contribution, I didn’t think it was worth giving up an orc-boulder trap difference to get a hag-orc difference. I also didn’t realize I needed the hag to alert the dragon.
In general, I feel like I was pretty far along with discovering the mechanics despite some missteps. I correctly had the adventurers taking a 5-encounter path with right/down steps, the choice of next step being based on the encounters in the choices for the next room, with an alerting mechanism, and that the alerting mechanism didn’t apply to traps and golems.
On the other hand, I applied the alerting mechanism only to score and not to preference order, except for goblins and orcs (why didn’t I try to apply it to preference order for other encounters once I realized it applied to preference order for goblins and orcs and that some degree of alerting mechanism score effect applied to other encounters ?????) (I also got confused into thinking that the effect on orc preference order only applied if the current encounter was also orcs). I also didn’t realize that the alerting mechanism had different sensitivity for different encounters, and I had my mistaken belief about the preference order being different from expected score for some encounter types (hey, the text played up how unnerving the hag was, there was some plausibility there!).
I think if I had gotten to where I was in my last edit early on in the time frame for this scenario instead of near the end, and had posted it, and other people had read it and tried it out, collectively we would have had a good chance of solving the whole thing. I also would have been much more likely to get the optimal solution if I had paid more attention to what abstractapplic said, instead of only very briefly glancing over his comments after posting my very belated comment and going back to doing my own thing.
In my view, a fun, challenging and theoretically solvable scenario (even if actually not that close to being solved in practice), so I think it was quite good.

simon 6 Jan 2025 9:33 UTC
2 points
0
on: D&D.Sci Dungeonbuilding: the Dungeon Tournament
Looking like I’ll not have figured this out before the time limit despite the extra time, what I have so far:
I’m modeling this as follows, but haven’t fully worked out and am getting complications/hard to explain dungeons that suggest that it might not be exactly correct
- the adventurers go through the dungeons using rightwards and downwards moves only, thus going through 5 rooms in total.
- at each room they choose the next room based on a preference order (which I am assuming is deterministic, but possibly dependent on, e.g. what the current room is)
- the score is dependent only on the rooms they pass through (but again, am getting complications)
- I’m assuming a simple addition of scores to start with, but then adding epicycles (which so far have been based on the previous room, generally)
- there is some randomness in the individual score contributions from each encounter.
For the dungeon generation: dungeon generation seems to treat rooms 1-8 equally (room 9 is different and tends to have harder encounters). Encounters of the same types (and some related “themes”) tend to be correlated. Scores in each tournament seem to be whole numbers from each judge and averaged between 3 or 4 judges; I am not sure if any tournaments are judged by 2 or 1, but if so they’re relatively less common.
In theory, I’d like to plug in a preference model and a score model to a simulator and iterate to refine, but I’m not there yet, still working out plausible scores and preferences.
One possibility for the scores and preference order:
baseline average scores:
Nothing: 0; Goblins: 1.5 (1d2?); Whirling Blade Trap 3; Orcs 3; Hag 4; Boulder Trap 4.5; Clay Golem 6, Dragon 6?, Steel Golem 7.5 (edit: <--- numbers estimated with small, atypical samples (included many Nothing, which is problematic for reasons that become obvious with below edit))
With Goblins and Orcs being increased (doubled?) if following goblins/orcs/any trap? (edit—or golems?) (edit—looking now like it’s probably anything but an empty room?)
Plus with the adventurers seemingly avoiding Orcs and Hags more than their difficulty warrants? (I found them to be relatively late in the preference order, then found that they were in practice lower in score, so am having to ad hoc adjust if I keep the assumption that the score contribution and prefrence order are related. 1.5 multiplier? 2x multiplier? fixed addition?) (I’m assuming a 1.5x multiplier atm since I initially had Hag avoided over anything but orcs, but found one dungeon that looks suspiciously like, but does not prove, Hag being chosen over Dragon (edit: see below for update)) (I suppose +2 would also work) (edit—it looks like the Orc difficulty increase for following a non-empty room only applies to adventurer preference if the current room is also Orcs—violating the assumption that preference is tied to expected difficulty. But for Goblins it seems the preference may indeed depend only on following a non-empty room, though in practice it doesn’t matter much since it only affects order wrt WBT).
(edit—see update to preference order below)
Assuming the above is correct, and I’m pretty sure it isn’t but hopefully has some relationship with reality, one strategy might be:
CHN/WON/BOD <---obsolete answer
where the idea is to use the encounters the adventurers avoid too much relative to their actual score contributions (Hag, Orcs) to herd the adventurers away from the Nothing rooms. One of the Orcs is left in after a Boulder Trap in the belief that will make it score higher than the hag. WBT is left in the preferred path to lead the adventurers along, don’t immediately see a way to avoid this.
EV if above model is correct: 6+3+4.5+6+6=25.5
How I’ve gotten here (mainly used Claude and Claude-written code, including the analysis tool which is good for prototyping if you don’t mind javascript):
- found initial basic encounter score contribution estimates from linear regression on whole dungeon
- after determining that rooms 1-8 were interchangeable as far as dungeon generation is concerned, looked at room importance to score, guessed the basic model based on that iirc (might have been more complicated than this) (I do remember considering and rejecting a model where each room is selected one at a time from the full set of available rooms, and rejecting any “symmetrical” model based on working out the full path in advance)
- initially assumed that adventurers preferred easier encounters based on the inital score estimates
- refined preference order based on minimizing variance between same-predicted-sequence-of-encounters dungeons
- tried to work out how scores actually work by filtering for specific predicted sequences of encounters and finding their scores
- found epicycles from that and started refining model, including preference order adjustments
- haven’t really finished the above step, epicycles might be because model is wrong/incomplete?
- hypothetical todo: apply model to entire dataset, also develop model for variations in score from each encounter, compare to known 3-judge and 4-judge tournaments for full Bayes assessment, refine further with this as feedback
edit: I’ve now read other people’s comments; I did not notice any 1-point jump in scores (didn’t check for it), not sure if i would have noticed if it is a judging difference as opposed to a strategy change? (wouldn’t notice if just strategy change). Also I did not notice anything special about Steel Golems at the entrance vs. other spots, did not check for any change in distribution of 3 vs 4 judge tournaments, etc.
further analysis after the above:
I’ve looked at root mean square deviation of predictions from the data for the full dataset (full Bayes seems a bit intimidating to code atm even with AI help). From this it seems the preference order is (there remains a likely possibility for more complications I haven’t checked):
Nothing > Goblins (current encounter null or Nothing) > Goblins (otherwise) = Whirling Blade Trap > Boulder Trap = Clay Golem = Orcs (current encounter not Orcs) > Dragon > Steel Golem >= Orcs (current encounter Orcs) > Hag Nothing > Goblins (current encounter null or Nothing) > Goblins (otherwise) = Whirling Blade Trap > Boulder Trap > Clay Golem = Orcs (current encounter not Orcs) > Dragon > Orcs (current encounter Orcs) > Hag = Steel Golem
~~where I can’t distinguish between Steel Golem being preferred or equal to Orcs with current encounter being Orcs.~~
~~Soo, if Orcs are avoided equally to a Boulder Trap if the current encounter is not Orcs, I need to improve the herding.~~ But also it seems Orcs get doubled by many other encounter types? This could work:
CHN/OBN/WOD <---- current solution
Predicted value is now 6+6+3+6+6=27.
further edit: also refining the scores, getting probably nonsense (due to missing some dependcy of some stuff on something else, probably), but it’s looking like maybe every encounter’s score depends on whether the previous encounter was Nothing/null. Except traps/golems? Which would explain why Steel Golems are being reported as better in the first slot.
I’m also getting remarkably higher numbers for Hag compared with my earlier method. But I don’t immediately see a way to profitably exploit this.

simon 30 Dec 2024 21:41 UTC
4 points
0
in reply to: habryka’s comment on: Is “VNM-agent” one of several options, for what minds can grow up into?
I feel like this discussion could do with some disambiguation of what “VNM rationality” means.
VNM assumes consequentialism. If you define consequentialism narrowly, this has specific results in terms of instrumental convergence.

You can redefine what constitutes a consequence arbitrarily. But, along the lines of what Steven Byrnes points out in his comment, redefining this can get rid of instrumental convergence. In the extreme case you can define a utility function for literally any pattern of behaviour.
When you say you feel like you can’t be dutch booked, you are at least implicitly assuming some definition of consequences you can’t be dutch booked in terms of. To claim that one is rationally required to adopt any particular definition of consequences in your utility function is basically circular, since you only care about being dutch booked according to it if you actually care about that definition of consequences. It’s in this sense that the VNM theorem is trivial.
BTW I am concerned that self-modifying AIs may self-modify towards VNM-0 agents.

But the reason is not because such self modification is “rational”.

It’s just that (narrowly defined) consequentialist agents care about preserving and improving their abilities to and proclivities to pursue their consequentialist goals, so tendencies towards VNM-0 will be reinforced in a feedback loop. Likewise for inter-agent competition.

simon 4 Dec 2024 23:43 UTC
2 points
0
in reply to: Rafael Harth’s comment on: Do simulacra dream of digital sheep?
You can also disambiguate between
a) computation that actually interacts in a comprehensible way with the real world and
b) computation that has the same internal structure at least momentarily but doesn’t interact meaningfully with the real world.
I expect that (a) can usually be uniquely pinned down to a specific computation (probably in both senses (1) and (2)), while (b) can’t.
But I also think it’s possible that the interactions, while important for establishing the disambiguated computation that we interact with, are not actually crucial to internal experience, so that the multiple possible computations of type (b) may also be associated with internal experiences—similar to Boltzmann brains.
(I think I got this idea from “Good and Real” by Gary L. Drescher. See sections “2.3 The Problematic Arbitrariness of Representation” and “7.2.3 Consciousness and Subjunctive Reciprocity”)

simon 4 Dec 2024 17:22 UTC
3 points
1
in reply to: Davidmanheim’s comment on: Do simulacra dream of digital sheep?
The interpreter, if it would exist, would have complexity. The useless unconnected calculation in the waterfall/rock, which could be but isn’t usually interpreted, also has complexity.
Your/Aaronson’s claim is that only the fully connected, sensibly interacting calculation matters. I agree that this calculation is important—it’s the only type we should probably consider from a moral standpoint, for example. And the complexity of that calculation certainly seems to be located in the interpreter, not in the rock/waterfall.
But in order to claim that only the externally connected calculation has conscious experience, we would need to have it be the case that these connections are essential to the internal conscious experience even in the “normal” case—and that to me is a strange claim! I find it more natural to assume that there are many internal experiences, but only some interact with the world in a sensible way.

simon 4 Dec 2024 16:43 UTC
2 points
0
in reply to: EuanMcLean’s comment on: Do simulacra dream of digital sheep?
But this just depends on how broad this set is. If it contains two brains, one thinking about the roman empire and one eating a sandwich, we’re stuck.
I suspect that if you do actually follow Aaronson (as linked by Davidmanheim) to extract a unique efficient calculation that interacts with the external world in a sensible way, that unique efficient externally-interacting calculation will end up corresponding to a consistent set of experiences, even if it could still correspond to simulations of different real-world phenomena.
But I also don’t think that consistent set of experiences necessarily has to be a single experience! It could be multiple experiences unaware of each other, for example.