Forged Invariant

Karma: 108

Forged Invariant May 26, 2025, 7:10 PM
1 point
0
in reply to: momom2’s comment on: D&D.Sci: The Choosing Ones
I was able to deduce them by
making a scatter-plot of Colleen vs Liboulen’s predictions. You can see that this plot has the points on a “flattened prism” in 3 directions, and manually count the shifts and see that each of the underlying components has 10 possible values.
Once you have that structure, you can pick out points on the extremes and use their slopes to calculate some of the relevant slopes. Finally, I brought in Bella’s info and used that to work out the remaining stats. (I used chatGPT for some help throwing together some linear regressions, but they needed a good bit of tweaking to be functional, and mostly agreed with the slopes that I had calculated by just looking at the scatterplots.)

Forged Invariant May 26, 2025, 8:50 AM
1 point
0
in reply to: Forged Invariant’s comment on: D&D.Sci: The Choosing Ones
At this point I am throwing everything that I found in a linear regression, because I ran out of time. My pick is:
Candidate 11, with an estimated 0.91 chance of success.
Candidates 19 and 7 would be my next choices, with 0.87 and 0.85 estimated chances of success respectively.
If I had had more time to work on this, I would have like to look at:
- Why do 3 of the “stats” have diminishing returns, while the other 3 have increasing returns?
- Are there any temporal trends?
- Can I find anything else out from the sick days?
- Why does adding the Bella/L/L stats together result in a spike for very low stats? (aphyer seems to have figured this one out.)
- How does the voting system of the faye council work?
- What is up with Amy’s ratings?
- What is up with Ziqual’s off-by-one ratings?
- Is the “noise” in Linestra’s ratings actually related to the other stats?
- If I was designing the puzzle, I would try to have one of the possible choices be someone the council would be unlikely to select, but who actually the best. Looking at aphyer’s comments, the reversal on the “physical” stats might be setup for the optimal answer.

Forged Invariant May 21, 2025, 7:02 AM
1 point
0
in reply to: Forged Invariant’s comment on: D&D.Sci: The Choosing Ones
A summary of some interesting results. I am leaving how I found some of this out for now, for brevity’s sake.
I have manage to extract 6 integer variables that range from 1-10.
3 of them are from the components of (Coleen, Linestra, Liboulen, Bella), the other 3 are from (Fizz, Ister, Ziqual).
Each of them has a very similar histogram, sort of like a truncated normal distribution. A linear regression of them with Holly gives approximately 1 as their coefficient, except for 1 variable (which I am calling X2 for now) which has a coefficient of roughly −1.
All of these underlying variables have a magnitude of correlation with the candidate succeeding, between 0.11 and 0.17 , with X2 being the only negative correlation.
When looking at the Fae council, I noticed:
When Linestra is gone, Colleen predicts exactly 50 each and every time. This suggests that she is plagiarizing Linestra. She only ever gives 50 rating when Colleen is missing or predicting 48.3.
When Colleen is gone, the correlation between Linestra’s rating and the candidate being chosen is almost halved. This is consistent with the council using a voting process.

Forged Invariant May 20, 2025, 12:13 AM
1 point
0
on: D&D.Sci: The Choosing Ones
A few miscellaneous observations:
Ister, Ziqual and Fizz seem to have some pretty deterministic structure connecting them.
Ister always predicts an integer between 51 and 60 inclusive.
Ziqual’s prediction is equal to (Ister − 50) * (Integer from 1 to 10) - (one of 0, 1). Multipliers in the 5 to 7 range are most common.
Fizz’s prediction is less than or equal to (Ister’s prediction + 10). Fizz’s prediction is greater than or equal to 44.
Separately, a scatterplot of Liboulen and Colleen’s predictions has a lot of structure: [Scatterplot removed since it seems to show up through the spoiler. Message me if you want to see it.]
Note that each of the 3 “axies” of this “prism” has 10 separate blobs of points. This makes me suspect that Liboulen and Colleen are each a weighted sum of 3 underlying integer variables that each range from 1 to 10. (There would also need to be a small noise term or other factor, since the points do not perfectly fit this pattern.) The noise term seems to only apply to Coleen’s estimates, as Liboulen’s estimates have way less distinct values.
Bella seems to have some interactions with these two. Linestra is an almost perfect clone of Colleen, but her estimates are either 1.7 or (occasionally) 1.9 lower.
It feels like it should be easy enough to find the coefficients corresponding to the 3 visible slopes formed by the edges of this figure. Based on some data slicing and eyeballing the graph above, I think the coefficients for L are 5, 1, 1, and the coefficients for C are approximately 3.6, −1.25, and 2.47
I am sure that there is a linear algebra regression to find the exact values, but I haven’t figured it out yet.

Forged Invariant Apr 22, 2025, 1:03 AM
3 points
0
on: Crime and Punishment #1
Story of a mostly homeless guy who scammed Isaac King out of $300. Isaac sued in small claims court on principle, did all the things, and none of it mattered.
This link goes to Sarah’s tweet, not to Isaac’s story.

Forged Invariant Apr 22, 2025, 1:02 AM
4 points
0
on: Crime and Punishment #1
British Columbia is recriminalizing marijuana
This is not what the article says. It says that BC is re-criminalizing hard drugs.
I am in BC, and have not heard anything about decriminalizing marijuana. I get the sense that it being legal is generally popular. Complaints about drug users are common here, but they are usually not talking about weed.

Forged Invariant Apr 20, 2025, 4:34 AM
3 points
2
in reply to: Cleo Nardo’s comment on: strawberry calm’s Shortform
I would expect that player 2 would be able to win almost all of the time for most normal hash functions, as they could just play randomly for the first 39 turns, and then choose one of the 2^8 available moves. It is very unlikely that all of those hashes are zero. (For commonly used hashes, player 2 could just play randomly the whole game and likely win, since the hash of any value is almost never 0.)

Forged Invariant Jan 20, 2025, 2:10 AM
14 points
14
in reply to: No77e’s comment on: meemi’s Shortform
In addition to the object level reasons mentioned by plex, misleading people about the nature of a benchmark is a problem because it is dishonest. Having an agreement to keep this secret indicates that the deception was more likely intentional on OpenAI’s part.

Forged Invariant Jan 4, 2025, 5:24 AM
3 points
4
on: Preference Inversion
Based on the quote from Kirkpatrick, It looks like a clear example of preference falsification, but I do not see any reason to believe that it is internalized preference falsification. Did I miss how the submissive apes were internalizing the preference to not mate? The sentence “This is an easy to understand example of an important general fact about humans: we can be threatened into internalized preference falsification, i.e. preference inversion.” makes me think that you intended it as an example of primates internalizing a preference falsification. It feels like the quote is only evidence that primates will be deceptive.
On rereading, the claim that “This is an easy to understand example of an important general fact about humans: we can be threatened into internalized preference falsification, i.e. preference inversion” seems reasonably supported by the next paragraph about male vs female arousal in humans. Maybe I just attached the claim to the wrong evidence.

Forged Invariant Jun 9, 2023, 4:24 PM
2 points
0
in reply to: Garrett Baker’s comment on: The Base Rate Times, news through prediction markets
As an example of how Manifold reacted to a (crude) attempt at manipulation:
Dr P (a Manifold user) would create and bet yes on markets for “Will Trump be president on [some date]?” for various dates where there was no plausible way trump would be president. Other users quickly noticed and set up limit orders to capture this source of free money. Eventually Dr. P’s bets were cancelled out quickly enough that they had little to no effect on the probability, and it became hard to find one of those bets profit from. Eventually Dr P gave up and their account became inactive. (There was some uncertainty about what would happen if Dr P misresolved the markets. Today I would expect false resolutions to be reversed. Various derivative/insurance markets were set up.)

Forged Invariant Jun 9, 2023, 4:07 PM
3 points
0
in reply to: Sune’s comment on: The Base Rate Times, news through prediction markets
One thing that I have seen on manifold is markets that will resolve at a random time, with a distribution such that at any time, their expected duration (from the current day, conditional on not having already resolved) is 6 months. They do not seem particularly common, and are not quite equivalent to a market with a deadline exactly 6 months in the future. (I can’t seem to find the market.)

Forged Invariant Oct 2, 2022, 8:42 PM
2 points
1
in reply to: Martin Randall’s comment on: Petrov Day Retrospective: 2022
The timing evidence is thus hostile evidence and updating on it correctly requires superintelligence.
What do you mean by this? It seems trivially false that updating on hostile evidence requires superintelligence; for example poker players will still use their opponent’s bets as evidence about their cards, even though these bets are frequently trying to mislead them in some way.
The evidence being from someone who went against the collective desire does mean that confidently taking it at face value is incorrect, but not that we can’t update on it.

Forged Invariant Sep 24, 2022, 6:21 AM
27 points
18
in reply to: Ruby’s comment on: LW Petrov Day 2022 (Monday, 9/26)
The LW staff are necessary to take down the site. If we assume that there are multiple users that are willing to press the button, then the (shapely-attributed) blame for taking the site down mostly falls on the LW staff, rather than whoever happens to press the button first.
According to http://shapleyvalue.com/?example=8 if there were 6 people who were willing to push the button, the LW team would deserve 85% of the blame. (Here I am considering the people who take actions that act to facilitate bringing down the site as part of the coalition.)
I am not quite sure how to take into account all the people who choose not to take down the website and thus delay, and there is some value in running the Petrov day event, so the above does not take everything into account.
Tweaking some values in the website to model this, where value = 7 if either LW and/or all the other users refuse to shut down the site, and 7-i where i is the highest numbered player that shuts down the site (higher meaning they shut things down sooner), I get these values:
The Shapley value of player 1(Low Karma button pusher) is: −0.023809523809524
The Shapley value of player 2 is: −0.057142857142857
The Shapley value of player 3 is: −0.10714285714286
The Shapley value of player 4 is: −0.19047619047619
The Shapley value of player 5 is: −0.35714285714286
The Shapley value of player 6(High karma button pusher) is: −0.85714285714286
The Shapley value of player 7(LW team) is: −4.4071428571429
(All the values are negative, since this assigns no value to running the experiment or to keeping the site online despite running the experiment and for simplicity’s sake measures things in site uptime, and not shutting down the site achieves that.)

Forged Invariant Sep 21, 2022, 4:23 AM
12 points
4
in reply to: YimbyGeorge’s comment on: Gene drives: why the wait?
Here is an example of something that comes close from “The Selfish Gene”:
One of the best-known segregation distorters is the so-called t gene in mice. When a mouse has two t genes it either dies young or is sterile, t is therefore said to be lethal in the homozygous state. If a male mouse has only one t gene it will be a normal, healthy mouse except in one remarkable respect. If you examine such a male’s sperms you will find that up to 95 per cent of them contain the t gene, only 5 per cent the normal allele. This is obviously a gross distortion of the 50 per cent ratio that we expect. Whenever, in a wild population, a t allele happens to arise by mutation, it immediately spreads like a brushfire. How could it not, when it has such a huge unfair advantage in the meiotic lottery? It spreads so fast that, pretty soon, large numbers of individuals in the population inherit the t gene in double dose (that is, from both their parents). These individuals die or are sterile, and before long the whole local population is likely to be driven extinct. There is some evidence that wild populations of mice have, in the past, gone extinct through epidemics of t genes.
Not all segregation distorters have such destructive side-effects as t. Nevertheless, most of them have at least some adverse consequences.
From the discussion of human-engineered gene drives, they would only cause sterility in one sex, which would help avoid the gene dying off as quickly.

Forged Invariant Jun 15, 2022, 5:18 AM
5 points
0
on: Why all the fuss about recursive self-improvement?
I had not thought of self-play as a form of recursive self-improvement, but now that you point it out, it seems like a great fit. Thank you.
I had been assuming (without articulating the assumption) that any recursive self improvement would be improving things at an architectural level, and rather complex (I had pondered improvement of modular components, but the idea was still to improve the whole model). After your example, this assumption seems obviously incorrect.
Alpha-go was improving its training environment, but not any other part of the training process.

Forged Invariant Nov 25, 2021, 4:23 AM
1 point
in reply to: Jsevillamol’s comment on: A Bayesian Aggregation Paradox
The left hand side of the example is deliberately making the mistake described in your article, as a way to build intuition on why it is a mistake.
(Adding instead of averaging in the update summaries was an unintended mistake)
Thanks for explaining how to summarize updates, it took me a bit to see why averaging works.

Forged Invariant Nov 23, 2021, 4:28 AM
6 points
on: A Bayesian Aggregation Paradox
Seeing the equations, it was hard to intuitively grasp why updates work this way. This example made things more intuitive for me:
If an event can have 3 outcomes, and we encounter strong evidence against outcomes B and C, then the update looks like this:
$⎛ ⎜ ⎝ \begin{matrix} 111 \end{matrix} ⎞ ⎟ ⎠      Prior \times ⎛ ⎜ ⎝ \begin{matrix} 1 0.01 1 \end{matrix} ⎞ ⎟ ⎠      Refute B \times ⎛ ⎜ ⎝ \begin{matrix} 11 0.01 \end{matrix} ⎞ ⎟ ⎠      Refute C \neq (\begin{matrix} 12 \end{matrix})      Pooled prior \times (\begin{matrix} 1 1.01 \end{matrix})      Refute B \times (\begin{matrix} 1 1.01 \end{matrix})      Refute C$
The information about what hypotheses are in the running is important, and pooling the updates can make the evidence look much weaker than it is.

Forged Invariant Oct 22, 2021, 7:22 AM
13 points
in reply to: Vanessa Kosoy’s comment on: Petrov Day Retrospective: 2021
I found the postmortem over-focuses on what went wrong or was sub-optimal. I would like to point out that I found the event fun, despite being a lurker with no code.

Forged Invariant Oct 22, 2021, 7:20 AM
4 points
on: Petrov Day Retrospective: 2021
There were some reports of people seeing a frozen countdown on the button, that disappeared when the page was refreshed. Was this an intentional false alarm? I had assumed that was the case, as a false alarm with some evidence that it was false echoes some parts of Petrov’s situation nicely.

Forged Invariant Sep 26, 2021, 9:44 PM
1 point
in reply to: Peter Wildeford’s comment on: Petrov Day 2021: Mutually Assured Destruction?
Just be aware that other users have already noticed messages which could be deliberate false alarms: https://www.lesswrong.com/posts/EW8yZYcu3Kff2qShS/petrov-day-2021-mutually-assured-destruction?commentId=JbsutYRotfPDLNskK