notfnofn

Karma: 29

notfnofn 7 Feb 2024 22:37 UTC
1 point
0
in reply to: Scott Garrabrant’s comment on: Tyranny of the Epistemic Majority
Is there no way to salvage it via a Nash bargaining argument if the odds are different? Or at least, deal with scenarios where you have x:1 and 0:1 odds (i.e. you can only bet on heads)?

notfnofn 8 Feb 2024 12:25 UTC
3 points
0
on: Why square errors?
In case it hasn’t crossed your mind, I personally think it’s helpful to start in the setting of estimating the true mean $μ$ of a data stream. A very natural choice estimator for $μ$ is the sample mean of the $x_{i},$ which I’ll denote $^μ$ . This can equivalently be formulated as the minimizer of $\sum (x_{i} -^μ)^{2}$ .
Others have mentioned the normal distribution, but this feels secondary to me. Here’s why—let’s say $x_{i} \sim σ f (\frac{x - μ}{σ})$ , where $f (x)$ is a known continuous probability distribution with mean 0 and variance 1, and $μ, σ$ are unknown. So the distribution of each $x_{i}$ has mean $μ$ and variance $σ^{2}$ (and assume independence).
What must $f (x)$ be for the sample mean $^μ$ to be the maximum likelihood estimator of $μ$ ? Gauss proved that it must be $\frac{1}{2 \sqrt{π}} e^{- x^{2} / 2}$ , and intuitively it’s not hard to see why it would have to be of the form $a e^{b x^{2}}$ .
So from this perspective, MSE is a generalization of taking the sample mean, and asking the linear model to have gaussian errors is necessary to formally justify MSE through MLE.
Replace sample mean with sample median and you get the mean absolute error.

[Question] CDT vs. EDT on Deterrence

notfnofn24 Feb 2024 15:41 UTC

1 point

9 comments1 min readLW link

notfnofn 26 Feb 2024 14:26 UTC
11 points
on: Intro to Naturalism: Orientation
I’m not sure if these kind of comments are acceptable on this site, but I just wanted to say thank you for this sequence. I doubt I will significantly change my life after reading this, but I hope to change it at least a little in this direction.
Viewing myself as a reinforcement learning agent that balances policy improvement (taking my present model and thinking about how to tweak my actions to optimize rewards assuming my model is correct) and exploration (observing how the world actually responds to certain actions to update the model), I have historically spent far too much time on policy improvement.
This sequence provides a nice set of guidelines and methods to pivot gears and really think about what it even means to improve ones model of the world, in a way that seems… fun? fulfilling? I hope to report back on this in a few months and say how it’s gone; there is a high probability that I fall back into old habits, but I hope I do not.

notfnofn 28 Feb 2024 12:41 UTC
5 points
2
on: New LessWrong review winner UI (“The LeastWrong” section and full-art post pages)
It’s extremely beautiful, and seems like it would serve as a nice introduction to the website that isn’t subject to the same random noise as the front page.
I really like ‘leastwrong’ in the url and top banner (header?), but I could see how making ‘The LeastWrong’ the actual title could rub off on some as pretentious.

notfnofn 6 May 2024 17:50 UTC
4 points
0
on: Explaining a Math Magic Trick
Here’s a puzzle I came up with in undergrad, based on this idea:
Let $f (x)$ be a function with nice derivatives and anti-derivatives (like exponentials, sine, or cosine) and $p (x)$ be a polynomial. Express the $k$ th anti-derivative of $p (x) f (x)$ in terms of derivatives and anti-derivatives of $p (x)$ and $f (x)$ .
Can provide link to a post on r/mathriddles with the answer in the comments upon request

notfnofn 7 May 2024 3:29 UTC
2 points
0
in reply to: DaemonicSigil’s comment on: Explaining a Math Magic Trick
This is true, but I’m looking for an explicit, non-recursive formula that needs to handle the general case of the kth anti-derivative (instead of just the first).
The solution involves doing something funny with formal power series, like in this post.

notfnofn 7 May 2024 13:12 UTC
4 points
0
in reply to: DaemonicSigil’s comment on: Explaining a Math Magic Trick
Very nice! Notice that if you write $r = j - k,$ $I$ as $D^{- 1}$ , and play around with binomial coefficients a bit, we can rewrite this as:

$D^{- k} (f p) = \sum_{r = 0}^{\infty} (\frac{- k}{r}) (D^{- k - r} f) (D^{r} p)$
which holds for $k < 0$ as well, in which case it becomes the derivative product rule. This also matches the formal power series expansion of $(x + y)^{- k}$ , which one can motivate directly
(By the way, how do you spoiler tag?)

[Question] Quantized vs. continuous nature of qualia

notfnofn15 May 2024 12:52 UTC

6 points

17 comments1 min readLW link

notfnofn 20 May 2024 12:44 UTC
2 points
0
in reply to: SamEisenstat’s comment on: The consistent guessing problem is easier than the halting problem
Are there any other nice decision problems that are low? A quick search only reveals existence theorems.

Intuitive guess: Can we get some hierarchy from oracles to increasingly sparse subsets of the digits of Chaitin’s constant?