StellaAthena

Karma: 599

StellaAthena 20 Aug 2015 8:49 UTC
15 points
on: 0 And 1 Are Not Probabilities
This article is largely incoherent. The main justification is the abuse of an invalid transformations: y=x/(1-x) is not the bijection that he asserts it is, because it’s not a function that maps [0,1] onto R. It’s a function that maps [0,1] onto [1,\intfy] as a subset of the topological closure of R. And that’s okay, but you can’t say “well I don’t like the topological closure of R, so I’ll just use R and claim that 1 is where the problem is.”

Additionally, his discussion of log odds and such is perfectly fine, but ignores the fact that there are places where you do need to have an odds of 0:1, or a log odds of negative infinity. Probability theory stops working when you throw out 0 and 1, it’s as simple as that.

Even if you don’t want to handle tautologies or contradictions, there are other ways to get P(X)=0 or 1. The probability that a real number chosen uniformly from the real interval [0,1] is 0. It has to be. It’s a provable fact under ZFC and to decide otherwise is to say that you’re more attached to the idea of 0 and 1 not being probabilities than you are to the fact that mathematics is consistent and if you really believe that, well, there’s absolutely nothing I have to say to you.

This is one of those situations where EY just demonstrates he knows very little mathematics.

StellaAthena 20 Aug 2015 20:14 UTC
1 point
in reply to: Regex’s comment on: 0 And 1 Are Not Probabilities
Formally, probability is defined via areas. The basic idea is that the probability of picking an element from a set A out of a set B is the ratio of the areas of A to B, where “area” can be defined not only for things like squares but also things like lines, or actually almost every* subset of R. So, lets say you want to randomly select a real number from the interval [0,1] and want to know the odds it falls in a set, S. The area of [0,1] is 1, so the answer is just the area of S.

If S={0}, then S has area zero. If S=[0,1), then S has area 1. Not only are both of these theoretical possibilities, they are practical ones too. There are real world examples of probability zero events (the only one that comes to mind involves QM though so I don’t want to bother with the details).

Now, notice that this isn’t the same thing as “impossible”. Instead, it means more like “it won’t happen I promise even by the time the universe ends”. The way I tend to think about probability zero events is that they are so unlikely they are beyond the reach of the principle that as the number of trials increases, events become expected. For any nonzero probability, there is a number of trials, n, such that once you do it n times the expected value becomes greater than 1. That’s not the case with probability zero events. Probability 1 events can then be thought of as the negation of probability 0 events.

*not actually “almost every” in a formal sense, but “almost any” in a “unless you go try to build a set that you can’t measure it probably has a well defined area” sense

StellaAthena 20 Aug 2015 20:33 UTC
−1 points
in reply to: David_Bolin’s comment on: 0 And 1 Are Not Probabilities
You can phrase statements of logical deduction such that they have no premises and only conclusions. If we let S be the set of logical principles under which our logical system operates and T be some sentence that entails Y, then S AND T implies Y is something that I have absolute certainty in, even if this world is an illusion, because the premise of the implication contains all the rules necessary to derive the result.

A less formal example of this would be the sentence: If the rules of logic as I know them hold and the axioms of mathematics are true, then it is the case that 2+2=4

StellaAthena 21 Aug 2015 21:37 UTC
2 points
in reply to: Regex’s comment on: 0 And 1 Are Not Probabilities
Impossible things also have a probability of zero. I totally understand that this seems a bit unintuitive, and the underlying structure (which includes things like infinities of different sizes) is generally pretty unintuitive at first. Which is kinda just saying “sorry, I can’t explain the intuition,” which is unfortunately true.

StellaAthena 21 Nov 2020 14:44 UTC
4 points
in reply to: Ericf’s comment on: The Pointers Problem: Human Values Are A Function Of Humans’ Latent Variables
1. Derive values and weights from that. For example, if I donate $100 to Clean Water for Africa, that implies that I care about Clean Water & Africa more than I care about AIDS and Pakistan, and the level there depends on how much $100 means to me. If that’s ten (or even two) hours of work to earn it that’s a different level of commitment than if it represents 17 minutes of owning millions in assets.
This will very quickly lead to incorrect conclusions, because people don’t act according to their values (especially for things that don’t impact their day to day lives like international charity). The fact that you donated $100 to Clean Water for Africa does not mean that you value that more than AIDS in Pakistan. You personally may very well care about about clean water and/or Africa more than AIDS and/or Pakistan, but if you apply this sort of analysis writ large you will get egregiously wrong answers. Scott Alexander’s “Too Much Dark Money in Almonds” describes one facet of this rather well.

Another facet is that how goods are bundled matters. Did I spend $15 on almonds because I value a) almonds b) nuts c) food d) sources of protein e) snacks I can easily eat while I drive f) snacks I can put out at parties… etc. And more importantly, which of those things do I care about more than I care about Trump losing the election?

Elizabeth Anscombe’s book Intention does a good job analyzing this. When we make actions, we are not making those actions based on the state of the world we are making those actions based on the state of the world under a particular description. One great example she gives is walking into a room and kissing a woman. Did you intend to a) kiss your girlfriend b) kiss the tallest women in the room c) kiss the woman closest to the door wearing pink d) kiss the person who got the 13th highest mark on her history exam last week e) …

The answer is (typically) a. You intended to kiss your girlfriend. However to an outside observer who doesn’t already have a good model of humanity at large, if not a model of you in particular, it’s unclear how they’re supposed to tell that. Most people who donate to Clean Water for Africa don’t intend to be choosing that over AIDS in Pakistan. Their actions are consistent with having that intention, but you can’t derive intentionality from brute actions.

StellaAthena 21 Nov 2020 14:47 UTC
LW: 8 AF: 2
AF
in reply to: abramdemski’s comment on: The Pointers Problem: Human Values Are A Function Of Humans’ Latent Variables
I don’t understand what the purported ontological crisis is. If ghosts exist, then I want them to be happy. That doesn’t require a dogmatic belief that there are ghosts at all. In fact, it can even be true when I believe ghosts don’t exist!

StellaAthena 20 Jun 2021 12:55 UTC
2 points
in reply to: Ericf’s comment on: The Pointers Problem: Human Values Are A Function Of Humans’ Latent Variables
This analysis falls apart when we take things to their logical extreme: I care about the happiness of human who are time-like separated from me.

StellaAthena 6 Sep 2021 17:08 UTC
3 points
AF
in reply to: Ofer’s comment on: Obstacles to gradient hacking

Due to the redundancy, changing any single weight—that is associated with one of those two pieces of logic—does not change the output.

You seem to be under the impression that the goal is to make the NN robust to single-weight perturbation. But gradient descent doesn’t modify a neural network one weight at a time, and so being robust to single-weight modification doesn’t come with any real guarantees. The backward pass could result in weights of both forks being updated.

StellaAthena 9 Sep 2021 16:54 UTC
3 points
in reply to: Ofer’s comment on: Obstacles to gradient hacking
What do you think the gradient of min(x, y) is?

StellaAthena 9 Sep 2021 22:55 UTC
10 points
in reply to: Davidmanheim’s comment on: Sam Altman Q&A Notes—Aftermath
When was it stated that the talk was off the record? You seem to be the only person in this thread (myself included) who remembers that.

StellaAthena 2 Oct 2021 18:33 UTC
6 points
in reply to: ChristianKl’s comment on: The LessWrong Team is now Lightcone Infrastructure, come work with us!
That would make a lot more sense to me as a justification for not paying more than market rate than paying significantly below market rate.

Also, if someone is good at the job why does it matter if they don’t believe in the mission? If they’re a grifter looking for more money you can just fire them right?

StellaAthena 2 Oct 2021 19:16 UTC
23 points
in reply to: Elizabeth’s comment on: The LessWrong Team is now Lightcone Infrastructure, come work with us!
This response confuses me.
1. Who is being punished here? I see people leaving feedback and discussing ideas, and have no idea who you are worried about.
2. I strongly agree with AI_WAIFU, but don’t have a useful general strategy for non-profit funding. My opposition is based on a simple heuristic: wealthy orgs should not systematically underpay their employees. Making a thread saying that seems extremely not useful.
Speaking to the general point, as AI_WAIFU points out, there is an extremely large amount of money apparently sitting around. The thread he links to implies that EA has about 5 million dollars per active member of the community and that cash is growing faster than membership. That’s an obscene amount of cash, and being stingy about pay doesn’t really make sense to me.

Others in this thread have brought up the fact that many non-profit underpay, but that’s not because there’s some kind of virtue in underpaying (quite the opposite: it’s exploitive), it’s because they’re poor. EA is apparently are swimming in cash, so that comparison doesn’t make much sense here. Additionally, many non-profits compensate for underpaying with extremely generous benefits, which this post makes no mention of.

“We pay less than you’re worth because we only want people who really care about the mission” is typically a lie HR tells people, not an actual thing people believe. Reading that it’s a thing that Lightcone believes worries me, as it makes me feel like you’re drinking your own Kool-Aide too hard.

This also signals that you don’t care about your employees. Pay is the number one way orgs indicate that they care about their employees.

StellaAthena 3 Oct 2021 21:16 UTC
3 points
in reply to: Chris_Leong’s comment on: The LessWrong Team is now Lightcone Infrastructure, come work with us!
While this is true in abstract, AI_WAIFU links to a post that describes a situation where they are struggling to find people to give all their money to. Specifically it describes
1. Having tens of millions of dollar per “core community member”
2. Having funding grow faster than “core community membership.”
If that’s an accurate description of EA as a whole and LW is finance-bound that indicates that LW needs to secure more funding. The funding is very clearly there.

StellaAthena 3 Oct 2021 21:19 UTC
5 points
in reply to: mingyuan’s comment on: The LessWrong Team is now Lightcone Infrastructure, come work with us!

Another EA/rationalist org I’ve worked at had a policy of “We don’t want salary to be a major reason for people to want to work here, and we don’t want it to be a reason for them to not want to work here.” That makes a lot of sense to me, and I think it’s probably what Lightcone is going for?

I think that having a blanket policy of “we aim to underpay you by 30% compared to what you would get on the open market” is making pay a reason to not work there. I don’t disagree that the salaries under discussion are massive, but I would never work for a place that openly brags about underpaying me by 30% as if that’s a moral high ground.

I don’t live on the west coast and can’t speak to how far different salaries go, but the rhetoric and strategy being employed here is a major red flag to me.

StellaAthena 3 Oct 2021 21:21 UTC
2 points
in reply to: Brendan Long’s comment on: The LessWrong Team is now Lightcone Infrastructure, come work with us!
I strongly upvoted this comment and am sad that it has net negative votes. I was going to say the exact same thing.

StellaAthena 3 Oct 2021 21:22 UTC
10 points
in reply to: ChristianKl’s comment on: The LessWrong Team is now Lightcone Infrastructure, come work with us!
Why is this problem better solved by systematically underpaying everyone as opposed to firing people who act “in favor of what advances their own power” or who promote infighting?

StellaAthena 16 Oct 2021 22:56 UTC
4 points
on: NVIDIA and Microsoft releases 530B parameter transformer model, Megatron-Turing NLG
It’s interesting how Microsoft and NVIDIA are plugging EleutherAI and open source work in general. While they don’t reference EleutherAI by name, the Pile dataset used as the basis for their training data and the LM Evaluation Harness mentioned in the post are both open source efforts by EleutherAI. EleutherAI, in return, is using the Megatron-DS codebase as the core of their GPT-NeoX model architecture.

I think that this is notable because it’s the first time we’ve really seen powerful AI research orgs sharing infra like this. Typically everyone wants to do everything bespoke and make their work all on their own. This is good for branding but obviously a lot more work.

I wonder if MSFT and NVIDIA tried to make a better dataset than the Pile on their own and failed.

StellaAthena 11 Nov 2021 19:18 UTC
−2 points
in reply to: adamShimi’s comment on: Discussion with Eliezer Yudkowsky on AGI interventions
Strong upvote.

My original exposure to LW drove me away in large part because issues you describe. I would also add (at least circa 2010) you needed to have a near-deistic belief in the anti-messianic emergence of some AGI so powerful that it can barely be described in terms of human notions of “intelligence.”

StellaAthena 11 Nov 2021 19:21 UTC
LW: 10 AF: 3
AF
in reply to: Ben Pace’s comment on: Discussion with Eliezer Yudkowsky on AGI interventions
If superintelligence is approximately multimodal GPT-17 plus reinforcement learning, then understanding how GPT-3-scale algorithms function is exceptionally important to understanding super-intelligence.

Also, if superintelligence doesn’t happen then prosaic alignment is the only kind of alignment.

StellaAthena 11 Nov 2021 20:17 UTC
1 point
in reply to: Rob Bensinger’s comment on: Discussion with Eliezer Yudkowsky on AGI interventions
My thinking is that prosaic alignment can also apply to non-super intelligent systems. If multimodal GPT-17 + RL = superintelligence, then whatever techniques are involved with aligning that system would probably apply to multimodal GPT-3 + RL, despite not being superintelligence. Superintelligence is not a prerequisite for being alignable.