interstice

Karma: 2,070

Recent Progress in the Theory of Neural Networks

interstice4 Dec 2019 23:11 UTC

83 points

9 comments9 min readLW link

interstice 2 Nov 2023 17:56 UTC
57 points
56
on: Saying the quiet part out loud: trading off x-risk for personal immortality
At this point timelines look short enough that you likely increase your personal odds of survival more by increasing the chance that AI goes well than by speeding up timelines. Also I don’t see why you think cryonics doesn’t make sense as alternative option.

Tao, Kontsevich & others on HLAI in Math

interstice10 Jun 2022 2:25 UTC

41 points

5 comments2 min readLW link

(www.youtube.com)

interstice 27 Apr 2023 23:56 UTC
41 points
5
on: Tuning your Cognitive Strategies
Worth noting that the reason SquirrelInHell is dead is that they committed suicide after becoming mentally unstable, likely in part due to experimentation with exotic self-modification techniques. This one in particular seems fine AFAICT, but, ya know, caveat utilitor.

interstice 11 Jun 2023 4:24 UTC
40 points
38
on: The Dictatorship Problem
Noting that the author deleted a critical comment which was somewhat rude but IMO made some reasonable points. That’s fair enough, but in conjunction with the way the site handles deletions this strikes me as bad, since there’s no way of (a) seeing which user posted the deleted comment (this might be a bug?) (b) examining the text of deleted comments. Together this means that you can’t distinguish cases where a post attracts no criticism(an important signal) and cases where there were critical comments that were deleted, and you can’t examine deleted criticisms.

interstice 14 Jun 2022 21:01 UTC
39 points
7
on: Slow motion videos as AI risk intuition pumps
In the same vein, I humbly suggest “The entire bee movie but every time they say bee it gets faster” as a good model for what the singularity will seem like from our perspective.

Alignment Might Never Be Solved, By Humans or AI

interstice7 Oct 2022 16:14 UTC

37 points

6 comments3 min readLW link

interstice 12 Jun 2022 5:24 UTC
37 points
in reply to: Ben Livengood’s comment on: A claim that Google’s LaMDA is sentient
It’s neither a hoax nor a HLAI, instead a predictable consequence of prompting a LLM with questions about its sentience: it will imitate the answers a human might give when prompted, or the sort of answers an AI in a science fiction story would give.

[Question] What’s the Relationship Between “Human Values” and the Brain’s Reward System?

interstice19 Apr 2022 5:15 UTC

36 points

17 comments1 min readLW link

interstice 21 Dec 2022 0:03 UTC
35 points
9
on: K-complexity is silly; use cross-entropy instead
As John conjectured, alt-complexity is a well-known notion in algorithmic information theory, and differs from K-complexity by at most a constant. See section 4.5 of this book for a proof. So I think the stuff about how physics favors alt-complexity is a bit overstated—or at least, can only contribute a bounded amount of evidence for alt-complexity.

ETA: this result is about the complexity of finite strings, not semimeasures on potentially infinite sequences; for such semimeasures, there actually is a non-constant gap between the log-total-probability and description complexity.
What links here?
- K-complexity is silly; use cross-entropy instead by So8res (20 Dec 2022 23:06 UTC; 136 points)

interstice 18 Aug 2023 2:31 UTC
34 points
30
in reply to: Richard_Ngo’s comment on: Against Almost Every Theory of Impact of Interpretability

Interpretability seems like our clear best bet for developing a more principled understanding of how deep learning works

If our goal is developing a principled understanding of deep learning, directly trying to do that is likely to be more effective than doing interpretability in the hope that we will develop a principled understanding as a side effect. For this reason I think most alignment researchers have too little awareness of various attempts in academia to develop “grand theories” of deep learning such as the neural tangent kernel. I think the ideal use for interpretability in this quest is as a way of investigating how the existing theories break down—e.g. if we can explain 80% of a given model’s behavior with the NTK, what are the causes of the remaining 20%? I think of interpretability as basically collecting many interesting data points; this type of collection is essential, but it can be much more effective when it’s guided by a provisional theory which tells you what points are expected and what are interesting anomalies which call for a revision of the theory, which in turn guides further exploration, etc.

NTK/GP Models of Neural Nets Can’t Learn Features

interstice22 Apr 2021 3:01 UTC

33 points

33 comments3 min readLW link

interstice 27 Apr 2023 20:49 UTC
29 points
20
on: My views on “doom”
Interesting that you think the risk of us “going crazy” after getting AI in some way is roughly comparable to overall AI takeover risk. I’d be interested to hear more if you have more detailed thoughts here. On this view it also seems like it could be a great x-risk-reduction opportunity if there are any tractable strategies, given how neglected it is compared to takeover risk.
What links here?

interstice 20 Sep 2022 1:32 UTC
28 points
4
on: Prize idea: Transmit MIRI and Eliezer’s worldviews
I think it would also be valuable to have someone translate “in the other direction” and take (for example) Paul Christiano’s writings and produce vivid, concrete parable-like stories based on them. I think such stories can be useful not just as persuasive tools but also epistemically, as a way of grounding the meaning of abstract definitions of the sort Paul likes to argue in terms of.

interstice 7 Sep 2021 17:41 UTC
28 points
in reply to: Kaj_Sotala’s comment on: I read “White Fragility” so you don’t have to (but maybe you should)
In rationalist circles, you might find out that you’re being instrumentally or epistemically irrational in the course of a debate—the norms of such a debate encourage you to rebut your opponent’s points if you think they are being unfair. In contrast, the central thesis of this book is that white people disputing their racism is a mechanism for protecting white supremacy and needs to be unlearned, along with other cornerstones of collective epistemology such as the notion of objective knowledge. So under the epistemic conditions promoted by this book, I expect “found about being racist” to roughly translate to “was told you were racist”.

interstice 26 Mar 2024 22:12 UTC
25 points
14
on: My Interview With Cade Metz on His Reporting About Slate Star Codex

ZMD: Looking at “Silicon Valley’s Safe Space”, I don’t think it was a good article. Specifically, you wrote,

In one post, [Alexander] aligned himself with Charles Murray, who proposed a link between race and I.Q. in “The Bell Curve.” In another, he pointed out that Mr. Murray believes Black people “are genetically less intelligent than white people.”

End quote. So, the problem with this is that the specific post in which Alexander aligned himself with Murray was not talking about race. It was specifically talking about whether specific programs to alleviate poverty will actually work or not.

So on the one hand, this particular paragraph does seem like it’s misleadingly implying Scott was endorsing views on race/iq similar to Murray’s even though, based on the quoted passages alone, there is little reason to think that. On the other hand, it’s totally true that Scott was running a strategy of bringing up or “arguing” with hereditarians with the goal of broadly promoting those views in the rationalist community, without directly being seen to endorse them. So I think it’s actually pretty legitimate for Metz to bring up incidents like this or the Xenosystems link in the blogroll. Scott was basically using a strategy of communicating his views in a plausibly deniable way by saying many little things which are more likely if he was a secret hereditarian, but any individual instance of which is not so damning. So I feel it’s total BS to then complain about how tenuous the individual instances Metz brought up are—he’s using it as an example or a larger trend, which is inevitable given the strategy Scott was using.

(This is not to say that I think Scott should be “canceled” for these views or whatever, not at all, but at this stage the threat of cancelation seems to have passed and we can at least be honest about what actually happened)

interstice 10 Mar 2024 20:51 UTC
25 points
7
on: Evolution did a surprising good job at aligning humans...to social status
Not sure how much I believe this myself, but Jacob cannell has an interesting take that social status isn’t a “base drive” either, but is basically a proxy for “empowerment”, influence over future states of the world. If that’s true it’s perhaps not so surprising that we’re still well-aligned, since “empowerment” is in some sense always being selected for by reality.

interstice 13 Mar 2022 20:46 UTC
23 points
on: Whence the determinant?
I always liked the interpretation of the determinant as measuring the expansion/contraction of n-dimensional volumes induced by a linear map, with the sign being negative if the orientation of space is flipped. This makes various properties intuitively clear such as non-zero determinant being equivalent to invertibility.

interstice 21 Dec 2019 23:47 UTC
22 points
in reply to: Scott Alexander’s comment on: Free Speech and Triskaidekaphobic Calculators: A Reply to Hubinger on the Relevance of Public Online Discussion to Existential Risk
In the analogy, it’s only possible to build a calculator that outputs the right answer on non-13 numbers because you already understand the true nature of addition. It might be more difficult if you were confused about addition, and were trying to come up with a general theory by extrapolating from known cases—then, thinking 6 + 7 = 15 could easily send you down the wrong path. In the real world, we’re similarly confused about human preferences, mind architecture, the nature of politics, etc., but some of the information we might want to use to build a general theory is taboo. I think that some of these questions are directly relevant to AI—e.g. the nature of human preferences is relevant to building an AI to satisfy those preferences, the nature of politics could be relevant to reasoning about what the lead-up to AGI will look like, etc.

interstice 21 Jan 2022 20:51 UTC
21 points
on: Consume fiction wisely
Seems to me that the ‘helpful’ works you listed contain falsehoods and wrong associations. They also contain useful information and enjoyable aspects, true—but couldn’t the same be said of lots of non-”rational” fiction? As it stands this just looks like a list of fiction that’s popular among our subculture.

interstice

Re­cent Progress in the The­ory of Neu­ral Networks

Tao, Kont­se­vich & oth­ers on HLAI in Math

Align­ment Might Never Be Solved, By Hu­mans or AI

[Question] What’s the Re­la­tion­ship Between “Hu­man Values” and the Brain’s Re­ward Sys­tem?

NTK/​GP Models of Neu­ral Nets Can’t Learn Features

Recent Progress in the Theory of Neural Networks

Tao, Kontsevich & others on HLAI in Math

Alignment Might Never Be Solved, By Humans or AI

[Question] What’s the Relationship Between “Human Values” and the Brain’s Reward System?

NTK/GP Models of Neural Nets Can’t Learn Features