Drew the shoggoth and named notkilleveryoneism.
Tetraspace
I’d like to report a bug. My comments aren’t larger than worlds, which is a pity, because the kind of content I produce is clearly the most insightful and intelligent of all. I’m also humble to boot—more humble than you could ever believe—which is one of the rationalist virtues that any non-tribal fellow would espouse.
Ahead of time, you can’t really tell precisely what problems you’ll be faced with—reality is allowed to throw pretty much anything at you. It’s a useful property, then, if decisions are possible to make in all situations, so you can guarantee that e.g. new physics won’t throw you into undefined behavior.
The argument at the start just seems to move the anthropics problem one step back—how do we know whether we “survived”* the cold war?
*Not sure how to succinctly state this better; I mean if Omega told me that the True Probability of surviving the Cold War was 1%, I would update on the safety of the Cold War in a different direction than if it told me 99%, even though both entail me, personally, surviving the Cold War.
Why does this line of reasoning not apply to friendly AIs?
Why would the unfriendly AI halt? Is there really no better way for it to achieve its goals?
Very nice! I like the colour-coding scheme, and the way it ties together those bullet points in MIRI’s research agenda.
Looks like these sequences are going to be a great (content-wise and aesthetically) introduction to a lot of the ideas behind agent foundations; I’m excited.
Do you have any recommended reading for learning enough math to do these exercises? I’m sort of using these as a textbook-list-by-proxy (e.g. google “Intermediate value theorem”, check which area of math it’s from, oh hey it’s Analysis, get an introductory textbook in Analysis, repeat), though I also have little knowledge of the field and don’t want to wander down suboptimal paths.
You cannot program a general intelligence with a fundamental drive to ‘not intervene in human affairs except when things are about to go drastically wrong otherwise, where drastically wrong is defined as either rape, torture, involuntary death, extreme debility, poverty or existential threats’ because that is not an optimization function.
In the extreme limit, you could create a horribly gerrymandered utility function where you assign 0 utility to universes where those bad things are happening, 1 utility to universes where they aren’t, and some reduced impact thing which means that it usually prefers to do nothing.
While “rationality” claims to be defined as “stuff that helps you win”, and while on paper if it turned out that the Sequences didn’t help you arrive at correct conclusions we’d stop calling that “rationality” and call something else “rationality”, in practise the word “rationality” points at “the stuff in the Sequences” rather than the “stuff that helps you win”, and that people with stuff that helps you win that isn’t the type of thing that you’d find in the Sequences have to call it something else to be unambiguous. Such is language.
Cutting-edge modern military AI seems to all be recently developed; the first flight of the F-22 Raptor was in 1997, while the first deployment of Sea Hunter was in 2016. I also think there are strong incentives for civilian organisations to develop AI that aren’t present for fighter jets.
Is there any way to mark a post as unread? It’s recommending me a lot of sequences that it believes I’m halfway through when in fact I’ve just briefly checked a couple of posts in it, and it would be nice if I could start it again from the beginning.
Submission. Counterfactual oracle. Give the oracle the set of questions on Metaculus that have a resolve date before some future date T, and receive output in the form of ordered pairs of question IDs and predictions. The score of the Oracle in the case where we don’t see its answers is the number of Metaculus points that it would have earned by T if it had made a prediction on those questions at the time when we asked it.
There’s no official, endorsed CFAR handbook that’s publicly available for download. The CFAR handbook from summer 2016, which I found on libgen, warns
While you may be tempted to read ahead, be forewarned—we’ve often found that participants have a harder time grasping a given technique if they’ve already anchored themselves on an incomplete understanding. Many of the explanations here are intentionally approximate or incomplete, because we believe this content is best transmitted in person. It helps to think of this handbook as a companion to the workshop, rather than as a standalone resource.
which I think is still their view on the matter.
I have heard that they would be more comfortable with people learning rationality techniques in-person from a friend, so if you know any CFAR alumni you could ask them (they’d probably also have a better answer to your question).
I’m off from university (3rd year physics undergrad) for the summer and hence have a lot of free time, and I want to use this to make as much progress as possible towards the goal of getting a job in AI safety technical research. I have found that I don’t really know how to do this.
Some things that I can do:
work through undergrad-level maths and CS textbooks
basic programming (since I do physics, this is at the level required to implement simple numerical methods in MATLAB)
the stuff in Andrew Ng’s machine learning Coursera course
Thus far I’ve worked through the first half of Hutton’s Programming in Haskell on the grounds that functional programming maybe teaches a style of thought that’s useful and opens doors to more theoretical CS stuff.
I’m optimising for something slightly different that purely becoming good at AI safety, in that at the end I’d like to have some legible things to point to or list on a CV or something (or become better-placed to later acquire such legible things).
I’d be interested to hear from people who know more about what would be helpful for this.
- 4 Aug 2019 23:04 UTC; 14 points) 's comment on Open & Welcome Thread—August 2019 by (
In the case of MNIST, how good is the judge itself—for example, if you were to pick the six pixels optimally to give it the most information, how well would it perform?
The simplicity prior is that you should assign a prior probability 2^-L to the description of length L. This sort of makes intuitive sense, since it’s what you’d get if you generated the description through a series of coinflips...
… except there are 2^L descriptions of length L, so the total prior probability you’re assigning is sum(2^L * 2^-L) = sum(1) = unnormalisable.
You can kind of recover this by noticing that not all bitstrings correspond to an actual description, and for some encodings their density is low enough that it can be normalised (I think the threshold is that less than 1/L descriptions of length L are “valid”)...
...but if that’s the case, you’re being fairly information inefficient because you could compress descriptions further, and why are you judging simplicity using such a bad encoding, and why 2^-L in that case if it doesn’t really correspond to complexity properly any more? And other questions in this cluster.
I am confused (and maybe too hung up on something idiosyncratic to an intuitive description I heard).
It was inspired by yours—when I read your post I remembered that there was this thing about Solomonoff induction that I was still confused about—though I wasn’t directly trying to answer your question so I made it its own thread.
In Against Against Billionaire Philanthropy, Scott says
The same is true of Google search. I examined the top ten search results for each donation, with broadly similar results: mostly negative for Zuckerberg and Bezos, mostly positive for Gates.
With Gates’ philanthropy being about malaria, Zuckerberg’s being about Newark schools, and Bezos’ being about preschools.
Also, as far as I can tell, Moskovitz’ philanthropy is generally considered positively, though of course I would be in a bubble with respect to this. Also also, though I say this without really checking, it seems that people are pretty much all against the Sacklers’ donations to art galleries and museums.
Squinting at these data points, I can kind of see a trend: people favour philanthropy that’s buying utilons, and are opposed to philanthropy that’s buying status. They like billionaires funding global development more than they like billionaires funding local causes, and they like them funding art galleries for the rich least of all.
Which is basically what you’d expect if people were well-calibrated and correctly criticising those who need to be taken down a peg.
The formalisation used in the Sequences (and algorithmic information theory) is the complexity of a hypothesis is the shortest computer program that can specify that hypothesis.
An illustrative example is that, when explaining lightning, Maxwell’s equations are simpler in this sense than the hypothesis that Thor is angry because the shortest computer program that implements Maxwell’s equations is much simpler than an emulation of a humanlike brain and its associated emotions.
In the case of many-worlds vs. Copenhagen interpretation, a computer program that implemented either of them would start with the same algorithm (Schrodinger’s equation), but (the claim is) that the computer program for Copenhagen would have to have an extra section that specified how collapse upon observation worked that many-worlds wouldn’t need.
- 6 Aug 2019 18:30 UTC; 1 point) 's comment on Occam’s Razor: In need of sharpening? by (
I might as well post a monthly update on my doing things that might be useful for me doing AI safety.
I decided to just continue with what I was doing last year before I got distracted, and learn analysis, from Tao’s Analysis I, on the grounds that it’s maths which is important to know and that I will climb the skill tree analysis → topology → these fixed point exercises. Have done chapters 5, 6 and 7.
My question on what it would be most useful for me to be doing remains if anyone has any input.
New situation: 3^^^3 people being tortured for 50 years, or one person getting tortured for 50 years and getting a single speck of dust in their eye.
By do unto others, I should, of course, torture the innumerably vast number of people, since I’d rather be tortured for 50 years than be tortured for 50 years and get dust in my eye.