Vladimir_Nesov(Vladimir Nesov)

Karma: 29,120

Counterfactual Mugging

Vladimir_Nesov19 Mar 2009 6:08 UTC

80 points

296 comments2 min readLW link

Vladimir_Nesov 28 Jul 2010 22:38 UTC
65 points
in reply to: Roko’s comment on: Open Thread: July 2010, Part 2
So you’ve deleted the posts you’ve made in the past. This is harmful for the blog, disrupts the record and makes the comments by other people on those posts unavailable.

For example, consider these posts, and comments on them, that you deleted:
- 11 core rationalist skills (39 points, 33 comments, promoted)
- The Terrible, Horrible, No Good, Very Bad Truth About Morality and What To Do About It (23 points, 108 comments, promoted)
- Supporting the underdog is explained by Hanson’s Near/Far distinction (20 points, 18 comments, promoted)
- Max Tegmark on our place in history: “We’re Not Insignificant After All” (14 points, 68 comments)
I believe it’s against community blog ethics to delete posts in this manner. I’d like them restored.

Edit: Roko accepted this argument and said he’s OK with restoring the posts under an anonymous username (if it’s technically possible).
What links here?
- Wei Dai's comment on Open Thread: July 2010, Part 2 by Alicorn (24 Sep 2010 20:47 UTC; 6 points)

Vladimir_Nesov 28 Feb 2021 22:01 UTC
60 points
on: “If You’re Not a Holy Madman, You’re Not Trying”
It’s not clear that people should be agents. Agents are means of setting up content of the world to accord with values, they are not optimized for being the valuable content of the world. So a holy madman has a work-life balance problem, they are an instrument of their values rather than an incarnation of them.
What links here?
- Vladimir_Nesov's comment on wrapper-minds are the enemy by nostalgebraist (16 Dec 2022 20:42 UTC; 4 points)

Vladimir_Nesov 12 Jun 2011 16:24 UTC
59 points
on: How not to move the goalposts

Both A and B are unpleasant statements that decent, rational people should probably disagree with

*cringe*
What links here?
- Tyrrell_McAllister's comment on How not to move the goalposts by HopeFox (12 Jun 2011 20:14 UTC; 21 points)

Vladimir_Nesov 20 Aug 2010 12:05 UTC
58 points
on: The Importance of Self-Doubt
This post suffers from lumping together orthogonal issues and conclusions from them. Let’s consider individually the following claims:
1. The world is in danger, and the feat of saving the world (if achieved) would be very important, more so than most other things we can currently do.
2. Creating FAI is possible.
3. Creating FAI, if possible, will be conductive to saving the world.
4. If FAI is possible, person X’s work contributes to developing FAI.
5. Person X’s work contributes to saving the world.
6. Most people’s work doesn’t contribute to saving the world.
7. Person X’s activity is more important than that of most other people.
8. Person X believes their activity is more important than that of most other people.
9. Person X suffers from delusions of grandeur.
A priori, from (8) we can conclude (9). But assuming the a priori improbable (7), (8) is a rational thing for X to conclude, and (9) doesn’t automatically follow. So, at this level of analysis, in deciding whether X is overconfident, we must necessarily evaluate (7). In most cases, (7) is obviously implausible, but the post itself suggests one pattern for recognizing when it isn’t:

The modern world is sufficiently complicated so that no human no matter how talented can have good reason to believe himself or herself to be the most important person in human history without actually doing something which very visibly and decisively alters the fate of humanity.

Thus, “doing something which very visibly and decisively alters the fate of humanity” is the kind of evidence that allows to conclude (7). But unfortunately there is no royal road to epistemic rationality, we can’t require this particular argument that (7) in all cases. Sometimes the argument has an incompatible form.

In our case, the shape of the argument that (7) is as follows. Assuming (2), from (3) and (4) it follows that (5), and from (1), (5) and (6) we conclude (7). Note that the only claim about a person is (4), that their work contributes to development of FAI. All the other claims are about the world, not about the person.

Given the structure of this argument for the abhorrent (8), something being wrong with the person can only affect the truth of (4), and not of the other claims. In particular, the person is overconfident if person X’s work doesn’t in fact contribute to FAI (assuming it’s possible to contribute to FAI).

Now, the extent of overconfidence in evaluating (4) is not related to the weight of importance conveyed by the object level conclusions (1), (2) and (3). One can be underconfident about (4) and still (8) will follow. In fact, (8) is rather insensitive to the strength of assertion (4): even if you contribute to FAI a little bit, but the other object level claims hold, your work is still very important.

Finally, my impression is that Eliezer is indeed overconfident about his ability to technically contribute to FAI (4), but not to the extent this post suggests, since as I said the strength of claim (8) has nothing to do with the level of overconfidence in (4), and even small contribution to FAI is enough to conclude (8) given other object level assumptions. Indeed, Eliezer never claims that success is assured:

Success is not assured. I’m not sure what’s meant by confessing to being “ambitious”. Is it like being “optimistic”?

On the other hand, only few people are currently in the position to claim (4) to any extent. One needs to (a) understand the problem statement, (b) be talented enough, and (c) take the problem seriously enough to direct serious effort at it.

My ulterior motive to elaborating this argument is to make the situation a little bit clearer to myself, since I claim the same role, just to a smaller extent. (One reason I don’t have much confidence is that each time I “level up”, last time around this May, I realize how misguided my past efforts were, and how much time and effort it will take to develop the skillset necessary for the next step.) I don’t expect to solve the whole problem (and I don’t expect Eliezer or Marcello or Wei to solve the whole problem), but I do expect that over the years, some measure of progress can be made by mine and their efforts, and I expect other people will turn up (thanks to Eliezer’s work on communicating the problem statement of FAI and new SIAI’s work on spreading the word) whose contributions will be more significant.
What links here?
- Vladimir_Nesov's comment on The Importance of Self-Doubt by multifoliaterose (20 Aug 2010 13:42 UTC; 3 points)

Vladimir_Nesov 26 Apr 2024 16:55 UTC
57 points
17
on: Scaling of AI training runs will slow down after GPT-5
Distributed training seems close enough to being a solved problem that a project costing north of a billion dollars might get it working on schedule. It’s easier to stay within a single datacenter, and so far it wasn’t necessary to do more than that, so distributed training not being routinely used yet is hardly evidence that it’s very hard to implement.

There’s also this snippet in the Gemini report:

Training Gemini Ultra used a large fleet of TPUv4 accelerators owned by Google across multiple datacenters. [...] we combine SuperPods in multiple datacenters using Google’s intra-cluster and inter-cluster network. Google’s network latencies and bandwidths are sufficient to support the commonly used synchronous training paradigm, exploiting model parallelism within superpods and data-parallelism across superpods.

I think the crux for feasibility of further scaling (beyond $10-$50 billion) is whether systems with currently-reasonable cost keep getting sufficiently more useful, for example enable economically valuable agentic behavior, things like preparing pull requests based on feature/bug discussion on an issue tracker, or fixing failing builds. Meaningful help with research is a crux for reaching TAI and ASI, but it doesn’t seem necessary for enabling existence of a $2 trillion AI company.
What links here?

Vladimir_Nesov 9 Jul 2014 0:19 UTC
54 points
in reply to: Eliezer Yudkowsky’s comment on: Consider giving an explanation for your deletion this time around. “Harry Yudkowsky and the Methods of Postrationality: Chapter One: Em Dashes Colons and Ellipses, Littérateurs Go Wild”
I agree that it’s better for that post to not be on LW, but banning such things is not standard procedure, and people don’t like it when moderators do surprising things. In particular, the post didn’t have more serious pathologies sometimes present in other posts (that are usually still not banned), such as hosting a bad prolific discussion or getting downvoted to minus 20.

(If I were to ban posts on the grounds that I consider them bad for LW, I would ban maybe a quarter of Discussion posts. I don’t have authority to do that, and don’t expect good consequences unless the procedure is accepted by the community on some level. This doesn’t seem likely or even desirable in the sense that there are better alternative procedures such as weighted votes, which would have less blind spots.)

Vladimir_Nesov 29 Sep 2011 10:33 UTC
54 points
on: ‘Newcomblike’ Video Game: Frozen Synapse
Let’s not call shoes we like “rationalist shoes”.

Edit: (Original title of the post was “Rationalist Video Game: Frozen Synapse”.)
What links here?
- Raemon's comment on How to use a microphone ~~rationally~~ during public speaking by ChristianKl (14 Sep 2018 19:29 UTC; 5 points)

Vladimir_Nesov 11 Nov 2013 12:24 UTC
49 points
0
on: On learning difficult things
I found that when a text requires a second or third reading, taking a lot of notes, etc., I won’t be able to master it at the level of the material that I know well, and it won’t be retained as reliably, for example I won’t be able to re-generate most of the key constructions and theorems without looking them up a couple of years later (this applies even if more advanced topics are practiced in the meantime, as they usually don’t involve systematic review of the basics). Thus, there is a triple penalty for working on challenging material: it takes more effort and time to process, the resulting understanding is less fluent, and it gets forgotten faster and to a greater extent. It’s only worth it if it’s necessary to (passably) learn the material on schedule, or if the material is not of much interest in itself and acts mostly as a stepping stone to more advanced material, or if there is no feasible route that would render the material non-challenging (in which case I’d have to put into it even more work to gain fluency, such as inventing mini-projects, inefficiently studying something not directly useful for my purposes that applies the material and then going back, etc.).

A more efficient path, if the goal is to learn a topic well eventually, is to focus on developing skills that would make the topic easier to learn. Instead of studying the hardest book that you can understand (after three readings and looking things up, etc.), study the easiest book containing something you don’t know very well, that would inform the topic you want to master. Eventually, you get to the book that was originally hard, but it’s now easy and can be mastered reliably.
What links here?
- Vladimir_Nesov's comment on Open Thread for February 11 − 17 by Scott Garrabrant (11 Feb 2014 23:20 UTC; 1 point)

Bayesian Utility: Representing Preference by Probability Measures

Vladimir_Nesov27 Jul 2009 14:28 UTC

48 points

37 comments2 min readLW link

Vladimir_Nesov 9 Apr 2011 20:14 UTC
48 points
on: David Deutsch on How To Think About The Future
Deutsch argues that the future is fundamentally unpredictable, that for example expected utility considerations can’t be applied to the future, because we are ignorant of the possible outcomes and intermediate steps leading to those outcomes, and the options that will be available; and there is no way to get around this. The very use of the concept of probability in this context, Deutsch says, is invalid.

As illustration, among other things, he lists some failed predictions made by smart people in the past, attributing failure to unavailability of the ideas relevant for the predictions, ideas that will only be discovered much later.

[Science can’t] predict any phenomenon whose course is going to be affected by the growth of knowledge, by the creation of new ideas. This is the fundamental limitation on the reach of scientific explanation and prediction.

[Predictions that are serious attempts to extract unknowable answers from existing knowledge] are going to be biased towards bad outcomes.

(If it’s unknowable, how can we know that a certain prediction strategy is going to be systematically biased in a known direction? Biased with respect to what knowable standard?)

Deutsch explains:

And the basic reason for that is that, as I said, the growth of knowledge is good, so that kind of prophesy, which can’t imagine it, is going to be biased against prophesying good.

Reason and science are the means to progress. They are not means to prophesy.

On a more constructive if not clearly argued note:

Merely pulling the trigger less often doesn’t change the inevitability of doom. [...] One of the most important uses of technology is to counteract disasters and to recover from disasters, both from foreseen and unforeseen evil. Therefore, the speed of progress itself is one of the things that is a defense against catastrophe.

The speed of progress is one of the things that gives the good guys the edge over the bad guys, because good guys make faster progress.

(Possibly an example of the halo effect: the good guys are good, the progress is good, so the good guys will make faster progress than the bad guys. Quite probably, there was better reasoning behind this argument, but Deutsch doesn’t give it, and doesn’t hint at its existence, probably because he considers the conclusion obvious, which is in any case a flaw of the talk.)

For the next 10 minutes or so he argues for the possibility of essentially open-ended technological progress.

The amount of knowledge in an environment of rational thought that allows it to grow, grows exponentially relative to the speed of computation.

[...] It’s a mistake to think of the so-called singularity as being a shock, where we find that we can’t cope with life, because iPhone updates are coming [...] every second. That’s a mistake, because when progress reaches that speed, our technologically enhanced speed of thinking will have increased in proportion, and so subjectively again we will experience mere exponential growth.

Here, Deutsch seemingly makes the same mistake he discussed at the beginning of the talk: making detailed predictions about future technology that depend on the set of technology-defining ideas presently available (which, by his own argument, can lead to underestimation of progress).

The conclusion is basically a better version of Kurzweil’s view of Singularity, that ordinary technological progress is going to continue indefinitely (Deutsch’s progress is exponential in subjective time, not in physical time). Yudkowsky wrote in 2002:

I’ve come to the conclusion that what Kurzweil calls the “Singularity” is what we would call “the ordinary progress of technology.” In Kurzweil’s world, the Grinding Gears of Industry churn out AI, superhuman AI, uploading, brain-computer interfaces and so on, but these developments do not affect the nature of technological progress except insofar as they help to maintain Kurzweil’s curves exactly on track.

Deutsch considers Popper’s views on the process of development of knowledge, pointing out that there are no reliable sources of knowledge, and so instead we should turn to finding and correcting errors. From this he concludes:

Optimism demands that we not try to extract prophesies of everything that could go wrong in order to forestall it from our existing scanty and misconception-laden existing knowledge. Instead, we need policies and institutions that are capable of correcting mistakes and recovering from disasters when they happen. When, not if.

(This doesn’t terribly help with existential risks. Also, this optimism thing seems to be one magically reliable source of knowledge, strong enough to ignore whatever best conclusions it is possible to draw using the best tools currently available, however poor they seem on the great cosmic scale.)

The way to prevent that nightmare of rogue AI apocalypse is not try to enslave our AIs, because if the AIs are creating new knowledge (and that’s a definition of AI), then successfully enslaving them would require foretelling (prophesying) the ideas that they could have, and the consequences of those ideas, which is impossible.

This was addressed in Knowability of Friendly AI and many later Yudkowsky’s writings, most recently in his joint paper with Bostrom. Basically, you can’t predict the moves of a good chess AI, otherwise you’d be at least that good chess player yourself, and yet you know it’s going to win the game.

Deutsch continues:

So instead, just as for our fellow humans, and for the same reason, we must allow AIs to integrate into the institutions of our open society.

(Or, presumably, so Optimism demands, since the AIs are unpredictable, and technology.)

The only moral values that permit sustained progress are the objective values of an open society and more broadly of the enlightenment. No doubt, the [extraterrestrials’] morality would not be the same as ours, but nor will it be the same as that of 16th century conquistadors. It will be better than ours.

Finally, Deutsch summarizes the meaning of the overarching notion of “optimism” he has been using throughout the talk:

Optimism in this sense that I have argued for is not a feeling, is not a bias or spin that we put on facts, like, you know, half-full instead of half-empty, nor on predictions, it’s not hope for the best, nor blind expectation of the best (in some sense it’s quite the contrary, we expect errors). It is a cold, hard, far-reaching implication of rejecting irrationality, nothing else. Thank you for listening.

(No good questions in the quite long Q&A session. No LWers in the audience, I guess, or only the shy ones.)
What links here?

Vladimir_Nesov 15 May 2012 21:03 UTC
45 points
0
on: I Stand by the Sequences

“Think for yourself” sounds vaguely reasonable only because of the abominable incompetence of those tasked with thinking for us.

-- Steven Kaas

Vladimir_Nesov 3 Dec 2023 6:56 UTC
44 points
15
on: Quick takes on “AI is easy to control”

Overall take: unimpressed.

Very simple gears in a subculture’s worldview can keep being systematically misperceived if it’s not considered worthy of curious attention. On the local llama subreddit, I keep seeing assumptions that AI safety people call for never developing AGI, or claim that the current models can contribute to destroying the world. Almost never is there anyone who would bother to contradict such claims or assumptions. This doesn’t happen because it’s difficult to figure out, this happens because the AI safety subculture is seen as unworthy of engagement, and so people don’t learn what it’s actually saying, and don’t correct each other on errors about what it’s actually saying.

This gets far worse with more subtle details, the standard of willingness to engage is raised higher to actually study what the others are saying, when it would be difficult to figure out even with curious attention. Rewarding engagement is important.

Vladimir_Nesov 3 Jul 2011 22:55 UTC
44 points
on: The Blue-Minimizing Robot
The robot is not consequentialist, its decisions are not controlled by the dependence of facts about the future on its decisions.
What links here?
- Ontological Crisis in Humans by Wei Dai (18 Dec 2012 17:32 UTC; 78 points)
- Vladimir_Nesov's comment on The Blue-Minimizing Robot by Scott Alexander (5 Jul 2011 0:13 UTC; 5 points)

Vladimir_Nesov 9 Dec 2010 21:45 UTC
42 points
in reply to: jimrandomh’s comment on: Best career models for doing research?
LW is a place where you’ll get useful help on weeding out mistakes in your plan to blow up the world, it looks like.

For Epistemic Rationality!

Vladimir_Nesov 29 Sep 2023 19:18 UTC
41 points
20
in reply to: 1a3orn’s comment on: “Diamondoid bacteria” nanobots: deadly threat or dead-end? A nanotech investigation
The original theory is sabotage, not specifically boiler explosion. People keep saying “How could you possibly sabotage a ship?”, and a boiler explosion is one possible answer, but it’s not the reason the ship was predicted to sink. Boiler explosion theory and sabotage theory both predict sinking, but it’s a false superficial agreement, these theories are moved by different arguments.

Vladimir_Nesov 28 Aug 2021 1:54 UTC
LW: 41 AF: 15
AF
on: Can you control the past?
Agent’s policy determines how its instances act, but in general it also determines which instances exist, and that motivates thinking of the agent as the algorithm channeled by instances rather than as one of the instances controlling the others, or as all instances controlling each other. For example, in Newcomb’s problem, you might be sitting inside the box with the $1M, and if you two-box, you have never existed. Grandpa decides to only have children if his grandchildren one-box. Or some copies in distant rooms numbered (on the outside) 1 to 5 writing integers on blackboards, with only the rooms whose number differs from the integer written by at most 1 being occupied. In the occupied rooms, the shape of the digits is exactly the same, but the choice of the integers determines which (if any) of the rooms are occupied. You may carefully write a 7, and all rooms are empty.

If you are the algorithm, which algorithm are you, and what instances are running you? Unfortunate policy decisions, such as thinking too much, can sever control over some instances, as in ASP, or when (as an instance) retracting too much knowledge (UDT-style) and then (as a resulting algorithm) having to examine too many possible states of knowledge or of possible observations, grasping at a wider scope but losing traction, because the instances can no longer channel such an algorithm. Decisions of some precursor algorithm may even determine which successor algorithm an instance is running, not just which policy a fixed algorithm executes, in which case identifying with the instance is even less coherent than if it can merely cease to exist.
What links here?
- Steven Byrnes's comment on Can you control the past? by Joe Carlsmith (28 Aug 2021 14:22 UTC; 10 points)
- Vladimir_Nesov's comment on Chantiel’s Shortform by Chantiel (31 Aug 2021 12:11 UTC; 2 points)

Note on Terminology: “Rationality”, not “Rationalism”

Vladimir_Nesov14 Jan 2011 21:21 UTC

40 points

51 comments1 min readLW link

Vladimir_Nesov 26 Jan 2012 21:37 UTC
40 points
on: The Substitution Principle
This post gives what could be called an “epistemic Hansonian explanation”. A normal (“instrumental”) Hansonian explanation treats humans as agents that possess hidden goals, whose actions follow closely from those goals, and explains their actual actions in terms of these hypothetical goals. People don’t respond to easily available information about quality of healthcare, but (hypothetically) do respond to information about how prestigious a hospital is. Which goal does this behavior optimize for? Affiliation with prestigious institutions, apparently. Therefore, humans don’t really care about health, they care about prestige instead. As Anna’s recent post discusses, the problem with this explanation is that human behavior doesn’t closely follow any coherent goals at all, so even if we posit that humans have goals, these goals can’t be found by asking “What goals does the behavior optimize?”

Similarly in this instance, when you ask humans a question, you get an answer. Answers to the question “How happy are you with your life these days?” are (hypothetically) best explained by respondents’ current mood. Which question are the responses good answers for? The question about the current mood. Therefore, the respondents don’t really answer the question about their average happiness, they answer the question about their current mood instead.

The problem with these explanations seems to be the same: we try to fit the behavior (actions and responses to questions both) to the idea of humans as agents, whose behavior closely optimizes the goals they really pursue, and whose answers closely answer the questions they really consider. But there seems to be no reality to the (coherent) goals and beliefs (or questions one actually considers) that fall out of a descriptive model of humans as agents, even if there are coherent goals and beliefs somewhere, too loosely connected to actions and anticipations to be apparent in them.
What links here?
- Grognor's comment on The Best Comments Ever by Grognor (18 Mar 2012 13:10 UTC; 8 points)

Vladimir_Nesov 23 Nov 2011 21:08 UTC
40 points
0
on: What mathematics to learn
A few points with pure math in mind (notes to my past self, perhaps):
- Mathematics is a single discipline, knowing each of its basic topics helps in understanding the other topics. Don’t omit anything on undergraduate level, include some topology, set theory, logic, geometry, number theory, category theory, complex analysis, differential equations, differential geometry where some courses skip them (in addition to the more reliably standard linear algebra, analysis, abstract algebra, etc.).
- The goal is fluency, as in learning a language, not mere ability to parse the arguments and definitions. It’s possible to follow a text that’s much too advanced for your level, but you won’t learn nearly as much as if you were ready to read it.
- Reading unfamiliar mathematics is difficult, while familiar material can be rapidly scanned. As a result, reading partially redundant supplementary texts comes at a relatively modest cost, but improves understanding of the material. In particular, some books can be included primarily to connect topics that are already known. Other books can be included as preliminary texts that precede other ostensibly introductory books that you could parse, but would learn less from without the preliminary text.
- Learn every topic multiple times, at increasing levels of sophistication, taking advantage of the improving knowledge of other topics learned in the meantime. The rule of thumb is to read 1-2 books on undergraduate level, and 1-2 books on graduate level.
- Don’t be overly obsessive, it’s not necessary to repeat all proofs in writing or solve all exercises.
- Don’t shy away from revisiting elementary material. It’s not there just as a stepping stone to more advanced material, to be forgotten once you’re through, it should remain comfortably familiar in itself.
What links here?

Vladimir_Nesov(Vladimir Nesov)

Coun­ter­fac­tual Mugging

Bayesian Utility: Rep­re­sent­ing Prefer­ence by Prob­a­bil­ity Measures

Note on Ter­minol­ogy: “Ra­tion­al­ity”, not “Ra­tion­al­ism”

Counterfactual Mugging

Bayesian Utility: Representing Preference by Probability Measures

Note on Terminology: “Rationality”, not “Rationalism”