ScottMessick

Karma: 256

ScottMessick 20 Aug 2011 15:11 UTC
22 points
on: What a practical plan for Friendly AI looks like

I have seen too many discussions of Friendly AI, here and elsewhere (e.g. in comments at Michael Anissimov’s blog), detached from any concrete idea of how to do it....

At present, it is discussed in conjunction with a whole cornucopia of science fiction notions such as: immortality, conquering the galaxy, omnipresent wish-fulfilling super-AIs, good and bad Jupiter-brains, mind uploads in heaven and hell, and so on. Similarly, we have all these thought-experiments: guessing games with omniscient aliens, decision problems in a branching multiverse, “torture versus dust specks”. Whatever the ultimate relevance of such ideas, it is clearly possible to divorce the notion of Friendly AI from all of them....

SIAI, in discussing the quest for the right goal system, emphasizes the difficulties of this process and the unreliability of human judgment. Their idea of a solution is to use artificial intelligence to neuroscientifically deduce the actual algorithmic structure of human decision-making, and to then employ a presently nonexistent branch of decision theory to construct a goal system embodying ideals implicit in the unknown human cognitive algorithms.

In short, there is a dangerous and almost universal tendency to think about FAI (and AGI generally) primarily in far mode. Yes!

However, I’m less enamored with the rest of your post. The reason is that building AGI is simply an altogether higher-risk activity than traveling to the moon. Using “build a chemical powered rocket” as your starting point for getting to the moon is reasonable in part because the worst that could plausibly happen is that the rocket will blow up and kill a lot of volunteers who knew what they were getting into. In the case of FAI, Eliezer Yudkowsky has taken great pains to show that the slightest, subtlest mistake, one which could easily pass through any number of rounds of committee decision making, coding, and code checking, could lead to existence failure for humanity. He has also taken pains to show that approaches to the problem which entire committees have in the past thought were a really good idea, would also lead to such a disaster. As far as I can tell, the LessWrong consensus agrees with him on the level of risk here, at least implicitly.

There is another approach. My own research pertains to automated theorem proving, and its biggest application, software verification. We would still need to produce a formal account of the invariants we’d want the AGI to preserve, i.e., a formal account of what it means to respect human values. When I say “formal”, I mean it: a set of sentences in a suitable formal symbolic logic, carefully chosen to suit the task at hand. Then we would produce a mathematical proof that our code preserves the invariants, or, more likely we would use techniques for producing the code and the proof at the same time. So we’d more or less have a mathematical proof that the AI is Friendly. I don’t know how the SIAI is trying to think about the problem now, exactly, but I don’t think Eliezer would be satisfied by anything less certain than this sort of approach.

Not that this outline, at this point, is satisfactory. The formalization of human value is a massive problem, and arguably where most of the trouble lies anyway. I don’t think anyone’s ever solved anything even close to this. But I’d argue that this outline does clarify matters a bit, because we have a better idea what a solution to this problem would look like. And it makes it clear how dangerous the loose approach recommended here is: virtually all software has bugs, and a non-verified recursively self-improving AI could magnify a bug in its value system until it no better approximates human values than does paperclip-maximizing. Moreover, the formal proof doesn’t do anyone a bit of good if the invariants were not designed correctly.

ScottMessick 19 Mar 2013 1:11 UTC
20 points
on: Harry Potter and the Methods of Rationality Bookshelves
Suggestions for Slytherin: Sun Tzu’s Art of War and some Nietzsche, maybe The Will to Power?

Suggestion for Ravenclaw: An Enquiry Concerning Human Understanding, David Hume.

ScottMessick 12 Jul 2012 18:54 UTC
20 points
0
on: What Is Signaling, Really?
I had long ago (but after being heavily influenced by Overcoming Bias) thought that signaling could be seen simply as a corollary to Bayes’ theorem. That is, when one says something, one knows that its effect on a listener will depend on the listener’s rational updating on the fact that one said it. If one wants the listener to behave as if X is true, one should say something that the listener would only expect in case X is true.

Thinking in this way, one quickly arrives at conclusions like “oh, so hard-to-fake signals are stronger” and “if everyone starts sending the same signal in the same way, that makes it a lot weaker”, which test quite well against observations of the real world.

Powerful corollary: we should expect signaling, along with these basic properties, to be prominent in any group of intelligent minds. For example, math departments and alien civilizations. (Non-example: solitary AI foom.)

ScottMessick 19 Jun 2012 18:04 UTC
17 points
on: Suggest alternate names for the “Singularity Institute”
I have direct experience of someone highly intelligent, a prestigious academic type, dismissing SI out of hand because of its name. I would support changing the name.

Almost all the suggestions so far attempt to reflect the idea of safety or friendliness into the name. I think this might be a mistake, because for people who haven’t thought about it much, this invokes images of Hollywood). Instead, I propose having the name imply that SI does some kind of advanced, technical research involving AI and is prestigious, perhaps affiliated with a university (think IAS).

Center for Advanced AI Research (CAAIR)

ScottMessick 26 Jan 2012 22:48 UTC
17 points
in reply to: MixedNuts’s comment on: I’ve had it with those dark rumours about our culture rigorously suppressing opinions
Wow, when I read “should not be treated differently from those issues”, I assumed the intention was likely to be “child acting, indoctrination, etc., should be considered abuse and not tolerated by society”, a position I would tentatively support (tentatively due to lack of expertise).

Incidentally, I found many of the other claims to be at least plausible and discussion-worthy, if not probably true (and certainly not things that people should be afraid to say).

ScottMessick 24 Sep 2012 2:42 UTC
13 points
on: High School Lecture—Report
I wonder how it would be if you asked instead “When should we say a statement is true?” instead of “What is truth?” and whether your classmates would think them the same (or at least closely related) questions.

ScottMessick 21 Aug 2011 15:56 UTC
11 points
in reply to: RobertLumley’s comment on: Please do not downvote every comment or post someone has ever made as a retaliation tactic.
Ah, now I feel extremely silly. The irony did not occur to me; it was simply a long comment that I agreed with completely, and I wasn’t satisfied merely upvoting it because it didn’t have any (other) upvotes yet at the time. Plus, doubly ironically, I was on a moral crusade to defend the karma system...

ScottMessick 21 Aug 2011 14:52 UTC
11 points
in reply to: RobertLumley’s comment on: Please do not downvote every comment or post someone has ever made as a retaliation tactic.
Why all the karma bashing? Yes, absolutely, people will upvote or downvote for political reasons and be heavily influenced by the name behind the post/comment. All the time. But as far as I can tell, politics is a problem with any evaluation system whatsoever, and karma does remarkably well. In my experience, post and comment scores are strongly correlated with how useful I find them, how much they contribute to my experience of the discussion. And the list of top contributors is full of people who have written posts that I have saved forever, that in many cases irreversibly impacted my thinking. The fact that EY is sometimes deservingly downvoted is a case in point. The abuse described in the original post is unfortunate, but overall the LessWrong system does a difficult job incredibly well.

ScottMessick 20 Aug 2011 2:42 UTC
11 points
in reply to: RobertLumley’s comment on: [Error communicating with LW2 server]
I don’t know if there’s a big reason behind it, but because .com is so entrenched as the “default” TLD, I think it’s probably best to be LessWrong.com rather than LessWrong.net or any other choice, simply because ”LessWrong.com″ is more likely to be correctly remembered by people who hear of it briefly, or correctly guessed by people who heard “Less Wrong” and randomly take a stab at their brower’s navigation bar.

I admit this point may be relatively trivial since it’s the first google hit for “less wrong” and that’s probably how a lot of people look for it who’ve only heard of it.

ScottMessick 4 Jul 2012 17:50 UTC
8 points
in reply to: MixedNuts’s comment on: Rationality Quotes July 2012
These phrases are mainly used in near mode, or when trying to induce near mode. The phenomenon described in the quote is a feature (or bug) of far mode.

ScottMessick 25 Nov 2013 5:03 UTC
7 points
on: The dangers of zero and one
The Pentium FDIV bug was actually discovered by someone writing code to compute prime numbers.

ScottMessick 11 Jul 2012 18:25 UTC
7 points
on: Reply to Holden on The Singularity Institute
I’m really glad you pointed out that SI’s strategy is not predicated on hard take-off. I don’t recall if this has been discussed elsewhere, but that’s something that always bothered me since I think hard take-off is relatively unlikely. (Admittedly, soft take-off still considerably diminishes my expected impact for SI and donating to it.)

ScottMessick 6 Jul 2012 20:34 UTC
7 points
on: Interlude for Behavioral Economics

But this elegant simplicity was, like so many other things, ruined by the Machiguenga Indians of eastern Peru.

Wait, is this a joke, or have the Machiguenga really provided counterexamples to lots of social science hypotheses?

ScottMessick 19 Jun 2012 0:56 UTC
7 points
in reply to: Paul Crowley’s comment on: Neil deGrasse Tyson on Cryonics
Summary: Expanding on what maia wrote, I find it plausible that many people could produce good technical arguments against cryonics but don’t simply because they’re not writing about cryonics at all.

I was defending maia’s point that there are many people who are uninterested in cryonics and don’t think it will work. This class probably includes lots of people who have relevant expertise as well. So while there are a lot of people who develops strong anti-cryonics sentiments (and say so), I suspect they’re only a minority of the people who don’t think cryonics will work. So the fact that the bulk of anti-cryonics writings lack a tenable technical argument is only weak evidence that no one can produce one right now. It’s just that the people who can produce them aren’t interested enough to bother writing about cryonics at all.

I wholeheartedly agree that we should encourage people who may have them to write up strong technical arguments why cryonics won’t work.

ScottMessick 21 Aug 2011 15:24 UTC
7 points
in reply to: ArisKatsaris’s comment on: Please do not downvote every comment or post someone has ever made as a retaliation tactic.
Right on.

ScottMessick 21 Aug 2011 15:23 UTC
7 points
in reply to: XiXiDu’s comment on: Please do not downvote every comment or post someone has ever made as a retaliation tactic.

Imagine a thousand professional philosophers would join lesswrong, or worse, a thousand creationists.

This test seems rather unfair—it’s pretty much a known that people who join LessWrong are likely to be already sympathetic to the LessWrong’s way of thinking. Besides, the only way to avoid a situation where thousands of dissidents joining could wreck the system is to have centralized power, i.e., more traditional moderation, which I think we were hoping to avoid for exactly the types of reasons that are being brought up here (politics, etc.).

The availability of a reputation system also discourages people to actually explain themselves by being able to let off steam or ignore cognitive dissonance by downvoting someone with a single mouse click.

True, but I think you have missed a positive incentive for response that is created by the reputation system in addition to the negative ones—a post/comment with a bad argument or worse creates an opportunity to win karma by writing a clear refutation, and I frequently see such responses being highly upvoted.

The initial population of a community might have been biased about something and the reputation system might provide a positive incentive to keep the bias and a negative incentive for those who disagree.

This is a problem, but based purely on my subjective experience it seems that people are more than willing to upvote posts that try to shatter a conventional LessWrong belief, and do so with good argumentation.

ScottMessick 1 Nov 2012 5:38 UTC
6 points
0
in reply to: JoshuaZ’s comment on: Proofs, Implications, and Models
I’m not going to say they haven’t been exposed to it, but I think quite few mathematicians have ever developed a basic appreciation and working understanding of the distinction between syntactic and semantic proofs.

Model theory is, very rarely, successfully applied to solve a well-known problem outside logic, but you would have to sample many random mathematicians before you could find one that could tell you exactly how, even if you restricted to only asking mathematical logicians.

I’d like to add that in the overwhelming majority of academic research in mathematical logic, the syntax-semantics distinction is not at all important, and syntax is suppressed as much as possible as an inconvenient thing to deal with. This is true even in model theory. Now, it is often needed to discuss formulas and theories, but a syntactical proof need not ever be considered. First-order logic is dominant, and the completeness theorem (together with soundness) shows that syntactic implication is equivalent to semantic implication.

If I had to summarize what modern research in mathematical logic is like, I’d say that it’s about increasingly elaborate notions of complexity (of problems or theorems or something else), and proving that certain things have certain degrees of complexity, or that the degrees of complexity themselves are laid out in a certain way.

There are however a healthy number of logicians in computer science academia who care a lot more about syntax, including proofs. These could be called mathematical logicians, but the two cultures are quite different.

(I am a math PhD student specializing in logic.)

ScottMessick 17 Jun 2012 2:49 UTC
6 points
in reply to: Paul Crowley’s comment on: Neil deGrasse Tyson on Cryonics
I think you may be missing a silent majority of people who passively judge cryonics as unlikely to work, and do not develop strong feelings or opinions about it besides that, because they have no reason to. I think this category, together with “too expensive to think about right now”, forms the bulk of intelligent friends with whom I’ve discussed cryonics.

ScottMessick 21 Jan 2012 23:47 UTC
6 points
in reply to: mwengler’s comment on: The Singularity Institute’s Arrogance Problem
Yes, and isn’t it interesting to note that Robin Hanson sought his own higher degrees for the express purpose of giving his smart contrarian ideas (and way of thinking) more credibility?

ScottMessick 23 Oct 2012 3:29 UTC
4 points
in reply to: magfrump’s comment on: 2012 Less Wrong Census Survey: Call For Critiques/Questions

I continue to be surprised (I believe I commented on this last year) that under “Academic fields” pure mathematics is not listed on its own; it is also not clear to me that pure mathematics is a hard science; relatedly, are non-computer science engineering folk expected to write in answers?

I second this: please include pure mathematics. I imagine there are a fair few of us, and there’s no agreed upon way to categorize it. I remember being annoyed about this last year. (I’m pretty sure I marked “hard sciences”.)