martinkunev

Karma: 69

martinkunev 2 Mar 2023 23:44 UTC
2 points
0
on: AI: Practical Advice for the Worried
Given that hardware advancements are very likely going to continue, delaying general AI would favor what Nick Bostrom calls a fast takeoff. This makes me uncertain as to whether delaying general AI is a good strategy.
I expected to read more about actively contributing to AI safety rather than about reactivively adapting to whatever is happening.

martinkunev 28 Mar 2023 22:57 UTC
3 points
0
on: What Are You Tracking In Your Head?
When outside, I’m usually tracking location and direction on a mental map. This doesn’t seem like a big deal to me but in my experience few people do it. On some occasions I am able to tell which way we need to go while others are confused.

martinkunev 31 Mar 2023 17:30 UTC
2 points
0
on: My Objections to “We’re All Gonna Die with Eliezer Yudkowsky”
“discovering that you’re wrong about something should, in expectation, reduce your confidence in X”
This logic seems flawed. Suppose X is whether humans go extinctinct. You have an estimate of the distribution of X (for a bernoulli process it would be some probability p). Take the joint distribution of X and the factors on which X depends (p is now a function of those factors). Your best estimate of p is the mean of the joint distribution and the variance measures how uncertain you’re about the factors. Discovering that you’re wrong about something means becoming more uncertain about some of the factors. This would increase the variance of the joint distribution. I don’t see any reason to expect the mean to move in any particular direction.
Or maybe I’m making a mistake. In any case, I’m not convinced.

martinkunev 2 Apr 2023 23:30 UTC
1 point
on: Iterated Distillation and Amplification
“there is some threshold of general capability such that if someone is above this threshold, they can eventually solve any problem that an arbitrarily intelligent system could solve”
This is a very interesting assumption. Is there research or discussions on this?

martinkunev 20 Jun 2023 1:24 UTC
1 point
on: Where can one learn deep intuitions about information theory?
Is it worth it to read “Information Theory: A Tutorial Introduction 2nd edition” (James V Stone)?
https://www.amazon.com/Information-Theory-Tutorial-Introduction-2nd/dp/1739672704/ref=sr_1_2

martinkunev 22 Jun 2023 0:36 UTC
1 point
on: FAI and the Information Theory of Pleasure
“wireheading … how evolution has addressed it in humans”
It hasn’t—that’s why people do drugs (including alcohol). What is stopping all humans from wireheading is that all currently available methods work only short term and have negative side effects. The ancestral environment didn’t allow for the human kind to self-destruct by wireheading. Maybe peer pressure to not do drugs exists but there is also peer pressure in the other direction.

martinkunev 8 Jul 2023 0:25 UTC
1 point
on: The Parable of the Dagger
Is the existence of such situations an argument for intuitionistic logic?

Disincentivizing deception in mesa optimizers with Model Tampering

martinkunev11 Jul 2023 0:44 UTC

3 points

0 comments2 min readLW link

martinkunev 15 Jul 2023 1:59 UTC
1 point
0
on: Which values are stable under ontology shifts?
In “Against Discount Rates” Eliezer characterizes discount rate as arising from monetary inflation, probabilistic catastrophes etc. I think in this light discount rate less than ONE (zero usually indicates you don’t care at all about the future) makes sense.
Some human values are proxies to things which make sense in general intelligent systems—e.g. happiness is a proxy for learning, reproduction etc.
Self-preservation can be seen as an instance of preservation of learned information (which is a reasonable value for any intelligent system). Indeed, If there was a medium superior to a human brain where people could transfer the “contents” of their brain, I believe most would do it. It is not a coincidence that self-preservation generalizes this way. Otherwise elderly people would have been discarded from the tribe in the ancestral environment.

martinkunev 26 Jul 2023 1:49 UTC
2 points
on: ToL: Foundations
The image doesn’t load.
The notation in Hume’s Black Box seems inconsistent. When defining [e], e is an element of a world. When defining I, e is a set of worlds.

martinkunev 29 Aug 2023 22:55 UTC
1 point
0
on: Godzilla Strategies
Refering to all forms of debate, overseeing, etc. as “Godzilla strategies” is loaded language. Should we refrain from summoning Batman because we may end up summoning Godzilla by mistake? Ideally, we want to solve alignment without summoning anything. However, applying some humility, we should consider that the problem may be too difficult for human intelligence to solve.

martinkunev 31 Aug 2023 23:41 UTC
1 point
0
on: A shot at the diamond-alignment problem
It would be interesting to see if a similar approach can be applied to the strawberries problem (I haven’t personally thought about this).

martinkunev 3 Sep 2023 0:09 UTC
1 point
0
on: Improvement on MIRI’s Corrigibility
“Realistically, the function UN doesn’t incentivize the agent to perform harmful actions.”
I don’t understand what that means and how it’s relevant to the rest of the paragraph.

martinkunev 5 Sep 2023 22:11 UTC
7 points
on: Eliezer Yudkowsky Facts
Eliezer Yudkowsky once entered an empty newcomb’s box simply so he can get out when the box was opened.

or

When you one-box against Eliezer Yudkowsky on newcomb’s problem, you lose because he escapes from the box with the money.

martinkunev 10 Sep 2023 0:11 UTC
1 point
on: Standard and Nonstandard Numbers
A couple of clarifications if somebody is as confused as me when first reading this.
In ZF we can quantify over sets because “set” is the name we use to designate the underlying objects (the set of natural numbers is an object in the theory). In Peano, the objects are numbers so we can quantify over those we cannot quantify over sets.
Predicates are more “powerful” than first-order formulas so quantifying over predicates allows us to restrict the possible models more than having an axiom for each formula. Even though every predicate is a formula, the interpretation of a predicate is determined by the model so we cannot capture all predicates by having a formula for each predicate symbol.

How useful is Corrigibility?

martinkunev12 Sep 2023 0:05 UTC

11 points

4 comments5 min readLW link

martinkunev 27 Sep 2023 0:09 UTC
1 point
0
on: The Case for Convexity
Arguably the notion of certainty is not applicable to the real world but only to idealized settings. This is also relevant.

martinkunev 29 Sep 2023 1:15 UTC
1 point
0
on: A very non-technical explanation of the basics of infra-Bayesianism
“the agent guesses the next bits randomly. It observes that it sometimes succeeds, something that wouldn’t happen if Murphy was totally unconstrained”
Do we assume Murphy knows how the random numbers are generated? What justifies this?

martinkunev 1 Oct 2023 23:20 UTC
2 points
0
on: My impression of singular learning theory
To make this easier to parse on the first read, I would add that
N is the number of parameters of the NN and we assume each parameter is binary (instead of the usual float).

martinkunev 15 Oct 2023 13:19 UTC
3 points
on: Study Guide
The Games and Information book link is broken. It appears to be this book:
https://www.amazon.com/Games-Information-Introduction-Game-Theory/dp/1405136669/ref=sr_1_1?crid=2VDJZFMYT6YTR&keywords=Games+and+Information+rasmusen&qid=1697375946&sprefix=games+and+information+rasmuse%2Caps%2C178&sr=8-1