The treaty can constrain the largest projects like DeepMind. Identifying them can be done by an international court. We don’t need to defend against the smartest humans, only the crackpots that still think we’re in an arms race instead of a bomb defusal operation. Imagine that the Manhattan project mathematicians had computed that a nuke had a 90% chance of igniting the atmosphere. Would their generals still have been table thumping that they should develop the nukes before the Nazis do? I think the more pressing concern is to make every potential researcher, including the Nazis, aware of the 90% chance. Legislation helping to slow is all your top-level comment requires.
In your world, a treaty might make everyone keep the researchers in check until alignment is solved.
Your use of the word punching looks like clickbait. Your nonstandard use should come after your definition, and especially not in the title.
You start with with 10 bucks, I start with 10 bucks. You wager any amount up to a hundred times, each time doubling it 60% of the time and losing it 40% of the time, until one of us is bankrupt or you stop. If you wager it all, I have a 40% chance to win. If you wager one buck at a time, you win almost certainly.
An input to another algorithm may be our source code, with the other algorithm’s output depending on what they can prove about our output. If we assume they reason consistently, and want to prove something about their output, we might assume what they prove about us even when that later turns out impossible.
Trusting your eyes blinds you to the invisible, that doesn’t sound particularly hard to explain.
Enlightenment seems to be a change to the way you look at what you already know that doesn’t change any predictions. A brain refactoring.
How does someone who has become enlightened know that he has? Does a particular kind of thought come more easily afterwards?
What words did you say to others to explain what cognitive jitter is?
I understand that quote by Nate Soares to mean that other outputs of algorithms are impossible because in a deterministic universe, everything can only possibly have happened the way it did.
The different input is just as impossible as the different output. Humans implement counterfactual reasoning without zooming in, so we should look for math that allows this.
Knowledge that there is an action to select, in the form of having an action in hand, allows the implementation of exactly one chooser: The one that always selects that action.
 holds for any function k / partition k−1 between any two sets. The proof you want may be that A→B is an exponential space and therefore usually larger than A.
interleave/sandwich should then take two predictions as parameters. This suggests that we could define a metric on the space of predictions, and then sandwich the chooser between two nearby predictions, to measure its response to inaccurate predictions.
People could edit their posts to claim that someone else private-messaged or shadow-replied to them, which disrespects the rule that the author can make it public but can’t exactly be stopped without people just evading to the nearest unblocked strategy. By acknowledging this in the UI, we preempt attempts to use censorship or dark arts to curb the practice, and people building alternate clients that add the feature, see reddit’s deleted post mirrors.
There’s also the thing where people grew up on 4chan embracing the anarchy, and not used to that you can enhance the wild west web’s user experience without destroying it.
I’m mostly joking now, but a principle of all possible things being intended would dictate that Alice can click an option on Bob’s shadow reply to replace it with a public “Alice claims that Bob shadow-replied thus:“. Alice can fake this, and Bob can confirm iff it’s true.
It’s the sort of feature you might want in a cyberpunk role-playing game taking the shape of a forum.
You could remove a trivial inconvenience in the way of extinguishing demon threads by having a checkbox on replies to make the reply a private message instead. (Perhaps it could even remain in the place where replies are, just only visible to you two!)
The mazes may have been constructed to be hard from the front. (Lesser hypotheses: If you switch to the back when you notice it’s hard from the front, on average it’s going to become easier. If you switch after having explored the maze a little from the front, you already know where to go.)
Right. Perhaps the axiom schema “If T proves φ, then φ.“?
I don’t follow. I agree that (PA + “PA is inconsistent”) is consistent. How does it follow that consistency of T isn’t enough? The way I use consistency there is “If T proves that a program halts, then that program does halt and we can safely run it.“.
Given y, the agent can figure out whether x = 0 by checking whether f(0) = y.
The only step in which his machine can fail to halt is “Run all programs such that a halting proof exists, until they halt.“. A program would have to have a halting proof, yet not halt. Therefore, beyond what we need to talk about turing machines at all, the only extra axiom needed is "T is consistent.".
Aren’t defenses submitted to the Unrestricted Adversial Examples Challenge incentivized to classify everything but the eligibility data set as ambiguous?
(A→0)→A usually has one element, so B needs not have an element.
That Hoogle doesn’t list a result essentially follows from k not being parametric in all types. (Except that it lists unsafeCoerce :: A→B—they’d rather have the type system inconsistent than incomplete...)
(A→B)→B stands for what the agent ends up making happen, and may be easier to implement—just like predicting that Kasparov will win a chess match, without knowing how. Interesting (A→B)→A should tend to have the property that interleave turns them into a particular kind of (A→B)→B . Why would you call it interleave?