To clarify—I do not think MCAS specifically is an AI based system, I was just thinking of a hypothetical future similar system that does include a weak AI component, but where, similarly to ACAS the issue is not so much with the flaw in AI itself, but in how it is being used in a larger system.
In other words, I think your test needs to make a distinction between a situation where one needed a trustworthy AI, and the actual AI was unintentionally/unexpectedly untrustworthy vs a situation where perhaps the AI performed reasonably well, but the use of AI was problematic, causing a disaster anyway.
HPMOR illustrated how Universe’s best move is to intervene before you make a precommitment like that to prevent you from making it. The redundancy argument does not work—they ought to have some common ancestor earlier in time. So here is me telling you on behalf of the Universe: DO NOT MESS WITH TIME.
Boeing MCAS (https://en.wikipedia.org/wiki/Maneuvering_Characteristics_Augmentation_System) is blaimed by more than 100 deaths. How much “AI” would a similar system need to include for a similar tragedy to count as “an event precipitated by AI”?
Actually, there is a logical error in your mathematicians joke—at least compared to how this joke normally goes. When it’s their turn, the 3rd mathematician knows that the first two wanted a beer (otherwise they would have said “yes”), and so can say Yes/No. https://www.beingamathematician.org/Jokes/445-three-logicians-walk-into-a-bar.png
Why do you think that refugees will be capable of creating better institutions than those that failed them in theis county of origin? Could it be that a small (relatively speaking) number of refugees can benefit from better institutions of their new country, without diluting the locals so much that the implicit institutional knowledge is lost, but a larger influx of immigrants would just import their “bad” institutions with them?
I use an alternative technique that works well for me—making sure to walk up the stack on every significant new development at lower levels.
E.g. if on level 5 am trying to solve X with technique Y, and I realize that it does not quite work, but I would probably be able to do X’ that is as good with Y’, before jumping into Y’, I take time to consider—well, X’ is as good as X for level 4, but does it perhaps mutate level 4 away from higher-level goals? Maybe the fact that Y does not actually work for X indicates that approach at one of the higher levels is off?
And it’s actually similar when Y does succeed for X—once it does, I learned something new, and need to check my stack again. Or maybe I realize that Y is taking me much longer than expected—again, need to walk the stack and figure out whether X and Y are even worth it. This way when I am in the zone on Y, there is no distraction, but I also do not have the stack ignored for too long as beeing in the zone for Y for too long is an indication that something went wrong and the plan needs to be reexamined.
Having hard deadlines, even artificially imposed, helps. Having goals explicit (and explicitly written, so that I can remind yourself how I ended up in the rabbit hole I am in) for each of higher levels helps.
YMMV, of course.
OK, sorry, but can somebody please explain the banana anecdote? What is supposed to be so obviously wrong with this approach? I seriously did not get it.
P.S. Otherwise, great writing—but can I suggest using more transparent anecdotes for your illustrations?
Here is yet another reason this trade may be irrational. If souls were real, then I’d expect the value of a soul to be quite high. For the sake of the argument, let’s posit the value of a soul (if it existed) at $1M. Now the question is—can you make 100,000 statements that you are about as certain of being true as the statement “souls do not exist” and not make even a single mistake? If the answer is “no” (and it’s probably “no” for all but the most careful people), then the habit of selling souls for $10 is a bad habit to have—sooner or later you’d mess up and sell something way too valuable.
I’d say that the cheerful price is a primarily psychological concept, while shadow price is a more analytical one, and that is the whole point—when what you feel and what you think you ought to feel disagrees, the concept of cheerful price is explicitly telling you to not worry about the mismatch, and go with the former.
It seems you are actually describing a 3-algorithm stack view for both the human and AGI. For human, there is 1) evolution working on the genome level, there is 2) long-term brain development / learning, and there is 3) the brain solving a particular task. Relatively speaking, evolution (#1) works on much smaller number much more legible parameters than brain development (#2). So if we use some sort of genetic algorithm for optimizing AGI meta-parameters, then we’d get a very stack that is very similar in style. And in any case we need to worry about “base” optimizer used in the AGI version of #1+#2 producing an unaligned mesa-optimizer for AGI version of the #3 algorithm.
Note that Kelly is valid under the assumption that you know the true probabilities. II do not know whether it is still valid when all you know is a noisy estimate of true probabilities—is it? It definitely gets more complicated when you are betting against somebody with a similarly noisy estimate of the same probably, as at some level you now need to take their willingness to bet into account when estimating the true probability—and the higher they are willing to go, the stronger the evidence that your estimate may be off. At the very least, that means that the uncertainty of your estimate also becomes the factor (the less certain you are, the more attention you should pay to the fact that somebody is willing to bet against you). Then the fact that sometimes you need to spend money on things, rather than just investing/betting/etc, and that you may have other sources of income, also complicates the calculus.
The way you described the chess/marriage/etc market, it’s a bit vulnerable. Imagine there is a move that appears to be a very strong one, but with a small possibility of a devastating countermove that is costly for market participants to analyze. There is an incentive to bet on it—if the countermove exists, hopefully somebody will discover it, heavily bet against the move, and cause the price to drop enough that it is not taken, and the bets are refunded. If no countermove exists, the bet is a good one, and is profitable. But if nobody bothers to check for the countermove, and it exists, everybody (those who bet on the move, and the decision makers who made the move) are in trouble, but it could still be the case that no bettors have enough incentive to check for countermove (if it exists, they do not derive any benefit from the significant mispricing of the move, as you just refund the bets).
Right, which is why the claim is immediately more suspect if Xavier is a close friend/relative/etc.
I do not see the connection. The gist of Newcomb’s Problem does not change if the player is given a time limit (you have to choose within an hour, or you do not get anything). Time-limited halting problem is of course trivially decidable.
I think your analysis of “you’re only X because of Y” is missing the “you are doing it wrong” implicit accusation in the statement. Basically, the implied meaning, I think, is that while there are acceptable reasons to X, you are lacking any of them, but instead your reason for X is Y, which is not one of the acceptable reasons. Which is why your Z is a defense—claiming to have reasons in the acceptable set. And another defense might be to respond entirely to the implied accusation and explain why Y should be an OK reason to X. “You’re only enjoying that movie scene because you know what happened before it”—“Yeah, and what’s wrong with that?”
Random data point—https://ftx.com/trade/TRUMPFEB (“Trump is the President on Feb 1st, 2021”) is currently at 0.142 (14.2% probability it will happen)...
In mathematics, axioms are not just chosen based of what feels correct—instead, the implications of those axioms are explored, and only if those seem to match the intuition too, then the axioms have some chance of getting accepted. If a reasonably-seeming set axioms allows you to prove something that clearly should not be provable (such as—in the extreme case—a contradiction), then you know your axioms are no good.
Axiomatically stating a particular ethical framework, then exploring the consequences of the axioms in the extreme and tricky cases can serve a similar purpose—if simingly sensible ethical “axioms” lead to completely unreasonable conclusions, then you know you have to revise the stated ethical framework in some way.
Perhaps also higher availability of testing and higher awareness means more people with mild symptoms get tested?
Well, this is Committee on Armed Services—obviously the adversarial view of things is kind of a part of their job description… (Not that this isn’t a problem, just pointing out that they are probably not the best place to look for a non-adversarial opinion).
More of an anecdote than research, but I recently became aware of Dr. A.J Cronin’s novel “The Citadel” published in 1937 and the claim that the book prompted new ideas about medicine and ethics, inspiring to some extent the UK NHS and the ideas behind it. Did not look into this much myself, but certainly a very fascinating story, if true.