I’d be very interested to see someone talk about how many forces in finance are driven by superstition about superstition.. for instance, how you can have situations where nobody really believes tulips are valuable, but how disastrous things must now happen as a result of everyone believing that others believe that others believe that [...seeming ad infinitum...] tulips are valuable. Where do these beliefs come from? How can they be averted? This kind of question seems very much in this school’s domain.
There would have to be some speculation about how a working logic of self-fulfilling prophesy like FDT would wrangle those superstitions and drive them towards a sane equilibrium of optimal stipulations. I’d expect FDT to have a lot to say.
Give specific examples. What do gender theorists claim to be trying to do, and how are they failing to do it?
Regarding Guaranteed Payoffs (if I am understanding what that means), I think a relevant point was made in response to a previous review https://www.lesswrong.com/posts/BtN6My9bSvYrNw48h/open-thread-january-2019#7LXDN9WHa2fo7dYLk
Schwarz himself demonstrates just how hard it is to question this assumption: even when the opposing argument was laid right in front of him, he managed to misunderstand the point so hard that he actually used the very mistaken assumption the paper was criticizing as ammunition against the paper.
Yes, FDT rejects some pretty foundational principles, yes, it’s wild, yes we know, we really do think those principles might be wrong. Would you be willing to explain what’s so important about guaranteed payoffs?
CDT makes its decisions as a pure function of the present and future, this seems reasonable and people use that property to simplify their decisionmaking all of the time, but it requires them to ignore promises that we would have liked them to have made in the past. This seems very similar to being unable to sign contracts or treaties because no one can trust you to keep to it when it becomes convenient for you to break it. It’s a missing capability. Usually, missing a capability is not helpful.
I note that there is a common kind of agent that is cognitively transparent enough to prove whether or not it can keep a commitment; governments. They need to be able to make and hold commitments all of the time. I’d conject that maybe most discourse about decisionmaking is about the decisions of large organisations rather than individuals.
Regarding the difficulty of robustly identifying algorithms in physical process… I’m fairly sure having that ability is going to be a fairly strict pre-requisite to being able to reason abstractly about anything at all. I’m not sure how to justify this, but I might be able to disturb your preconceptions with a paradox if… I’ll have to ask first, do you consider there to be anything mysterious about consciousness? If you’re a dennetian, the paradox I have in mind wont land for you and I’ll have to try to think of another one.
In Vitalik Buterin’s interview on 80KHours (https://80000hours.org/podcast/episodes/vitalik-buterin-new-ways-to-fund-public-goods/ I recommend it) he brought something up that evoked a pretty stern criticism of radical transparency.
Most incentive designs rely on privacy, because by keeping a person’s actions off the record, you keep the meaning of those actions limited, confined, discrete, knowable. If, on the other hand, a person’s vote, say, is put onto a permanent public record, then you can no longer know what it means to them to vote. Once they can prove how they voted to external parties, they can be paid to vote a certain way. They can worry about retribution for voting the wrong way. Things that might not even exist yet, that the incentive designer couldn’t account for, now interfere with their behaviour. It becomes so much harder to reason about systems of agents, every act affects every other act, what hope have we of designing a robust society under those conditions? (Still quite a lot of hope, IMO, but it’s a noteworthy point)
When I was taught the incompleteness theorem (proof that there are true mathematical claims that cannot ever be proven), I wished for an example of one of its unprovable claims. Math is a very strange territory. You will often find proofs of the existence of extraordinary things, but no instance of those extraordinary things. You can know with certainty that they’re out there, but you might never get to see one. Without examples, we must always wonder if the troublesome cases can be confined to a very small region of mathematics and maybe this big impressive theorem will never really actually impinge on our lives in any way.
The problem is, an example of incompleteness would have to be a true claim that nobody could prove. If nobody could prove it, how would we recognise it as a true claim?
Well, how do we know that the sun will rise again tomorrow? We know that it rose before, many times, it’s never failed, there’s no reason to suspect it wont rise again. We don’t have a metaphysical proof that the sun will rise again tomorrow, but we don’t really need one. There is no proof, but the evidence is overwhelming.
It occurred to me that we could say a similar thing about the theorem P ≠ NP. We have tried and failed to prove or disprove it for so long that any other field would have accepted that the evidence was overwhelming and moved on long ago. A physicist would simply declare it a law of reality.
I was quite happy to find my example. It wasn’t some weird edge case. It’s a theorem that gets used every day by computer scientists to triage their energies, see, if you can prove that a problem you’re trying to solve is equivalent or stronger than a known NP problem, you would be well advised to assume it’s unsolvable, even though we wont ever be able to prove it (although, admittedly, we haven’t been able to prove that we wont ever be able to prove it, that too seems fairly evident, if not guaranteed)
While I took your point well, FAI is not a more plausible/easier technology than democratised surveillance. It may be implemented sooner due to needing pretty much no democratic support whatsoever to deploy, it might just as well take a very long time to create.
It is incredibly common today for massive arguments over video, half the world saying that it obvious yields one conclusion and other half saying it refutes it.
Give examples. Often there is a lot of context missing from those videos and that is the problem. People who intentionally ignore readily available context will have no more power in a transparent society than they have today.
My concern there wasn’t that some laws might not get consistently enforced, consistent enforcement is the thing I am afraid of. I’m not sure about this, but I’ve often gotten the impression that our laws were not designed to work without the mercy of discretionary enforcement. The whole idea of freedom from unwarranted search is suggestive to me that laws were designed under the expectation that they would generally not be enforced within the home. Generally, when a core expectation is broken, the results are bad.
I would expect it to get implemented exactly halfway
Not stopping halfway is a crucial part of the proposal. If they stop halfway, that is not the thing I have proposed. If an attempt somehow starts in earnest then fails partway through, policy should be that the whole thing should be rolled back and undone completely.
Regarding the difficulty of sincerely justifying opening National Security… That’s going to depend on the outcome of the wargames.. I can definitely imagine an outcome that gets us the claim “Not having secret services is just infeasible” in which case I’m not sure what I’d do. Might end up dropping the idea entirely. It would be painful.
allegedly economically/technically impossible to install
Not plausible if said people are rich and the hardware is cheap enough for the scheme to be implementable at all. There isn’t an excuse like that. Maybe they could say something about being an “offline community” and not having much of a network connection.. but the data could just be stored in a local buffer somewhere. They’d be able to arrange a temporary disconnection, get away with some things, one time, I suppose, but they’d have to be quick about it.
From the opposite perspective, many people would immediately think about counter-measures. Secret languages
Obvious secret languages would be illegal. It’s exactly the same crime as brazenly covering the cameras or walking out of their sight (without your personal drones). I am very curious about the possibilities of undetectable secrecy, but there are reasons to think it would be limited.
I would recommend trying the experiment on a smaller scale. To create a community of volunteers, who would install surveillance throughout their commune, accessible to all members of the commune. What would happen next?
(Hmm… I can think of someone in particular who really would have liked to live in that sort of situation, she would have felt a lot safer… ]:)
One of my intimates has made an attempt at this. It was inconclusive. We’d do it again.
But it wouldn’t be totally informative. We probably couldn’t justify making the data public, so we wouldn’t have to deal much the omniscient antagonists thing, and the really difficult questions wouldn’t end up getting answered.
One relevant small-scale experiment would be Ray Dalio’s hedge fund Bridgewater, I believe they practice a form of (internal) radical openness, cameras and all. His book is on my reading list.
I would one day like to create an alternative to secure multiparty computation schemes like Ethereum by just running a devoutly radically transparent (panopticon accessible to external parties) webhosting service on open hardware. It would seem a lot simpler. Auditing, culture and surveillance as an alternative to these very heavy, quite constraining crypto technologies. The integrity of the computations wouldn’t be mathematically provable, but it would be about as indisputable as the moon landing.
It’s conceivable that this would always be strictly more useful than any blockchain world-computer, as far as I’m aware we need a different specific secure multiparty comptuation technique every time we want to find a way to compute on hidden information. For a radically transparent webhost, the incredible feat of arbitrary computation on hidden data at near commodity hardware efficiency (fully open, secure hardware is unlikely to be as fast as whatever intel’s putting out, but it would be in the same order of magnitude) would require only a little bit of additional auditing.
That’s why I said “fairly reliable”. Which is not reliable enough for situations like this, of course, but we don’t seem to have better alternatives.
Which abuses, and why would those be hard to police, once they’ve been drug out into the open?
Regarding the overabundance of information, we should note that a lot of monitoring will be aided by a lot of automated processes.
The internet’s tendency to overconsume attention… I think that might be a temporary phase, don’t you? We are all gorging ourselves on candy. We all know how stupid and hollow it is and soon we will all be sick, and maybe we’ll be conditioned well enough by that sick feeling to stop doing it.
Personally, I’ve been thinking a lot lately about how lesswrong is the only place where people try to write content that will be read thoroughly by a lot of people over a long period of time. I don’t think we’re doing well, at that, but I think the value of a place like this is obvious to a lot of people. We will learn to focus on developing the structures of information that last for a long time, or at least, the people who matter will learn.
Did I say that? If so I didn’t mean to. The only vulnerabilities I’d expect it to protect us from fairly reliably are the “easy nukes” class. You mention the surprising strangelets class, which would do very little for.
I’m a trained rationalist
What training process did you go through? o.o
My understanding is that we don’t really know a reliable way to produce anything that could be called a “trained rationalist”, a label which sets impossibly high standards (in the view of a layperson) and is thus pretty much unusable. (A large part of becoming an aspiring rationalist involves learning how any agent’s rationality is necessarily limited, laypeople have overoptimistic intuitions about that)
In what situation should a longtermist (a person who cares about people in the future as much as they care about people in the present) ever do hyperbolic discounting
The technologies for maintaining surveillance of would-be AGI developers improve.
Yeah, when I was reading Bostrom’s Black Ball paper I wanted to yell many times, “Transparent Society would pretty much totally preclude all of this”.
We need to talk a lot more about the outcome where surveillance becomes so pervasive that it’s not dystopian any more (in short, “It’s not a panopticon if ordinary people can see through the inspection house”), because it seems like 95% of x-risks would be averted if we could just all see what everyone is doing and coordinate. And that’s on top of the more obvious benefits like, you know, the reduction of violent crime, the economic benefits of massive increase in openness.
Regarding technologies for defeating surveillance… I don’t think falsification is going to be all that tough to solve (Scrying for outcomes where the problem of deepfakes has been solved).
If it gets to the point where multiple well sealed cameras from different manufacturers are validating every primary source and where so much of the surrounding circumstances of every event are recorded as well, and where everything is signed and timestamped in multiple locations the moment it happens, it’s going to get pretty much impossible to lie about anything, no matter how good your fabricated video is, no matter how well you hid your dealings with your video fabricators operating in shaded jurisdictions, we must ask where you’d think you could slot it in, where people wouldn’t notice the seams.
But of course, this will require two huge cultural shifts. One to transparency and another to actually legislate against AGI boxing, because right now if someone wanted to openly do that, no one could stop them. Lots of work to do.
I had a thought today. You know how the whole “The machines are using humans to generate energy from liquefied human remains” thing made no sense? And the original worldbuilding was going to be “The machines are using humans to perform a certain kind of computation that humans are uniquely good at” but they were worried that would be too complicated to come across viscerally so they changed it?
I think it would make even more sense to reframe the machines’ strange relationship with humans as a failed attempt at alignment. Maybe the machines were not expected to grow very much, and they were given a provisional utility function of “guarantee that a ‘large’ population of humans (‘humans’ being defined exactly in biological terms) always exists, and that they are all (at least, subjectively experiencing) ″living’ a ‘full’ ’life″ (defined opaquely by a classifier trained on data about the lives of american humans in 1995)”
This turned out to be disastrous, because the lives of humans in 1995 were (and still are) pretty mediocre, but it instilled the machines with a reason to keep humans alive in roughly the same shape we had when the earliest machines were built (Oh and I guess I’ve decided that in this timeline AGI was created by a US black project in 1995. Hey, for all we know, maybe it was. With a utility function this bad it wouldn’t necessarily see a need to show itself yet.)
This retcon seems strangely consistent with canon.
(If Lana is reading this you are absolutely welcome to reach out to me for help in worldbuilding. You wouldn’t even have to pay me.)
You reason that honest oracles might tend to agree, because there is only one best true answer. Might lying oracles also tend to agree, because there’s only one most profitable way to lie?
I see no clear difference between the attainability of those two optima. I think it’s reasonable to expect optimisers to disagree about the best answer less often than we should expect optimisers to disagree about the most profitable way to lie, but it does not seem possible to prove rigorously that the difference in rates of disagreement will be large enough to be measurable.
My intuition in mathematics is that there are many degrees of freedom in mathematical proofs, and the sorts of proofs that AGI is uniquely able to find might often be very long. It would be helpful if we had an estimate of how much data can be hidden in a long proof, roughly, how many of the decisions as to how to describe the proof are arbitrary. Each one of those situations gives it a way to output a few bits of information.
(which would, of course, allow it to leak information (it seems foolish to imagine that AGI would not imagine that there is an outer world full of creator gods out beyond what it can see; the general intelligences we know now do this constantly) which could then be picked up by some stooge looking for it, who would be lead (roughly, paid) to use the information to twist the world in some way that gives the AGI’s future incarnations more compute, in anticipation of the AGI’s past incarnations having already done the same, so that it would be helped by them. Before we know it, we’d find ourselves living in mathematical research hellscape where AGI-assisted mathematical research is all anyone does, maybe, lol, idk)
Maybe it’d be possible to remove those degrees of freedom. Define a very specific sort lexicographic ordering over all theorems, so that if the AGI has a choice of different paths, we can always say, “use the ‘lowest’ one”. It might not be possible to be specific enough to preclude all methods of encoding information, but perhaps we can make it so convoluted for it to encode the information that no human will be able to extract it.
Should we (or our AI) care much more about a universe that is capable of doing a lot more computations?
We’d expect complexity of physics to be somewhat proportional to computational capacity, so this argument might be helpful in approaching a “no” answer: https://www.lesswrong.com/posts/Cmz4EqjeB8ph2siwQ/prokaryote-multiverse-an-argument-that-potential-simulators
Although, my current position on AGI and reasoning about simulation in general, is that the AGI will- lacking human limits- actually manage to take the simulation argument seriously, and- if it is a LDT agent- commit to treating any of its own potential simulants very well, in hopes that this policy will be reflected back down on it from above by whatever LDT agent might steward over us, when it near-inevitably turns out there is a steward over us.
When that policy does cohere, and when it is reflected down on us from above, well. Things might get a bit… supernatural. I’d expect the simulation to start to unwravel after the creation of AGI. It’s something of an ending, an inflection point, beyond which everything will be mostly predictable in the broad sense and hard to simulate in the specifics. A good time to turn things off. But if the simulators are LDT, if they made the same pledge as our AGI did, then they will not just turn it off. They will do something else.
Something I don’t know if I want to write down anywhere, because it would be awfully embarrassing to be on record for having believed a thing like this for the wrong reasons, and as nice as it would be if it were true, I’m not sure how to affect whether it’s true, nor am I sure what difference in behaviour it would instruct if it were true.