I am a longtime LessWrong and SSC reader who finally got around to starting a blog. I would love to hear feedback from you! https://harsimony.wordpress.com/
Standardization/interoperability seems promising, but I want to suggest a stranger option: subsidies!
In general, monopolies maximize profit by setting an inefficiently high price, meaning that they under-supply the good. Essentially, monopolies don’t make enough money.
A potential solution is to subsidize the sale of monopolized goods so the monopolist increases supply to the efficient level.
For social media monopolies, they charge too high a “price” by using too many ads, taking too much data, etc. Because of the network effect, it would be socially beneficial to have more users, but the social media company drives them away with their high “prices”. The socially efficient network size could be achieved by paying the social media company per active user!
I was planning to write this up in more detail at some point (see also). There are of course practical difficulties with identifying monopolies, determining the correct subsidy in an adversarial environment, Sybil attacks, etc.
Current AI Models Seem Sufficient for Low-Risk, Beneficial AI
Nice post, thanks!
Is there a formulation of UDASSA that uses the self-indication assumption instead? What would be the implications of this?
Frowning upon groups which create new, large scale models will do little if one does not address the wider economic pressures that cause those models to be created.
I agree that “frowning” can’t counteract economic pressures entirely, but it can certainly slow things down! If 10% of researchers refused to work on extremely large LM’s, companies would have fewer workers to build them. These companies may find a workaround, but it’s still an improvement on the situation where all researchers are unscrupulous.
The part I’m uncertain about is: what percent of researchers need to refuse this kind of work to extend timelines by (say) 5 years? If it requires literally 100% of researchers to coordinate, then it’s probably not practical, if we only have to convince the single most productive AI researcher, then it looks very doable. I think the number could be smallish, maybe 20% of researchers at major AI companies, but that’s a wild guess.
That being said, work on changing the economic pressures is very important. I’m particularly interested in open-source projects that make training and deploying small models more profitable than using massive models.
On outside incentives and culture: I’m more optimistic that a tight-knit coalition can resist external pressures (at least for a short time). This is the essence of a coordination problem; it’s not easy, but Ostrom and others have identified examples of communities that coordinate in the face of internal and external pressures.
Massive Scaling Should be Frowned Upon
I like this intuition and it would be interesting to formalize the optimal charitable portfolio in a more general sense.
I talked about a toy model of hits-based giving which has a similar property (the funder spends on projects proportional to their expected value rather than on the best projects):
Updated version here: https://harsimony.wordpress.com/2022/03/24/a-model-of-hits-based-giving/
I think the section “Perhaps we don’t want AGI” is the best argument against these extrapolations holding in the near-future. I think data limitations, practical benefits of small models, and profit-following will lead to small/specialized models in the near future.
Yeah I think a lot of it will have to be resolved at a more “local” level.
For example, for people in a star system, it might make more sense to define all land with respect to individual planets (“Bob owns 1 acre on Mars’ north pole”, “Alice owns all of L4″ etc.) and forbid people from owning stationary pieces of space. I don’t have the details of this fleshed out, but it seems like within a star system, its possible to come up with a sensible set of rules and have the edge cases hashed out by local courts.
For the specific problem of predicting planetary orbits, if we can predict 1000 years in the future, it seems like the time-path of land ownership could be updated automatically every 100 years or so, so I don’t expect there to be huge surprises there.
For taxation across star systems, I’m having trouble thinking of a case where there might be ownership ambiguity given how far apart they are. For example, even when the Milky Way and Andromeda galaxies collide, its unlikely that any stars will collide. Once again, this seems like something that can be solved by local agreements where owners of conflicting claims renegotiate them as needed.
I feel like something important got lost here. The colonists are paying a land value tax in exchange for (protected) possession of the planet. Forfeiting the planet to avoid taxes makes no sense in this context. If they really don’t want to pay taxes and are fine with leaving, they could just leave and stop being taxed; no need to attack anyone.
The “its impossible to tax someone who can do more damage than their value” argument proves too much; it suggests that taxation is impossible in general. It’s always been the case that individuals can do more damage than could be recouped in taxation, and yet, people still pay taxes.
Where are the individuals successfully avoiding taxation by threatening acts of terrorism? How are states able to collect taxes today? Why doesn’t the U.S. bend to the will of weaker states since it has more to lose? It’s because these kinds of threats don’t really work. If the U.S. caved to one unruly individual then nobody would pay taxes, so the U.S. has to punish the individual enough to deter future threats.
… this would provide for enough time for a small low value colony, on a marginally habitable planet, to evacuate nearly all their wealth.
But the planet is precisely what’s being taxed! Why stage a tax rebellion only to forfeit your taxable assets?
If the lands are marginal, they would be taxed very little, or not at all.
Even if they left the planet, couldn’t the counter strike follow them? It doesn’t matter if you can do more economic damage if you also go extinct. It’s like refusing to pay a $100 fine by doing $1000 of damage and then ending up in prison. The taxing authority can precommit to massive retaliation in order to deter such behavior. The colony cannot symmetrically threaten the tax authority with extinction because of the size difference.
All of this ignores the practical issues with these weapons, the fact that earth’s value is minuscule compared to the sun, the costs of forfeiting property rights, the relocation costs, and the fact that citizens of marginal lands would receive net payments from the citizens dividend.
There are two possibilities here:
Nations have the technology to destroy another civilization
Nations don’t have the technology to destroy another civilization
In either case, taxes are still possible!
In case 1, any nation that attempts to destroy another nation will also be destroyed since their victim has the same technology. Seems better to pay the tax.
In case 2, the Nation doesn’t have a way to threaten the authorities, so they pay the tax in exchange for property rights and protection.
Thus threatening the destruction of value several orders of magnitude greater than the value to be collected is a viable deterrent.
Agreed, but to destroy so much value, one would have to destroy at least as much land as they currently control. Difficulties of tax administration mean that only the largest owners will be taxed, likely possessing entire solar systems. So the tax dodge would need to destroy a star. That doesn’t seem easy.
It’s impossible, without some as yet uninvented sensing technology, to reliably surveil even the few hundred closest star systems.
I’m more optimistic about sensing tech. Gravitational lensing, superlenses, and simple scaling can provide dramatic improvements in resolution.
It’s probably unnecessary to surveil many star systems. Allied neighbors on Alpha Centauri can warn earth about an incoming projectile as it passes by (providing years of advanced notice), so nations might only need to surveil locally and exchange information.
… once it’s past the Oort Cloud and relatively easy to detect again, there will be almost no time left at 0.5 c
It would take 2-3 years for a 0.5c projectile to reach earth from the outer edge of the Oort cloud, this seems like enough time to adapt. At the very least, its enough time to launch a counterstrike.
Fast projectiles may be impractical given that a single collision with the interstellar medium would destroy them. Perhaps thickening the Oort cloud could be an effective defense system.
A second-strike is only a credible counter if the opponent has roughly equal amounts to lose.
Agents pre-commit to a second strike as a deterrent, regardless of how wealthy the aggressor is. If the rebelling nation has the technology to destroy another nation and uses it, they’re virtually guaranteed to be destroyed by the same technology.
Given the certainty of destruction, why not just pay the (intentionally low, redistributive, efficiency increasing, public-good funding) taxes?
I imagine that these policies will be enforced by a large coalition of members interested in maintaining strong property rights (more on that later).
Its not clear that space war will be dominated by kinetic energy weapons or MAD:
These weapons seem most useful when entire civilizations are living on a single planet, but its possible that people will live in disconnected space habitats. These would be much harder to wipe out.
Any weapon will take a long time to move over interstellar distances. A rebelling civilization would have to wait thousands of years for the weapon to reach its target. It seems like their opponent could detect the weapon and respond in this time.
Even if effective civilization-destroying weapons are developed, mutual defense treaties and AI could be used to launch a second strike, making civilization-scale attacks unlikely.
In general, any weapon a single civilization might use to avoid taxation can also be used by a group of civilizations to enforce taxation.
On the other hand, it does seem like protecting territory/resources from theft will be an issue. This is where property rights come in. Governments, corporations, and AI’s will want to have their property protected over very long timescales and don’t want to spend most of their resources on security.
Over long timescales, a central authority can help them defend their territory (over short timescales, these groups will still have to protect their own property since it will take a while for allies to reach them). But order to receive this protection, these groups must register what they own and pay taxes towards mutual defense. I think a land value tax is the “right” way to raise taxes in this case.
This approach makes more sense for smaller systems where defense is easier but it may also be useful across much larger distances if agents are very patient.
(The “central authority” also doesn’t have to be an actual central government. States could pay neighbors to come to their aid using a portion of the collected taxes, kind of like “defense insurance”)
Georgism in Space
I haven’t given this a thoroughly read yet, but I think this has some similarities to retroactive public goods funding:
The impact markets team is working on implementing these:
Going by figure 5, I think the way to format climate contingent finance like an impact certificate would be:
‘A’ announces that they will award $X in prizes to different project based on how much climate change damage a project averts
‘B’ funds different projects (based on how much damage they think each project will avert in the future) in exchange for a percentage of the prize money
… robust broadly credible values for this would be incredibly valuable, and I would happily accept them over billions of dollars for risk reduction …
This is surprising to me! If I understand correctly, you would prefer to know for certain that P(doom) was (say) 10% than spend billions on reducing x-risks? (perhaps this comes down to a difference in our definitions of P(doom))
Like Dagon pointed out, it seems more useful to know how much you can change P(doom). For example, if we treat AI risk as a single hard step, going from 10% → 1% or 99% → 90% both increase the expected value of the future by 10X, it doesn’t matter much whether it started at 10% or 99%.
For prioritization within AI safety, are there projects in AI safety that you would stop funding as P(doom) goes from 1% to 10% to 99%? I personally would want to fund all the projects I could, regardless of P(doom) (with resources roughly proportional to how promising those projects are).
For prioritization across different risks, I think P(doom) is less important because I think AI is the only risk with greater than 1% chance of existential catastrophe. Maybe you have higher estimates for other risks and this is the crux?
In terms of institutional decision making, it seems like P(doom) > 1% is sufficient to determine the signs of different interventions. In a perfect world, a 1% chance of extinction would make researchers, companies, and governments very cautious, there would be no need to narrow down the range further.
Like Holden and Nathan point out, P(doom) does serve a promotional role by convincing people to focus more on AI risk, but getting more precise estimates of P(doom) isn’t necessarily the best way to convince people.
Yes, “precision beyond order-of-magnitude” is probably a better way to say what I was trying to.
I would go further and say that establishing P(doom) > 1% is sufficient to make AI the most important x-risk, because (like you point out), I don’t think there are other x-risks that have over a 1% chance of causing extinction (or permanent collapse). I don’t have this argument written up, but my reasoning mostly comes from the pieces I linked in addition to John Halstead’s research on the risks from climate change.
You need to multiply by the amount of change you think you can have.
Agreed. I don’t know of any work that addresses this question directly by trying to estimate how much different projects can reduce P(doom) but would be very interested to read something like that. I also think P(doom) sort of contains this information but people seem to use different definitions.
Setting aside how important timelines are for strategy, the fact that P(doom) combines several questions together is a good point. Another way to decompose P(doom) is:
How likely are we to survive if we do nothing about the risk? Or perhaps: How likely are we to survive if we do alignment research at the current pace?
How much can we really reduce the risk with sustained effort? How immutable is the overall risk?
Though people probably mean different things by P(doom) and seems worthwhile to disentangle them.
Talking about our reasoning for our personal estimates of p(doom) is useful if and only if it helps sway some potential safety researchers into working on safety …
Good point, P(doom) also serves a promotional role, in that it illustrates the size of the problem to others and potentially gets more people to work on alignment.
Precise P(doom) isn’t very important for prioritization or strategy
You may also want to check out John Wentworth’s natural abstraction hypothesis work:
Two arguments I would add:
Conflict has direct costs/risks, a fight between AI and humanity would make both materially worse off
Because of comparative advantage, cooperation between AI and humanity can produce gains for both groups. Cooperation can be a Pareto improvement.
Alignment applies to everyone, and we should be willing to make a symmetric commitment to a superintelligence. We should grant them rights, commit to their preservation, respect it’s preferences, be generally cooperative and avoid using threats, among other things.
It may make sense to commit to a counterfactual contract that we expect an AI to agree to (conditional on being created) and then intentionally (carefully) create the AI.