I feel like in spirit the proposal is very close to the US wanting to be world policeman while trying to commit to not infringing on other nations’ sovereignty unless they pose some large risk to other nations / infringe on some basic human rights.
In practice it might be very different because:
It might be way easier to get credible commitments
You might be able to get an impartial judge that is hard to sway—which allows for more fuzzy rules
You might be able to get an impartial implementation that won’t exploit its power (e.g. it won’t use the surveillance mechanism for other ends than the one it is made for)
Is that right?
I am not sure how much the commitment mechanisms buy you. I would guess the current human technology feels like it would be sufficient for very strong commitments, and the reason it doesn’t happen is that people don’t know what they want to commit to. What are concrete things you imagine would be natural to commit to and why can’t we commit to them without an ASI (and instead just have some “if someone breaks X others nuke them”)?
The impartial judge also looks rough. It looks like powerful entities rarely deferred to impartial judges despite it being in principle possible to find humans without too much skin in the game. But given sufficiently good alignment, maybe you can get much better impartial judges than you historically got? I think it’s not obvious it is even possible to be better than impartial human judges even with perfect and transparent alignment technology.
The impartial implementation is maybe a big deal. Though again, I would guess that human tech allows for things like that and that this didn’t happen for other reasons.
I feel like in spirit the proposal is very close to the US wanting to be world policeman while trying to commit to not infringing on other nations’ sovereignty unless they pose some large risk to other nations / infringe on some basic human rights.
In practice it might be very different because:
It might be way easier to get credible commitments
You might be able to get an impartial judge that is hard to sway—which allows for more fuzzy rules
You might be able to get an impartial implementation that won’t exploit its power (e.g. it won’t use the surveillance mechanism for other ends than the one it is made for)
Is that right?
I am not sure how much the commitment mechanisms buy you. I would guess the current human technology feels like it would be sufficient for very strong commitments, and the reason it doesn’t happen is that people don’t know what they want to commit to. What are concrete things you imagine would be natural to commit to and why can’t we commit to them without an ASI (and instead just have some “if someone breaks X others nuke them”)?
The impartial judge also looks rough. It looks like powerful entities rarely deferred to impartial judges despite it being in principle possible to find humans without too much skin in the game. But given sufficiently good alignment, maybe you can get much better impartial judges than you historically got? I think it’s not obvious it is even possible to be better than impartial human judges even with perfect and transparent alignment technology.
The impartial implementation is maybe a big deal. Though again, I would guess that human tech allows for things like that and that this didn’t happen for other reasons.