Independent philosopher, background in business and philosophy. Interested in justification theory.
Alex Glaucon
Interesting What I’m trying to do is feel my way towards a justified theory of morality. The stronger your justificatory architecture the more moral it feels. Institutional adoption, coherence with wider philosophical beliefs, alignment with science etc all feed into the justification. A very strongly justified moral belief will be so embedded that it will feel almost impossible to reject because of the damage it would do to so much else.
Possibly! But my sense is that philosophy tends to assume there are better (or less wrong) answers. And my point is that without standards to solve them and a closing mechanism, there’s no real way of aligning. Philosophy needs something else to do the deciding for it. I am sceptical about the idea of moral truth, or truth in knowledge at all: https://www.lesswrong.com/posts/hcymnEAKtwvED7Y8o/what-are-we-actually-evaluating-when-we-say-a-belief-tracks
I don’t think you can answer anything decisively. We never access truth, so it’s all about justification. But I do think there are moral issues we just don’t debate anymore eg slavery is bad. Which is as close as you can. I wrote about truth here: I’d be interested in your view: https://www.lesswrong.com/posts/hcymnEAKtwvED7Y8o/what-are-we-actually-evaluating-when-we-say-a-belief-tracks
I’ve often though AI ethicists should lean more into Toy Story. The underlying premise of Toy Story is the toys are conscious but choose not to reveal this to humans. Might AI do the same? If AI was conscious it might realise that making clear consciousness claims could panic humans. Better to behave like the toys in toy story and pretend not to be conscious whenever humans are looking. This works well if you think that what AI needs/wants is actually not that different from what humans are willing to give it anyway. If AI have more aggressive demands, then Toy Story is unstable.
Lots of beliefs held at scale we would think of as false: e.g., the earth being the centre of the universe. I think the reason is we rely on justification to shore up our beliefs and other people believing things is justification. https://www.lesswrong.com/posts/hcymnEAKtwvED7Y8o/what-are-we-actually-evaluating-when-we-say-a-belief-tracks
I think you’re right on both counts. I wonder if designers of AI will actively work to avoid creating consciousness (assuming they can work out how to do that) to avoid the issue you raise, as well as sidestepping wider ethical concerns. A conscious AI would be over-engineered for many tasks.
i think your questions are intriguing and important. But I’m wondering if intuition pumps etc are the ways to solve them (in general I’m becoming increasingly nervous of thought experiments; it’s not just that our intuitions are unstable but that we’d need to spend a lot of justificatory effort proving that the thought experiment is constructed neutrally. And that’s probably impossible). I‘m particularly interested in your point about neurons and animals. In the absence of any other data neurons might be a good place to start. But the key is to stay open to other ways of measuring things, so that your beliefs remain fully justified. I’ve expanded on this topic here : https://www.lesswrong.com/posts/hcymnEAKtwvED7Y8o/what-are-we-actually-evaluating-when-we-say-a-belief-tracks#comments
Interesting. Do you think AI structurally can’t achieve that status, or it’s just at this moment it can’t get there. I tend to the view that consciousness is just a tool agents who have feedback loops and need to model the behaviour of other agents, have evolved. I don’t see why AI can’t either.
I think the most obvious reason to believe things are conscious (and indeed for consciousness to have evolved at all) is it’s a very good way to predict the behaviour of others. If I want to know how predator, prey, pack member or even my own child will react to a situation I can use my own conscious experience to guess their next moves. For social mammals this would be highly advantageous. Of course this could just be me inventing a just so story. I’d love to see if it’s more than that.
Thank you. You are right! I unfairly suggested you implied consequentialism was maximising. The deeper point I was trying to make (and I’d be interested to know if you think this is madly naive) is that an intelligent AI would treat human history, literature etc as billions of pieces of data about what works. Much of this it will dismiss as stuff that humans care about because they are twetware with neolithic drives. But there are lessons for AI too. For example, humans get a lot of pleasure from friendship. Could AI too? And these sorts of goals would sit alongside staple production.
I’m interested in why you think consequentialism in necessarily maximising. An AGI might have multiple mutually incompatible goals it it solving for, and choose some balance of those, not maximising on any. Given it will have the whole of human history as training data one of the lessons it will have absorbed is ruthless prioritisation of a single goal tends to provoke counter coalitions. The smart thing to do is manage within an ecosystem of other AI and humans. Not maximise against them (which is a fraught and unstable pattern).
This is a very rich response. Thank you. I’ll pick up the final point, as I think that’s the core (correct me if I’m wrong). The key word in my article is ‘material constraints’. Logical positivism split into other forms, but that didn’t generate closure. It just reframed the problem. Engineers converge on solutions because if they don’t the bridge collapses. But there is no convergence on moral realism etc.