I’m not quite sure I understand the problem with blueness as you see it.
Suppose nouroscience was advanced enough that it could manipulate your perception of colors in any arbitrary way just by manipulating your neurons. For example they could make you perceive blue as you previously perceived red and the other way round, induce synaesthesia and make you perceive the smell of roses, the taste of salt, the note C or other things as blue. They could change your perception of color completely, leaving your new perception of colors only as similar to your old one as that one was to your perception of smells, flavor or sounds. If all of this was true, would that be enough for you to accept that blueness is sufficiently explicable by the behaviour of neurons alone? Or would you argue that while neurons are enough to induce the sensation of blueness this sensation itself is still something beyond the mere behaviour of neurons?
I wonder why Eliezer doesn’t want to say anything concrete about his work with Marcello? (“Most of the real progress that has been made when I sit down and actually work on the problem is things I’d rather not talk about”)
There seem to be only two plausible reasons:
Someone else might use his work in ways he doesn’t want them to.
It would somehow hurt him, the SIAI or the cause of Friendly AI.
For 1. someone else stealing his work and finishing a provably friendly AI first would be a good thing, would it not? Losing the chance to do it himself shouldn’t matter as much as the fate of the future intergalactic civilization to an altruist like him. Maybe his work on provable friendliness would reveal ideas on AI design that could be used to produce an unfriendly AI? But even then the ideas would probably only help AI researchers who work on transparent design, are aware of the friendliness problem and take friendliness serious enough to mine the work on friendliness of the main proponent of friendliness for useful ideas. Wouldn’t giving these people a relative advantage compared to e. g. connectivists be a good thing? Unless he thinks that AGI would then suddenly be very close while FAI still is far away… Or maybe he thinks a partial solution to the friendliness problem would make people overconfident and less cautious than they would otherwise be?
As for 2. the work so far might be very unimpressive, reveal embarrassing facts about a previous state of knowledge, or be subject to change and a publicly apparent change of opinion be deemed disadvantageous. Or maybe Eliezer fears that publicly revealing some things would psychologically commit him to them in ways that would be counterproductive?