Retired software engineer with a love of knowledge and disinterest in dead philosophers.
NickH
I think not using it for grammar checking is going too far. If you follow that reasoning logically then human editors, proofreaders and reviewers should also be shunned.
I love it, but, of course, no good leader, in the moment, thinks that they are over concentrating power—Each believes that they are only doing as much as is necessary for the greater good and so your analysis, can never hope to achieve more than to have every would be conqueror question themselves, which all the better ones do anyway.
Morale is a group thing. Humans are a social animal. Low morale implies that there is either something wrong with the group or something wrong with me. Either would be an existential threat in the ancestral environment. If there’s something wrong with me then low morale should incentivise me to change just as pain incentivises me to stop doing whatever I am are doing that causes me pain. If there’s something wrong with the group then, in the ancestral environment, I’m in deep s**t and there’s not much I can do about it apart from leave the group and set out on my own, which is almost certainly fatal. This may well be the cause of people sinking into low morale apathy—the equivalent of a sick animal wandering off to die.
There’s a difference between being unable to express your reasons and expressing incoherent reasons. The former is legitimate debate, whereas the latter is not. You can’t, in good conscience, fall back on “that’s what the gods want”.
The physical simulation will not actively resist you changing the parameters/properties. AGI will.
So if I can get a protected group, G, to support some view, X, that I support, then any attack on X supporters will magically become a hate crime against G? Surely this is problematic.
I think most people have positive views about some/most humans (and consequently about alignment) because they are implicitly factoring in their mortality. Would you feel safe picking a human that you thought was good and giving them a pill that gave them superintelligence? Maybe. Would you feel safe giving that same person a pill that made them both superintelligent AND immortal? I know I wouldn’t trust me with that. An AGI/SGI would be potentially immortal and would know it. For that reason alone I would never trust it no matter how well I thought it seemed aligned in the short term (and compared to an immortal, any human timescale is short term).
I would respond to that question with: “How are you coping with the certainty that you, and everyone you ever knew or cared about or who cares about you, will be dead in a hundred years or so”? (And before many peoples estimate of AI doom). The simple answer is that we did not evolve to be able to truly feel that kind of thing and for good reason.
It’s good to see that this does at least mention the problem with influencing the world over long time periods but it still misses the key human blind spot: Humans are not really capable of thinking of humanity over long time periods. Humans think 100 years or 1,000 years is an eternity when, for a potentially immortal entity, 1,000,000 years is almost indistinguishable from 100. A good thought experiment is to imagine that aliens come down and give us a single pill that will make a single human super intelligent AND immortal. Suppose we set up a special training facility to train a group of children, from birth, to be the recipient of the pill. Would you ever feel certain that giving one of those children the pill was safe? I wouldn’t. I certainly think that you’d be foolish to give it to me. Why is there no discussion about restricting the time frame that AGIs are allowed to consider?
I saw an, apparently relevant, video about AI generated music that claimed to be able to detect it by splitting it into its constituent tracks—It turns out that the tools for doing this (which use AI) work well with human music that was actually created from mixing individual tracks but badly for AI generated music (when you listen to the individual tracks they are obviously “wrong”). This is clearly because the AI does not (currently) create music by building it up from individual tracks (although clearly it could be made to do this). Instead it somehow synthesises the whole thing at once—It appears that AI images are similar in that they are not built up from individual components, like fingers. This does suggest that a way to better identify AI images is to have s/w identify the location of the skeletal joints in an image and check whether they can be mapped onto a model of an actual skeleton without distortion.
In a general discussion of ethics your replies are very sensible. When discussing AI safety, and, in particular P(doom), they are not. Your analogy does not work. It is effectively saying trying to prevent AI from killing us all by blocking its access to the internet with a password is better than not using a password, but an AI that is a threat to us will not be stopped by a password and neither will it be stopped by an imperfect heuristic. If we don’t have 100% certainty, we should not build it.
You are arguing that it is tractable to have predictable positive long term effects using something that is known to be imperfect (heuristic ethics). For that to make sense you would have to justify why small imperfections cannot possibly grow into large problems. It’s like saying that because you believe that you only have a small flaw in your computer security nobody could ever break in and steal all of your data. This wouldn’t be true even if you knew what the flaw was and, with heuristic ethics, you don’t even know that.
This is totally misguided. If heuristics worked 100% of the time they wouldn’t be rules of thumb, they’d be rules of nature. We only have to be wrong once for AI to kill us.
I invest in US assets myself but not because of any faith in the US, in fact the opposite—Firstly it’s like a fund manager investing into a known bubble—You know it’s going to burst but, if it doesn’t burst in the next year or so you cannot afford the short/medium term loss relative to your competitors and, secondly, If the US crashes it takes down the rest of the world with it and is probably the first to recover so you might as well stick with it. None of this translates to faith in US, AI, governance. Your mention of positive-sum deals is particularly strange since, if the world has learned one thing about Trump, it is that he sees the world, almost exclusively, in zero sum terms.
Stating the obvious here but Trump has ensured that the USG cannot credibly guarantee anything at all and hence this is a non-starter for foreign governments.
Evangelicals either hate people or don’t actually believe that their god is loving and compassionate. Proof:
If god DOES NOT love people who have never heard about him or only heard about him from people who did a bad job of “explaining” him then he is NOT loving or compassionate, but, in this case it would be caring and compassionate for evangelicals to evangelise in order to try to get people onto gods good side because the consequences of being on his bad side are BAD.
If god DOES love people who have never heard about him or only heard about him from people who did a bad job of “explaining” him then evangelicals who also love people and are compassionate and caring should actively AVOID spreading the word of god as this will necessarily deprive some people of the “get out of jail free” card (see 1)
I think it does. Certainly the way that I would do it would be to create a world map from memory, then overlay the coordinate grid, then just answer by looking it up. You answers will be as good as your map is. I believe that the LLMs most likely work from wikipedia articles—There are a lot of location pages with coordinates in wikipedia
Humans would draw a map of the world from memory, overlay the grid and look up the reference. I doubt that the LLMs do this. It would be interesting to see whether they can actually relate the images to the coordinates—I suspect not i.e. I expect that they could draw a good map, with gridlines from training data but would be unable to relate the visual to the question. I expect that they are working from coordinates in wikipedia articles and the CIA website. Another suggestion would be to ask the LLM to draw a map of the world with non-standard grid lines e.g. every 7 degrees
This is interesting but, in some ways, it should have been obvious—Everything we say, says something about who we are and what we say is influenced by what we know in ways that we are not conscious of. Magicians use subconscious forcing all the time along the lines of “Think of a number between 1 and 10”
I totally agree that everything flounders on the messed up idea of free will. My “solution” is to abandon alignment etc. altogether and instead focus on limiting the possible damage: I don’t see how, with messed up ontology, we can ever guarantee that, given enough time, an ASI won’t tile the universe with hedonium—so why not focus on restricting the time? I’m pretty sure the ASI can’t do it in only a few years so, if we can set it a time limit of say, 5 years and to not value anything beyond that window then we “could” be safe. This fits in with the current state of human knowledge whereby we know how to make a lot of things better but we are constrained almost entirely by the short term pain that would be inflicted on a portion of the population due to the rate of change which makes these fixes politically impossible. It’s the rate of change that needs to be limited. If my method worked the ASI would want to tile the universe with hedonoium but, realising that it couldn’t do more than set up a couple of hedonium factories in the given timeframe (and it doesn’t care about the futrure so they don’t add value as “a step in the right direction”), it would settle for a smaller and less drastic change.