No. That’s a foolish interpretation of domain insight. We have a massive number of highly general strategies that nonetheless work better for some things than others. A domain insight is simply some kind of understanding involving the domain being put to use. Something as simple as whether to use a linked list or an array can use a minor domain insight. Whether to use a monte carlo search or a depth limited search and so one are definitely insights. Most advances in AI to this point have in fact been based on domain insights, and only a small amount on scaling within an approach (though more so recently). Even the ‘bitter lesson’ is an attempted insight into the domain (that is wrong due to being a severe overreaction to previous failure.)
Also, most domain insights are in fact an understanding of constraints. ‘This path will never have a reward’ is both an insight and a constraint. ‘Dying doesn’t allow me to get the reward later’ is both a constraint and a domain insight. So is ‘the lists I sort will never have numbers that aren’t between 143 and 987’ (which is useful for and O(n) type of sorting). We are, in fact, trying to automate the process of getting domain insights via machine with this whole enterprise in AI, especially in whatever we have trained them for.
Even, ‘should we scale via parameters or data’ is a domain insight. They recently found out they had gotten that wrong (Chinchilla) too because they focused too much on just scaling.
Alphazero was given some minor domain insights (how to search and how to play the game), years later, and ended up slightly beating a much earlier approach, because they were trying to do that specifically. I specifically said that sort of thing happens. It’s just not as good as it could have been (probably).
I do agree with your rephrasing. That is exactly what I mean (though with a different emphasis.).
I agree with you. The biggest leap was going to human generality level for intelligence. Humanity already is a number of superintelligences working in cooperation and conflict with each other; that’s what a culture is. See also corporations and governments. Science too. This is a subculture of science worrying that it is superintelligent enough to create a ‘God’ superintelligence.
To be slightly uncharitable, the reason to assume otherwise is fear -either their own or to play on that of others. Throughout history people have looked for reasons why civilization would be destroyed, and this is just the latest. Ancient prophesiers of doom were exactly the same as modern ones. People haven’t changed that much.
That doesn’t mean we can’t be destroyed, of course. A small but nontrivial percentage of doomsayers were right about the complete destruction of their civilization. They just happened to be right by chance most of the time.
I also agree that quantitative differences could possibly end up being very large, since we already have immense proof of that in one direction given that we have superintelligences massively larger than we are already, and computers have already made them immensely faster than they used to be.
I even agree that it is likely that they key advantages quantitatively would likely be in supra-polynomial arenas that would be hard to improve too quickly even for a massive superintelligence. See the exponential resources we are already pouring into chip design for continued smooth but decreasing progress and even higher exponential resources being poured into dumb tool AIs for noticeable but not game changing increases. While I am extremely impressed by some of them like Stable Diffusion (an image generation AI that has been my recent obsession) there is such a long way to go that resources will be a huge problem before we even get to human level, much less superhuman.
Honestly Illusionism is just really hard to take seriously. Whatever consciousness is, I have better evidence it exists than anything else since it is the only thing I actually experience directly. I should pretend it isn’t real...why exactly? Am I talking to slightly defective P-zombies?
If the computer emitted it for the same reasons...is a clear example of a begging the question fallacy. If a computer claimed to be conscious because it was conscious, then it logically has to be conscious, but that is the possible dispute in the first place. If you claim consciousness isn’t real, then obviously computers can’t be conscious. Note, that you aren’t talking about real illusionism if you don’t think we are p-zombies. Only the first of the two possibilities you mentioned is Illusionism if I recall correctly.
You seem like one of the many people trying to systematize things they don’t really understand. It’s an understandable impulse, but leads to an illusion of understanding (which is the only thing that leads to a systemization like Illusionism seems like frustrated people claiming there is nothing to see here.)If you want a systemization of consciousness that doesn’t claim things it doesn’t know, then assume consciousness is the self-reflective and experiential part of the mind that controls and directs large parts of the overall mind. There is no need to state what causes it.
If a machine fails to be self-reflective or experiential then it clearly isn’t conscious. It seems pretty clear that modern AI is neither. It probably fails the test of even being a mind in any way, but that’s debatable.
Is it possible for a machine to be conscious? Who knows. I’m not going to bet against it, but current techniques seem incredibly unlikely to do it.
As individuals, Humans routinely do things much too hard for them to fully understand successfully. This is due partly due to innately hardcoded stuff (mostly for things we think are simple like vision and controlling our bodies automatic systems), and somewhat due to innate personality, but mostly due to the training process our culture puts us through (for everything else).
For its part, cultures can take the inputs of millions to hundreds of millions of people (or even more when stealing from other cultures), and distill them into both insights and practices that absolutely no one would have ever come up with on their own. The cultures themselves are, in fact, massively superintelligent compare to us, and people are effectively putting their faith either in AI being no big deal because it is too limited, or in the fact that we can literally ask a superintelligence for help in designing things much stupider than culture is to not turn on us too much.
AI is currently a small sub-culture within the greater cultures, and struggling a bit with the task, but as AI grows more impressive, much more of culture will be about how to align and improve AI for our purposes. If the full might of even a midsized culture ever sees this as important enough, alignment will probably become quite rapid, not because it is an easy question, but because cultures are terrifyingly capable.
At a guess, Alignment researches have seen countless impossible tasks fall to the midsized ‘Science’ culture of which they are a part, and many think this is much the same. ‘Human achievable’ means anything a human-based culture could ever do. This is just about anything that doesn’t violate the substrates it is based on too much (and you could even see AI as a way around that.). Can human cultures tame a new substrate? It seems quite likely.
I’m hardly missing the point. It isn’t impressive to have it be exactly 75%, not more or less, so the fact that it can’t always be that is irrelevant. His point isn’t that that particular exact number matters, it’s that the number eventually becomes very small. But since the number being very small compared to what it should be does not prevent it from being made smaller by the same ratio, his point is meaningless. It isn’t impressive to fulfill an obvious bias toward updating in a certain direction.
It doesn’t take many people to cause these effects. If we make them ‘the way’, following them doesn’t take an extremist, just someone trying to make the world better, or some maximizer. Both these types are plenty common, and don’t have to make it fanatical at all. The maximizer could just be a small band of petty bureaucrats who happen to have power over the area in question. Each one of them just does their role, with a knowledge that it is to prevent overall suffering. These aren’t even the kind of bureaucrats we usually dislike! They are also monsters, because the system has terrible (and knowable) side effects.
I don’t have much time, so:
While footnote 17 can be read as applying, it isn’t very specific.
For all that you are doing math, this isn’t mathematics, so base needs to be specified.
I am convinced that people really do give occasional others a negative weight.
And here are some notes I wrote while finishing the piece (that I would have edited and tightened up a a lot)(it’s a bit all over the place):
This model obviously assumes utilitarianism.Honestly, their math does seem reasonable to account for people caring about other people (as long as they care about themselves at all on the same scale, which could even be negative, just not exactly 0.).They do add an extraneous claim that the numbers for the weight of a person can’t be negative (because they don’t understand actual hate? At least officially.) If someone hates themselves, then you can’t do the numbers under these constraints, nor if they hate anyone else. But this constraint seems completely unnecessary, since you can sum negatives with positives easily enough.I can’t see the point of using an adjacency matrix (of a weighted directed graph).Being completely altruistic doesn’t seem like everyone gets a 1, but that everyone gets at least that much.I don’t see a reason to privilege mental similarity to myself, since there are people unlike me that should be valued more highly. (Reaction to footnote 13) Why should I care about similarities to pCEV when valuing people?
Thus, they care less about taking richer people’s money. Why is the first example explaining why someone could support taking money from people you value less to give to other people, while not supporting doing so with your own money? It’s obviously true under utilitarianism (which I don’t subscribe to), but it’s also obscures things by framing ‘caring’ as ‘taking things from others by force’.
In ‘Pareto improvements and total welfare’ should a social planner care about the sum of U, or the sum of X? I don’t see how it is clear that it should be X. Why shouldn’t they value the sum of U, which seems more obvious?
‘But it’s okay for different things to spark joy’. Yes, if I care about someone I want their preferences fulfilled, not just mine, but I would like to point out that I want them to get what they want, not just for them to be happy.Talking about caring about yourself though, if you care about yourself at different times, then you will care about what your current self does, past self did, and future self will, want. I’m not sure that my current preferences need to take into account those things though.Thus I see two different categories of thing mattering as regards preferences. Contingent or instrumental preferences are changeable in accounting, while you should evaluate things as if your terminal preferences are unchanging.Even though humans can have them change, such as when they have a child. Even if you already love your child automatically when you have one, you don’t necessarily care who that child turns out to be, but you care quite a bit afterwards. See any time travel scenario, and the parent will care very much that Sally no longer exists even though they now have Sammy. They will likely now also terminally value Sammy. Take into account that you will love your child, but not who they are unless you will have an effect on it (such as learning how to care for them in advance making them a more trusting child.).
In practice, subsidies and taxes end up not being about externalities at all, or to a very small degree. Often, one kind of externality (often positive) will be ignored even when it is larger than the other (often negative) externality.This is especially true in modern countries where people ignore the positive externalities of people’s preferences being satisfied making them a better and more useful person in society, while they are obsessed with the idea of the negatives of any exchange.I have a intuition that the maximum people would pay to avoid an externality is not really that close to its actual effects, and that people would generally lie if you asked them even if they knew.
In the real world, most people (though far from all) seem to have the intuition that the government uses the money they get from a tax less well than the individuals they take it from do.Command economies are known to be much less efficient than free markets, so the best thing the government could do with a new tax is to lower less efficient taxes, but taxes only rarely go down, so this encourages wasted resources. Even when they do lower taxes, it isn’t by eliminating the worst taxes. When they put it out in subsidies, they aren’t well targeted subsidies either, but rather, distortionary.Even a well targeted tax on negative externalities would thus have to handle the fact that it is, in itself, something with significant negative externalities even beyond the administrative cost (of making inefficient use of resources).
It’s weird to bring up having kids vs. abortion and then not take a position on the latter. (Of course, people will be pissed at you for taking a position too.)
There are definitely future versions of myself whose utility are much more or less valuable to me than others despite being equally distant.If in ten years I am a good man, who has started a nice family, that I take good care of, then my current self cares a lot more about their utility than an equally (morally) good version of myself that just takes care of my mother’s cats, and has no wife or children (and this is separate from the fact that I would care about the effects my future self would have on that wife and children or that I care about them coming to exist).
Democracy might be less short-sighted on average because future people are more similar to average other people that currently exist than you happen to be right now. But then, they might be much more short-sighted because you plan for the future, while democracy plans for right now (and getting votes.) I would posit that sometimes one will dominate, and sometimes the other.As to your framing, the difference between you-now and you-future is mathematically bigger than the difference between others-now and others-future if you use a ratio for the number of links to get to them.Suppose people change half as much in a year as your sibling is different from you, and you care about similarity for what value you place on someone. Thus, two years equals one link.After 4 years, you are now two links away from yourself-now and your sibling is 3 from you now. They are 50% more different than future you (assuming no convergence). After eight years, you are 4 links away, while they are only 5, which makes them 25% more different to you than you are.Alternately, they have changed by 67% more, and you have changed by 100% of how much how distant they were from you at 4 years.It thus seems like they have changed far less than you have, and are more similar to who they were, thus why should you treat them as having the same rate.
I’m only a bit of the way in, and it is interesting so far, but it already shows signs of needing serious editing, and there are other ways it is clearly wrong too.
In ‘The inequivalence of society-level and individual charity’ they list the scenarios as 1, 1, and 2 instead of A, B, C, as they later use. Later, refers incorrectly to preferring C to A with different necessary weights when the second reference is is to prefer C to B.
The claim that money becomes utility as a log of the amount of money isn’t true, but is probably close enough for this kind of use. You should add a note to the effect. (The effects of money are discrete at the very least).
The claim that the derivative of the log of y = 1/y is also incorrect. In general, log means either log base 10, or something specific to the area of study. If written generally, you must specify the base. (For instance, in Computer Science it is base-2, but I would have to explain that if I was doing external math with that.) The derivative of the natural log is 1/n, but that isn’t true of any other log. You should fix that statement by specifying you are using ln instead of log (or just prepending the word natural).
Just plain wrong in my opinion, for instance, claiming that a weight can’t be negative assumes away the existence of hate, but people do hate either themselves or others on occasion in non-instrumental ways, wanting them to suffer, which renders this claim invalid (unless they hate literally everyone).
I also don’t see how being perfectly altruistic necessitates valuing everyone else exactly the same as you. I could still value others different amounts without being any less altruistic, especially if the difference is between a lower value for me and the others higher. Relatedly, it is possible to not care about yourself at all, but this math can’t handle that.
I’ll leave aside other comments because I’ve only read a little.
I strongly disagree. It would be very easy for a non-omnipotent, unpopular, government that has limited knowledge of the future, that will be overthrown in twenty years to do a hell of a lot of damage with negative utilitarianism, or any other imperfect utilitarianism. On a smaller scale, even individuals could do it alone.
A negative utilitarian could easily judge that something that had the side effect of making people infertile would cause far less suffering than not doing it, causing immense real world suffering amongst the people who wanted to have kids, and ending civilizations. If they were competent enough, or the problem slightly easier than expected, they could use a disease that did that without obvious symptoms, and end humanity.
Alternately, a utilitarian that valued the far future too much might continually cause the life of those around them to be hell for the sake of imaginary effects on said far future. They might even know those effects are incredibly unlikely, and that they are more likely to be wrong than right due to the distance, but it’s what the math says, so...they cause a civil war. The government equivalent would be to conquer Africa (success not necessary for the negative effects, of course), or something like that, because your country is obviously better at ruling, and that would make the future brighter. (This could also be something done by a negative utilitarian to alleviate the long-term suffering of Africans).
Being in a limited situation does not automatically make Utilitarianism safe. (Nor any other general framework.) The specifics are always important.
A lot of this depends on your definition of doomsday/apocalypse. I took it to mean the end of humanity, and a state of the world we consider worse than our continued existence. If we valued the actual end state of the world more than continuing to exist, it would be easy to argue it was a good thing, and not a doom at all. (I don’t think the second condition is likely to come up for a very long time as a reason for something to not be doomsday.) For instance, if each person created a sapient race of progeny that weren’t human, but they valued as their own children, and who had good lives/civilizations, then the fact humanity ceased to exist due to a simple lack of biological children would not be that bad. This could in some cases be caused by AGI, but wouldn’t be a problem. (It would also be in the far future.)
AI doomsday never (though it is far from impossible). Not doomsday never, it’s just unlikely to be AGI. I believe we both aren’t that close, and that ‘takeoff’ would be best described as glacial, and we’ll have plenty of time to get it right. I am unsure of the risk level of unaligned moderately superhuman AI, but I believe (very confidently) that tech level for minimal AGI is much lower than the tech level for doomsday AGI. If I was wrong about that, I would obviously change my mind about the likelihood of AGI doomsday. (I think I put something like 1 in 10 million in the next fifty years. [Though in percentages.] Everything else was 0, though in the case of 25 years, I just didn’t know how many 0s to give it.)
‘Tragic AGI disasters’ are fairly likely though. For example, an AGI that alters traffic light timing to make crashes occur, or intentionally sabotages things it is supposed to repair. Or even an AGI that is well aligned to the wrong people or moral framework doing things like refusing to allow necessary medical procedures due to expense even when people are willing to use their own money to pay (since it thinks the person is worth less than the cost of the procedure, and thus has negative utility, perhaps.). Alternately, it could predict that the people wanting the procedure were being incoherent, and actually would value their kids getting the money more, but feel like they have to try. Whether this is correct or not, it would still be AGI killing people.
I would actually rate the risk of Tool AI as higher, because humans will be using those to try to defeat other humans, and those could very well be strong enough to notably enhance the things humans are bad at. (And most of the things moderately superhuman AGI could do would be doable sooner with tool AI and an unaligned human.) An AI could help humans design a better virus that is like ‘Simian Hemorrhagic Fever’, but that effects humans, and doesn’t apply to people with certain genetic markers (that denote the ethnicity or other traits of the people making it). Humans would then test, manufacture, distribute, and use it to destroy their enemies. Then oops, it mutates, and hits everyone. This is still a very unlikely doom though.
Interactionism would simply require an extension of physics to include the interaction between the two, which would not defy physics any more than adding the strong nuclear force did. You can hold against it that we do not know how it works, but that’s a weak point because there are many things where we still don’t know how they work.
Epiphenomenalism seems irrelevant to me since it is simply a way you could posit things to be. A normal dualist ignores the idea because there is no reason to posit it. We can obviously see how consciousness has effects on the body, so there simply isn’t a reason to believe it only goes the other way. Additionally, to me, Epiphenomenalism seems clearly false. Dualism as a whole has never said the body can’t have effects on consciousness either.
Causal closure seems unrelated to the actuality of physics. It is simply a statement of philosophical belief. It is one dualists obviously disagree with in the strong version, but that is hardly incompatibility with actual physics. Causal closure is not used to any real effect, and is hard to reconcile with how things seem to actually be. You could argue that causal closure is even denying things like the idea of math, and the idea of physics being things that can meaningfully affect behavior.
If they didn’t accept physical stuff as being (at least potentially) equal to consciousness they actually wouldn’t be a dualist. Both are considered real things, and though many have less confidence in the physical world, they still believe in it as a separate thing. (Cartesian dualists do have the least faith in the real world, but even they believe you can make real statements about it as a separate thing.) Otherwise, they would be a ‘monist’. The ‘dual’ is in the name for a reason.
This is clearly correct. We know the world through our observations, which clearly occur within our consciousness, and are thus at least equally proving our consciousness. When something is being observed, you can assume that the something else doing the observations must exist. If my consciousness observes the world, my consciousness exists. If my consciousness observes itself, my consciousness exists. If my consciousness is viewing only hallucinations, it still exists for that reason. I disagree with Descartes, but ‘I think therefore I am’ is true of logical necessity.
I do not like immaterialism personally, but it is more logically defensible that illusionism.
The description and rejection given of dualism are both very weak. Also, dualism is a much broader group of models than is admitted here.
The fact is, we only have direct evidence of the mind, and everything else is just an attempt to explain certain regularities. An inability to imagine that the mind could be all that exists is clearly just a willful denial, and not evidence, but notably, dualism does not require nor even suggest that the mind is all there is, just that it is all we have proof of (even in the cartesian variant). Thus, dualism.
Your personal refusal to imagine that physicalism is false and dualism is true seems completely irrelevant to whether or not dualism is true. Also, dualism hardly ‘defies’ physics. In dualism, physics is simply ‘under’ a meta-physics that includes consciousness as another category, without even changing physics. (If it did defy physics, that would be strong proof against physics since it is literally all of the evidence we actually have, but there is no incompatibility at all.)
Description wise, there are forms of dualism for which you give an incorrect analysis of the ‘teletransporter’ paradox. Obviously, the consciousness interacts with reality in some way, and there is no proof nor reason in dualism to assume that the consciousness could not simply follow the created version in order to keep interacting with the world.
Mind-body wise, the consciousness certainly attaches to the body through the brain to alter the world, assuming the brain and body are real (which the vast majority of dualists believe). Consciousness would certainly alter brain states if brain states are a real thing.
We also don’t know that a consciousness would not attach itself to a ‘Chinese Room’.
Your attempts at reasoning have led you astray in other areas too, but I’m more familiar with the ways in which these critiques of dualism are wrong. You seem extremely confident of this incorrect reasoning as well. This seems more like a motivated defense of illusionism than actually laying out the theories correctly.
I was replying to someone asking why it isn’t 2-5 years. I wasn’t making an actual timeline. In another post elsewhere on the sight, I mention that they could give memory to a system now and it would be able to write a novel.
Without doing so, we obviously can’t tell how much planning they would be capable of if we did, but current models don’t make choices, and thus can only be scary for whatever people use them for, and their capabilities are quite limited.
I do believe that there is nothing inherently stopping the capabilities researchers from switching over to more agentic approaches with memory and the ability to plan, but it would be much harder than the current plan of just throwing money at the problem (increasing compute and data.).
It will require paradigm shifts (I do have some ideas as to ones that might work) to get to particularly capable and/or worrisome levels, and those are hard to predict in advance, but they tend to take a while. Thus, I am a short term skeptic of AI capabilities and danger.
You’re assuming that it would make sense to have a globally learning model, one constantly still training, when that drastically increases the cost of running the model over present approaches. Cost is already prohibitive, and to reach that many parameters any time soon exorbitant (but that will probably happen eventually). Plus, the sheer amount of data necessary for such a large one is crazy, and you aren’t getting much data per interaction. Note that Chinchilla recently showed that lack of data is a much bigger issue right now for models than lack of parameters so they probably won’t focus on parameter counts for a while.
Additionally, there are many fundamental issues we haven’t yet solved for DL-based AI. Even if it was a huge advancement over present model, which I don’t believe it would be at that size, it would still have massive weaknesses around remembering, or planning, and would largely lack any agency. That’s not scary. It could be used for ill-purposes, but not at human (or above) levels.
I’m skeptical of AI in the near term because we are not close. (And the results of scaling are sublinear in many ways. I believe that mathematically, it’s a log, though how that transfers to actual results can be hard to guess in advance.)
You’re assuming that the updates are mathematical and unbiased, which is the opposite of how people actually work. If your updates are highly biased, it is very easy to just make large updates in that direction any time new evidence shows up. As you get more sure of yourself, these updates start getting larger and larger rather than smaller as they should.
That sort of strategy only works if you can get everyone to coordinate around it, and if you can do that, you could probably just get them to coordinate on doing the right things. I don’t know if HR would listen to you if you brought your concerns directly to them, but they probably aren’t harder to persuade on that sort of thing than convincing the rest of your fellows to defy HR. (Which is just a guess.) In cases where you can’t get others to coordinate on it, you are just defecting against the group, to your own personal loss. This doesn’t seem like a good strategy.
In more limited settings, you might be able to convince your friends to debate things in your preferred style, though this depends on them in particular. As a boss, you might be able to set up a culture where people are expected to make strong arguments in formal settings. Beyond these, I don’t really think it is practical. (They don’t generalize -for instance, as a parent, your child will be incapable of making strong arguments for an extremely long time.)
That does sound problematic for his views if he actually holds these positions. I am not really familiar with him, even though he did write the textbook for my class on AI (third edition) back when I was in college. At that point, there wasn’t much on the now current techniques and I don’t remember him talking about this sort of thing (though we might simply have skipped such a section).
You could consider it that we have preferences on our preferences too. It’s a bit too self-referential, but that’s actually a key part of being a person. You could determine those things that we consider to ‘right’ directly from how we act when knowingly pursuing those objectives, though this requires much more insight.
You’re right, the debate will keep going on in philosophical style, but if it works or not as an approach for something different than humans could change that.