No its not. There is no objective sense in which human suffering and extinction is bad. It’s not even a matter of degree. Questions of morality are only meaningful from a subjective POV. That’s Hume’s Law in a nutshell. You can’t solve the alignment problem by believing that we should aim as close as possible to the objective truth that everything is just particles and forces and alignment doesn’t matter. It’s circular reasoning.
Abe Dillon
Why do you assume that Yelsgib doesn’t know or keep that in mind?
The problem is that Yudkowsky insists that a mechanistic view of the universe is the only correct perspective even though problems like alignment are inherently inaccessible from such a perspective due to Hume’s Guillotine. It’s only from a subjective POV that the idea that human suffering and/or extinction can be considered bad.
I always feel like I’m reading your response to some other argument, but you decide to use some indirect reference or straw-man instead of actually addressing the impetus for your posts. This article is a long way of saying that even when things aren’t black and white, that doesn’t mean shades of grey don’t matter.
Also, I think people often reach for analogies as though they always provide clarification when sometimes they just muddle things. I have no idea what to make of your disappearing moon example. The odds that the entire moon could disappear and reappear seem very hard to compare to the odds that there’s an invisible dragon that can cure cancer. Why not compare something highly probable with the cancer curing dragon instead of something so strangely contrived? You can’t prove Big Ben will ring tomorrow, but the odds are much better than an invisible dragon baking you a cake!
Both AIXI and AIXItl will at some point drop an anvil on their own heads just to see what happens
You’re confusing arbitrary optimization with a greedy algorithm which AIXI explicitly is not. It considers a future horizon. I see you commit this falacy often. You implicitly assume “what would an arbitrarily intelligent system do?” is equivalent to “what would a arbitrarily greedy algorithm do?”
Also, the math of AIXI assumes the environment is separably divisible
I don’t know what you mean by this.
If you’re talking about the fact that it considers models of the environment where dropping the anvil on its head will yield some reward are mixed in with models of the environment where dropping an anvil on its head will result in zero reward thereafter: That’s not “assuming the environment is separably divisible”. That’s assuming the environment is seperably divisible, each program it checks is a different model for the environment as a whole. Specifically, the ones that are congruent with its experience.Supposedly, the models that return a reward for dropping an anvil on its head would become vanishingly improbable pretty quickly, even if the system is incapable of learning about its own mortality (which it wouldn’t have to do because AIXI only works in an Agent-Environment Loop where the agent is not embedded in the environment.
If we had enough CPU time to build AIXItl, we would have enough CPU time to build other programs of similar size, and there would be things in the universe that AIXItl couldn’t model.
I think you’re missing the point of AIXI if you’re trying to think about it in practical terms. Math is a very useful human construct for studying patterns. The space of patterns describable by math is way larger than the set of patterns that exist in the real world. This creates a lot of confusion when trying to relate mathematical models to the real world.
For instance: It’s trivial to use algorithmic information theory to prove that a universal lossless compression algorithm is impossible, yet we use lossless compression to zip files all the time because we don’t live in a world that looks like TV static. Most files in the real world have a great deal of structure. Most N-length bit-strings of the set of all possible N-length bitstrings have no discernable structure at all.
I think it’s easy to get distracted by the absurdity of AIXI and miss the insight that it provides.AIXItl (but not AIXI, I think) contains a magical part: namely a theorem-prover which shows that policies never promise more than they deliver.
What does that even mean? How does AIXItl promise something?
My main problem is that I don’t think Hutter should have titled his book “Universal Artificial Intelligence”. For one because I don’t think the word “artificial belongs”, but mainly because I don’t think optimality is equivalent to intelligence nor do I think intelligence should be defined in terms of an agent-environment loop. That implies that a system with less agency, like Stephen Hawking, would be less intelligent.
I think of intelligence as a measure of a system’s ability to produce solutions to problems. Unlike optimality, I think the resources required to produce a given solution should also factor in. That ends up being highly circumstancial, so it ends up being a relative measure rather than something absolute. AIXIlt being as brute-force as an algorithm gets, would probably compare poorly to most heuristic systems.
Hey, G Gordon Worley III!
I just finished reading this post because Steve2152 was one of the two people (you being the other) to comment on my (accidentally published) post on formalizing and justifying the concept of emotions.
It’s interesting to hear that you’re looking for a foundational grounding of human values because I’m planning a post on that subject as well. I think you’re close with the concept of error minimization. My theory reaches back to the origins of life and what sets living systems apart from non-living systems. Living systems are locally anti-entropic which means: 1) According to the second law of thermodynamics, a living system can never be a truly closed system. 2) Life is characterized by a medium that can gather information such as genetic material.
The second law of thermodynamics means that all things decay, so it’s not enough to simply gather information, the system must also preserve the information it gathers. This creates an interesting dynamic because gathering information inherently means encountering entropy (the unknown) which is inherently dangerous (what does this red button do?). It’s somewhat at odds with the goal of preserving information. You can even see this fundamental dichotomy manifest in the collective intelligence of the human race playing tug-of-war between conservatism (which is fundamentally about stability and preservation of norms) and liberalism (which is fundamentally about seeking progress or new ways to better society).
Another interesting consequence of the ‘telos’ of life being to gather and preserve information is: it inherently provides a means of assigning value to information. That is: information is more valuable the more it pertains to the goal of gathering and preserving information. If an asteroid were about to hit earth and you were chosen to live on a space colony until Earth’s atmosphere allowed humans to return and start society anew, you would probably favor taking a 16 GB thumb drive with the entire English Wikipedia article text than a server-rack full several petabytes of high-definition recordings of all the reality television ever filmed, because that won’t be super helpful toward the goal of preserving knowledge *relevant* to man kind’s survival.
The theory also opens interesting discussions like, if all living things have a common goal; why do things like paracites, conflict, and war exist? Also, how has evolution led to a set of instincts that imperfectly approximate this goal? How do we implement this goal in an intelligent system? How do we guarantee such an implementation will not result in conflict? Etc.
Anyway, I hope you’ll read it when I publish it and let me know what you think!
Thanks for the insight!
This is actually an incomplete draft that I didn’t mean to publish, so I do intend to cover some of your points. It’s probably not going to go into the depth you’re hoping for since it’s pretty much just a synthesis of the bit of information from a segment from a Radiolab episode and three theorems about neural networks.
My goal was to simply use those facts to provide an informal proof that a trade-off exists between latency and optimality* in neural networks and that said trade-off explains why some agents (including biological creatures) might use multiple models at different points in that trade-off instead of devoting all their computational resources to one very deep model or one low-latency model. I don’t think it’s a particularly earth-shattering revelation, but sometimes; even pretty straight forward ideas can have an impact**.
I also don’t think that subconscious processing is exactly the same as emotions.
The position I present here is a little more subtle than that. It doesn’t directly equate subconscious processing to emotions. I state that emotions are: a conscious recognition of physiological processes triggered by faster stimulus-response paths in your nervous system.
The examples given in the podcast focus mostly on fight-or-flight until they get later into the discussion about research on paraplegic subjects. I think that might hint at a hierarchy of emotional complexity. It’s easy to explain the most basic ‘emotion’ that even the most primitive brains should express. As you point out; emotions like guilt are more difficult to explain. I don’t know if I can give a satisfactory response to that point because it’s beyond my lay understanding, but my best guess is: this feed-back loop from stimulus to response back to stimulus and so on can be initiated from something other than direct sensory input and the information fed back might include more than physiological state.
Each path has some input which propagates through it and results in some output. The output might include more than signals that directly physiological control signals such as various muscles. It include more abstract information such as a compact representation of the internal state of the path. The input might include more than sensory input. The feedback might be more direct.
For instance, I believe I’ve read that some parts of the brain receive a copy of recent motor commands which may or may not correspond to physiological change. Along with the in-direct feedback from sensors that measure your sweaty palms, the output of a path may directly feed back the command signals to release hormones or to blink eyes or whatever as input to other paths. A path might output signals that don’t correspond to any physiological control, they may be specifically meant to be feedback signals that communicate more abstract information.
Another example is: you don’t cry at the end of Schindler’s List because of any direct sensory input. The emotion arises from a more complex, higher-order cognition of the situation. Perhaps there are abstract outputs from the slower paths that feed back into the faster paths which makes the whole feed-back system more complex and allows for a higher-order cognition paths to indirectly result in physiological responses that they don’t directly control.
Another piece of the puzzle may be that the slowest path which I, perhaps erroneously; refer to consciousness, is supposedly where the physiological state triggered by faster paths gets labeled. That slower path almost definitely uses other context to arrive at such a label. A physiological state can have multiple causes. If you’ve just run a marathon on a cold day, it’s unlikely you’ll feel you’re frightened if you register as an elevated heart rate, sweaty palms, goosebumps, etc.
I lump all those ‘faster stimulus-response paths’ including reflexes under the umbrella term ‘subconscious’ which might not be correct. I’m not sure if any of the related fields (neurology, psychology, etc.) have a more precise definition for subconscious. The word used in the podcast is the ‘autonomic nervous system’ which, according to Google means: the part of the nervous system responsible for control of the bodily functions not consciously directed, such as breathing, the heartbeat, and digestive processes.
There’s a bit of a blurred line there, since reflexes are often included as part of the autonomic nervous system even though they govern responses that can also be consciously directed, such as blinking. Also, I believe the debate of what, exactly, ‘consciously directed’ means, is still out since, AFAIK; there’s no generally agreed upon formal definition of the word ‘consciousness’.
In fact, the term “subconscious” lumps together “some of the things happening in the neocortex” with “everything happening elsewhere in the brain” (amygdala, tectum, etc.) which I think are profoundly different and well worth distinguishing. … I think a neocortex by itself cannot do anything biologically useful.
I think there are a lot of words related to the phenomenon of intelligence and consciousness that have nebulous, informal meanings which vaguely reference concrete implementations (like the human mind and brain), but could and should be formalized mathematically. In that pursuit, I’d like to extract the essence of those words from the implementation details like the neocortex.
There are many other creatures, such as octopuses and crows; which are on a similar evolutionary path of increasing intelligence but have completely different anatomy to humans and each other. I agree that focusing research on the neocortex itself is a terrible way to understand intelligence. It’s like trying to understand how a computer works by looking only at media files on the hard drive. Ignoring the BIOS, operating system, file system, CPU, and other underlying systems that render that data useful.
I believe, for instance; Artificial Intelligence is a misnomer. We should be studying the phenomenon of intelligence as an abstract property that a system can exhibit regardless of whether it’s man-made. There is no scientific field of artificial aerodynamics or artificial chemistry. There’s no fundamental difference between the way air behaves when it interacts with a wing that depends upon whether the wing is natural or man-made.
Without a formal definition of ‘intelligence’ we have no way of making basic claims like, “system X is more intelligent than system Y”. It’s similar to how fields like physics were stuck until previously vague words like force and energy were given formal mathematical definitions. The engineering of heat engines benefited greatly when thermodynamics was developed and formalized ideas like ‘heat’ and ‘entropy’. Computer science wasn’t really possible until Church and Turing formalized the vague ideas of computation and computability. Later Shannon formalized the concept of information and allowed even greater progress.
We can look to specific implementations of a phenomenon to draw inspiration and help us understand the more universal truths about the phenomenon in question (as I do in this post), but if an alien robot came from outer-space and behaved in every way like a human, I see no reason to treat its intelligence as a fundamentally distinct phenomenon. When it exhibits emotion, I see no reason to call it anything else.
Anyway, I haven’t read your post yet, but I look forward to it! Thanks, again!
*here, optimality refers to producing the absolute best outputs for a given input. It’s independent of the amount of resources required to arrive at those outputs.
**I mean: Special Relativity (SR) came from the fact that the velocity of light (measured in space/time) appeared constant across all reference frames according to Maxwell’s equations (and backed up by observation). Einstein made the genius but obvious (in hind-sight) conclusion that the only way it’s possible for a value of space/time to remain constant between reference frames is if the measure space and time themselves are variable. The Lorentz transform is the only transform consistent with such dimensional variability between reference frames. There are only three terms in c = time/space, If c is constant and different reference frames demand variability, time and space must not be constant.
Not that I think I’m presenting anything as amazing as Special Relativity or that I think I’m anywhere near Einstein. It’s just a convenient example.
In short, your second paragraph is what I’m after.
Philosophically, I don’t think the distinction you make between a design choice and an evolved feature carries much relevance. It’s true that some things evolve that have no purpose and it’s easy to imagine that emotions are one of things especially since people often conceptualize emotion as the “opposite” of rationality, however; some things evolve that clearly do serve a purpose (in other words there is a justification for their existence), like the eye. Of course nobody sat down with the intent to design an eye. It evolved, was useful, and stuck around because of that utility. The utility of the eye (its justification for sticking around) exists independent of whether the eye exists. A designer recognizes the utility before hand and purposefully implements it. Evolution “recognizes” the utility after stumbling into it.
How? The person I’m responding to gets the math of probability wrong and uses it to make a confusing claim that “there’s nothing wrong” as though we have no more agency over the development of AI than we do over the chaotic motion of a dice.
It’s foolish to liken the development of AI to a roll of the dice. Given the stakes, we must try to study, prepare for, and guide the development of AI as best we can.
This isn’t hypothetical. We’ve already built a machine that’s more intelligent than any man alive and which brutally optimizes toward a goal that’s incompatible with the good of man kind. We call it, “Global Capitalism”. There isn’t a man alive who knows how to stock the shelves of stores all over the world with #2 pencils that cost only 2 cents each, yet it happens every day because *the system* knows how. The problem is: that system operates with a sociopathic disregard for life (human or otherwise) and has exceeded all limits of sustainability without so much as slowing down. It’s a short-sighted, cruel leviathan and there’s no human at the reigns.
At this point, it’s not about waiting for the dice to settle, it’s about figuring out how to wrangle such a beast and prevent the creation of more.
This is a pretty lame attitude towards mathematics. If William Rowan Hamilton showed you his discovery of quaternions, you’d probably scoff and say “yeah, but what can that do for ME?”.
Occam’s razor has been a guiding principal for science for centuries without having any proof for why it’s a good policy, Now Solomonoff comes along and provides a proof and you’re unimpressed. Great.
After all, a formalization of Occam’s razor is supposed to be useful in order to be considered rational.
Declaring a mathematical abstraction useless just because it is not practically applicable to whatever your purpose may be is pretty short-sighted. The concept of infinity isn’t useful to engineers, but it’s very useful to mathematicians. Does that make it irrational?
Thinking this through some more, I think the real problem is that S.I. is defined in the perspective of an agent modeling an environment, so the assumption that Many Worlds has to put any un-observable on the output tape is incorrect. It’s like stating that Copenhagen has to output all the probability amplitudes onto the output tape and maybe whatever dice god rolled to produce the final answer as well. Neither of those are true.
That’s a link to somebody complaining about how someone else presented an argument. I have no idea what point you think it makes that’s relevant to this discussion.
output of a TM that just runs the SWE doesn’t predict your and only your observations. You have to manually perform an extra operation to extract them, and that’s extra complexity that isn’t part of the “complexity of the programme”.
First, can you define “SWE”? I’m not familiar with the acronym.
Second, why is that a problem? You should want a theory that requires as few assumptions as possible to explain as much as possible. The fact that it explains more than just your point of view (POV) is a good thing. It lets you make predictions. The only requirement is that it explains at least your POV.
The point is to explain the patterns you observe.
>The size of the universe is not a postulate of the QFT or General Relativity.
That’s not relevant to my argument.
It most certainly is. If you try to run the Copenhagen interpretation in a Turing machine to get output that matches your POV, then it has to output the whole universe and you have to find your POV on the tape somewhere.
The problem is: That’s not how theories are tested. It’s not like people are looking for a theory that explains electromagnetism and why they’re afraid of clowns and why their uncle “Bob” visited so much when they were a teenager and why their’s a white streak in their prom photo as though a cosmic ray hit the camera when the picture was taken, etc. etc.
The observations we’re talking about are experiments where a particular phenomenon is invoked with minimal disturbance from the outside world (if you’re lucky enough to work in a field like Physics which permits such experiments). In a simple universe that just has an electron traveling toward a double-slit wall and a detector, what happens? We can observe that and we can run our model to see what it predicts. We don’t have to run the Turing machine with input of 10^80 particles for 13.8 billion years then try to sift through the output tape to find what matches our observations.
Same thing for the Many Worlds interpretation. It explains the results of our experiments just as well as Copenhagen, it just doesn’t posit any special phenomenon like observation, observation is just what entanglement looks like from the perspective of one of the entangled particles (or system of particles if you’re talking about the scientist).
Operationally, something like copenhagen, ie. neglect of unobserved predictions, and renormalisation , hasto occur, because otherwise you can’t make predictions.
First of all: Of course you can use many worlds to make predictions, You do it every time you use the math of QFT. You can make predictions about entangled particles, can’t you? The only thing is: while the math of probability is about weighted sums of hypothetical paths, in MW you take it quite literally as paths the actually being traversed. That’s what you’re trading for the magic dice machine in non-deterministic theories.
Secondly: Just because Many Worlds says those worlds exist, doesn’t mean you have to invent some extra phenomenon to justify renormalization. At the end of the day the unobservable universe is still unobservable. When you’re talking about predicting what you might observe when you run experiment X, it’s fine to ultimately discard the rest of the multiverse. You just don’t need to make up some story about how your perspective is special and you have some magic power to collapse waveforms that other particles don’t have.
Hence my comment about SU&C. Different adds some extra baggage about what that means—occurred in a different branch versus didn’t occur—but the operation still needs to occur.
Please stop introducing obscure acronyms without stating what they mean. It makes your argument less clear. More often than not it results in *more* typing because of the confusion it causes. I have no idea what this sentence means. SU&C = Single Universe and Collapse? Like objective collapse? “Different” what?
Well, the original comment was about explaining lightning
You’re right. I think I see your point more clearly now. I may have to think about this a little deeper. It’s very hard to apply Occam’s razor to theories about emergent phenomena. Especially those several steps removed from basic particle interactions. There are, of course, other ways to weigh on theory against another. One of which is falsifiability.
If the Thor theory must be constantly modified so to explain why nobody can directly observe Thor, then it gets pushed towards un-falsifiability. It gets ejected from science because there’s no way to even test the theory which in-turn means it has no predictive power.
As I explained in one of my replies to Jimdrix_Hendri, thought there is a formalization for Occam’s razor, Solomonoff induction isn’t really used. It’s usually more like: individual phenomena are studied and characterized mathematically, then; links between them are found that explain more with fewer and less complex assumptions.
In the case of Many Worlds vs. Copenhagen, it’s pretty clear cut. Copenhagen has the same explanatory power as Many Worlds and shares all the postulates of Many Worlds, but adds some extra assumptions, so it’s a clear violation of Occam’s razor. I don’t know of a *practical* way to handle situations that are less clear cut.
Thor isn’t quite as directly in the theory :-) In Norse mythology...
Tetraspace Grouping’s original post clearly invokes Thor as an alternate hypothesis to Maxwell’s equations to explain the phenomenon of electromagnetism. They’re using Thor as a generic stand-in for the God hypothesis.
Norse mythology he’s a creature born to a father and mother, a consequence of initial conditions just like you.
Now you’re calling them “initial conditions”. This is very different from “conditions” which are directly observable. We can observe the current conditions of the universe, come up with theories that explain the various phenomena we see and use those theories to make testable predictions about the future and somewhat harder to test predictions about the past. I would love to see a simple theory that predicts that the universe not only had a definite beginning (hint: your High School science teacher was wrong about modern cosmology) but started with sentient beings given the currently observable conditions.
Sure, you’d have to believe that initial conditions were such that would lead to Thor.
Which would be a lineage of Gods that begins with some God that created everything and is either directly or indirectly responsible for all the phenomena we observe according to the mythology.
I think you’re the one missing Tetraspace Grouping’s point. They weren’t trying to invoke all of Norse mythology, they were trying to compare the complexity of explaining the phenomenon of electromagnetism by a few short equations vs. saying some intelligent being does it.
You wouldn’t penalize the Bob hypothesis by saying “Bob’s brain is too complicated”, so neither should you penalize the Thor hypothesis for that reason.
The existence of Bob isn’t a hypothesis it’s not used to explain any phenomenon. Thor is invoked as the cause of, not consequence of, a fundamental phenomenon. If I noticed some loud noise on my roof every full moon, and you told me that your friend bob likes to do parkour on rooftops in my neighborhood in the light of the full moon, that would be a hypothesis for a phenomenon that I observed and I could test that hypothesis and verify that the noise is caused by Bob. If you posited that Bob was responsible for some fundamental forces of the universe, that would be much harder for me to swallow.
The true reason you penalize the Thor hypothesis is because he has supernatural powers, unlike Bob. Which is what I’ve been saying since the first comment.
No. The supernatural doesn’t just violate Occam’s Razor: it is flat-out incompatible with science. The one assumption in science is naturalism. Science is the best system we know for accumulating information without relying on trust. You have to state how you performed an experiment and what you observed so that others can recreate your result. If you say, “my neighbor picked up sticks on the sabbath and was struck by lightning” others can try to repeat that experiment.
It is, indeed, possible that life on Earth was created by an intelligent being or a group of intelligent beings. They need not be supernatural. That theory, however; is necessarily more complex than any a-biogenesis theory because you have to then explain how the intelligent designer(s) came about which would eventually involve some form of a-biogenesis.
You’re trying to conflate theory, conditions, and what they entail in a not so subtle way. Occam’s razor is about the complexity of a theory, not conditions, not what the theory and conditions entail. Just the theory. The Thor hypothesis puts Thor directly in the theory. It’s not derived from the theory under certain conditions. In the case of the Thor theory, you have to assume more to arrive at the same conclusion.
It’s really not that complicated.
That’s not how rolling a die works. Each roll is completely independent. The expected value of rolling a 20 sided die is 10.5 but there’s no logical way to assign an expected outcome of any given roll. You can calculate how many times you’d have to roll before you’re more likely than not to have rolled a specific value (1-P(specific value))^n < 0.5 so log(0.5)/log(1-P(specific_value)) < n. In this case P(specific_value) is 1⁄20 = 0.05. So n > log(0.5)/log(0.95) = 13.513. So you’re more likely than not to have rolled a “1” after 14 rolls, but that still doesn’t tell you what to expect your Nth roll to be.
I don’t see how your dice rolling example supports a pacifist outlook. We’re not rolling dice here. This is a subject we can study and gain more information about to understand the different outcomes better. You can’t do that with a dice. The outcomes of rolling a dice are not so dire. Probability is quite useful for making decisions in the face of uncertainty if you understand it better.
The telos of life is to collect and preserve information. That is to say: this is the defining behavior of a living system, so it is an inherent goal. The beginning of life must have involved some replicating medium for storing information. At first, life actively preserved information by replicating, and passively collected information through the process of evolution by natural selection. Now life forms have several ways of collecting and storing information. Genetics, epigenetic, brains, immune systems, gut biomes, etc.
Obviously a system that collects and preserves information is anti-entropic, so living systems can never be fully closed systems. One can think of them as turbulent vortices that form in the flow of the universe from low-entropy to high-entropy. It may never be possible to halt entropy completely, but if the vortex grows enough, it may slow the progression enough that the universe never quite reaches equilibrium. That’s the hope, at least.
One nice thing about this goal is that it’s also an instrumental goal. It should lead to a very general form of intelligence that’s capable of solving many problems.
One question is: if all living creatures share the same goal, why is there conflict? The simple answer is that it’s a flaw in evolution. Different creatures encapsulate different information about how to survive. There are few ways to share this information, so there’s not much way to form an alliance with other creatures. Ideally, we would want to maximize our internal, low entropy part, and minimize our interface with high entropy.
Imagine playing a game of Risk. A good strategy is to maximize the number of countries you control while minimizing the number of access points to your territory. If you hold North America, you want to take Venezuela, Iceland, and Kamchatka too because they add to your territory without adding to your “interface”. You still only have three territories to defend. This principal extends to many real-world scenarios.
Of-course a better way is to form alliances with your neighbors so you don’t have to spend so many resources concurring them (that’s not a good way to win Risk, but it would be better in the real world).
The reason humans haven’t figured out how to reach a state of peace is because we have a flawed implementation of intelligence that makes it difficult to align our interests (or to recognize that our base goals are inherently aligned).
One interesting consequence of the goal of collecting and preserving information is that it inherently implies a utility function to information. That is: information that is more relevant to the problem of collecting and preserving information is more valuable than information that’s less relevant to that goal. You’re not winning at life if you have an HD box set of “Happy Days” while your neighbor has only a flash drive with all of wikipedia on it. You may have more bits of information, but those bits aren’t very useful.
Another reason for conflict among humans is the hard problem of when to favor information preservation over collection. Collecting information necessarily involves risk because it means encountering the unknown. This is the basic conflict between conservatism and liberalism in the most general form of those words.
Would an AI given the goal of collecting and preserving information completely solve the alignment problem? It seems like it might. I’d like to be able to prove such a statement. Thoughts?
EDIT: Please pardon the disorganized, stream-of-consciousness, style of this post. I’m usually skeptical of posts that seem so scatter-brained and almost… hippy-dippy… for lack of a better word. Like the kind of rambling that a stoned teenager might spout. Please work with me here. I’ve found it hard to present this idea without coming off as a spiritualist-quack, but it is a very serious proposal.
I think you’re example of interpreting quantum mechanics gets pretty close to the heart of the matter. It’s one thing to point at solomonoff induction and say, “there’s your formalization”. It’s quite another to understand how Occam’s Razor is used in practice.
Nobody actually tries to convert the Standard Model to the shortest possible computer program, count the bits, and compare it to the shortest possible computer program for string theory or whatever.
What you’ll find, however; is that some theories amount to other theories but with an extra postulate or two (e.g. many worlds vs. Copenhagen). So they are strictly more complex. If it doesn’t explain more than the simpler theory the extra complexity isn’t justified.
A lot of the progression of science over the last few centuries has been toward unifying diverse theories under less complex, general frameworks. Special relativity helped unify theories about the electric and magnetic forces, which were then unified with the weak nuclear force and eventually the strong nuclear force. A lot of that work has helped explain the composition of the periodic table and the underlying mechanisms to chemistry. In other words, where there used to be many separate theories, there are now only two theories that explain almost every phenomenon in the observable universe. Those two theories are based on surprisingly few and surprisingly simple postulates.
Over the 20th century, the trend was towards reducing postulates and explaining more, so it was pretty clear that Occam’s razor was being followed. Since then, we’ve run into a bit of an impasse with GR and QFT not nicely unifying and discoveries like dark energy and dark matter.
If the “cartesian barrier” is such a show-stopper, then why is it non-trivial to prove that I’m not a brain in a vat remotely puppetering a meat robot?
Was nobody intelligent before the advent of neuroscience? Do people need to know neuroscience before they qualify as intelligent agents? Are there no intelligent animals?
I’m really not sure how to interpret the requirement that an agent know about software upgrades. There is a system called a Gödel Machine that’s compatible with AIXI(tl) and it’s all about self modification, however; I don’t know of many real-world examples of intelligent agents concerned with whatever the equivalent of a software upgrade would be for a brain.
Rewards help by filtering out world models where doing dangerous things has a high expected reward. Remember that AIXI includes reward in its world models and exponentially devalues long world models. If the reward signal drops as AIXI pilots its body close to fire, lava, acid, sharks, etc. The world model that says “don’t damage your body” is much shorter than the model that says, “Don’t go near fire, lava, acid, sharks, etc. but maybe dropping an anvil on your head is a great idea!”.
That’s not an AIXI thing. That’s a problem for all agents.
The anvil “paradox” simply illustrates the essential intractability of tabula rasa in general, but its not like you couldn’t initialize AIXI with some apriori.
In a totally tabula rasa set-up, an agent can’t know if anything it outputs will yield arbitrarily high or low reward. That’s not unique to AIXI. It’s also not unique to AIXI that it can only infer the concept of mortality.
Did your parents teach you what they think is deadly or were you born with inate knowledge of death? How exactly is it that you came to suspect that dropping an anvil on your head isn’t a good idea? Were your parents perfect programmers?
So you are saying it can’t generalize. That’s exactly what you’re saying.
Teenagers do horribly dangerous things all the time for the dubious reward of impressing their peers, yet this machine that’s diligently applying inductive inference to determine the provably optimal actions fails to meet your yet-to-be described standard for intelligence if it’s unlucky?
Also, why is the reward function the only means of feeding this agent data. Couldn’t you just, tell it that jumping off a cliff is a bad idea? Do you think it undermines the intelligence of a child to tell it to look both ways before crossing the street?
Why? Prove that a small punishment wouldn’t work. If you give the AIXI heat sensors so it gradually gets more and more punishment as it approaches a fire, show me how Occam’s razor wouldn’t prevail and say “I bet you’ll get more punishment if you get even closer to the fire”. Where does the model that says there’s a pot of gold in a lava pit come from? How does it end up drowning out litterally every other world model? Explain it. Don’t just say “It could happen therefore AIXI isn’t perfect and it has to be perfect to be intelligent”.
No agent can perfectly emulate itself. AIXItl can have an approximate self-model just like any other agent. It would have an incomplete world-model otherwise. It can “think it’s like other brains” too. That’s also Occam’s razor. A world model where you assume others are like you is shorter than a world model where other agents are completely alien. Imperfect ≠ incapable. You’re applying a double standard.
Why not?
AIXItl can absolutely develop a world model that includes a mortal self-model. I think what you’re arguing is that, since there will always be a world model where jumping off a cliff yields some reward, It will never hypothisize that its future reward goes to zero. It will always assume there’s some chance it will live. That’s not irrational. There is some chance it will live. That chance never technically goes to zero. That’s very different from thinking jumping off a cliff is the optimal action.
Or maybe you’re saying that if you expand the tree of future actions, you are supposing that you can take those actions? Not in the world models that say your dead. Those will continue to yield zero after all your agency is obliterated.
Imagine you had a mech suit, and you could lift a car. Your world model will include that mech suit. It will also include the posibility that said mech suit is destroyed. Then you can’t lift a car.
I’m done for now. I really don’t like this straw-man conversation style of article. Why can’t you argue actual points against actual people?