Dach

Karma: 91

Dach 30 Aug 2020 13:58 UTC
3 points
in reply to: Anirandis’s comment on: Open & Welcome Thread—August 2020
If you’re having significant anxiety from imagining some horrific I-have-no-mouth-and-I-must-scream scenario, I recommend that you multiply that dread by a very, very small number, so as to incorporate the low probability of such a scenario. You’re privileging this supposedly very low probability specific outcome over the rather horrifically wide selection of ways AGI could be a cosmic disaster.

This is, of course, not intended to dismay you from pursuing solutions to such a disaster.

Dach 2 Sep 2020 8:25 UTC
5 points
in reply to: Anirandis’s comment on: Open & Welcome Thread—August 2020
You can’t really be accidentally slightly wrong. We’re not going to develop Mostly Friendly AI, which is Friendly AI but with the slight caveat that it has a slightly higher value on the welfare of shrimp than desired, with no other negative consequences. The molecular sorts of precision needed to get anywhere near the zone of loosely trying to maximize or minimize for anything resembling human values will probably only follow from a method that is converging towards the exact spot we want it to be at, such as some clever flawless version of reward modelling.

In the same way, we’re probably not going to accidentally land in hyperexistential disaster territory. We could have some sign flipped, our checksum changed, and all our other error-correcting methods (Any future seed AI should at least be using ECC memory, drives in RAID, etc.) defeated by religious terrorists, cosmic rays, unscrupulous programmers, quantum fluctuations, etc. However, the vast majority of these mistakes would probably buff out or result in paper-clipping. If an FAI has slightly too high of a value assigned to the welfare of shrimp, it will realize this in the process of reward modelling and correct the issue. If its operation does not involve the continual adaptation of the model that is supposed to represent human values, it’s not using a method which has any chance of converging to Overwhelming Victory or even adjacent spaces for any reason other than sheer coincidence.

A method such as this has, barring stuff which I need to think more about (stability under self-modification), no chance of ending up in a “We perfectly recreated human values… But placed an unreasonably high value on eating bread! Now all the humans will be force-fed bread until the stars burn out! Mwhahahahaha!” sorts of scenarios. If the system cares about humans being alive enough to not reconfigure their matter into something else, we’re probably using a method which is innately insulated from most types of hyperexistential risk.

It’s not clear that Gwern’s example, or even that category of problem, is particularly relevant to this situation. Most parallels to modern-day software systems and the errors they are prone to are probably best viewed as sobering reminders, not specific advice. Indeed, I suspect his comment was merely a sobering reminder and not actual advice. If humans are making changes to the critical software/hardware of an AGI (And we’ll assume you figured out how to let the AGI allow you to do this in a way that has no negative side effects), while that AGI is already running, something bizarre and beyond my abilities of prediction is already happening. If you need to make changes after you turn your AGI on, you’ve already lost. If you don’t need to make changes and you’re making changes, you’re putting humanity in unnecessary risk. At this point, if we’ve figured out how to assist the seed AI in self-modification, at least until the point at which it can figure out how to do stable self-modification for itself, the problem is already solved. There’s more to be said here, but I’ll refrain for the purpose of brevity.

Essentially, we can not make any ordinary mistake. The type of mistake we would need to make in order to land up in hyperexistential disaster territory would, most likely, be an actual, literal sign flip scenario, and such scenarios seem much easier to address. There will probably only be a handful of weak points for this problem, and those weak points are all already things we’d pay extra super special attention to and will engineer in ways which make it extra super special sure nothing goes wrong. Our method will, ideally, be terrorist proof. It will not be possible to flip the sign of the utility function or the direction of the updates to the reward model, even if several of the researchers on the project are actively trying to sabotage the effort and cause a hyperexistential disaster.

I conjecture that most of the expected utility gained from combating the possibility of a hyperexistential disaster lies in the disproportionate positive effects on human sanity and the resulting improvements to the efforts to avoid regular existential disasters, and other such side-benefits.

None of this is intended to dissuade you from investigating this topic further. I’m merely arguing that a hyperexistential disaster is not remotely likely- not that it is not a concern. The fact that people will be concerned about this possibility is an important part of why the outcome is unlikely.

Dach 2 Sep 2020 20:25 UTC
4 points
in reply to: Anirandis’s comment on: Open & Welcome Thread—August 2020

I’m slightly confused by this one. If we were to design the AI as a strict positive utilitarian (or something similar), I could see how the worst possible thing to happen to it would be no human utility (i.e. paperclips). But most attempts at an aligned AI would have a minimum at “I have no mouth, and I must scream”. So any sign-flipping error would be expected to land there.

It’s hard to talk in specifics because my knowledge on the details of what future AGI architecture might look like is, of course, extremely limited.

As an almost entirely inapplicable analogy (which nonetheless still conveys my thinking here): consider the sorting algorithm for the comments on this post. If we flipped the “top-scoring” sorting algorithm to sort in the wrong direction, we would see the worst-rated posts on top, which would correspond to a hyperexistential disaster. However, if we instead flipped the effect that an upvote had on the score of a comment to negative values, it would sort comments which had no votes other than the default vote assigned on posting the comment to the top. This corresponds to paperclipping- it’s not minimizing the intended function, it’s just doing something weird.

If we inverted the utility function, this would (unless we take specific measures to combat it like you’re mentioning) lead to hyperexistential disaster. However, if we invert some constant which is meant to initially provide value for exploring new strategies while the AI is not yet intelligent enough to properly explore new strategies as an instrumental goal, the AI would effectively brick itself. It would place negative value on exploring new strategies, presumably including strategies which involve fixing this issue so it can acquire more utility and strategies which involve preventing the humans from turning it off. If we had some code which is intended to make the AI not turn off the evolution of the reward model before the AI values not turning off the reward model for other reasons (e.g. the reward model begins to properly model how humans don’t want the AI to turn the reward model evolution process off), and some crucial sign was flipped which made it do the opposite, the AI would freeze the process of the reward model being updated and then maximize whatever inane nonsense its model currently represented, and it would eventually run into some bizarre previously unconsidered and thus not appropriately penalized strategy comparable to tiling the universe with smiley faces, i.e. paperclipping.

These are really crude examples, but I think the argument is still valid. Also, this argument doesn’t address the core concern of “What about the things which DO result in hypexistential disaster”, it just establishes that much of the class of mistake you may have previously thought usually or always resulted in hyperexistential disaster (sign flips on critical software points) in fact usually causes paperclipping or the AI bricking itself.

If we were to design the AI as a strict positive utilitarian (or something similar), I could see how the worst possible thing to happen to it would be no human utility (i.e. paperclips).

Can you clarify what you mean by this? Also, I get what you’re going for, but paperclips is still extremely negative utility because it involves the destruction of humanity and the reconfiguration of the universe into garbage.

Perhaps there’ll be a reward function/model intentionally designed to disvalue some arbitrary “surrogate” thing in an attempt to separate it from hyperexistential risk. So “pessimizing the target metric” would look more like paperclipping than torture. But I’m unsure as to (1) whether the AGI’s developers would actually bother to implement it, and (2) whether it’d actually work in this sort of scenario.

I sure hope that future AGI developers can be bothered to embrace safe design!

Also worth noting is that an AGI based on reward modelling is going to have to be linked to another neural network, which is going to have constant input from humans. If that reward model isn’t designed to be separated in design space from AM, someone could screw up with the model somehow.

The reward modelling system would need to be very carefully engineered, definitely.

If we were to, say, have U = V + W (where V is the reward given by the reward model and W is some arbitrary thing that the AGI disvalues, as is the case in Eliezer’s Arbital post that I linked,) a sign flip-type error in V (rather than a sign flip in U) would lead to a hyperexistential catastrophe.

I thought this as well when I read the post. I’m sure there’s something clever you can do to avoid this but we also need to make sure that these sorts of critical components are not vulnerable to memory corruption. I may try to find a better strategy for this later, but for now I need to go do other things.

I think this is somewhat likely to be the case, but I’m not sure that I’m confident enough about it. Flipping the direction of updates to the reward model seems harder to prevent than a bit flip in a utility function, which could be prevent through error-correcting code memory (as you mentioned earlier.)

Sorry, I meant to convey that this was a feature we’re going to want to ensure that future AGI efforts display, not some feature which I have some other independent reason to believe would be displayed. It was an extension of the thought that “Our method will, ideally, be terrorist proof.”

Dach 3 Sep 2020 11:04 UTC
3 points
in reply to: Anirandis’s comment on: Open & Welcome Thread—August 2020

Interesting analogy. I can see what you’re saying, and I guess it depends on what specifically gets flipped. I’m unsure about the second example; something like exploring new strategies doesn’t seem like something an AGI would terminally value. It’s instrumental to optimising the reward function/model, but I can’t see it getting flipped with the reward function/model.

Sorry, I meant instrumentally value. Typo. Modern machine learning systems often require a specific incentive in order to explore new strategies and escape local maximums. We may see this behavior in future attempts at AGI, And no, it would not be flipped with the reward function/model- I’m highlighting that there is a really large variety of sign flip mistakes and most of them probably result in paperclipping.

My thinking was that a signflipped AGI designed as a positive utilitarian (i.e. with a minimum at 0 human utility) would prefer paperclipping to torture because the former provides 0 human utility (as there aren’t any humans), whereas the latter may produce a negligible amount. I’m not really sure if it makes sense tbh.

Paperclipping seems to be negative utility, not approximately 0 utility. It involves all the humans being killed and our beautiful universe being ruined. I guess if there are no humans, there’s no utility in some sense, but human values don’t actually seem to work that way. I rate universes where humans never existed at all and

I’m… not sure what 0 utility would look like. It’s within the range of experiences that people experience on modern-day earth- somewhere between my current experience and being tortured. This is just definition problems, though- We could shift the scale such that paperclipping is zero utility, but in that case, we could also just make an AGI that has a minimum at paperclipping levels of utility.

Even if we engineered it carefully, that doesn’t rule out screw-ups. We need robust failsafe measures just in case, imo.

In the context of AI safety, I think “robust failsafe measures just in case” is part of “careful engineering”. So, we agree!

You’d still need to balance it in a way such that the system won’t spend all of its resources preventing this thing from happening at the neglect of actual human values, but that doesn’t seem too difficult.

I read Eliezer’s idea, and that strategy seems to be… dangerous. I think that “Giving an AGI a utility function which includes features which are not really relevant to human values” is something we want to avoid unless we absolutely need to.

I have much more to say on this topic and about the rest of your comment, but it’s definitely too much for a comment chain. I’ll make an actual post on this containing my thoughts sometime in the next week or two, and link it to you.
What links here?
- Anirandis's comment on How easily can we separate a friendly AI in design space from one which would bring about a hyperexistential catastrophe? by Anirandis (10 Sep 2020 2:11 UTC; 2 points)

Dach 4 Sep 2020 10:01 UTC
3 points
in reply to: Anirandis’s comment on: Open & Welcome Thread—August 2020

So it definitely seems plausible for a reward to be flipped without resulting in the system failing/neglecting to adopt new strategies/doing something weird, etc.

I didn’t mean to imply that a signflipped AGI would not instrumentally explore.

I’m saying that, well… modern machine learning systems often get specific bonus utility for exploring, because it’s hard to explore the proper amount as an instrumental goal due to the difficulties of fully modelling the situation, and because systems which don’t have this bonus will often get stuck in local maximums.

Humans exhibit this property too. We have investigating things, acquiring new information, and building useful strategic models as a terminal goal- we are “curious”.

This is a feature we might see in early stages of modern attempts at full AGI, for similar reasons to why modern machine learning systems and humans exhibit this same behavior.

Presumably such features would be built to uninstall themselves after the AGI reaches levels of intelligence sufficient to properly and fully explore new strategies as an instrumental goal to satisfying the human utility function, if we do go this route.

If we sign flipped the amount of reward the AGI gets from such a feature, the AGI would be penalized for exploring new strategies- this may have any number of effects which are fairly implementation specific and unpredictable. However, it probably wouldn’t result in hyperexistential catastrophe. This AI, providing everything else works as intended, actually seems to be perfectly aligned. If performed on a subhuman seed AI, it may brick- in this trivial case, it is neither aligned nor misaligned- it is an inanimate object.

Yes, an AGI with a flipped utility function would pursue its goals with roughly the same level of intelligence.

The point of this argument is super obvious, so you probably thought I was saying something else. I’m going somewhere with this, though- I’ll expand later.

Dach 10 Sep 2020 12:36 UTC
3 points
on: Why haven’t we celebrated any major achievements lately?
Scientific and industrial progress is an essential part of modern life. The opening of a new extremely long suspension bridge would be entirely unsurprising- If it was twice the length of the previous longest, I might bother to read a short article about it. I would assume there would be some local celebration (Though not too much- if it was too well received, why did we not do it before?), but it would not be a turning point in technology or a grand symbol of man’s triumph over nature. We’ve been building huge awe inspiring structures for quite some time by now, and the awe has worn off. Innovation and progress is normal.

Celebration in terms of “Bells are ringing and the people are weeping and philosophizing” requires complete upsets. Reusable rockets, manned missions to mars, a COVID-19 vaccine, etc- those are all part of the current state of affairs. If humanity wants these things, and has the time, I know they will come.

[Question] Outcome Terminology?

Dach14 Sep 2020 18:04 UTC

6 points

0 comments1 min readLW link

Dach 18 Sep 2020 0:45 UTC
3 points
on: Making the Monte Hall problem weirder but obvious
Amusing anecdote: I once tried to give my mother intuition behind Monte Hall with a process similar to this. She didn’t quite get it, so I played the game with her a few times. Unfortunately, she won more often when she stayed than when she switched (n ~= 10), and decided that I was misremembering. A lesson was learned, but not by the person I had intended.

Dach 24 Sep 2020 7:17 UTC
1 point
on: Dach’s Shortform
(2020 − 10 − 03) EDIT: I have found the solution: the way I was thinking about identity turns out to be silly.
In general, if you update your probability estimates of non-local phenomenon based on anthropic arguments, you’re (probably? I’m sure someone has come up with smart counterexamples) doing something that includes the sneaky implication that you’re conducting FTL communication. I consider this to be a reductio ad absurdum on the whole idea of updating your probability estimates of non-local phenomena based on anthropic arguments, regardless of the validity of the specific scenario in this post.
If you conduct some experiment which tries to determine which world you’re in, and you observe x thing, you haven’t learned (at least, in general) anything about what percentage of exact copies of you before you did the experiment observed what you observed.
If you do update and If you claim the update you’re making corresponds to reality, you’re claiming that non-local facts are having a justified influence on you. Whenever you put it like that, it’s very silly. By adjusting the non-local worlds, we can change this justified influence on you (otherwise your update does not correspond to reality), and we have FTL signaling.
The things you’re experiencing aren’t any evidence about the sorts of things that most exact copies of your brain are experiencing, and if you claim it is you’re unknowingly claiming FTL communication is possible, and that you’re doing it right now.
I’ll need to write something more substantial about this problem.
(End Edit)
(This bit is edited to redact irrelevant/obviously-wrong-in-hindsight information)
So, let us imagine our universe is “big” in the sense of many worlds, and all realities compatible with the universal wavefunction are actualized- or at least something close to that. This seems increasingly likely to me.
Aligned superintelligences might permute through all possible human minds, for various reasons. They might be especially interested in permuting through minds which are thinking about the AI alignment problem- also for various reasons.
As a result, it’s not evident to me for normal reasons that most of the “measure” of me-right-now flows into “normal” things- it seems plausible (on a surface level- I have some arguments against this) that most of the “measure” of me should be falling into Weird Stuff. Future superintelligences control the vast majority of all of the measure of everything (more branches in the future, more future in general, etc.), and they’re probably more interested in building minds from scratch and then doing weird things with them.
If, among all of the representations of my algorithm in reality, (100%) * (1 − 10^-30) of my measure was “normal stuff”, I’d still expect to be “diverted” basically instantly, if we assume there’s one opportunity for a “diversion” every planck second.
However, this is, of course, not happening. We are constantly avoiding waking up from the simulation.
Possible explanations:
- The world is small. This seems unlikely- look at these conditions:
1. Many worlds is wrong.
2. The universe is finite and not arbitrarily large.
3. There’s no multiverse, or the multiverse is small, or other universes all have properties which mean they don’t support the potential for diversion. e.g. their laws are sufficiently different where none of them will contain human algorithms in great supply,
4. There’s no way to gain infinite energy, or truly arbitrarily large amounts of energy.
- The sum of “Normal universe simulations” vastly outweighs the sum of “continuity of experience stealing simulations”, for some reason. Maybe there are lots of different intelligent civilizations in the game, and lots of us spawn general universe simulators, and we also tend to cut simulations when other superintelligences arrive.
- Superintelligences are taking deliberate action to prevent “diversion” tactics from being effective, or predict that other superintelligences are taking these actions. For example, if I don’t like the idea of diverting people, I might snipe for recently diverted sentients and place them back in a simulation consistent with a “natural” environment.
- “Diversion” as a whole isn’t possible, and my understanding of how identity and experience work is sketchy.
- Some other sort of misunderstanding hidden in my assumptions or premise.
(Apply the comments in the edit above to the original post. If I think that the fact I’m not observing myself being shoved into a simulation is evidence that most copies of my algorithm throughout reality are not being shoved into simulations, I also need to think that the versions of me which don’t get shoved into simulations are justifiably updating in correspondence with facts of arbitrary physical separation from themselves, thus FTL signaling. Or, even worse, inter-universe signaling.)

Dach 24 Sep 2020 18:29 UTC
1 point
in reply to: TAG’s comment on: Dach’s Shortform
Right, that isn’t an exhaustive list. I included the candidates which seemed most likely.
So, I think superintelligence is unlikely in general- but so is current civilization. I think superintelligences have a high occurrence rate given current civilization (for lots of reasons), which also means that current civilization isn’t that much more likely than superintelligence. It’s more justified to say “Superintelligences which make human minds” have a super low occurrence rate relative to natural examples of me and my environment, but that still seems to be an unlikely explanation.
Based on the “standard” discussion on this topic, I get the distinct impression that the probability our civilization will construct an aligned superintelligence is significantly greater than, for example, 10^-20%, and the large amounts of leverage that a superintelligence would have (There’s lots of matter out there) would produce this same effect.

Dach 29 Sep 2020 2:42 UTC
1 point
on: The Short Case for Verificationism
It’s a well known tragedy that (unless Humanity gains a perspective on reality far surpassing my wildest expectations) there are arbitrarily many nontrivially unique theories which correspond to any finite set of observations.
The practical consequence of this (A small leap, but valid) is that we can remove any idea you have and make exactly the same predictions about sensory experiences by reformulating our model. Yes, any idea. Models are not even slightly unique- the idea of anything “really existing” is “unnecessary”, but literally every belief is “unnecessary”. I’d expect some beliefs would, for the practical purposes of present-day-earth human brains, be impossible to replace, but I digress.
(Joke: what’s the first step of more accurately predicting your experiences? Simplifying your experiences! Ahaha!)
You cannot “know” anything, because you’re experiencing exactly the same thing as you could possibly be experiencing if you were wrong. You can’t “know” that you’re either wrong or right, or neither, you can’t “know” that you can’t “know” anything, etc. etc. etc.
There are infinitely many different ontologies which support every single piece of information you have ever or will ever experience.
In fact, no experience indicates anything- we can build a theory of everything which explains any experience but undermines any inferences made using it, and we can do this with a one-to-one correspondence to theories that support that inference.
In fact, there’s no way to draw the inference that you’re experiencing anything. We can build infinitely many models (Or, given the limits on how much matter you can store in a Hubble volume, an arbitrarily large but finite number of models) in which the whole concept of “experience” is explained away as delusion...
And so on!
The main point of making beliefs pay rent is having a more computationally efficient model- doing things more effectively. Is your reformulation more effective than the naïve model? No.
Your model, and this whole line of thought, is not paying rent.

Dach 30 Sep 2020 0:42 UTC
1 point
in reply to: TAG’s comment on: The Short Case for Verificationism
This is only true for trivial values, e.g. “I terminally value having this specific world model”.
For most utility schemes (Including, critically, that of humans), the supermajority of the purpose of models and beliefs is instrumental. For example, making better predictions, using less computing power, etc.
In fact, humans who do not recognize this fact and stick to beliefs or models because they like them are profoundly irrational. If the sky is blue, I wish to believe the sky is blue, and so on. So, assuming that only prediction is valuable is not question begging- I suspect you already agreed with this and just didn’t realize it.
In the sense that beliefs (and the models they’re part of) are instrumental goals, any specific belief is “unnecessary”. Note the quotations around “unnecessary” in this comment and the comment you’re replying to. By “unnecessary” I mean the choice of which beliefs and which model to use is subject to the whims of which is more instrumentally valuable- in practice, a complex tradeoff between predictive accuracy and computational demands.

Dach 30 Sep 2020 19:48 UTC
3 points
in reply to: TAG’s comment on: The Short Case for Verificationism
It’s also true for “I terminally value understanding the world, whatever the correct model is”.
I said e.g, not i.e, and “I terminally value understanding the world, whatever the correct model is” is also a case of trivial values.
First, a disclaimer: It’s unclear how well the idea of terminal/instrumental values maps to human values. Humans seem pretty prone to value drift- whenever we decide we like some idea and implement it, we’re not exactly “discovering” some new strategy and then instrumentally implementing it. We’re more incorporating the new strategy directly into our value network. It’s possible (Or even probable) that our instrumental values “sneak in” to our value network and are basically terminal values with (usually) lower weights.
Now, what would we expect to see if “Understanding the world, whatever the correct model is” was a broadly shared terminal value in humans, in the same way as the other prime suspects for terminal value (survival instinct, caring for friends and family, etc)? I would expect:
1. It’s exhibited in the vast majority of humans, with some medium correlation between intelligence and the level to which this value is exhibited. (Strongly exhibiting this value tends to cause greater effectiveness i.e. intelligence, but most people already strongly exhibit this value)
2. Companies to have jumped on this opportunity like a pack of wolves and have designed thousands of cheap wooden signs with phrases like “Family, love, ‘Understanding the world, whatever the correct model is’”.
3. Movements which oppose this value are somewhat fringe and widely condemned.
4. Most people who espouse this value are not exactly sure where it’s from, in the same way they’re not exactly sure where their survival instinct or their love for their family came from.
But, what do we see in the real world?
1. Exhibiting this value is highly correlated with intelligence. Almost everyone lightly exhibits this value, because its practical applications are pretty obvious (Pretending your mate isn’t cheating on you is just plainly a stupid strategy), but it’s only strongly and knowingly exhibited among really smart people interested in improving their instrumental capabilities.
2. Movements which oppose this value are common.
3. Most people who espouse this value got it from an intellectual tradition, some wise counseling, etc.

Dach 1 Oct 2020 3:12 UTC
1 point
in reply to: TAG’s comment on: The Short Case for Verificationism
Refer to my disclaimer for the validity of the idea of humans having terminal values. In the context of human values, I think of “terminal values” as the ones directly formed by evolution and hardwired into our brains, and thus broadly shared. The apparent exceptions are rarish and highly associated with childhood neglect and brain damage.
“Broadly shared” is not a significant additional constraint on what I mean by “terminal value”, it’s a passing acknowledgement of the rare counterexamples.
If that’s your argument then we somewhat agree. I’m saying that the model you should use is the model that most efficiently pursues your goals, and (in response to your comment) that utility schemes which terminally value having specific models (and thus whose goals are most efficiently pursued through using said arbitrary terminally valued model and not a more computationally efficient model) are not evidently present among humans in great enough supply for us to expect that that caveat applies to anyone who will read any of these comments.
Real world examples of people who appear at first glance to value having specific models (e.g. religious people) are pretty sketchy- if this is to be believed, you can change someone’s terminal values with the argumentative equivalent of a single rusty musket ball and a rubber band. That defies the sort of behaviors we’d want to see from whatever we’re defining as a “terminal value”, keeping in mind the inconsistencies between the way human value systems are structured and the way the value systems of hypothetical artificial intelligences are structured.
The argumentative strategy required to convince someone to ignore instrumentally unimportant details about the truth of reality looks more like “have a normal conversation with them” than “display a series of colorful flashes as a precursor to the biological equivalent of arbitrary code execution” or otherwise psychologically breaking them in a way sufficient to get them to do basically anything, which is what would be required to cause serious damage to what I’m talking about when I say “terminal values” in the context of humans.

Dach 1 Oct 2020 22:22 UTC
1 point
in reply to: ike’s comment on: The Short Case for Verificationism
This is false. I actually have no idea what it would mean for an experience to be a delusion—I don’t think that’s even a meaningful statement.
I’m comfortable with the Cartesian argument that allows me to know that I am experiencing things.
Everything you’re thinking is compatible with a situation in which you’re actually in a simulation hosted in some entirely alien reality (2 + 2 = 3, experience is meaningless, causes follow after effects, (True ^ True) = False, etc, which is being manipulated in extremely contrived ways which produce your exact current thought processes.
There are an exhausting number of different riffs on this idea- maybe you’re in an asylum and all of your thinking including “I actually have no idea what it would mean for an experience to be a delusion” is due to some major mental disorder. Oh, how obvious- my idea of experience was a crazy delusion all along. I can’t believe I said that it was my daughter’s arm. “I think therefore I am”? Absurd!
If you have an argument against this problem, I am especially interested in hearing it- it seems like the fact you can’t tell between this situation and reality (and you can’t know whether this situation is impossible as a result, etc.) is part of the construction of the scenario. You’d need to show that the whole idea that “We can construct situations in which you’re having exactly the same thoughts as you are right now, but with some arbitrary change (Which you don’t even need to believe is theoretically possible or coherent) in the background” is invalid.
Do I think this is a practical concern? Of course not. The Cartesian argument isn’t sufficient to convince me, though- I’m just assuming that I really exist and things are broadly as they seem. I don’t think it’s that plausible to expect that I would be able to derive these assumptions without using them- there is no epistemological rock bottom.
On the contrary, it’s the naive realist model that doesn’t pay rent by not making any predictions at all different from my simpler model.
Your model is (I allege) not actually simpler. It just seems simpler because you “removed something” from it. A mind could be much “simpler” than ours, but also less useful- which is the actual point of having a simpler model. The “simplest” model which accurately predicts everything we see is going to be a fundamental physical theory, but making accurate predictions about complicated macroscopic behavior entirely from first principles is not tractable with eight billion human brains worth of hardware.
The real question of importance is, does operating on a framework which takes specific regular notice of the idea that naïve realism is technically a floating belief increase your productivity in the real world? I can’t see why that would be the case- it requires occasionally spending my scare brainpower on reformatting my basic experience of the world in more complicated terms, I have to think about whether or not I should argue with someone whenever they bring up the idea of naïve realism, etc. You claim adopting the “simpler” model doesn’t change your predictions, so I don’t see what justifies these costs. Are there some major hidden costs of naïve realism that I’m not aware of? Am I actually wasting more unconscious brainpower working with the idea of “reality” and things “really existing”?
If I have to choose between two models which make the exact same predictions (i.e. my current model and your model), I’m going to choose between the model which is better at achieving my goals. In practice, this is the more computationally efficient model, which (I allege) is my current model.

Dach 2 Oct 2020 2:18 UTC
1 point
in reply to: ike’s comment on: The Short Case for Verificationism
E.g. “maybe you’re in an asylum” assumes that it’s possible for an asylum to “exist” and for someone to be in it, both of which are meaningless under my worldview.
What do you mean by “reality”? You keep using words that are meaningless under my worldview without bothering to define them.
You’re implementing a feature into your model which doesn’t change what it predicts but makes it less computationally efficient.
The fact you’re saying “both of which are meaningless under my worldview” is damning evidence that your model (or at least your current implementation of your model) sucks, because that message transmits useful information to someone using my model but apparently has no meaning in your model. Ipso facto, my model is better. There’s no coherent excuse for this.
This isn’t relevant to the truth of verificationism, though. My argument against realism is that it’s not even coherent. If it makes your model prettier, go ahead and use it.
What does it mean for your model to be “true”? There are infinitely many unique models which will predict all evidence you will ever receive- I established this earlier and you never responded.
It’s not about making my model “prettier”- my model is literally better at evoking the outcomes that I want to evoke. This is the correct dimension on which to evaluate your model.
You’ll just run into trouble if you try doing e.g. quantum physics and insist on realism—you’ll do things like assert there must be loopholes in Bell’s theorem, and search for them and never find them.
My preferred interpretation of quantum physics (many worlds) was made before bell’s theorem, and it turns out that bell’s theorem is actually strong evidence in favor of many worlds. Bell’s theorem does not “disprove realism”, it just disproves hidden variable theories. My interpretation already predicted that.
I suspect this isn’t going anywhere, so I’m abdicating.

Dach 2 Oct 2020 8:48 UTC
1 point
in reply to: TAG’s comment on: The Short Case for Verificationism
The existence of places like LessWrong, philosophy departments, etc, indicate that people do have some sort of goal to understand things in general, aside from any nitpicking about what is a true terminal value.
I agree- lots of people (including me, of course) are learning because they want to- not as part of some instrumental plan to achieve their other goals. I think this is significant evidence that we do terminally value learning. However, the way that I personally have the most fun learning is not the way that is best for cultivating a perfect understanding of reality (nor developing the model which is most instrumentally efficient, for that matter). This indicates that I don’t necessarily want to learn so that I can have the mental model that most accurately describes reality- I have fun learning for complicated reasons which I don’t expect align with any short guiding principle.
Also, at least for now, I get basically all of my expected value from learning from my expectations for being able to leverage that knowledge. I have a lot more fun learning about e.g. history than the things I actually spend my time on, but historical knowledge isn’t nearly as useful, so I’m not spending my time on it.
In retrospect, I should’ve said something more along the lines of “We value understanding in and of itself, but (at least for me, and at least for now) most of the value in our understanding is from its practical role in the advancement of our other goals.”
I’ve already stated than I am not talking about confirming specific models.
There’s been a mix-up here- my meaning for “specific” also includes “whichever model corresponds to reality the best”

Dach 4 Oct 2020 23:30 UTC
8 points
in reply to: Dagon’s comment on: Industrial literacy
I suspect that if people really understood the cost to future people of the contortions we go through to support this many simultaneous humans in this level of luxury, we’d have to admit that we don’t actually care about them very much. I sympathize with those who are saying “go back to the good old days” in terms of cutting the population back to a sustainable level (1850 was about 1.2B, and it’s not clear even that was sparse/spartan enough to last more than a few millennia).
There’s enough matter in our light cone to support each individual existing human for roughly 10^44 years.
The problem is not “running out of resources”- there are so many resources it will require cosmic engineering for us to use more of them than entropy, even if we multiply our current population by ten billion.
Earth is only one planet- it does not matter how much of earth we use here and now. Our job is to make sure that our light cone ends up being used for what we find valuable. That’s our only job. The finite resources available on earth are almost irrelevant to the long term human project, beyond the extent to which those resources help us accomplish our job- I would burn a trillion pacific oceans worth of oil for a .000000000000000001% absolute increase to our probability of succeeding at our job.
I sympathize with people who are thinking like this, because it shows that they’re at least trying to think about the future. But… Future Humanity doesn’t need the petty resources available on earth any more than we need good flint to make hunting spears with. The only important thing and the best thing we can do for them is to ensure that they will ever exist at all!

Dach 5 Oct 2020 21:48 UTC
1 point
in reply to: Emiya’s comment on: Industrial literacy
Why do you seem to imply that burning fossil fuels would help at all the odds of the long term human project?
I don’t imply that. For clarification:
I would waste any number of resources if that was what was best for the long-term prospects of Humanity. In practice, that means that I’m willing to sacrifice really really large amounts of resources that we won’t be able to use until after we develop AGI or similar, in exchange for very very small increases to our probability of developing aligned AGI or similar.
Because I think we won’t be able to use significant portions of most of the types of resources available on Earth before we develop AGI or similar, I’m willing to completely ignore conservation of those resources. I still care about the side effects of the process of gathering and using those resources, but...
The oil example isn’t meant to be any reflection of my affinity for fossil fuels.
My point that “Super long term conservation of resources” isn’t a concern. If there are near term non “conservation of resources” reasons why doing something is bad, I’m open to those concerns- we don’t need to worry about ensuring that humans 100 years from now have access to fuel sources.
For the record, I think nuclear and solar seem to clearly be better energy sources than fossil fuels for most applications. Especially nuclear.
I’m also not fighting defense for climate change activists- I don’t care about how many species die out, unless those species are useful (short term- next 50 years, 100 years max?) to us. If you want to make sure future humanity has access to Tropical Tree Frog #952, and you’re concerned about them going extinct, go grab some genetic samples and preserve them. If the species makes many humans very happy, provides us valuable resources, etc., fine.
At the current rate of fishing, all fish species could be practically extinct by 2050
I’m open to the notion that regulating our fish intake is the responsible move- it seems like a pretty easy sell. It keeps our fishing equipment, boats, and fishermen useful. I’m taking this action because it’s better for humanity, not because it’s better for the fish or better for the Earth.
The Strategy is not to excessively use resources and destroy the environment just because we can, it’s to actively and directly use our resources to accomplish our goals, which I have doubts strongly aligns with preserving the environment.
Let’s list a few ways in which our conservation efforts are bad:
- Long term (100+ years) storage of nuclear waste.
- Protecting species which aren’t really useful to Humanity.
- Planning with the idea that we will be indefinitely (Or, for more than 100 years) living in the current technological paradigm, i.e. without artificial general intelligence.
And in which they’re valid:
- Being careful with our harvesting of easily depletable species which we’ll be better off having alive for the next 100 years.
- Being careful with our effect on global temperatures and water levels, in order to avoid the costs of relocating large numbers of humans.
- Being careful with our management of important freshwater reserves, at least until we develop sufficiently economical desalinization plants.
I personally don’t want to see my personal odds of survival diminishing because I’ll have to deal with riots, food shortages, totalitarian fascist governments or… who know?
The greatest risks to your survival are, by far, (unless you’re a very exceptional person) natural causes and misaligned artificial general intelligence. You shouldn’t significantly concern yourself with dealing with weird risk factors such as riots or food shortages unless you’ve already found that you can’t do anything about natural causes and misaligned artificial general intelligence. Spoiler: It seems you can do something about these risk factors.
Every economical estimate I saw said that the costs would be a lot less than the economic damage from climate change alone, many estimates agree that it would actually improve the economy, and nobody is saying “let’s toss industry and technology out of the window, back to the caves everyone!”.
Many people are saying things I consider dangerously close to “Let’s toss industry and technology out of the window!”. Dagon suggested that our current resource expenditure was reckless, and that we should substantially downgrade our resource expenditures. I consider this to be a seriously questionable perspective on the problem.
I’m not arguing against preserving the environment if it would boost the economy for at least the next 100 years, keeping in mind opportunity cost. I want to improve humanity’s generalized power to pursue its goals- I’m not attached to any particular short guiding principle for doing this, such as “Protect the Earth!” or “More oil!”. I don’t have Mad Oil Baron Syndrome.

Dach 5 Oct 2020 22:48 UTC
1 point
in reply to: TAG’s comment on: Industrial literacy
It’s possible, but very improbable. We have vastly more probable concerns (misaligned AGI, etc.) than resource depletion sufficient to cripple the entire human project.
What critical resources is Humanity at serious risk of depleting? Remember that most resources have substitutes- food is food.

Dach

[Question] Out­come Ter­minol­ogy?

[Question] Outcome Terminology?