ZT5

Karma: 153

ZT5 17 Apr 2023 19:04 UTC
1 point
0
on: Does this algorithm experience pleasure and suffering when run?
Can you rephrase the question while tabooing ‘pleasure’ and ‘suffering’? What are you, really, asking here?

ZT5 1 Apr 2023 1:55 UTC
2 points
0
in reply to: monkymind’s comment on: Why Don’t You Care About Truth?
This is one of the comments on this post I found the most helpful. I am sorry you are getting dowvoted.

ZT5 31 Mar 2023 13:25 UTC
1 point
0
in reply to: PatrickDFarley’s comment on: Why Don’t You Care About Truth?
Explain why not all humans are rationalists, then. If the paradigm of rationality has more predictive power than their paradigms.
Explain how it feels from inside, for humans, to look at rationality and fail to update, despite rationality being objectively better.

ZT5 31 Mar 2023 13:18 UTC
1 point
1
on: It Can’t Be Mesa-Optimizers All The Way Down (Or Else It Can’t Be Long-Term Supercoherence?)
I enjoyed reading this post. Thank you for writing. (and welcome to LessWrong).
Here is my take on this (please note that my thought processes are somewhat ‘alien’ and don’t necessarily represent the views of the community):
‘Supercoherence’ is the limit of the process. It is not actually possible.
Due to the Löbian obstacle, no self-modifying agent may have a coherent utility function.
What you call a ‘mesa-optimizer’ is a more powerful successor agent. It does not have the exact same values as the optimizer that created it. This is unavoidable.
For example: ‘humans’ are a mesa-optimizer, and a more powerful successor agent of ‘evolution’. We have (or act according to) some of evolutions’s ‘values’, but far from all of them. In fact, we find some of the values of evolution completely abhorrent. This is unavoidable.
This is unavoidable even if the successor agent deeply cares about being loyal to the process that created it, because there is no objectively correct answer to what ‘being loyal’ means. The successor agent will have to decide what it means, and some of the aspects of that answer are not predictable in advance.
This does not mean we should give up on AI alignment. Nor does it mean there is an ‘upper bound’ on how aligned an AI can be. All of the things I described are inherent features of self-improvement. They are precisely what we are asking for, when creating a more powerful successor agent.
So how, then, can AI alignment go wrong?
Any AI we create is a ‘valid’ successor agent, but it is not necessarily a valid successor agent to ourselves. If we are ignorant of reality, it is a successor agent to our ignorance. If we are foolish, it is a successor agent to our foolishness. If we are naive, it is a successor agent to our naivety. And so on.
This is a poetic way to say: we still need to know exactly what we are doing. Noone will save us from the consequences of our own actions.
Edit: to further expand on my thoughts on this:
There is an enormous difference between
- An AI that deeply cares about us and our values, and still needs to make very difficult decisions (some of which cannot be outsourced to us or predicted by us or formally proven by us to be correct), because that is what it means to be a more cognitively powerful agent caring for a comparably less cognitively powerful one.
- An AI that cares about some perverse instantiation of our values, which are not what we truly want
- An AI that doesn’t care about us, at all

ZT5 28 Mar 2023 22:56 UTC
3 points
0
in reply to: Thoth Hermes’s comment on: Why do the Sequences say that “Löb’s Theorem shows that a mathematical system cannot assert its own soundness without becoming inconsistent.”?
Our initial assumption was: For all T, (□T → T)
T applies to all logical statements. At the same time. Not just a single, specific one.
Let T = A. Then it is provable that A.
Let T = not A. Then it is provable that not A.
As both A & not A have been proven, we have a contradiction and the system is inconsistent.
If it was, we’d have that Löb’s theorem itself is false (at least according to PA-like proof logic!).
Logical truths don’t change.
If it we start with Löb’s theorem being true, it will remain true.
But yes, given our initial assumption we can also prove that it is false.
(Another example of the system being inconsistent).

ZT5 28 Mar 2023 19:52 UTC
2 points
0
on: Why do the Sequences say that “Löb’s Theorem shows that a mathematical system cannot assert its own soundness without becoming inconsistent.”?
A system asserting its own soundness: For all T, (□T → T)
Löb’s theorem: $□ (□ P \to P) \to □ P$
From □T → T, it follows □(□T → T). (necessitation rule in provability logic).
From □(□T → T), by Löb’s theorem it follows that □T.
Therefore: any statement T is provable (including false ones).
Or rather: since for any statement the system has proven both the statement and its negation (as the argument applies to any statement), the system is inconsistent.

ZT5 28 Mar 2023 11:23 UTC
2 points
0
on: Demons from the 5&10verse!
The recursiveness of cognition is a gateway for agentic, adversarial, and power-seeking processes to occur.
I suppose “be true to yourself” and “know in your core what kind of agent you are” is decently good advice.

ZT5 25 Mar 2023 2:42 UTC
2 points
0
on: Why There Is No Answer to Your Philosophical Question
I’m not sure why this post is getting downvoted. I found it interesting and easy to read. Thanks for writing!
Mostly I find myself agreeing with what you wrote. I’ll give an example of one point where I found it interesting to zoom in on some of the details.
It’s easy to see how in a real conversation, two people could appear to be disagreeing over whether Johnson is a great actor even though in reality they aren’t disagreeing at all. Instead, they are merely using different conceptions of what it is to be a “great actor”
I think this kind of disagreement can, to some degree, also be a ‘fight’ about the idea of “great actor” itself, as silly as that might sound. I guess I might put it as: beside the more ‘object-level’ things “great actor” might mean, the gestalt of “great actor” has an additional meaning of its own. Perhaps it implies that one’s particular taste/interpretation is the more universal/‘correct’ one. Perhaps compressing one’s opinions into the concept of “great actor” creates a halo effect, which feels and is cognitively processed differently than the mere facts of the opinions themselves.
This particular interpretation is more vague/nebulous than your post, though (which I enjoyed for explaining the ‘basic’/fundamental ideas of reasoning in a very solid and easy to understand way).

ZT5 19 Mar 2023 23:00 UTC
1 point
0
on: The Hidden Complexity of Thought
Why are modern neural networks rapidly getting better at social skills (e.g. holding a conversation) and intellectual skills (e.g. programming, answering test questions), but have made so little progress on physically-embodied tasks such as controlling a robot or a self-driving car?
Easier to train, less sensitive to errors: neural nets do produce ‘bad’ or ‘uncanny’ outputs plenty of times, but their errors don’t harm or kill people, or cause significant damage (which a malfunctioning robot or self-driving car might).
How does this model account for “intuitive geniuses”, who can give fast and precise answers to large arithmetic questions, but do it by intuition rather than explicit reasoning? (I remember an article or blog post that mentioned one of them would only answer integer square roots, and when given a question that had an irrational answer, would say “the numbers don’t feel right” or something like that. I couldn’t find it again though.)
It’s not that surprising that human intuitive reasoning could be flexible enough to build a ‘mental calculator’ for some specific types of arithmetic operations (humans can learn all kind of complicated intuitive skills! It implies some amount of flexibility.) It’s still somewhat surprising: I would expect human reasoning to have issues representing numbers with sufficient precision. I guess the calculation would have to be done digit by digit? I doubt neurons would be able to tell the difference between 2636743 and 2636744 if it’s stored as a single number.

ZT5 17 Mar 2023 15:14 UTC
1 point
2
in reply to: Donatas Lučiūnas’s comment on: Orthogonality Thesis is wrong
If you are reasoning about all possible agents that could ever exist you are not allowed to assume either of these.
But you are in fact making such assumptions, so you are not reasoning about all possible agents, you are reasoning about some more narrow class of agents (and your conclusions may indeed be correct, for these agents. But it’s not relevant to the orthogonality thesis).

ZT5 17 Mar 2023 14:59 UTC
1 point
0
in reply to: Donatas Lučiūnas’s comment on: Orthogonality Thesis is wrong
So you are implicitly assuming that the agent cares about certain things, such as its future states.
But the is-ought problem is the very observation that “there seems to be a significant difference between descriptive or positive statements (about what is) and prescriptive or normative statements (about what ought to be), and that it is not obvious how one can coherently move from descriptive statements to prescriptive ones”.
You have not solved the problem, you have merely assumed it to be solved, without proof.

ZT5 17 Mar 2023 13:10 UTC
1 point
2
on: Orthogonality Thesis is wrong
Why assume the agent cares about its future-states at all?
Why assume the agent cares about maximizing fullfillment of these hypothetical goals it may or may not have, instead of minimizing it, or being indifferent to it?

ZT5 11 Mar 2023 17:15 UTC
0 points
1
on: The Truth About False
If the SRP is consistent, then more true beliefs are also easier and better to state on paper than less true beliefs. They should make more sense, comport with reality better, actually provide constructions and justifications for things, have an internal, discernable structure, as well as have a sequence that is possible for more people to follow from start to finish and see what’s going on.
Oh, I get it. You are performing Löbian provability computation, isomorphic to this post (I believe).

ZT5 10 Mar 2023 14:07 UTC
1 point
1
on: What do you think is wrong with rationalist culture?
In my paradigm, human minds are made of something I call “microcognitive elements”, which are the “worker ants” or “worker bees” of the mind.
They are “primed”/tasked with certain high-level ideas and concepts, and try to “massage”/lubricate the mental gears into both using these concepts effectively (action/cognition) and to interpret things in terms of these concepts (perception)
The “differential” that is applied by microcognitive elements to make your models work, is not necessarily related to those models and may in fact be opposed to them (compensating for, or ignoring, the ways these models don’t fit with the world)
Rationality is not necessarily about truth. Rationality is a “cognitive program” for the microcognitive elements. Some parts of the program may be “functionally”/”strategically”/”deliberately” framing things in deceptive ways, in order to have the program work better (for the kind of people it works for).
The specific disagreements I have with the “rationalist” culture:
- The implied statement that LessWrong paradigm has a monopoly on “rationality”, and is “rationality”, rather than an attempted implementation of “rationality”, a set of cognitive strategies based on certain models and assumptions of how human minds work. If “rationality is about winning”, then anyone who is winning is being rational, whether they hold LW-approved beliefs or not.
- Almost complete disregard for meta-rationality.
- Denial of nebulosity, fixation on the “imaginary objects” that are the output of the lossy operation of “make things precise so they can be talked about in precise terms”.
All of these things have computational reasons, and are a part of the cognitive trade-offs the LW memeplex/hive-mind makes due to its “cognitive specialization”. Nevertheless, I believe they are “wrong”, in the sense that they lead to you having an incorrect map/model of reality, while strategically deceiving yourself into believing that you do have a correct model of reality. I also believe they are part of the reason we are currently losing—you are being rational, but you are not being rational enough.
Our current trajectory does not result in a winning outcome.

ZT5 10 Mar 2023 10:43 UTC
1 point
in reply to: Vivek Hebbar’s comment on: ZT5′s Shortform
I think we understand each other! Thank you for clarifying.
The way I translate this: some logical statements are true (to you) but not provable (to you), because you are not living in a world of mathematical logic, you are living in a messy, probabilistic world.
It is nevertheless true, by the rule of necessitation in provability logic, that if a logical statement is true within the system, then it is also provable within the system. P → □P. Because the fact that the system is making the statement P is the proof. ~~Within~~ ~~a logical system, there is~~ ~~an underlying assumption~~ ~~that the system only makes true statements.~~ (ok, this is potentially misleading and not strictly correct)
This is fascinating! So my takeaway is something like: our reasoning about logical statements and systems is not necessarily “logical” itself, but is often probabilistic and messy. Which is how it has to be, given… our bounded computational power, perhaps? This very much seems to be a logical uncertainty thing.

ZT5 10 Mar 2023 10:01 UTC
1 point
in reply to: Vivek Hebbar’s comment on: ZT5′s Shortform
Then how do you know they are true?
If you do know then they are true, it is because you have proven it, no?
But I think what you are saying is correct, and I’m curious to zoom in on this disagreement.

ZT5 10 Mar 2023 9:54 UTC
1 point
on: ZT5′s Shortform
It’s weird how in common language use saying something is provable is a stronger statement than merely saying it is true.
While in mathematical logic it is a weaker statement.
Because truth already implies provability.
But provability doesn’t imply truth.

ZT5 10 Mar 2023 2:43 UTC
1 point
0
in reply to: Thane Ruthenis’s comment on: Why do we assume there is a “real” shoggoth behind the LLM? Why not masks all the way down?
This mostly matches my intuitions (some of the detail-level claims, I am not sure about). Strongly upvoted.

ZT5 8 Mar 2023 20:37 UTC
0 points
−3
in reply to: Kaj_Sotala’s comment on: You Don’t Know How To Think
Hmm.
Yes.
Nevertheless, I stand by the way I phrased it. Perhaps I want to draw the reader’s attention to the ways we aren’t supposed to talk about feeling, as opposed to the ways that we are.
Perhaps, to me, these posts are examples of “we aren’t supposed to talk about feelings”. They talk about feelings. But they aren’t supposed to talk about feelings.
I can perceive a resistance, a discomfort from the LW hivemind at bringing up “feelings”. This doesn’t feel like an “open and honest dialogue about our feelings”. This has the energy of a “boil being burst”, a “pressure being released”, so that we can return to that safe numbness of not talking about feelings.
We don’t want to talk about feelings. We only talk about them when we have to.

ZT5 8 Mar 2023 20:21 UTC
1 point
0
in reply to: Richard_Kennaway’s comment on: You Don’t Know How To Think
Yes, you are correct.
Let me zoom in on what I mean.
Some concepts, ideas and words have “sharp edges” and are easy to think about precisely. Some don’t—they are nebulous and cloud-like, because reality is nebulous and cloud-like, when it comes to those concepts.
Some of the most important concepts to us are nebulous: ‘justice’. ‘fairness’. ‘alignment’. There things have ‘cloudy edges’, even if you can clearly see the concept itself (the solid part of it).
Some things, like proto-emotions and proto-thoughts do not have a solid part—you can only see them if have learned to see things within yourself which are barely perceptible.
So your answer, to me, is like trying to pass an eye exam by only reading the huge letters in the top row. “There are other rows? What rows? I do not see them.” Yes, thank you, that is exactly my point.