Eccentricity

Karma: 231

Eccentricity 5 May 2024 18:10 UTC
4 points
2
on: My hour of memoryless lucidity
A better way to do the memory overwrite experiment is to prepare a list of what’s in the box to match each of ten possible numbers, then have someone provide a random number while your short term memory doesn’t work and see if you can successfully overwrite the memory that corresponds to that number (as measured by correctly guessing the number much later).

Eccentricity 7 Apr 2024 6:22 UTC
4 points
3
on: My intellectual journey to (dis)solve the hard problem of consciousness
I’m confused. I know that it is like something to be me (this is in some sense the only thing I know for sure). It seems like there rules which shape the things I experience, and some of those rules can be studied (like the laws of physics). We are good enough at understanding some of these rules to predict certain systems with a high degree of accuracy, like how an asteroid will orbit a star or how electrons will be pushed through a wire by a particular voltage in a circuit. But I have no way to know or predict if it is like something to be a fish or GPT-4. I know that physical alterations to my brain seem to affect my experience, so it seems like there is a mapping from physical matter to experiences. I do not know precisely what this mapping is, and this indeed seems like a hard problem. In what sense do you disagree with my framing here?

Eccentricity 17 Mar 2024 19:31 UTC
0 points
1
in reply to: zyansheep’s comment on: Chapter 5: The Fundamental Attribution Error
Oh good catch, I missed that. Thanks!

Eccentricity 11 Mar 2024 20:26 UTC
1 point
8
on: Evolution did a surprising good job at aligning humans...to social status
The only metric natural selection is “optimizing” for is inclusive genetic fitness. It did not “try” to align humans with social status, and in many cases people care about social status to the detriment of their inclusive genetic fitness. This is a failure of alignment, not a success.

Eccentricity 27 Feb 2024 2:47 UTC
3 points
0
on: Can we get an AI to do our alignment homework for us?
I am not so sure it will be possible to extract useful work towards solving alignment out of systems we do not already know how to carefully steer. I think that substantial progress on alignment is necessary before we know how to build things that actually want to help us advance the science. Even if we built something tomorrow that was in principle smart enough to do good alignment research, I am concerned we don’t know how to make it actually do that rather than, say, imitate more plausible-sounding but incorrect ideas. The fact that appending silly phrases like “I’ll tip $200” improves the probability of receiving correct code from current LLMs indicates to me that we haven’t succeeded at aligning them to maximally want to produce correct code when they are capable of doing so.

Eccentricity 21 Feb 2024 17:16 UTC
2 points
0
on: Chapter 5: The Fundamental Attribution Error
How does Harry know the name “Lucius Malfoy”?

Eccentricity 21 Jan 2024 16:39 UTC
2 points
1
on: Another Non-Anthropic Paradox: The Unsurprising Rareness of Rare Events
We aren’t surprised by HHTHHTTTHT or whatever because we perceive it as the event “a sequence containing a similar number of heads and tails in any order, ideally without a long subsequence of H or T”, which occurs frequently.

Eccentricity 11 Jan 2024 21:50 UTC
6 points
11
on: An even deeper atheism
I’m enjoying this series, and look forward to the next installment.

Eccentricity 10 Jan 2024 19:22 UTC
9 points
0
on: “Dark Constitution” for constraining some superintelligences
The thing I mean by “superintelligence” is very different from a government. A government cannot design nanotechnology, and is made of humans which value human things.

Eccentricity 5 Jan 2024 21:06 UTC
1 point
0
on: Deep atheism and AI risk
What can men do against such reckless indifference?

Eccentricity 31 Dec 2023 17:28 UTC
6 points
2
on: Paper Summary: The Koha Code—A Biological Theory of Memory
Can someone with more knowledge give me a sense of how new this idea is, and guess at the probability that it is onto something?

Eccentricity 12 Dec 2023 6:01 UTC
5 points
1
on: The Consciousness Box
Why are we so sure chatbots (and parrots for that matter) are not conscious? Well, maybe the word is just too slippery to define, but I would bet that parrots have some degree of subjective experience, and I am sufficiently uncertain regarding chatbots that I do worry about it slightly.

Eccentricity 9 Dec 2023 17:23 UTC
41 points
22
on: The Offense-Defense Balance Rarely Changes
Please note that the graph of per capita war deaths is on a log scale. The number moves over several orders of magnitude. One could certainly make the case that local spikes were sometimes caused by significant shifts in the offense-defense balance (like tanks and planes making offense easier for a while at the beginning of WWII). These shifts are pushed back to equilibrium over time, but personally I would be pretty unhappy about, say, deaths from pandemics spiking 4 orders of magnitude before returning to equilibrium.

Finding Sparse Linear Connections between Features in LLMs

Logan Riggs, Sam Mitchell and Eccentricity

9 Dec 2023 2:27 UTC

68 points

5 comments10 min readLW link

Eccentricity 11 Nov 2023 8:52 UTC
16 points
−2
on: AI Timelines
This random Twitter person says that it can’t. Disclaimer: haven’t actually checked for myself.

https://chat.openai.com/share/36c09b9d-cc2e-4cfd-ab07-6e45fb695bb1

Here is me playing against GPT-4, no vision required. It does just fine at normal tic-tac-toe, and figures out anti-tic-tac-toe with a little bit of extra prompting.

Eccentricity 26 Oct 2023 4:45 UTC
1 point
0
in reply to: Mitchell_Porter’s comment on: Even if we “solve alignment”, Moloch can still kill us
Yes. I think the title of my post is misleading (I have updated it now). I think I am trying to point at the problem that the current incentives mean we are going to mess up the outer alignment problem, and natural selection will favor the systems that we fail the hardest on.

Eccentricity 26 Oct 2023 4:19 UTC
1 point
0
in reply to: mako yass’s comment on: Even if we “solve alignment”, Moloch can still kill us
That’s a very fair response. My claim here is really about the outer alignment problem, and that if lots of people have access to the ability to create / fine tune AI agents, many agents that have goals misaligned with humanity as a whole will be created, and we will lose control of the future.

Eccentricity 26 Oct 2023 3:46 UTC
2 points
0
in reply to: interstice’s comment on: Even if we “solve alignment”, Moloch can still kill us
I suppose what I’m trying to point to is some form of the outer alignment problem. I think we may end up with AIs that are aligned with human organizations like corporations more than individual humans. The reason for this is that corporations or militaries which employ more ruthless AIs will, over time, accrue more power and resources. It’s not so much explicit (i.e. violent) competition, but rather the gradual tendency for systems which are power-seeking and resource-maximizing to end up with more power and resources over time. If we allow for the creation / fine tuning of many AI agents, and allow them to accrue resources and copy themselves, then natural selection will favor the more selfish ones which are least aligned with humanity at large. We already require pretty extensive regulation to make sure that corporations don’t incur significant negative externalities, and these are organizations that are run by and composed of humans. When those entities are no longer humans, I think the vast majority of power and resources will no longer be explicitly controlled by humans, and moreover will be controlled by AI which has values poorly aligned with the majority of humans. The AI’s goals will only be aligned with the short-term interests of the small number of humans that created them in the first place. Once the majority of people realize that this system is not acting in their long-term interests, there will be nothing they can do about it.

Eccentricity 26 Oct 2023 1:01 UTC
15 points
13
on: Architects of Our Own Demise: We Should Stop Developing AI
Yeah. I think a key point that is often overlooked is that even if powerful AI is technically controllable, i.e. we solve inner alignment, that doesn’t mean society will handle it safely. I think by default it looks like every company and military is forced to start using a ton of AI agents (or they will be outcompeted by someone else who does). Competition between a bunch of superhuman AIs that are trying to maximize profits or military tech seems really bad for us. We might not lose control all at once, but rather just be gradually outcompeted by machines, where “gradually” might actually be pretty quick. Basically, we die by Moloch.

Eccentricity 23 Oct 2023 1:54 UTC
1 point
0
in reply to: MondSemmel’s comment on: A Good Explanation of Differential Gears
Yeah, I’ve seen this video before. Still excellent.

Eccentricity

Find­ing Sparse Lin­ear Con­nec­tions be­tween Fea­tures in LLMs

Finding Sparse Linear Connections between Features in LLMs