Keenan Pepper

Karma: 19

Keenan Pepper 23 Apr 2024 0:21 UTC
1 point
0
in reply to: Keenan Pepper’s comment on: Transformers Represent Belief State Geometry in their Residual Stream
Actually I would still really appreciate the training hyperparameters like batch size, learning rate schedule...

Keenan Pepper 20 Apr 2024 23:51 UTC
1 point
0
in reply to: Keenan Pepper’s comment on: Transformers Represent Belief State Geometry in their Residual Stream
Ah, never mind, I believe I found the relevant hyperparameters here: https://github.com/adamimos/epsilon-transformers/blob/main/examples/msp_analysis.ipynb
In particular, the stuff I needed was that it has only a single attention head per layer, and 4 layers.

Keenan Pepper 20 Apr 2024 23:50 UTC
3 points
0
in reply to: Exa Watson’s comment on: Transformers Represent Belief State Geometry in their Residual Stream
No, the actual hidden Markov process used to generate the awesome triangle fractal image is not the {0,1,random} model but a different one, which is called “Mess3” and has a symmetry between the 3 hidden states.
Also, they’re not claiming the transformer learns merely the hidden states of the HMM, but a more complicated thing called the “mixed state presentation”, which is not the states that the HMM can be in but the (usually much larger number of) belief states which an ideal prediction process trying to “sync” to it might go thru.

Keenan Pepper 20 Apr 2024 21:29 UTC
1 point
0
on: Transformers Represent Belief State Geometry in their Residual Stream
Can you share the hyperparameters used to make this figure?

Keenan Pepper 14 Mar 2024 22:53 UTC
1 point
0
on: Fixed point or oscillate or noise
Okay my computer right here has 10^13 bits of storage and without too much trouble I could get it to use all that memory as a counter and just count to the highest value possible, which would be 2^(10^13) or in other words much much longer than the age of the universe even at a fast clock speed.

Now technically yes, after it got to that 2^(10^13) value it would have to either halt or start over from 0 or something… but that seems not so practically relevant to me because it’s such a huge integer value.

Keenan Pepper 20 Dec 2023 2:02 UTC
1 point
0
on: Acausal normalcy
I haven’t dived into this yet, but am I right in guessing that the gist is exactly like a way more fleshed-out and intricate version of Hofstadter’s “superrationality”?

Keenan Pepper 28 Sep 2023 1:11 UTC
1 point
on: Elementary Infra-Bayesianism
I’m trying to read this post now but it looks like a bunch of images (of math) are missing. Does that match what others see?

Keenan Pepper 21 Sep 2023 0:42 UTC
3 points
2
in reply to: LawrenceC’s comment on: Interpretability Externalities Case Study—Hungry Hungry Hippos
The Bitter Lesson applies to almost all attempts to build additional structure into neural networks, it turns out.
Out of curiosity, what are the other exceptions to this besides the obvious one of attention?

Keenan Pepper 21 Sep 2023 0:11 UTC
−5 points
−2
in reply to: mruwnik’s comment on: Where might I direct promising-to-me researchers to apply for alignment jobs/grants?
Upvoted because this mentions Nonlinear Network.

Keenan Pepper 18 Aug 2023 23:36 UTC
3 points
0
on: Against Almost Every Theory of Impact of Interpretability
Some of your YouTube links are broken because the equals sign got escaped as “%3D”. If I were you I’d spend a minute to fix that.

Keenan Pepper 30 May 2023 21:13 UTC
1 point
0
on: Adumbrations on AGI from an outsider
Have you read https://www.lesswrong.com/posts/5wMcKNAwB6X4mp9og/that-alien-message yet?

I had some similar thoughts to yours before reading that, but it helped me make a large update in favor of superintelligence being able to make magical-seeming feats of deduction. If a large number of smart humans working together for a long time can figure something out (without performing experiments or getting frequent updates of relevant sensory information), then a true superintelligence will also be able to.

Keenan Pepper 12 May 2023 2:15 UTC
1 point
0
in reply to: Mo Putera’s comment on: Hell is Game Theory Folk Theorems
Hilarious… I fixed my error

Keenan Pepper 1 May 2023 23:20 UTC
18 points
7
on: Hell is Game Theory Folk Theorems
Reminds me of this from Scott Alexander’s Meditations on Moloch:
Imagine a country with two rules: first, every person must spend eight hours a day giving themselves strong electric shocks. Second, if anyone fails to follow a rule (including this one), or speaks out against it, or fails to enforce it, all citizens must unite to kill that person. Suppose these rules were well-enough established by tradition that everyone expected them to be enforced.

Keenan Pepper 1 Apr 2023 16:39 UTC
1 point
0
in reply to: Adam Zerner’s comment on: Proposal: Butt bumps as a default for physical greetings
Keenan Pepper

Keenan Pepper 8 Jul 2022 1:05 UTC
3 points
2
in reply to: Nora Belrose’s comment on: Human values & biases are inaccessible to the genome
What I gather from https://www.lesswrong.com/s/HzcM2dkCq7fwXBej8 is that it’s sort of like what you’re saying but it’s much more about predictions than actual experiences. If the Learning Subsystem is imagining a plan predicted to have high likelihood of smelling sex pheromones, seeing sexy body shapes, experiencing orgasm, etc. then the Steering Subsystem will reward the generation of that plan, basically saying “Yeah, think more thoughts like that!”.

The Learning Subsystem has a bunch of abstract concepts and labels for things the Steering Subsystem doesn’t care about (and can’t even access), but there are certain hardcoded reward channels it can understand. But the important thing is the reward signals can be evaluated for imagined worlds as well as the real immediate world.