Aprillion (Peter Hozák)

Karma: 114

https://peter.hozak.info

Aprillion (Peter Hozák) 12 May 2024 11:24 UTC
1 point
0
in reply to: faul_sname’s comment on: Duct Tape security
It’s duct tapes all the way down!

Aprillion (Peter Hozák) 12 May 2024 11:09 UTC
1 point
0
on: Duct Tape security
Bad: “Screw #8463 needs to be reinforced.”
The best: “Book a service appointment, ask them to replace screw #8463, do a general check-up, and report all findings to the central database for all those statistical analyses that inform recalls and design improvements.”

Aprillion (Peter Hozák) 12 May 2024 10:54 UTC
1 point
0
on: Duct Tape security
Consider a car that starts shaking whenever it’s driven. It’s uncomfortable, so the owner gets a pillow to put on the seat.
I know there are people are like that, but I have to say… Aaaaaaaaaaaargh!❕❗😱

Aprillion (Peter Hozák) 12 May 2024 10:48 UTC
2 points
0
in reply to: Shoshannah Tekofsky’s comment on: Dyslucksia
Oh, I should probably mention that my weakness is that I cannot remember the stuff well while reading out loud (especially when I focus on pronunciation for the benefit of listeners)… My workaround is to make pauses—it seems the stuff is in working memory and my subconscious can process it if I give it a short moment, and then I can think about it consciously too, but if I would read out loud a whole page, I would have trouble even trying to summarize the content.

Similarly a common trick how to remember names is to repeat the name out loud.. that doesn’t seem to improve recall for me very much, I can hear someone’s name a lot of times and repeating it to myself doesn’t seem to help. Perhaps seeing it written while hearing it might be better, but not sure… By far the best method is when I want to write them a message and I have to scroll around until I see their picture, after that I seem to remember names just fine 😹

Aprillion (Peter Hozák) 11 May 2024 8:37 UTC
2 points
0
in reply to: Lorxus’s comment on: Dyslucksia
Yeah, I myself subvocalize absolutely everything and I am still horrified when I sometimes try any “fast” reading techniques—those drain all of the enjoyment our of reading for me, as if instead of characters in a story I would imagine them as p-zombies.

For non-fiction, visual-only reading cuts connections to my previous knowledge (as if the text was a wave function entangled to the rest of the universe and by observing every sentence in isolation, I would collapse it to just “one sentence” without further meaning).

I never move my lips or tongue though, I just do the voices (obviously, not just my voice … imagine reading Dennett without Dennett’s delivery, isn’t that half of the experience gone? how do other people enjoy reading with most of the beauty missing?).

It’s faster then physical speech for me too, usually the same speed as verbal thinking.

Aprillion (Peter Hozák) 3 May 2024 16:16 UTC
1 point
0
in reply to: kave’s comment on: Ironing Out the Squiggles
ah, but booby traps in coding puzzles can be deliberate… one might even say that it can feel “rewarding” when we train ourselves on these “adversarial” examples

the phenomenon of programmers introducing similar bugs in similar situations might be fascinating, but I wouldn’t expect a clear answer to the question “Is this true?” without a slightly more precise definitions of:
- “same” bug
- same “bug”
- “hastily” cobbled-together programs
- hastily “cobbled-together” programs …

Aprillion (Peter Hozák) 18 Apr 2024 14:12 UTC
1 point
0
in reply to: Adam Shai’s comment on: Transformers Represent Belief State Geometry in their Residual Stream
To me as a programmer and not a mathematitian, the distinction doesn’t make practical intuitive sense.

If we can create 3 functions f, g, h so that they “do the same thing” like f(a, b, c) == g(a)(b)(c) == average(h(a), h(b), h(c)), it seems to me that cross-entropy can “do the same thing” as some particular objective function that would explicitly mention multiple future tokens.

My intuition is that cross-entropy-powered “local accuracy” can approximate “global accuracy” well enough in practice that I should expect better global reasoning from larger model sizes, faster compute, algorithmic improvements, and better data.

Implications of this intuition might be:
- myopia is a quantity not a quality, a model can be incentivized to be more or less myopic, but I don’t expect it will be proven possible to enforce it “in the limit”
- instruct training on longer conversations outght to produce “better” overall conversations if the model simulates that it’s “in the middle” of a conversation and follow-up questions are better compared to giving a final answer “when close to the end of this kind of conversation”
What nuance should I consider to understand the distinction better?

Aprillion (Peter Hozák) 17 Apr 2024 7:35 UTC
9 points
4
on: Transformers Represent Belief State Geometry in their Residual Stream
transformer is only trained explicitly on next token prediction!
I find myself understanding language/multimodal transformer capabilities better when I think about the whole document (up to context length) as a mini-batch for calculating the gradient in transformer (pre-)training, so I imagine it is minimizing the document-global prediction error, it wasn’t trained to optimize for just a single-next token accuracy...

Aprillion (Peter Hozák) 17 Apr 2024 7:02 UTC
2 points
0
on: Transformers Represent Belief State Geometry in their Residual Stream
Can you help me understand a minor labeling convention that puzzles me? I can see how we can label $S_{R}$ from the Z1R process as $η_{11}$ in MSP because we observe 11 to get there, but why $S_{1}$ is labeled as $η_{01}$ after observing either 100 or 00, please?

Aprillion (Peter Hozák) 10 Apr 2024 11:29 UTC
1 point
0
on: Aprillion (Peter Hozák)’s Shortform
Pushing writing ideas to external memory for my less burned out future self:
- agent foundations need path-dependent notion of rationality
  - economic world of average expected values / amortized big O if f(x) can be negative or you start very high
  - vs min-maxing / worst case / risk-averse scenarios if there is a bottom (death)
- alignment is a capability
  - they might sound different in the limit, but the difference disappears in practice (even close to the limit? 🤔)
- in a universe with infinite Everett branches, I was born in the subset that wasn’t destroyed by nuclear winter during the cold war—no matter how unlikely it was that humanity didn’t destroy itself (they could have done that in most worlds and I wasn’t born in such a world, I live in the one where Petrov heard the Geiger counter beep in some particular patter that made him more suspicious or something… something something anthropic principle)
  - similarly, people alive in 100 years will find themselves in a world where AGI didn’t destroy the world, no matter what are the odds—as long as there is at least 1 world with non-zero probability (something something Born rule … only if any decision along the way is a wave function, not if all decisions are classical and the uncertainty comes from subjective ignorance)
  - if you took quantum risks in the past, you now live only in the branches where you are still alive and didn’t die (but you could be in pain or whatever)
  - if you personally take a quantum risk now, your future self will find itself only in a subset of the futures, but your loved ones will experience all your possible futures, including the branches where you die … and you will experience everything until you actually die (something something s-risk vs x-risk)
  - if humanity finds itself in unlikely branches where we didn’t kill our collective selves in the past, does that bring any hope for the future?

Aprillion (Peter Hozák)’s Shortform

Aprillion (Peter Hozák)10 Apr 2024 11:29 UTC

3 points

1 comment1 min readLW link

Aprillion (Peter Hozák) 24 Mar 2024 13:44 UTC
3 points
0
on: Natural Latents: The Concepts
Now, suppose Carol knows the plan and is watching all this unfold. She wants to make predictions about Bob’s picture, and doesn’t want to remember irrelevant details about Alice’s picture. Then it seems intuitively “natural” for Carol to just remember where all the green lines are (i.e. the message M), since that’s “all and only” the information relevant to Bob’s picture.

(Writing before I read the rest of the article): I believe Carol would “naturally” expect that Alice and Bob share more mutual information than she does with Bob herself (even if they weren’t “old friends”, they both “decided to undertake an art project” while she “wanted to make predictions”), thus she would weight the costs of remembering more than just the green lines against the expected prediction improvement given her time constrains, lost opportunities, … - I imagine she could complete purple lines on her own, and then remember some “diff” about the most surprising differences...

Also, not all of the green lines would be equally important, so a “natural latent” would be some short messages in “tokens of remembering”, not necessarily correspond to the mathematical abstraction encoded by the 2 tokens of English “green lines” ⇒ Carol doesn’t need to be able to draw the green lines from her memory if that memory was optimized to predict purple lines.

If the purpose was to draw the green lines, I would be happy to call that memory “green lines” (and in that, I would assume to share a prior between me and the reader that I would describe as: "to remember green lines" usually means "to remember steps how to draw similar lines on another paper" ... also, similarity could be judged by other humans ... also, not to be confused with a very different concept "to remember an array of pixel coordinates" that can also be compressed into the words "green lines", but I don't expect people will be confused about the context, so I don't have to say it now, just keep in mind if someone squirts their eyes just-so which would provoke me to clarify).

Aprillion (Peter Hozák) 13 Nov 2023 18:37 UTC
4 points
−2
in reply to: Signer’s comment on: It’s OK to eat shrimp: EAs Make Invalid Inferences About Fish Qualia and Moral Patienthood
yeah, I got a similar impression that this line of reasoning doesn’t add up...
we interpret other humans as feeling something when we see their reactions
we interpret other eucaryotes as feeling something when we see their reactions 🤷

Aprillion (Peter Hozák) 25 Oct 2023 11:20 UTC
1 point
on: The Brain as a Universal Learning Machine
(there are a couple of circuit diagrams of the whole brain on the web, but this is the best. From this site.)
could you update the 404 image, please? (link to the site still works for now, just the image is gone)

Aprillion (Peter Hozák) 22 Oct 2023 11:21 UTC
1 point
0
on: Features and Adversaries in MemoryDT
S5

What is S5, please?

Aprillion (Peter Hozák) 20 Oct 2023 7:27 UTC
3 points
2
in reply to: jacob_cannell’s comment on: Are humans misaligned with evolution?
I agree with what you say. My only peeve is that the concept of IGF is presented as a fact from the science of biology, while it’s used as a confused mess of 2 very different concepts.

Both talk about evolution, but inclusive finess is a model of how we used to think about evolution before we knew about genes. If we model biological evolution on the genetic level, we don’t have any need for additional parameters on the individual organism level, natural selection and the other 3 forces in evolution explain the observed phenomena without a need to talk about invididuals on top of genetic explanations.

Thus the concept of IF is only a good metaphor when talking approximately about optimization processes, not when trying to go into details. I am saying that going with the metaphor too far will result in confusing discussions.

Aprillion (Peter Hozák) 19 Oct 2023 14:50 UTC
11 points
0
on: Are humans misaligned with evolution?
humans don’t actually try to maximize their own IGF

Aah, but humans don’t have IGF. Humans have https://en.wikipedia.org/wiki/Inclusive_fitness, while genes have allele frequency https://en.wikipedia.org/wiki/Gene-centered_view_of_evolution ..

Inclusive genetic fitness is a non-standard name for the latter view of biology as communicated by Yudkowsky—as a property of genes, not a property of humans.

The fact that bio-robots created by human genes don’t internally want to maximize the genes’ IGF should be a non-controversial point of view. The human genes successfully make a lot of copies of themselves without any need whatsoever to encode their own goal into the bio-robots.

I don’t understand why anyone would talk about IGF as if genes ought to want for the bio-robots to care about IGF, that cannot possibly be the most optimal thing that genes should “want” to do (if I understand examples from Yudkowsky correctly, he doesn’t believe that either, he uses this as an obvious example that there is nothing about optimization processes that would favor inner alignment) - genes “care” about genetic success, they don’t care about what the bio-robots outght to believe at all 🤷

Aprillion (Peter Hozák) 14 Sep 2023 15:26 UTC

4 points

in reply to: philh’s comment on: Sum-threshold attacks

Some successful 19th century experiments used 0.2°C/minute and 0.002°C/second.

Have you found the actual 19th century paper?

The oldest quote about it that I found is from https://www.abc.net.au/science/articles/2010/12/07/3085614.htm

Or perhaps the story began with E.M. Scripture in 1897, who wrote the book, The New Psychology. He cited earlier German research: "…a live frog can actually be boiled without a movement if the water is heated slowly enough; in one experiment the temperature was raised at the rate of 0.002°C per second, and the frog was found dead at the end of two hours without having moved."

Well, the time of two hours works out to a temperature rise of 18°C. And, the numbers don't seem right.

First, if the water boiled, that means a final temperature of 100°C. In that case, the frog would have to be put into water at 82°C (18°C lower).

Surely, the frog would have died immediately in water at 82°C.

Aprillion (Peter Hozák) 13 Sep 2023 13:02 UTC
1 point
−1
on: Sum-threshold attacks
I’m not sure what to call this sort of thing. Is there a preexisting name?
sounds like https://en.wikipedia.org/wiki/Emergence to me 🤔 (not 100% overlap and also not the most useful concept, but very similar shaky pointer in concept space between what is described here and what has been observed as a phenomena called Emergence)

Aprillion (Peter Hozák) 13 Sep 2023 12:55 UTC
1 point
0
on: Sum-threshold attacks
Thanks to Gaurav Sett for reminding me of the boiling frog.
I would like to see some mention that this is a pop culture reference / urban myth, not something actual frogs might do.

To quote https://en.wikipedia.org/wiki/Boiling_frog, “the premise is false”.