Ulisse Mini

Karma: 1,686

Born too late to explore Earth; born too early to explore the galaxy; born just the right time to save humanity.

https://uli.rocks/about

Ulisse Mini 25 Jun 2024 9:33 UTC
6 points
1
in reply to: David Gross’s comment on: Suffering Is Not Pain
I think asking people like Daniel Ingram, Frank Yang, Nick Cammeratta, Shinzen Young, Roger Thisdell, etc. on how they experience pain post awakening is much more productive than debating 2500 year old teachings which have been (mis)translated many times.

[Question] What rationality failure modes are there?

Ulisse Mini19 Jan 2024 9:12 UTC

42 points

11 comments1 min readLW link

Ulisse Mini 12 Nov 2023 19:26 UTC
2 points
0
on: What ML gears do you like?
Answering my own question, a list of theories I have yet to study that may yield significant insight:
- Theory of Heavy-Tailed Self-Regularization (https://weightwatcher.ai/)
- Singular learning theory
- Neural tangent kernels et. al. (deep learning theory book)
- Information theory of deep learning

[Question] What ML gears do you like?

Ulisse Mini11 Nov 2023 19:10 UTC

25 points

4 comments1 min readLW link

Paper: Understanding and Controlling a Maze-Solving Policy Network

TurnTrout, Ulisse Mini, peligrietzer, mrinank_sharma, Austin Meek, Monte M and lisathiergart

13 Oct 2023 1:38 UTC

69 points

0 comments1 min readLW link

(arxiv.org)

ActAdd: Steering Language Models without Optimization

technicalities, TurnTrout, lisathiergart, David Udell, Ulisse Mini and Monte M

6 Sep 2023 17:21 UTC

105 points

3 comments2 min readLW link

(arxiv.org)

Open problems in activation engineering

TurnTrout, woog, lisathiergart, Monte M and Ulisse Mini

24 Jul 2023 19:46 UTC

51 points

2 comments1 min readLW link

(coda.io)

Ulisse Mini 14 Jul 2023 18:35 UTC
3 points
−5
in reply to: mako yass’s comment on: Alignment Megaprojects: You’re Not Even Trying to Have Ideas
I wasn’t in a flaming asshole mood, it was a deliberate choice. I think being mean is necessary to accurately communicate vibes & feelings here, I could serialize stuff as “I’m feeling XYZ and think this makes people feel ABC” but this level of serialization won’t activate people’s mirror neurons & have them actually internalize anything.

Unsure if this worked, it definitely increased controversy & engagement but that wasn’t my goal. The goal was to shock one or two people out of bad patterns.

Ulisse Mini 13 Jul 2023 19:44 UTC
3 points
−14
in reply to: Nicholas / Heather Kross’s comment on: Alignment Megaprojects: You’re Not Even Trying to Have Ideas
Sorry, I was more criticizing a pattern I see in the community rather than you specifically

However, basically everyone I know who takes innate intelligence as “real and important” is dumber for it. It is very liable to mode collapse into fixed mindsets, and I’ve seen this (imo) happen a lot in the rat community.

(When trying to criticize a vibe / communicate a feeling it’s more easily done with extreme language, serializing loses information. sorry.)

Ulisse Mini 13 Jul 2023 7:40 UTC
0 points
−1
on: Alignment Megaprojects: You’re Not Even Trying to Have Ideas
EDIT: I think this comment was overly harsh, leaving it below for reference. The harsh tone was contributed from being slightly burnt out from feeling like many people in EA were viewing me as their potential ender wiggin, and internalizing it.^[1]

The people who suggest schemes like what I’m criticizing are all great people who are genuinely trying to help, and likely are.

Sometimes being a child in the machine can be hard though, and while I think I was ~mature and emotionally robust enough to take the world on my shoulders, many others (including adults) aren’t.

An entire school system (or at least an entire network of universities, with university-level funding) focused on Sequences-style rationality in general and AI alignment in particular.

[...]

Genetic engineering, focused-training-from-a-young-age, or other extreme “talent development” setups.

Please stop being a fucking coward speculating on the internet about how child soldiers could solve your problems for you. Enders game is fiction, it would not work in reality, and that isn’t even considering the negative effects on the kids. You aren’t smart enough for galaxy brained plans like this to cause anything other than disaster.

In general rationalists need to get over their fetish for innate intelligence and actually do something instead of making excuses all day. I’ve mingled with good alignment researchers, they aren’t supergeniuses, but they did actually try.

(This whole comment applies to Rationalists generally, not just the OP.)
1. ↩︎
  I should clarify this mostly wasn’t stuff the atlas program contributed to. Most of the damage was done from my personality + heroic responsibility in rat fiction + dark arts of rationality + death with dignity post. Nor did atlas staff do much to extenuate this, seeing myself as one of the best they could find was most of it, cementing the deep “no one will save you or those you love” feeling.

Ulisse Mini 13 Jul 2023 5:02 UTC
4 points
4
on: New wind in rationality’s sails: Applications for Epistea Residency 2023 are now open
Excited to see what comes out of this. I do want to raise attention to this failure mode covered in the sequences. however. I’d love for those who do the program try to bind their results to reality in some way, ideally having a concrete result of how they’re substantively stronger afterwards, and how this replicated with other participants who did the training.

Ulisse Mini 12 Jul 2023 19:48 UTC
7 points
0
on: Goal-Direction for Simulated Agents
Really nice post. One thing I’m curious about is this line:

This provides some intuitions about what sort of predictor you’d need to get a non-delusional agent—for instance, it should be possible if you simulate the agent’s entire boundary.

I don’t see the connection here? Haven’t read the paper though.

Ulisse Mini 11 Jul 2023 20:23 UTC
2 points
on: Ulisse Mini’s Shortform
Quick thoughts on creating a anti-human chess engine.
1. Use maiachess to get a probability distribution over opponent moves based on their ELO. for extra credit fine-tune on that specific player’s past games.
2. Compute expectiminimax search over maia predictions. Bottom out with stockfish value when going deeper becomes impractical. (For MVP bottom out with stockfish after a couple ply, no need to be fancy.) Also note: We want to maximize (P(win)) not centipawn advantage.
3. For extra credit, tune hyperparameters via self-play against maia (simulated human). Use lichess players as a validation set.
4. ???
5. Profit.

Ulisse Mini 11 Jul 2023 19:43 UTC
2 points
0
on: When do “brains beat brawn” in Chess? An experiment

This might actually be a case where a chess GM would outperform an AI: they can think psychologically, so they can deliberately pick traps and positions that they know I would have difficulty with.

Emphasis needed. I expect a GM to beat you down a rook every time, and down a queen most times.

Stockfish assumes you will make optimal moves in planning and so plays defensive when down pieces, but an AI optimized to trick humans (i.e. allowing suboptimal play when humans are likely to make a mistake) would do far better. You could probably build this with maiachess, I recall seeing someone build something like this though I can’t find the link right now.

Put another way, all the experiments you do are making a significant type error, Stockfish down a rook against a worse opponent does not play remotely like a GM down a rook. I would lose to a GM every time, I would beat Stockfish most times.
What links here?
- Ulisse Mini's comment on Ulisse Mini’s Shortform by Ulisse Mini (11 Jul 2023 20:23 UTC; 2 points)

[ASoT] GPT2 Steering & The Tuned Lens

Ulisse Mini1 Jul 2023 14:12 UTC

23 points

0 comments2 min readLW link

Ulisse Mini 19 Jun 2023 0:23 UTC
2 points
0
in reply to: James Chua’s comment on: Steering GPT-2-XL by adding an activation vector
Haha nice work! I’m impressed you got TransformerLens working on Colab, I underestimated how much CPU ram they had. I would have shared a link to my notebook & Colab but figured it might be good to keep under the radar so people could preregister predictions.

Maybe the knowledge that you’re hot on my heels will make me finish the LLAMAs post faster now ;)

Ulisse Mini 12 Jun 2023 22:16 UTC
6 points
0
in reply to: Garrett Baker’s comment on: Critiques of prominent AI safety labs: Conjecture
From my perspective 9 (scaling fast) makes perfect sense since Conjecture is aiming to stay “slightly behind state of the art”, and that requires engineering power.

Ulisse Mini 30 May 2023 19:01 UTC
2 points
0
in reply to: the gears to ascension’s comment on: LIMA: Less Is More for Alignment
Added italics. For the next post I’ll break up the abstract into smaller paragraphs and/or make a TL;DR.

Ulisse Mini 30 May 2023 17:41 UTC
2 points
0
in reply to: Raemon’s comment on: LIMA: Less Is More for Alignment
Copied it from the paper. I could break it down into several paragraphs but I figured bolding the important bits was easier. Might break up abstracts in future linkposts.

LIMA: Less Is More for Alignment

Ulisse Mini30 May 2023 17:10 UTC

16 points

6 comments1 min readLW link

(arxiv.org)

Ulisse Mini

[Question] What ra­tio­nal­ity failure modes are there?

[Question] What ML gears do you like?

Paper: Un­der­stand­ing and Con­trol­ling a Maze-Solv­ing Policy Network

Ac­tAdd: Steer­ing Lan­guage Models with­out Optimization

Open prob­lems in ac­ti­va­tion engineering

[ASoT] GPT2 Steer­ing & The Tuned Lens

LIMA: Less Is More for Alignment

[Question] What rationality failure modes are there?

Paper: Understanding and Controlling a Maze-Solving Policy Network

ActAdd: Steering Language Models without Optimization

Open problems in activation engineering

[ASoT] GPT2 Steering & The Tuned Lens