Matrice Jacobine

Karma: 630

Student in fundamental and applied mathematics, interested in theoretical computer science and AI alignment

Matrice Jacobine 7 Aug 2025 20:35 UTC
1 point
0
in reply to: Søren Elverlin’s comment on: Saying Goodbye
Technically I guess there is no consensus against alignment optimism (which is fine by itself).

Matrice Jacobine 5 Aug 2025 12:14 UTC
12 points
1
in reply to: Søren Elverlin’s comment on: Saying Goodbye
idk if I’m allowed to take the money if I’m not the OP, but it really doesn’t seem hard to find other examples who read and internalized the Sequences and went on to do at least one of the things you mentioned: the Zizians, Cole Killian, etc. I think I know the person OP meant when talking about “releasing blatant pump-and-dump coins and promoting them on their personal Twitter”, I won’t mention her name publicly. I’m sure you can find people who read the Sequences and endorse alignment optimism or China hawkism (certainly you can find highly upvoted arguments for alignment optimism here or on the Alignment Forum) as well.

Matrice Jacobine 4 Aug 2025 21:13 UTC
3 points
0
in reply to: dr_s’s comment on: Saying Goodbye
The Last Commit Before The End Of The World

Matrice Jacobine 4 Aug 2025 19:34 UTC
14 points
0
in reply to: Søren Elverlin’s comment on: Saying Goodbye
This seems trivial. Ctrl+F “the Sequences” here

Matrice Jacobine 2 Aug 2025 20:27 UTC
7 points
1
on: I am worried about near-term non-LLM AI developments
A single human brain has the energy demands of a lightbulb, instead of the energy demands for all the humans in Wyoming.
This is a non sequitur. The reasons AI models don’t have the energy demands of a lightbulb isn’t because they’re too big and current algorithms are too inefficient. Quite the contrary, an actual whole brain emulation would require the world’s largest supercomputer. Current computers are just nowhere near as efficient as the human brain.

Matrice Jacobine 19 Jul 2025 13:39 UTC
3 points
0
in reply to: Viliam’s comment on: NYT article about the Zizians including quotes from Eliezer, Anna, Ozy, Jessica, Zvi
I am on the other side of the planet and my participation in the rationality community is almost exclusively online. Those of you who live in the center of the “fraternity”, how much would you agree with describing Ziz as a typical member of your “fraternity” back then?
I don’t live in the Bay rn, but at least enough that Ziz went on “long walks” [plural] with Anna Salamon according to the article, and had several high-profile friends (at least Raemon and Kaj).

Matrice Jacobine 14 Jul 2025 15:52 UTC
8 points
4
in reply to: Cole Wyeth’s comment on: Bernie Sanders (I-VT) mentions AI loss of control risk in Gizmodo interview
(EY was one of my hypotheses for which researcher he was talking to two hours before the interview, though I think it’s overall most likely Hinton.)

Matrice Jacobine 1 Jul 2025 17:22 UTC
1 point
0
on: The best simple argument for Pausing AI?
FTR: You can choose your own commenting guidelines when writing or editing a post in the section “Moderation Guidelines”.

Matrice Jacobine 25 Jun 2025 10:02 UTC
1 point
−1
in reply to: David Matolcsi’s comment on: evhub’s Shortform
I think your tentative position is correct and public-facing chatbots like Claude should lean toward harmlessness in the harmlessness-helpfulness trade-off, but (post-adaptation buffer) open-source models with no harmlessness training should be available as well.

Matrice Jacobine 22 Jun 2025 22:15 UTC
3 points
0
on: Futarchy’s fundamental flaw
This seems related to the 5-and-10 problem? Especially @Scott Garrabrant’s version, considering logical induction is based on prediction markets.

Matrice Jacobine 17 Jun 2025 11:29 UTC
1 point
0
in reply to: Ebenezer Dukakis’s comment on: the void
You seem to smuggle in an unjustified assumption: that white collar workers avoid thinking about taking over the world because they’re unable to take over the world. Maybe they avoid thinking about it because that’s just not the role they’re playing in society.
White-collar workers avoid thinking about taking over the world because they’re unable to take over the world, and they’re unable to take over the world because their role in society doesn’t involve that kind of thing. If a white-collar worker is somehow drafted for president of the United States, you would assume their propensity to think about world hegemony will increase. (Also, white-collar workers engage in scheming, sandbagging, and deception all the time? The average person lies 1-2 times per day)

Matrice Jacobine 17 Jun 2025 10:12 UTC
1 point
0
in reply to: Ebenezer Dukakis’s comment on: the void
Human white-collar workers are unarguably agents in the relevant sense here (intelligent beings with desires and taking actions to fulfil those desires). The fact that they have no ability to take over the world has no bearing on this.

Matrice Jacobine 16 Jun 2025 20:57 UTC
1 point
0
in reply to: Ebenezer Dukakis’s comment on: the void
… do you deny human white-collar workers are agents?

Matrice Jacobine 15 Jun 2025 16:04 UTC
1 point
1
in reply to: Ebenezer Dukakis’s comment on: the void
LLMs are agent simulators. Why would they contemplate takeover more frequently than the kind of agent they are induced to simulate? You don’t expect a human white-collar worker, even one who make mistakes all the time, to contemplate world domination plans, let alone attempt one. You could however expect the head of state of a world power to do so.

Matrice Jacobine 14 Jun 2025 23:00 UTC
2 points
0
in reply to: Lucie Philippon’s comment on: the void
https://www.lesswrong.com/users/janus-1 ?

Matrice Jacobine 14 Jun 2025 18:08 UTC
1 point
2
in reply to: Lucie Philippon’s comment on: the void
I think Janus is closer to “AI safety mainstream” than nostalgebraist?

Matrice Jacobine 14 Jun 2025 18:06 UTC
1 point
0
in reply to: ConcurrentSquared’s comment on: the void
Uh? The OpenAssistant dataset would qualify as supervised learning/fine-tuning, not RLHF, no?

Matrice Jacobine 13 Jun 2025 20:27 UTC
2 points
1
in reply to: ConcurrentSquared’s comment on: the void
Would it be worth it to train a series of base models with only data up to year X for different values of X and see the consequences on alignment of derived assistant models?

Matrice Jacobine 12 May 2025 18:56 UTC
1 point
0
in reply to: Marius Adrian Nicoară’s comment on: Absolute Zero: Reinforced Self-play Reasoning with Zero Data
@alapmi’s post seems like it should be a question and not a regular post. Is it possible to change this after the fact?

Matrice Jacobine 17 Apr 2025 19:38 UTC
1 point
0
in reply to: Chris_Leong’s comment on: ASI existential risk: Reconsidering Alignment as a Goal
Interesting analysis, but this statement is a bit strong. A global safe AI project would be theoretically possible, but would be extremely challenging to solve the co-ordination issues without AI progress dramatically slowing. Then again, all plans are challenging/potentially impossible.
[...]
Another option would be to negotiate a deal where only a few countries are allowed to develop AGI, but in exchange, the UN gets to send observers and provide input on the development of the technology.
“co-ordination issues” is a major euphemism here: such a global safe AI would not just require the same kind of coordination one generally expect in relations between nation-states (even in the eyes of the most idealistic liberal-internationalists), but effectively having already achieved a world government and species-wide agreement on a same moral philosophy – which may itself require having already achieved at the very least a post-scarcity economy. This is more or less what I mean in the last bullet point by “and only then (possibly) building aligned ASI”.
Alternatively, an aligned ASI could be explicitly instructed to preserve existing institutions. Perhaps it’d be limited to providing advice, or, strongly, it wouldn’t intervene except by preventing existential or near-existential risks.
Depending on whether this advice is available to everyone or only to the leadership of existing institutions, this would fall either under Tool AI (which is one of the approaches in my third bullet point) or state-aligned (but CEV-unaligned) ASI (a known x-risk and plausibly a s-risk).
Yet another possibility is that the world splits into factions which produce their own AGI’s and then these AGIs merge.
If the merged AGIs are all CEV-unaligned, I don’t see why we should assume that, just because it is a merger from AGIs from across the world, the merged AGI would suddenly be CEV-aligned.