Tomás B.

Karma: 2,636

Occasionally think about topics discussed here. Will post if I have any thoughts worth sharing.

Tomás B. 2 Apr 2022 1:43 UTC
109 points
on: MIRI announces new “Death With Dignity” strategy
Thank you for trying.

Tomás B. 25 Aug 2022 14:54 UTC
97 points
81
on: Common misconceptions about OpenAI

Incorrect: OpenAI leadership is dismissive of existential risk from AI.

So the reason I think this is very high-level people have made claims like, “the orthogonality thesis is probably false”, and someone I know who talked to a very, very, very high-level person at OpenAI had to explain to them that inner alignment is a thing. If they actually cared, I would expect the leadership to have more familiarity with their critic’s arguments.

No one remembers now, but the founding rhetoric was also pretty bad, though walked back I suppose.

Also, I often see them claim their AI ethics work (train a model not to offend the average Berkeley humanities grad—possibly not useless, I suppose, but not exactly going to save our lightcone) is important alignment work. Obviously, what is going on inside is not legible to me, but what I see from the outside has mostly been disheartening. Their recent blog on alignment was an exception to this.

Though there are people with their priorities straight at OpenAI, I see little evidence that this is true of their leadership. I’m not confident an organization can be net beneficial when this is the case.

Tomás B. 2 Mar 2022 3:02 UTC
87 points
on: Have You Tried Hiring People?
I strongly believe that, given the state of things, we really should spend way more on higher-quality people and see what happens. Up to and including paying Terry Tao 10 million dollars. I would like to emphasize I am not joking about this.
I’ve heard lots of objections to this idea, and don’t think any of them convince me it is not worth actually trying this.
What links here?
- Recruit the World’s best for AGI Alignment by Greg_Colbourn (EA Forum; 30 Mar 2023 16:41 UTC; 34 points)

Tomás B. 11 Nov 2021 6:29 UTC
86 points
on: Discussion with Eliezer Yudkowsky on AGI interventions
I know we used to joke about this, but has anyone considered actually implementing the strategy of paying Terry Tao 10 million dollars to work on the problem for a year?
What links here?
- Has anyone actually tried to convince Terry Tao or other top mathematicians to work on alignment? by P. (8 Jun 2022 22:26 UTC; 59 points)
- Recruit the World’s best for AGI Alignment by Greg_Colbourn (EA Forum; 30 Mar 2023 16:41 UTC; 34 points)

Tomás B. 12 Jun 2022 14:00 UTC
69 points
on: A claim that Google’s LaMDA is sentient
Anyone who thinks boxing can happen, this thing isn’t AGI, or even an agent really, and it’s already got someone trying to hire a lawyer to represent it. It seems humans do most the work of hacking themselves.

Tomás B. 16 Mar 2023 22:46 UTC
48 points
8
on: Conceding a short timelines bet early
My emotional state right now: https://twitter.com/emojimashupbot/status/1409934745895583750?s=46

Tomás B. 6 Sep 2021 18:27 UTC
34 points
on: Rough notes on the Sam Altman Q&A: GPT and AGI
I think it is fine to take notes, and fine to share them with friends. I’d prefer if this was not posted publicly on the web, as the reason he did not want to be recorded is it allowed him to speak more freely.

Tomás B. 2 Mar 2022 23:14 UTC
31 points
in reply to: Eliezer Yudkowsky’s comment on: Late 2021 MIRI Conversations: AMA / Discussion
Every time I’ve asked about trying anything like this, all the advisors claim that you cannot pay people at the Terry Tao level to work on problems that don’t interest them.
As I am sure you would agree, Neumann/Tao-level people are a very different breed from even very, very, very good professors. It is plausible they are significantly more sane than the average genius.
Given the enormous glut of money in EA trying to help here and the terrifying thing where a lot of the people who matter have really short timelines, I think it is worth testing this empirically with Tao himself and Tao-level people.
It is worth noting that Neumann occasionally did contract work for extraordinary sums.

Tomás B. 12 Jul 2022 15:49 UTC
30 points
19
on: Slowing down AI progress is an underexplored alignment strategy
I’ve thought a bit about ideas like this, and talked to much smarter people than myself about such ideas—and they usually dismiss them, which I take as a strong signal this may be a misguided idea.
I think the Machiavellian model of politics is largely correct—and it just is the case that if you look closely at any great change in policy you see, beneath the idealized narrative, a small coterie of very smart ideologues engaging in Machiavellian politics.
To the extent overt political power is necessary for EA causes to succeed, Machiavellian politics will be necessary and good. However, this sort of duplicitous regulatory judo you advocate strikes me as possibly backfiring—by politicizing AI in this way those who are working on the actually important AI safety research become very tempting targets to the mechanisms you hope to summon. We see hints of this already.
To the extent it is possible to get people with correct understanding of the actually important problem in positions of bureaucratic and moral authority, this seems really, really good. Machiavellian politics will be required to do this. Such people may indeed need to lie about their motivations. And perhaps they may find it necessary to manipulate the population in the way you describe.
However, if you don’t have such people actually in charge and use judo mind tricks to manipulate existing authorities to bring AI further into the culture war, you are summoning a beast you, by definition, lack the power to tame.
I suspect it would backfire horribly, incentivize safety washing of various kinds in existing organizations who are best positioned to shape regulation, making new alignment orgs like Conjecture and Redwood very difficult to start, and worst of all making overtly caring about the actual problem very politically difficult

Tomás B. 9 Feb 2024 16:49 UTC
28 points
10
on: OpenAI wants to raise 5-7 trillion
I feel 5 trillion must be a misprint. This is like several years worth of American tax revenues. Conditional on this being true I would take this as significant evidence that what they have internally is unbelievably good. Perhaps even an AI with super-persuasion!
It is such a ridiculous figure, I suspect it must be off by at least an OOM.

Tomás B. 8 Sep 2021 21:08 UTC
28 points
on: Sam Altman Q&A Notes—Aftermath
Though my preferred outcome would be you taking the post down without much of a fuss, I understand this is a pretty self-serving preference. I did like the compromise idea of making the post available only to members, but that does not appear to be an existing feature of the site.
Taking it down helps remedy a failure of my own rather than yours, as we clearly should have been more explicit about this.
You posting them initially is perfectly understandable. Though I disagreed with your desire to keep them up after I requested them down, I understand this is a matter of opinion.
“Defection” is a pretty loaded word and I should not have used it.
In general, I think it is really great when people provide public goods like book reviews or highlights (I also think it is really rewarding and I have never regretted doing such things myself), so to the extent this has discouraged you from this path, I would like to point out that this is obviously a weird “scissor case” and similar efforts in the future will certainly be well received.

Tomás B. 7 Sep 2021 14:54 UTC
28 points
on: Rough notes on the Sam Altman Q&A: GPT and AGI
See, it is on the front page of HackerNews now, all over Reddit. I’m the person who books guests for Joshua’s meetups, and I feel like this is a sort of defection against Altman and future attendees of the meetup. As I said, I think notes are fine and sharing them privately is fine but publishing on the open web vastly increases the probability of some journalist writing a click-bait story about your paraphrased take of what Altman said.
Actually attending the meetups was a trivial inconvenience that reduced the probability of this occurring. Perhaps the damage is now done, but I really don’t feel right about this.
I take some responsibility for not being explicit about not publishing notes on the web, for whatever reason this was not a problem last time.

Tomás B. 2 Mar 2022 6:13 UTC
27 points
in reply to: RobertM’s comment on: Have You Tried Hiring People?
“ok, bite the bullet, and spend 6 months − 2 years training interested math/cs/etc students to be competent researchers”—but have they tried this with Terry Tao?

Tomás B. 1 Jan 2020 17:33 UTC
27 points
in reply to: Mark_Friedenbach’s comment on: human psycholinguists: a critical appraisal

They already assigned >90% probability that GPT-2 models something like how speech production works.

Is that truly the case? I recall reading Corey Washington a former linguist (who left the field for neuroscience in frustration with its culture and methods) claim that when he was a linguist the general attitude was there was no way in hell something like GPT-2 would ever work even close to the degree that it does.

Found it:

Steve: Corey’s background is in philosophy of language and linguistics, and also neuroscience, and I have always felt that he’s a little bit more pessimistic than I am about AGI. So I’m curious — and answer honestly, Corey, no revisionist thinking — before the results of this GPT-2 paper were available to you, would you not have bet very strongly against the procedure that they went through working?

Corey: Yes, I would’ve said no way in hell actually, to be honest with you.

Steve: Yes. So it’s an event that caused you to update your priors.

Corey: Absolutely. Just to be honest, when I was coming up, I was at MIT in the mid ’80s in linguistics, and there was this general talk about how machine translation just would never happen and how it was just lunacy, and maybe if they listened to us at MIT and took a little linguistics class they might actually figure out how to get this thing to work, but as it is they’re going off and doing this stuff which is just destined to fail. It’s a complete falsification of that basic outlook, which I think, — looking back, of course — had very little evidence — it had a lot of hubris behind it, but very little evidence behind it.

I was just recently reading a paper in Dutch, and I just simply… First of all, the OCR recognized the Dutch language and it gave me a little text version of the page. I simply copied the page, pasted it into Google Translate, and got a translation that allowed me to basically read this article without much difficulty. That would’ve been thought to be impossible 20, 30 years ago — and it’s not even close to predicting the next word, or writing in the style that is typical of the corpus.

Tomás B. 18 Mar 2024 15:08 UTC
26 points
2
on: On Devin
This seems to be as good of a place as any to post my unjustified predictions on this topic, the second of which I have a bet outstanding on at even odds.
1. Devin will turn out to be just a bunch of GPT-3.5/4 calls and a pile of prompts/heuristics/scaffolding so disgusting and unprincipled only a team of geniuses could have created it.
2. Someone will create an agent that gets 80%+ on SWE-Bench within six months.
I am not sure if 1. being true or false is good news. Both suggest we should update towards large jumps in coding ability very soon.
Regarding RSI, my intuition has always been that automating AI research will likely be easier than automating the development and maintenance of a large app like, say, Photoshop, So I don’t expect fire alarms like “non-gimmicky top 10 app on AppStore was developed entirely autonomously” before doom.

Tomás B. 25 Apr 2023 16:38 UTC
26 points
11
in reply to: Lao Mein’s comment on: My Assessment of the Chinese AI Safety Community
Is it “insanely cringe” for different reasons than it is “insanely cringe” for English audiences? I suspect most Americans, if exposed to it, would describe it as cringe. There is much about it that is cringe, and I say this with some love.

Tomás B. 26 Jul 2022 14:56 UTC
26 points
−1
on: AI ethics vs AI alignment
AI ethics: Working towards building an AI that will not embarrass itself at a mid-tier dinner party in San Francisco.
AI alignment: Advancing capabilities while feeling vaguely guilty about it.
Accountability in AI: Ensuring AI doesn’t interfere with our ability to create sinecures.
AI we can trust: Ensuring AI doesn’t discredit the linguistics work of Noam Chomsky.
AI policy: A long march through the institutions for the purpose of solving a problem that will kill us in five years.
Decentralized AI: The token is totally necessary and going to the moon.

Tomás B. 18 Oct 2021 14:53 UTC
26 points
in reply to: Viliam’s comment on: My experience at and around MIRI and CFAR (inspired by Zoe Curzi’s writeup of experiences at Leverage)
Even in the case of Sam Harris, who seems relatively normal, he lost a decade of his life pursuing “enlightenment” though meditation—also notable is this was spurred on by psychedelic use. Though I am sure he would not agree with the frame that it was a waste, I read his *Waking Up* as a bit of a horror story. For someone without his high IQ and indulgent parents, you could imagine more horrible ends.
I know of at least one person who was bright, had wild ambitious ideas, and now spends his time isolated from his family inwardly pursuing “enlightenment.” And this through the standard meditation + psychedelics combination. I find it hard to read this as anything other than wire-heading, and I think a good social norm would be one where we consider such behavior as about as virtuous as obsessive masturbation.
In general, for any drug that produces euphoria, especially spiritual euphoria, the user develops an almost romantic relationship with their drug, as the feelings they inspire are just as intense (and sometimes more so) as familial love. One should at least be slightly suspicious of the benefits propounded by their users, who in many cases literally worship their drugs of choice.

Tomás B. 18 Oct 2021 19:46 UTC
25 points
in reply to: Aella’s comment on: My experience at and around MIRI and CFAR (inspired by Zoe Curzi’s writeup of experiences at Leverage)
And I get you might think I’m… brainwashed or something? by drugs?
I’m not sure what you find implausible about that. Drugs do not literally propagandize the user, but they can hijack the reward system, in the case of many drugs, and in the case of psychedelics they seem to alter beliefs in reliable ways. Psychedelics are also taken in a memetic context with many crystalized notions about what the psychedelic experience is, what enlightenment is, that enlightenment itself is a mysterious but worthy pursuit.
The classic joke about psychedelics is they provide the feelings associated with profound insights without the actual profound insights. To the extent this is true, I feel this is pretty dangerous territory for a rationalist to tread.
In your own case unless I am misremembering, I believe on your blog you discuss LSD permanently ~~lowering your~~ ~~mathematical abilities~~ degrading your memory. This seems really, really bad to me…
Maybe this one is less concrete, but some part of me feels really deeply at peace, always, like it knows everything is going to be ok and I didn’t have that before.
I’m glad your anxiety is gone, but I don’t think everything is going to be alright by default. I would not like to modify myself to think that. It seems clearly untrue.
Perhaps the masturbation line was going too far. But the gloss of virtue that “seeking enlightenment” has strikes me as undeserved.

Tomás B. 16 Nov 2022 16:03 UTC
22 points
11
on: Disagreement with bio anchors that lead to shorter timelines
In general, I have noticed a pattern where people are dismissive of recursive self improvement. To the extent people are still believing this, I would like to suggest this is a cached thought that needs to be refreshed.
When it seemed like models with a chance of understanding code or mathematics were a long ways off—which it did (checks notes) two years ago, this may have seemed sane. I don’t think it seems sane anymore.
What would it look like to be on the precipice of a criticality threshold? I think it looks like increasingly capable models making large strides in coding and mathematics. I think it looks like feeding all of human scientific output into large language models. I think it looks a world where a bunch of corporations are throwing hundreds of millions of dollars into coding models and are now in the process of doing the obvious things that are obvious to everyone.
There’s a garbage article going around with rumors of GPT-4, which appears to be mostly wrong. But from slightly-more reliable rumors, I’ve heard it’s amazing and they’re picking the low-hanging data set optimization fruits.
The threshold for criticality, in my opinion, requires a model capable of understanding the code that produced it as well as a certain amount of scientific intuition and common sense. This no longer seems very far away to me.
But then, I’m no ML expert.