Yonatan Cale

Karma: 442

Yonatan Cale 18 May 2024 17:12 UTC
3 points
2
in reply to: habryka’s comment on: simeon_c’s Shortform
@habryka , Would you reply to this comment if there’s an opportunity to donate to either? Me and another person are interested, and others could follow this comment too if they wanted to
(only if it’s easy for you, I don’t want to add an annoying task to your plate)

Yonatan Cale 18 Nov 2023 15:44 UTC
4 points
0
on: On excluding dangerous information from training
+1, you convinced me.
I worry this will distract from risks like “making an AI that is smart enough to learn how to hack computers from scratch”, but I don’t buy the general “don’t distract with true things” argument.

Yonatan Cale 16 Oct 2023 13:30 UTC
5 points
3
in reply to: Benaya Koren’s comment on: I’m a Former Israeli Officer. AMA
“I don’t think that there is more that 1% that support direct violence against non-terrorists for its own sake”: This seems definitely wrong to me, if you also count Israelies who consider everyone in Gaza as potential terrorists or something like that.
If you offer Israelies:
Button 1: Kill all of Hamas
Button 2: Kill all of Gaza
Then definitely more than 1% will choose Button 2

Yonatan Cale 10 Oct 2023 20:25 UTC
2 points
−1
in reply to: Charlie Steiner’s comment on: I’m a Former Israeli Officer. AMA
I haven’t heard of anything like that (but not sure if I would).
Note there are also problems in trying to set up a government using force, in setting up a police force there if they’re not interested in it, in building an education system (which is currently, afaik, very anti Israel and wouldn’t accept Israel’s opinions on changes, I think) ((not that I’m excited about Israel’s internal education system either)).
I do think Israel provides water, electricity, internet, equipment, medical equipment (subsidized? free? i’m not sure of all this anyway) to Gaza. I don’t know if you count that is something like “building a stockpile of equipment for providing clean drinking water to residents of occupied territory”.
I don’t claim the current solution is good, I’m just pointing out some problems with what I think you’re suggesting (and I’m not judging whether those problems are bigger or smaller).

Yonatan Cale 10 Oct 2023 17:26 UTC
1 point
−1
in reply to: Charlie Steiner’s comment on: I’m a Former Israeli Officer. AMA
What do you mean by “building capacity” in this context? (maybe my English isn’t good enough, I didn’t understand your question)

Yonatan Cale 10 Oct 2023 17:25 UTC
7 points
0
in reply to: ryan_b’s comment on: I’m a Former Israeli Officer. AMA
I was a software developer in the Israeli military (not a data scientist), and I was part of a course constantly trains software developers for various units to use.
The big picture is that the military is a huge organization, and there is a ton of room for software to improve everything. I can’t talk about specific uses (just like I can’t describe our tanks or whatever, sorry if that’s what you’re asking, and sorry I’m not giving the full picture), but even things like logistics or servers or healthcare have big teams working on them.
Also remember the military started a long time ago, when there weren’t good off-the-shelf solutions for everything, and imagine how big are the companies that make many of the products that you (or orgs) use.

Yonatan Cale 10 Oct 2023 17:13 UTC
6 points
−3
in reply to: Benaya Koren’s comment on: I’m a Former Israeli Officer. AMA
1. There are also many Israelies that don’t consider Plaestinians to be humans worth protecting, but rather as evil beings / outgroup / whatever you’d call that.
2. Also (with much less confidence), I do think many Palastinians want to kill Israelies because of things that I’d consider brainwashing.
  1. Hard question—what to do about a huge population that’s been brainwashed like that (if my estimation here is correct), or how might a peaceful resolution look?

Yonatan Cale 10 Oct 2023 10:32 UTC
20 points
−1
on: I’m a Former Israeli Officer. AMA
Not a question, but seems relevant for people who read this post:
Meni Rosenfeld, one of the early LessWrong Israel members, has enlisted:
Source: https://www.facebook.com/meni.rosenfeld/posts/pfbid0bkvfrb3qFTF7U82eMgkZzgMjMT4s3pbGUx7ahgKX1B8hr2n1viYqg9Msz6t3dBUPl (a public post by him)

Yonatan Cale 28 Aug 2023 16:19 UTC
25 points
0
on: Eliezer Yudkowsky Is Frequently, Confidently, Egregiously Wrong
Eliezer replied on the EA Forum

Yonatan Cale 28 Feb 2023 12:24 UTC
6 points
0
on: Sam Altman: “Planning for AGI and beyond”
Any ideas on how much to read this as “Sam’s actual opinions” vs “Sam trying to say things that will satisfy the maximum amount of people”?
(do we have priors on his writings? do we have information about him absolutely not meaning one or more of the things here?)

Yonatan Cale 28 Feb 2023 12:11 UTC
2 points
1
on: The Preference Fulfillment Hypothesis
Hey Kaj :)
The part-hiding-complexity here seems to me like “how exactly do you take a-simulation/prediction-of-a-person and get from it the-preferences-of-the-person”.
For example, would you simulate a negotiation with the human and how the negotiation would result? Would you simulate asking the human and then do whatever the human answers? (there were a few suggestions in the post, I don’t know if you endorse a specific one or if you even think this question is important)

Yonatan Cale 9 Feb 2023 22:22 UTC
11 points
−4
in reply to: Eliezer Yudkowsky’s comment on: SolidGoldMagikarp (plus, prompt generation)
Because (I assume) once OpenAI^[1] say “trust our models”, that’s the point when it would be useful to publish our breaks.
Breaks that weren’t published yet, so that OpenAI couldn’t patch them yet.
[unconfident; I can see counterarguments too]
1. ^
  Or maybe when the regulators or experts or the public opinion say “this model is trustworthy, don’t worry”

Yonatan Cale 9 Feb 2023 18:03 UTC
11 points
−2
in reply to: Eliezer Yudkowsky’s comment on: SolidGoldMagikarp (plus, prompt generation)
I’m confused: Wouldn’t we prefer to keep such findings private? (at least, keep them until OpenAI will say something like “this model is reliable/safe”?)
My guess: You’d reply that finding good talent is worth it?

Which ML skills are useful for finding a new AIS research agenda?

Yonatan Cale9 Feb 2023 13:09 UTC

16 points

1 comment1 min readLW link

Yonatan Cale 30 Jan 2023 10:02 UTC
1 point
0
on: 11 heuristics for choosing (alignment) research projects
This seems like great advice, thanks!
I’d be interested in an example for what “a believable story in which this project reduces AI x-risk” looks like, if Dane (or someone else) would like to share.

Yonatan Cale 13 Jan 2023 14:04 UTC
1 point
0
in reply to: alyssavance’s comment on: Let’s See You Write That Corrigibility Tag
A link directly to the corrigibility part (skipping unrelated things that are in the same page) :
https://www.projectlawful.com/replies/1824457#reply-1824457

Yonatan Cale 13 Jan 2023 12:41 UTC
1 point
on: Trapped Priors As A Basic Problem Of Rationality
This post got me to do something like exposure therapy to myself in 10+ situations, which felt like the “obvious” thing to do in those situations. This is a huge amount of life-change-per-post

Yonatan Cale 12 Jan 2023 23:37 UTC
2 points
0
on: Victoria Krakovna on AGI Ruin, The Sharp Left Turn and Paradigms of AI Alignment
My thoughts:
[Epistemic status + impostor syndrome: Just learning, posting my ideas to hear how they are wrong and in hope to interact with others in the community. Don’t learn from my ideas]
A)
Victoria: “I don’t think that the internet has a lot of particularly effective plans to disempower humanity.
I think:
1. Having ready plans on the internet and using them is not part of the normal threat model from an AGI. If that was the problem, we could just filter out those plans from the training set.
2. (The internet does have such ideas. I will briefly mention biosecurity, but I prefer not spreading ideas on how to disempower humanity)
B)
[Victoria:] I think coming up with a plan that gets past the defenses of human society requires thinking differently from humans.
TL;DR: I think some ways to disempower humanity don’t require thinking differently than humans
I’ll split up AI’s attack vectors into 3 buckets:
1. Attacks that humans didn’t even think of (such as what we can do to apes)
2. Attacks that humans did think of but are not defending against (for example, we thought about pandemic risks but we didn’t defended against them so well). Note this does not require thinking about things that humans didn’t think about.
3. Attacks that humans are actively defending against, such as using robots with guns or trading in the stock market or playing go (go probably won’t help taking over the world, but humans are actively working on winning go games, so I put the example here). Having an AI beat us in one of these does require it to be in some important (to me) sense smarter than us, but not all attacks are in this bucket.
C)
[...] requires thinking differently from humans
I think AIs already today think differently than humans in any reasonable way we could mean that. In fact, if we could make an them NOT think differently than humans, my [untrustworthy] opinion is that this would be non-negligible progress towards solving alignment. No?
D)
The intelligence threshold for planning to take over the world isn’t low
First, disclaimers:
(1) I’m not an expert and this isn’t widely reviewed, (2) I’m intentionally being not detailed in order to not spread ideas on how to take over the world, I’m aware this is bad epistemic and I’m sorry for it, it’s the tradeoff I’m picking
So, mainly based on A, I think a person who is 90% as intelligent as Elon Musk in all dimensions would probably be able to destroy humanity, and so (if I’m right), the intelligence threshold is lower than “the world’s smartest human”. Again sorry for the lack of detail. [mods, if this was already too much, feel free to edit/delete my comment]

Yonatan Cale 7 Jan 2023 12:06 UTC
1 point
0
in reply to: JoshuaFox’s comment on: What’s up with ChatGPT and the Turing Test?
“Doing a Turing test” is a solution to something. What’s the problem you’re trying to solve?

Yonatan Cale 5 Jan 2023 16:07 UTC
3 points
0
on: What’s up with ChatGPT and the Turing Test?
As a judge, I’d ask the test subject to write me a rap song about turing tests. If it succeeds, I guess it’s a ChatGPT ;P
More seriously—it would be nice to find a judge that doesn’t know the capabilities and limitations of GPT models. Knowing those is very very useful

Yonatan Cale

Which ML skills are use­ful for find­ing a new AIS re­search agenda?

The intelligence threshold for planning to take over the world isn’t low

Which ML skills are useful for finding a new AIS research agenda?