[Question] Term/​Cat­e­gory for AI with Neu­tral Im­pact?

isomic11 May 2023 22:00 UTC
6 points
1 comment1 min readLW link

Thoughts on LessWrong norms, the Art of Dis­course, and mod­er­a­tor mandate

Ruby11 May 2023 21:20 UTC
37 points
20 comments5 min readLW link

Align­ment, Goals, and The Gut-Head Gap: A Re­view of Ngo. et al.

Violet Hour11 May 2023 18:06 UTC
20 points
2 comments13 min readLW link

Se­quence opener: Jor­dan Harbinger’s 6 minute networking

Severin T. Seehrich11 May 2023 17:06 UTC
4 points
0 comments1 min readLW link

Ad­vice for newly busy people

Severin T. Seehrich11 May 2023 16:46 UTC
124 points
2 comments5 min readLW link

AI #11: In Search of a Moat

Zvi11 May 2023 15:40 UTC
67 points
28 comments81 min readLW link
(thezvi.wordpress.com)

[Question] Bayesian up­date from sen­sa­tion­al­is­tic sources

houkime11 May 2023 15:26 UTC
1 point
0 comments1 min readLW link

I bet $500 on AI win­ning the IMO gold medal by 2026

azsantosk11 May 2023 14:46 UTC
37 points
27 comments1 min readLW link

Fate­book for Slack: Track your fore­casts, right where your team works

11 May 2023 14:11 UTC
24 points
3 comments1 min readLW link

Con­tra Caller Signs

jefftk11 May 2023 13:10 UTC
10 points
0 comments1 min readLW link
(www.jefftk.com)

Notes on the im­por­tance and im­ple­men­ta­tion of safety-first cog­ni­tive ar­chi­tec­tures for AI

Brendon_Wong11 May 2023 10:03 UTC
3 points
0 comments3 min readLW link

A more grounded idea of AI risk

Iknownothing11 May 2023 9:48 UTC
3 points
4 comments1 min readLW link

Separat­ing the “con­trol prob­lem” from the “al­ign­ment prob­lem”

Yi-Yang11 May 2023 9:41 UTC
12 points
1 comment4 min readLW link

[Question] Is In­fra-Bayesi­anism Ap­pli­ca­ble to Value Learn­ing?

RogerDearnaley11 May 2023 8:17 UTC
5 points
4 comments1 min readLW link

[Question] How should we think about the de­ci­sion rele­vance of mod­els es­ti­mat­ing p(doom)?

Mo Putera11 May 2023 4:16 UTC
11 points
1 comment3 min readLW link

The Aca­demic Field Pyra­mid—any point to en­courag­ing broad but shal­low AI risk en­gage­ment?

Matthew_Opitz11 May 2023 1:32 UTC
20 points
1 comment6 min readLW link

[Question] How should one feel morally about us­ing chat­bots?

Adam Zerner11 May 2023 1:01 UTC
18 points
4 comments1 min readLW link

[Question] AI in­ter­pretabil­ity could be harm­ful?

Roman Leventov10 May 2023 20:43 UTC
13 points
2 comments1 min readLW link

Athens, Greece – ACX Mee­tups Every­where Spring 2023

Spyros Dovas10 May 2023 19:45 UTC
1 point
0 comments1 min readLW link

Bet­ter debates

TsviBT10 May 2023 19:34 UTC
57 points
7 comments3 min readLW link

Men­tal Health and the Align­ment Prob­lem: A Com­pila­tion of Re­sources (up­dated April 2023)

10 May 2023 19:04 UTC
251 points
53 comments21 min readLW link

A Cor­rigi­bil­ity Me­taphore—Big Gambles

WCargo10 May 2023 18:13 UTC
16 points
0 comments4 min readLW link

Roadmap for a col­lab­o­ra­tive pro­to­type of an Open Agency Architecture

Deger Turan10 May 2023 17:41 UTC
30 points
0 comments12 min readLW link

AGI-Au­to­mated In­ter­pretabil­ity is Suicide

__RicG__10 May 2023 14:20 UTC
23 points
33 comments7 min readLW link

Class-Based Addressing

jefftk10 May 2023 13:40 UTC
22 points
6 comments1 min readLW link
(www.jefftk.com)

In defence of epistemic mod­esty [dis­til­la­tion]

Luise10 May 2023 9:44 UTC
17 points
2 comments9 min readLW link

[Question] How much of a con­cern are open-source LLMs in the short, medium and long terms?

JavierCC10 May 2023 9:14 UTC
5 points
0 comments1 min readLW link

10 great rea­sons why Lex Frid­man should in­vite Eliezer and Robin to re-do the FOOM de­bate on his podcast

chaosmage10 May 2023 8:27 UTC
−7 points
1 comment1 min readLW link
(www.reddit.com)

New OpenAI Paper—Lan­guage mod­els can ex­plain neu­rons in lan­guage models

MrThink10 May 2023 7:46 UTC
47 points
14 comments1 min readLW link

Nat­u­ral­ist Experimentation

LoganStrohl10 May 2023 4:28 UTC
57 points
14 comments10 min readLW link

[Question] Could A Su­per­in­tel­li­gence Out-Ar­gue A Doomer?

tjaffee10 May 2023 2:40 UTC
−16 points
6 comments1 min readLW link

Gra­di­ent hack­ing via ac­tual hacking

Max H10 May 2023 1:57 UTC
12 points
7 comments3 min readLW link

Red team­ing: challenges and re­search directions

joshc10 May 2023 1:40 UTC
30 points
1 comment10 min readLW link

[Question] Look­ing for a post I read if any­one rec­og­nizes it

SilverFlame10 May 2023 1:24 UTC
2 points
2 comments1 min readLW link

Re­search Re­port: In­cor­rect­ness Cas­cades (Cor­rected)

Robert_AIZI9 May 2023 21:54 UTC
9 points
0 comments9 min readLW link
(aizi.substack.com)

Stop­ping dan­ger­ous AI: Ideal US behavior

Zach Stein-Perlman9 May 2023 21:00 UTC
17 points
0 comments3 min readLW link

Stop­ping dan­ger­ous AI: Ideal lab behavior

Zach Stein-Perlman9 May 2023 21:00 UTC
8 points
0 comments2 min readLW link

Progress links and tweets, 2023-05-09

jasoncrawford9 May 2023 20:22 UTC
14 points
0 comments2 min readLW link
(rootsofprogress.org)

[Question] Have you heard about MIT’s “liquid neu­ral net­works”? What do you think about them?

Ppau9 May 2023 20:16 UTC
36 points
14 comments1 min readLW link

Re­spect for Boundaries as non-ar­bir­trary co­or­di­na­tion norms

Jonas Hallgren9 May 2023 19:42 UTC
9 points
3 comments7 min readLW link

Solv­ing the Mechanis­tic In­ter­pretabil­ity challenges: EIS VII Challenge 1

9 May 2023 19:41 UTC
119 points
1 comment10 min readLW link

Fore­cast­ing as a tool for teach­ing the gen­eral pub­lic to make bet­ter judge­ments?

Dominik Hajduk | České priority9 May 2023 17:35 UTC
3 points
0 comments3 min readLW link

Lan­guage mod­els can ex­plain neu­rons in lan­guage models

nz9 May 2023 17:29 UTC
23 points
0 comments1 min readLW link
(openai.com)

Asi­mov on build­ing robots with­out the First Law

rossry9 May 2023 16:44 UTC
4 points
1 comment2 min readLW link

Mak­ing Up Baby Signs

jefftk9 May 2023 16:40 UTC
44 points
6 comments2 min readLW link
(www.jefftk.com)

Ex­cit­ing New In­ter­pretabil­ity Paper!

research_prime_space9 May 2023 16:39 UTC
12 points
1 comment1 min readLW link

Re­sult Of The Bounty/​Con­test To Ex­plain In­fra-Bayes In The Lan­guage Of Game Theory

johnswentworth9 May 2023 16:35 UTC
79 points
0 comments1 min readLW link

The Bleak Har­mony of Diets and Sur­vival: A Glimpse into Na­ture’s Un­for­giv­ing Balance

bardstale9 May 2023 16:08 UTC
−16 points
0 comments1 min readLW link

En­tropic Abyss

bardstale9 May 2023 15:59 UTC
−12 points
0 comments2 min readLW link

AI Safety Newslet­ter #5: Ge­offrey Hin­ton speaks out on AI risk, the White House meets with AI labs, and Tro­jan at­tacks on lan­guage models

9 May 2023 15:26 UTC
28 points
1 comment4 min readLW link
(newsletter.safe.ai)