VC The­ory Overview

Joar Skalse2 Jul 2023 22:45 UTC
10 points
2 comments11 min readLW link

Sources of ev­i­dence in Alignment

Martín Soto2 Jul 2023 20:38 UTC
20 points
0 comments11 min readLW link

Quan­ti­ta­tive cruxes in Alignment

Martín Soto2 Jul 2023 20:38 UTC
21 points
0 comments23 min readLW link

Go­ing Crazy and Get­ting Bet­ter Again

Evenstar2 Jul 2023 18:55 UTC
118 points
10 comments7 min readLW link

Shall We Throw A Huge Party Be­fore AGI Bids Us Adieu?

GeorgeMan2 Jul 2023 17:56 UTC
−1 points
6 comments1 min readLW link

Why it’s so hard to talk about Consciousness

Rafael Harth2 Jul 2023 15:56 UTC
87 points
152 comments9 min readLW link

How Smart Are Hu­mans?

Joar Skalse2 Jul 2023 15:46 UTC
9 points
19 comments2 min readLW link

Con­sider giv­ing money to peo­ple, not pro­jects or organizations

Nina Rimsky2 Jul 2023 14:33 UTC
79 points
30 comments3 min readLW link
(ninarimsky.substack.com)

Through a panel, darkly: a case study in in­ter­net BS detection

jchan2 Jul 2023 13:40 UTC
22 points
7 comments3 min readLW link

LLMs, Batches, and Emer­gent Epi­sodic Memory

Lao Mein2 Jul 2023 7:55 UTC
5 points
4 comments1 min readLW link

Nega­tivity en­hances positivity

Adam Zerner2 Jul 2023 2:47 UTC
12 points
7 comments2 min readLW link

faster la­tent diffusion

bhauth2 Jul 2023 1:30 UTC
10 points
8 comments2 min readLW link
(www.bhauth.com)

Us­ing (Un­in­ter­pretable) LLMs to Gen­er­ate In­ter­pretable AI Code

Joar Skalse2 Jul 2023 1:01 UTC
15 points
9 comments3 min readLW link

Grant ap­pli­ca­tions and grand narratives

Elizabeth2 Jul 2023 0:16 UTC
188 points
20 comments6 min readLW link

An In­tro­duc­tion, an Overview of my per­sonal re­sources, and how one might make use of them

ProofBySonnet1 Jul 2023 21:00 UTC
4 points
6 comments3 min readLW link

My “2.9 trauma limit”

Raemon1 Jul 2023 19:32 UTC
164 points
31 comments7 min readLW link

Alpha

Erich_Grunewald1 Jul 2023 16:05 UTC
65 points
2 comments14 min readLW link
(www.erichgrunewald.com)

Fo­rum Karma: view stats and find highly-rated com­ments for any LW user

Max H1 Jul 2023 15:36 UTC
58 points
16 comments2 min readLW link
(forumkarma.com)

[ASoT] GPT2 Steer­ing & The Tuned Lens

Ulisse Mini1 Jul 2023 14:12 UTC
23 points
0 comments2 min readLW link

[Linkpost] A shared lin­guis­tic space for trans­mit­ting our thoughts from brain to brain in nat­u­ral conversations

Bogdan Ionut Cirstea1 Jul 2023 13:57 UTC
17 points
2 comments1 min readLW link

Ele­ments of Com­pu­ta­tional Philos­o­phy, Vol. I: Truth

1 Jul 2023 11:44 UTC
11 points
6 comments1 min readLW link
(compphil.github.io)

Micro Habits that Im­prove One’s Day

silentbob1 Jul 2023 10:53 UTC
60 points
9 comments5 min readLW link

Ate­liers: But what is an Ate­lier?

Stephen Fowler1 Jul 2023 5:57 UTC
4 points
2 comments10 min readLW link

Pre­dict­ing: Quick Start

duck_master1 Jul 2023 3:43 UTC
9 points
3 comments14 min readLW link

EA/​LW/​SSC Ar­gentina Group!

daviddelauba1 Jul 2023 2:47 UTC
1 point
0 comments1 min readLW link

De­s­pe­dida a Pablo Stafforini

daviddelauba1 Jul 2023 2:44 UTC
1 point
0 comments1 min readLW link

Hori­zon­tal and Ver­ti­cal Integration

Jeffrey Heninger1 Jul 2023 1:15 UTC
17 points
1 comment2 min readLW link

In­flec­tion AI an­nounces $1.3 billion of fund­ing led by cur­rent in­vestors, Microsoft, and NVIDIA

SandXbox30 Jun 2023 21:32 UTC
7 points
0 comments1 min readLW link
(inflection.ai)

Introduction

30 Jun 2023 20:45 UTC
7 points
0 comments2 min readLW link

In­her­ently In­ter­pretable Architectures

30 Jun 2023 20:43 UTC
4 points
0 comments7 min readLW link

Pos­i­tive Attractors

30 Jun 2023 20:43 UTC
6 points
0 comments13 min readLW link

Agency from a causal perspective

30 Jun 2023 17:37 UTC
38 points
5 comments6 min readLW link

On house­hold dust

Nina Rimsky30 Jun 2023 17:03 UTC
74 points
12 comments5 min readLW link

Lit­tle at­ten­tion seems to be on dis­cour­ag­ing hard­ware progress

RussellThor30 Jun 2023 10:14 UTC
5 points
3 comments1 min readLW link

In­tro­duc­ing EffiS­ciences’ AI Safety Unit

30 Jun 2023 7:44 UTC
64 points
0 comments12 min readLW link

Con­tra An­ton 🏴‍☠️ on Kol­mogorov com­plex­ity and re­cur­sive self improvement

DaemonicSigil30 Jun 2023 5:15 UTC
25 points
12 comments2 min readLW link

Foom Liability

PeterMcCluskey30 Jun 2023 3:55 UTC
20 points
10 comments6 min readLW link
(bayesianinvestor.com)

I Think Eliezer Should Go on Glenn Beck

Lao Mein30 Jun 2023 3:12 UTC
25 points
21 comments1 min readLW link

[Question] Should MS open-source the ex­ten­sion for GitHub Copi­lot?

Sheikh Abdur Raheem Ali29 Jun 2023 23:14 UTC
17 points
4 comments1 min readLW link

Ben­gio’s FAQ on Catas­trophic AI Risks

Vaniver29 Jun 2023 23:04 UTC
39 points
0 comments1 min readLW link
(yoshuabengio.org)

AGI & War

Calecute29 Jun 2023 22:20 UTC
9 points
1 comment1 min readLW link

Biosafety Reg­u­la­tions (BMBL) and their rele­vance for AI

Štěpán Los29 Jun 2023 19:22 UTC
4 points
0 comments4 min readLW link

Na­ture Re­leases A Stupid Edi­to­rial On AI Risk

omnizoid29 Jun 2023 19:00 UTC
2 points
1 comment3 min readLW link

AI Safety with­out Align­ment: How hu­mans can WIN against AI

vicchain29 Jun 2023 17:53 UTC
1 point
1 comment2 min readLW link

Challenge pro­posal: small­est pos­si­ble self-hard­en­ing back­door for RLHF

Christopher King29 Jun 2023 16:56 UTC
7 points
0 comments2 min readLW link

AI #18: The Great De­bate Debate

Zvi29 Jun 2023 16:20 UTC
47 points
9 comments52 min readLW link
(thezvi.wordpress.com)

Bruce Ster­ling on the AI ma­nia of 2023

Mitchell_Porter29 Jun 2023 5:00 UTC
25 points
1 comment1 min readLW link
(www.newsweek.com)

Cheat sheet of AI X-risk

momom229 Jun 2023 4:28 UTC
19 points
1 comment7 min readLW link

An­throp­i­cally Blind: the an­thropic shadow is re­flec­tively inconsistent

Christopher King29 Jun 2023 2:36 UTC
40 points
38 comments10 min readLW link

One path to co­her­ence: con­di­tion­al­iza­tion

porby29 Jun 2023 1:08 UTC
28 points
4 comments4 min readLW link