[Question] Is there any met­ric mea­sur­ing ~”pro­por­tion of peo­ple cre­at­ing ex­tra value”?

Amal 3 Aug 2023 22:54 UTC
7 points
3 comments1 min readLW link

[Question] Hy­po­thet­i­cal: what would you do?

JNS3 Aug 2023 22:39 UTC
4 points
2 comments1 min readLW link

[Linkpost] De­cep­tion Abil­ities Emerged in Large Lan­guage Models

Bogdan Ionut Cirstea3 Aug 2023 17:28 UTC
12 points
0 comments1 min readLW link

Embed­ding Eth­i­cal Pri­ors into AI Sys­tems: A Bayesian Approach

Justausername3 Aug 2023 15:31 UTC
−5 points
3 comments21 min readLW link

Pass­word-locked mod­els: a stress case for ca­pa­bil­ities evaluation

Fabien Roger3 Aug 2023 14:53 UTC
144 points
14 comments6 min readLW link

AI #23: Fun­da­men­tal Prob­lems with RLHF

Zvi3 Aug 2023 12:50 UTC
59 points
9 comments41 min readLW link
(thezvi.wordpress.com)

Bad Imi­ta­tion Instruments

jefftk3 Aug 2023 2:30 UTC
21 points
1 comment1 min readLW link
(www.jefftk.com)

Kol­mogorov’s the­ory of Al­gorith­mic Probability

Aidan Rocke3 Aug 2023 0:58 UTC
5 points
2 comments2 min readLW link
(keplerlounge.com)

Work cul­ture creep

CrimsonChin3 Aug 2023 0:38 UTC
27 points
15 comments8 min readLW link

[Question] Boxing

Zach Stein-Perlman2 Aug 2023 23:38 UTC
6 points
1 comment1 min readLW link

Ex­ter­nal ra­tio­nal­ity vs. in­ter­nal rationality

metachirality2 Aug 2023 23:29 UTC
7 points
0 comments1 min readLW link

When perform­ing a di­men­sion­al­ity re­duc­tion on ten­sors, the trace is of­ten zero.

Joseph Van Name2 Aug 2023 21:06 UTC
7 points
1 comment3 min readLW link

Progress links di­gest, 2023-08-02: Su­per­con­duc­tor edition

jasoncrawford2 Aug 2023 20:27 UTC
13 points
0 comments3 min readLW link
(rootsofprogress.org)

[Question] What works for ADHD and/​or re­lated things?

TeaTieAndHat2 Aug 2023 18:37 UTC
6 points
13 comments1 min readLW link

[Question] Would you pay for a search en­g­ine limited to ra­tio­nal­ist sites?

Conor2 Aug 2023 18:06 UTC
4 points
19 comments1 min readLW link

The Roots of Progress Blog-Build­ing In­ten­sive: ad­vice for ap­pli­cants, re­quest for support

jasoncrawford2 Aug 2023 15:37 UTC
9 points
0 comments1 min readLW link
(rootsofprogress.org)

3 lev­els of threat obfuscation

HoldenKarnofsky2 Aug 2023 14:58 UTC
69 points
14 comments7 min readLW link

ChatGPT for trans­la­tion

Varshul Gupta2 Aug 2023 11:57 UTC
1 point
0 comments3 min readLW link
(dubverseblack.substack.com)

Long-Term Fu­ture Fund: April 2023 grant recommendations

2 Aug 2023 7:54 UTC
81 points
3 comments50 min readLW link

[Question] Could we breed/​en­g­ineer in­tel­li­gent par­rots?

lukehmiles2 Aug 2023 7:32 UTC
7 points
18 comments1 min readLW link

An­throp­i­cal Motte and Bailey in two ver­sions of Sleep­ing Beauty

Ape in the coat2 Aug 2023 7:08 UTC
29 points
56 comments6 min readLW link

so­lar-ther­mal and techno-eco­nomic analysis

bhauth2 Aug 2023 6:22 UTC
21 points
8 comments5 min readLW link
(www.bhauth.com)

South Bay ACX/​SSC Meetup @ Whole Foods

allisona2 Aug 2023 3:44 UTC
1 point
0 comments1 min readLW link

“Is There Any­thing That’s Worth More”

Zack_M_Davis2 Aug 2023 3:28 UTC
64 points
6 comments1 min readLW link

Bay Win­ter Sols­tice: call for speech pitches!

tcheasdfjkl2 Aug 2023 3:24 UTC
9 points
0 comments1 min readLW link
(docs.google.com)

[Question] What is on­tol­ogy?

Adam Zerner2 Aug 2023 0:54 UTC
27 points
19 comments1 min readLW link

My cur­rent LK99 questions

Eliezer Yudkowsky1 Aug 2023 22:48 UTC
205 points
38 comments5 min readLW link

Spiral Staircase

Michael Samoilov1 Aug 2023 21:51 UTC
19 points
2 comments2 min readLW link

Open Mic—Au­gust 2023

Adam Zerner1 Aug 2023 19:24 UTC
8 points
0 comments1 min readLW link

ARC Evals new re­port: Eval­u­at­ing Lan­guage-Model Agents on Real­is­tic Au­tonomous Tasks

Beth Barnes1 Aug 2023 18:30 UTC
153 points
12 comments5 min readLW link
(evals.alignment.org)

Ex­plainer—Au­toIn­ter­pre­ta­tion Finds Sparse Cod­ing Beats Alternatives

Gauraventh1 Aug 2023 17:29 UTC
8 points
0 comments3 min readLW link

[Question] When(if ever) are su­per­stim­uli good/​use­ful/​ad­van­ta­geous?

Perhaps1 Aug 2023 15:50 UTC
−7 points
2 comments1 min readLW link

AISN #17: Au­to­mat­i­cally Cir­cum­vent­ing LLM Guardrails, the Fron­tier Model Fo­rum, and Se­nate Hear­ing on AI Oversight

1 Aug 2023 15:40 UTC
8 points
0 comments8 min readLW link
(newsletter.safe.ai)

AISN #16: White House Se­cures Vol­un­tary Com­mit­ments from Lead­ing AI Labs and Les­sons from Oppenheimer

1 Aug 2023 15:39 UTC
3 points
0 comments6 min readLW link
(newsletter.safe.ai)

“Des­per­ate Hon­esty” by Agnes Callard

David Gross1 Aug 2023 13:34 UTC
11 points
0 comments2 min readLW link
(dailynous.com)

Bar­bieheimer: Across the Dead Reckoning

Zvi1 Aug 2023 13:00 UTC
49 points
17 comments41 min readLW link
(thezvi.wordpress.com)

Un­tan­gling In­frabayesi­anism: A re­dis­til­la­tion [PDF link; ~12k words + lots of math]

Lorxus1 Aug 2023 12:42 UTC
29 points
16 comments2 min readLW link
(docdro.id)

What Is Child­hood Sup­posed To Be?

Sable1 Aug 2023 9:51 UTC
21 points
13 comments3 min readLW link
(affablyevil.substack.com)

AI ro­man­tic part­ners will harm so­ciety if they go unregulated

Roman Leventov1 Aug 2023 9:32 UTC
25 points
71 comments13 min readLW link

What is au­ton­omy, and how does it lead to greater risk from AI?

Davidmanheim1 Aug 2023 7:58 UTC
30 points
0 comments6 min readLW link

Eval­u­at­ing Su­per­hu­man Models with Con­sis­tency Checks

1 Aug 2023 7:51 UTC
15 points
2 comments9 min readLW link
(arxiv.org)

[See link to Sept meetup be­low!] San Fran­cisco ACX Meetup “First Satur­day” Au­gust 5, 1 pm

guenael1 Aug 2023 3:38 UTC
1 point
0 comments1 min readLW link

[Question] Ex­er­cise: Solve “Think­ing Physics”

Raemon1 Aug 2023 0:44 UTC
85 points
23 comments5 min readLW link

The “pub­lic de­bate” about AI is con­fus­ing for the gen­eral pub­lic and for poli­cy­mak­ers be­cause it is a three-sided de­bate

Adam David Long1 Aug 2023 0:08 UTC
144 points
30 comments4 min readLW link

The “no sand­bag­ging on check­able tasks” hypothesis

Joe Carlsmith31 Jul 2023 23:06 UTC
50 points
12 comments9 min readLW link

A So­cial His­tory of Truth

Vaniver31 Jul 2023 22:49 UTC
64 points
2 comments13 min readLW link

Water­mark­ing con­sid­ered over­rated?

DanielFilan31 Jul 2023 21:36 UTC
18 points
4 comments1 min readLW link

What The Lord of the Rings Teaches Us About AI Alignment

Jeffrey Heninger31 Jul 2023 20:16 UTC
21 points
11 comments7 min readLW link

The “spel­ling mir­a­cle”: GPT-3 spel­ling abil­ities and glitch to­kens revisited

mwatkins31 Jul 2023 19:47 UTC
85 points
29 comments20 min readLW link

“Build­ing a House” Review

jefftk31 Jul 2023 19:20 UTC
62 points
6 comments1 min readLW link
(www.jefftk.com)