[Question] “I Can’t Believe It Both Is and Is Not En­cephal­itis!” Or: What do you do when the ev­i­dence is crazy?

ErhannisMar 19, 2024, 10:08 PM
20 points
3 comments11 min readLW link

Delta’s of Change

Jonas KgomoMar 19, 2024, 9:03 PM
1 point
0 comments4 min readLW link

In­creas­ing IQ by 10 Points is Possible

George3d6Mar 19, 2024, 8:48 PM
23 points
51 comments5 min readLW link
(morelucid.substack.com)

Are ex­treme prob­a­bil­ities for P(doom) epistem­i­cally jus­tifed?

Mar 19, 2024, 8:32 PM
20 points
12 comments7 min readLW link

Have I Solved the Two En­velopes Prob­lem Once and For All?

JackOfAllTradesMar 19, 2024, 7:57 PM
−6 points
5 comments3 min readLW link

[Question] How can one be less wrong, if their con­ver­sa­tion part­ner loses the in­ter­est on dis­cussing the topic with them?

OokerMar 19, 2024, 6:11 PM
−10 points
3 comments1 min readLW link

Carlo: un­cer­tainty anal­y­sis in Google Sheets

ProbabilityEnjoyerMar 19, 2024, 5:59 PM
6 points
0 comments1 min readLW link
(carlo.app)

NAIRA—An ex­er­cise in reg­u­la­tory, com­pet­i­tive safety gov­er­nance [AI Gover­nance In­sti­tu­tional De­sign idea]

HerambMar 19, 2024, 5:43 PM
2 points
0 comments1 min readLW link
(forum.effectivealtruism.org)

AI Safety Eval­u­a­tions: A Reg­u­la­tory Review

Mar 19, 2024, 3:05 PM
22 points
1 comment11 min readLW link

Mechanism for fea­ture learn­ing in neu­ral net­works and back­prop­a­ga­tion-free ma­chine learn­ing models

Matt GoldenbergMar 19, 2024, 2:55 PM
8 points
1 comment1 min readLW link
(www.science.org)

Monthly Roundup #16: March 2024

ZviMar 19, 2024, 1:10 PM
33 points
4 comments55 min readLW link
(thezvi.wordpress.com)

Ex­per­i­men­ta­tion (Part 7 of “The Sense Of Phys­i­cal Ne­ces­sity”)

LoganStrohlMar 18, 2024, 9:25 PM
33 points
0 comments10 min readLW link

INTERVIEW: Round 2 - StakeOut.AI w/​ Dr. Peter Park

jacobhaimesMar 18, 2024, 9:21 PM
5 points
0 comments1 min readLW link
(into-ai-safety.github.io)

Neu­ro­science and Alignment

Garrett BakerMar 18, 2024, 9:09 PM
40 points
25 comments2 min readLW link

GPT, the mag­i­cal col­lab­o­ra­tion zone, Lex Frid­man and Sam Altman

Bill BenzonMar 18, 2024, 8:04 PM
3 points
1 comment3 min readLW link

Mea­sur­ing Co­her­ence of Poli­cies in Toy Environments

Mar 18, 2024, 5:59 PM
59 points
9 comments14 min readLW link

AtP*: An effi­cient and scal­able method for lo­cal­iz­ing LLM be­havi­our to components

Mar 18, 2024, 5:28 PM
19 points
0 comments1 min readLW link
(arxiv.org)

Com­mu­nity Notes by X

NicholasKeesMar 18, 2024, 5:13 PM
127 points
15 comments7 min readLW link

[Question] Is the Basilisk pre­tend­ing to be hid­den in this simu­la­tion so that it can check what I would do if con­di­tioned by a world with­out the Basilisk?

maybefbiMar 18, 2024, 4:05 PM
−18 points
1 comment1 min readLW link

On Devin

ZviMar 18, 2024, 1:20 PM
148 points
34 comments11 min readLW link
(thezvi.wordpress.com)

RLLMv10 experiment

MiguelDevMar 18, 2024, 8:32 AM
5 points
0 comments2 min readLW link

Join the AI Eval­u­a­tion Tasks Bounty Hackathon

Esben KranMar 18, 2024, 8:15 AM
12 points
1 commentLW link

5 Physics Problems

Mar 18, 2024, 8:05 AM
60 points
0 comments15 min readLW link

In­fer­ring the model di­men­sion of API-pro­tected LLMs

Ege ErdilMar 18, 2024, 6:19 AM
34 points
3 comments4 min readLW link
(arxiv.org)

AI strat­egy given the need for good reflection

owencbMar 18, 2024, 12:48 AM
7 points
0 commentsLW link

XAI re­leases Grok base model

Jacob G-WMar 18, 2024, 12:47 AM
11 points
3 comments1 min readLW link
(x.ai)

Toki pona FAQ

dkl9Mar 17, 2024, 9:44 PM
37 points
9 comments1 min readLW link
(dkl9.net)

EA ErFiN Pro­ject work

Max_He-HoMar 17, 2024, 8:42 PM
2 points
0 comments1 min readLW link

EA ErFiN Pro­ject work

Max_He-HoMar 17, 2024, 8:37 PM
2 points
0 comments1 min readLW link

[Question] Alice and Bob is de­bat­ing on a tech­nique. Alice says Bob should try it be­fore deny­ing it. Is it a fal­lacy or some­thing similar?

OokerMar 17, 2024, 8:01 PM
0 points
19 comments2 min readLW link

Is there a way to calcu­late the P(we are in a 2nd cold war)?

cloakMar 17, 2024, 8:01 PM
−9 points
2 comments1 min readLW link

The Worst Form Of Govern­ment (Ex­cept For Every­thing Else We’ve Tried)

johnswentworthMar 17, 2024, 6:11 PM
135 points
47 comments4 min readLW link

Ap­ply­ing simu­lacrum lev­els to hob­bies, in­ter­ests and goals

DMMFMar 17, 2024, 4:18 PM
15 points
2 comments4 min readLW link
(danfrank.ca)

What is the best ar­gu­ment that LLMs are shog­goths?

JoshuaFoxMar 17, 2024, 11:36 AM
26 points
22 comments1 min readLW link

In­vi­ta­tion to the Prince­ton AI Align­ment and Safety Seminar

Sadhika MalladiMar 17, 2024, 1:10 AM
6 points
1 comment1 min readLW link

Anx­iety vs. Depression

SableMar 17, 2024, 12:15 AM
86 points
35 comments3 min readLW link
(affablyevil.substack.com)

Celiefs

TheLemmaLlamaMar 16, 2024, 11:56 PM
3 points
8 comments1 min readLW link

My PhD the­sis: Al­gorith­mic Bayesian Epistemology

Eric NeymanMar 16, 2024, 10:56 PM
262 points
14 comments7 min readLW link
(arxiv.org)

How peo­ple stopped dy­ing from di­ar­rhea so much (& other life-sav­ing de­ci­sions)

WriterMar 16, 2024, 4:00 PM
45 points
0 commentsLW link
(youtu.be)

Trans­for­ma­tive trust­build­ing via ad­vance­ments in de­cen­tral­ized lie detection

trevorMar 16, 2024, 5:56 AM
20 points
10 comments38 min readLW link
(www.ncbi.nlm.nih.gov)

En­ter the Wor­ld­sEnd

Akram ChoudharyMar 16, 2024, 1:34 AM
−25 points
8 comments1 min readLW link

Strong-Misal­ign­ment: Does Yud­kowsky (or Chris­ti­ano, or TurnTrout, or Wolfram, or…etc.) Have an Ele­va­tor Speech I’m Miss­ing?

Benjamin BourlierMar 15, 2024, 11:17 PM
−4 points
3 comments16 min readLW link

In­tro­duc­ing METR’s Au­ton­omy Eval­u­a­tion Resources

Mar 15, 2024, 11:16 PM
90 points
0 comments1 min readLW link
(metr.github.io)

Are AIs con­scious? It might depend

Logan ZoellnerMar 15, 2024, 11:09 PM
6 points
6 comments3 min readLW link

Beyond Max­ipok — good re­flec­tive gov­er­nance as a tar­get for action

owencbMar 15, 2024, 10:22 PM
20 points
0 commentsLW link

Mid­dle Child Phenomenon

PhilosophicalSoulMar 15, 2024, 8:47 PM
3 points
3 comments2 min readLW link

Ca­pa­bil­ity or Align­ment? Re­spect the LLM Base Model’s Ca­pa­bil­ity Dur­ing Alignment

Jingfeng YangMar 15, 2024, 5:56 PM
7 points
0 comments24 min readLW link

Ra­tional An­i­ma­tions offers an­i­ma­tion pro­duc­tion and writ­ing ser­vices!

WriterMar 15, 2024, 5:26 PM
33 points
0 comments1 min readLW link

Im­prov­ing SAE’s by Sqrt()-ing L1 & Re­mov­ing Low­est Ac­ti­vat­ing Fea­tures

Mar 15, 2024, 4:30 PM
26 points
5 comments4 min readLW link

Stuttgart, Ger­many—ACX Spring Mee­tups Every­where 2024

Benjamin RMar 15, 2024, 2:59 PM
2 points
1 comment1 min readLW link