RSS

RL, but don’t do any­thing I wouldn’t do

Gunnar_Zarncke7 Dec 2024 22:54 UTC
22 points
0 comments1 min readLW link
(arxiv.org)

Mask and Re­s­pi­ra­tor In­tel­ligi­bil­ity Comparison

jefftk7 Dec 2024 3:20 UTC
16 points
1 comment1 min readLW link
(www.jefftk.com)

Broad­en­ing Hori­zons: Re­think­ing So­cial Mo­bil­ity Through Skill Diversification

Yanling Guo7 Dec 2024 0:04 UTC
−1 points
0 comments2 min readLW link

Purg­ing Cor­rupted Ca­pa­bil­ities across Lan­guage Models

6 Dec 2024 22:56 UTC
8 points
0 comments16 min readLW link

Gra­di­ent Rout­ing: Mask­ing Gra­di­ents to Lo­cal­ize Com­pu­ta­tion in Neu­ral Networks

6 Dec 2024 22:19 UTC
98 points
1 comment11 min readLW link
(arxiv.org)

Un­der­stand­ing Shap­ley Values with Venn Diagrams

agucova6 Dec 2024 21:56 UTC
70 points
5 comments1 min readLW link
(medium.com)

Model Integrity

6 Dec 2024 21:28 UTC
4 points
1 comment18 min readLW link

Can AI im­prove the cur­rent state of molec­u­lar simu­la­tion?

Abhishaike Mahajan6 Dec 2024 20:22 UTC
4 points
0 comments1 min readLW link
(www.owlposting.com)

Low Tem­per­a­ture Solomonoff Induction

dil-leik-og6 Dec 2024 18:55 UTC
2 points
0 comments11 min readLW link

Ex­per­i­ments are in the ter­ri­tory, re­sults are in the map

Tahp6 Dec 2024 15:44 UTC
7 points
1 comment6 min readLW link

Fron­tier Models are Ca­pable of In-con­text Scheming

5 Dec 2024 22:11 UTC
170 points
12 comments7 min readLW link

Are SAE fea­tures from the Base Model still mean­ingful to LLaVA?

Shan23Chen5 Dec 2024 20:21 UTC
6 points
0 comments10 min readLW link
(www.lesswrong.com)

Ex­pevolu, a laissez-faire ap­proach to coun­try creation

Fernando5 Dec 2024 19:29 UTC
2 points
2 comments43 min readLW link
(expevolu.substack.com)

Are SAE fea­tures from the Base Model still mean­ingful to LLaVA?

Shan23Chen5 Dec 2024 19:24 UTC
4 points
0 comments10 min readLW link

Smart peo­ple should do biology

Haotian Huang5 Dec 2024 19:11 UTC
7 points
2 comments3 min readLW link

De­tec­tion of Asymp­tomat­i­cally Spread­ing Pathogens

jefftk5 Dec 2024 18:20 UTC
45 points
7 comments7 min readLW link
(www.jefftk.com)

Model In­tegrity: MAI on Value Alignment

Jonas Hallgren5 Dec 2024 17:11 UTC
6 points
11 comments1 min readLW link
(meaningalignment.substack.com)

So­cial Science in its episte­molog­i­cal context

Arturo Macias5 Dec 2024 16:12 UTC
2 points
0 comments1 min readLW link
(www.theseedsofscience.pub)

Why mus­cle ten­sion can be unsexy

Chipmonk5 Dec 2024 16:11 UTC
8 points
9 comments1 min readLW link
(chrislakin.blog)

What If We Re­build Mo­ti­va­tion with the Fermi ESTIMATion?

Gabriel Brito5 Dec 2024 15:35 UTC
5 points
0 comments4 min readLW link