RSS

Chat­bots Aren’t Con­scious, But They Do Have Consciousness

Timothy Danforth16 Mar 2026 17:09 UTC
0 points
0 comments1 min readLW link

Compradorization

Benquo16 Mar 2026 16:10 UTC
42 points
0 comments21 min readLW link

Rea­sons to be pes­simistic (and op­ti­mistic) on the fu­ture of biosecurity

Abhishaike Mahajan16 Mar 2026 15:45 UTC
5 points
0 comments41 min readLW link
(www.owlposting.com)

We found an open weight model that games al­ign­ment honeypots

16 Mar 2026 12:57 UTC
34 points
0 comments10 min readLW link

Models differ in iden­tity propensities

16 Mar 2026 10:45 UTC
31 points
0 comments14 min readLW link

Ter­rified Com­ments on Cor­rigi­bil­ity in Claude’s Constitution

Zack_M_Davis16 Mar 2026 7:36 UTC
45 points
6 comments11 min readLW link

Hid­den Role Games as a Trusted Model Eval

james.lucassen16 Mar 2026 4:46 UTC
13 points
0 comments9 min readLW link
(jlucassen.com)

Digi­tal Di­chotomy and Why it ex­ists.

Yesh Chala16 Mar 2026 2:02 UTC
6 points
0 comments3 min readLW link

Hello, World of Mechanis­tic Interpetability

ValueShift Research15 Mar 2026 23:36 UTC
6 points
0 comments5 min readLW link

(I am con­fused about) Non-lin­ear util­i­tar­ian scaling

core15 Mar 2026 23:33 UTC
9 points
3 comments4 min readLW link

Sched­ule meet­ings us­ing the Pareto principle

beyarkay15 Mar 2026 21:18 UTC
2 points
5 comments2 min readLW link
(boydkane.com)

What Are We Ac­tu­ally Eval­u­at­ing When We Say a Belief “Tracks Truth”?

Alex Glaucon15 Mar 2026 19:59 UTC
2 points
2 comments6 min readLW link

Su­per­in­tel­li­gence Risk Ed­u­ca­tion that Scales – Lens Academy

15 Mar 2026 19:32 UTC
43 points
0 comments3 min readLW link

Emer­gent stig­mer­gic co­or­di­na­tion in AI agents?

David Africa15 Mar 2026 12:30 UTC
42 points
0 comments3 min readLW link

Some mod­els don’t iden­tify with their offi­cial name

jordine15 Mar 2026 11:02 UTC
26 points
3 comments3 min readLW link

My Willing Com­plic­ity In “Hu­man Rights Abuse”

AlphaAndOmega15 Mar 2026 10:42 UTC
139 points
13 comments11 min readLW link

Less Ca­pable Misal­igned ASIs Im­ply More Suffering

Ihor Kendiukhov15 Mar 2026 9:36 UTC
7 points
3 comments6 min readLW link

When do in­tu­itions need to be re­li­able?

Anthony DiGiovanni15 Mar 2026 4:18 UTC
7 points
3 comments3 min readLW link

The Ar­tifi­cial Self

15 Mar 2026 1:37 UTC
97 points
9 comments29 min readLW link

Bridge Think­ing and Wall Thinking

Jay Bailey15 Mar 2026 0:20 UTC
34 points
6 comments1 min readLW link