RSS

AI Psychology

TagLast edit: 29 Dec 2024 2:16 UTC by habryka

Trying to understand modern ML systems (at the moment mostly foundation models) from a top down perspective.

Analogous to Human Psychology (Top Down) vs Human Neuroscience (Bottom Up)

A Three-Layer Model of LLM Psychology

Jan_Kulveit26 Dec 2024 16:49 UTC
236 points
15 comments8 min readLW link

The Pando Prob­lem: Re­think­ing AI Individuality

Jan_Kulveit28 Mar 2025 21:03 UTC
133 points
14 comments13 min readLW link

Do Not Tile the Light­cone with Your Con­fused Ontology

Jan_Kulveit13 Jun 2025 12:45 UTC
227 points
27 comments5 min readLW link
(boundedlyrational.substack.com)

Show, not tell: GPT-4o is more opinionated in images than in text

2 Apr 2025 8:51 UTC
112 points
41 comments3 min readLW link

AXRP Epi­sode 42 - Owain Evans on LLM Psychology

DanielFilan6 Jun 2025 20:20 UTC
13 points
0 comments66 min readLW link

Toward a tax­on­omy of cog­ni­tive bench­marks for agen­tic AGIs

Ben Smith27 Jun 2024 23:50 UTC
15 points
0 comments5 min readLW link

Study­ing The Alien Mind

5 Dec 2023 17:27 UTC
80 points
10 comments15 min readLW link

The Stochas­tic Par­rot Hy­poth­e­sis is de­bat­able for the last gen­er­a­tion of LLMs

7 Nov 2023 16:12 UTC
52 points
21 comments6 min readLW link

In­tel­li­gence Is Jagged

Adam Train19 Feb 2025 7:08 UTC
6 points
1 comment3 min readLW link

De­tailed Ideal World Benchmark

Knight Lee30 Jan 2025 2:31 UTC
5 points
2 comments2 min readLW link

Us­ing Psy­chol­in­guis­tic Sig­nals to Im­prove AI Safety

Jkreindler27 Aug 2025 22:30 UTC
−2 points
0 comments4 min readLW link

First Cer­tified Public Solve of Ob­server’s False Path In­sta­bil­ity — Level 4 (Ad­vanced Var­i­ant) — Walter Taran­telli — 2025-05-30 UTC

Walter Tarantelli31 May 2025 1:41 UTC
1 point
0 comments2 min readLW link

Policy En­tropy, Learn­ing, and Align­ment (Or Maybe Your LLM Needs Ther­apy)

sdeture31 May 2025 22:09 UTC
15 points
6 comments8 min readLW link

Pre­face to the Se­quence on LLM Psychology

Quentin FEUILLADE--MONTIXI7 Nov 2023 16:12 UTC
33 points
0 comments2 min readLW link

Cat­e­gory-The­o­retic Wan­der­ings into Interpretability

unruly abstractions2 Sep 2025 0:03 UTC
12 points
2 comments1 min readLW link

Static Place AI Makes Agen­tic AI Re­dun­dant: Mul­tiver­sal AI Align­ment & Ra­tional Utopia

ank13 Feb 2025 22:35 UTC
1 point
2 comments11 min readLW link

[Question] Beyond Bench­marks: A Psy­cho­me­t­ric Ap­proach to AI Evaluation

Kareem Soliman27 Jul 2025 16:09 UTC
1 point
0 comments8 min readLW link

Does Claude Pri­ori­tize Some Prompt In­put Chan­nels Over Others?

keltan29 Dec 2024 1:21 UTC
9 points
2 comments5 min readLW link

Psy­cho­anal­y­sis and Ar­tifi­cial Intelligence

Felipe K. Massaro15 May 2025 13:55 UTC
1 point
0 comments1 min readLW link