Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
RSS
Daniel Wu
Karma:
9
All
Posts
Comments
New
Top
Old
BlackBoxQuery [BBQ]-Bench: Measuring Hypothesis Formation and Experimentation Capabilities in LLMs
Daniel Wu
12 Jan 2026 19:36 UTC
10
points
0
comments
12
min read
LW
link
Back to top