RSS

kaivu

Karma: 292

In­verse Rubric Op­ti­miza­tion: A testbed for agent science

11 Jun 2026 1:44 UTC
5 points
0 comments1 min readLW link
(fulcrum.inc)

Track­ing Difficulty with Fea­ture Portfolios

19 May 2026 2:25 UTC
22 points
0 comments5 min readLW link

Bench­mark­ing Real Work

16 May 2026 20:43 UTC
30 points
2 comments4 min readLW link

The bit­ter les­son for software

16 Mar 2026 23:38 UTC
15 points
3 comments2 min readLW link
(fulcruminc.substack.com)

More is differ­ent for intelligence

7 Mar 2026 0:02 UTC
17 points
0 comments2 min readLW link
(fulcruminc.substack.com)

In­tro­duc­ing Lunette: au­dit­ing agents for evals and environments

15 Dec 2025 23:17 UTC
23 points
0 comments1 min readLW link
(fulcrumresearch.ai)

Au­to­mated real time mon­i­tor­ing and or­ches­tra­tion of cod­ing agents

23 Oct 2025 22:12 UTC
8 points
0 comments2 min readLW link
(fulcrumresearch.ai)

AI agents and painted facades

30 Aug 2025 23:13 UTC
38 points
3 comments2 min readLW link
(fulcrumresearch.ai)

Me, My­self, and AI: the Si­tu­a­tional Aware­ness Dataset (SAD) for LLMs

8 Jul 2024 22:24 UTC
109 points
40 comments5 min readLW link1 review

Take­aways from a Mechanis­tic In­ter­pretabil­ity pro­ject on “For­bid­den Facts”

15 Dec 2023 11:05 UTC
34 points
8 comments10 min readLW link

Up­date on Har­vard AI Safety Team and MIT AI Alignment

2 Dec 2022 0:56 UTC
60 points
4 comments8 min readLW link