AI Capabilities

TagLast edit: Aug 29, 2021, 12:57 PM by plex

AI Capabilities are the growing abilities of AIs to act effectively in increasingly complex environments. It is often compared to to AI Alignment, which refers to efforts to ensure that these effective actions taken by AIs are also intended by the creators and beneficial to humanity.

EfficientZero: human ALE sample-efficiency w/MuZero+self-supervised

gwernNov 2, 2021, 2:32 AM

137 points

52 comments1 min readLW link

(arxiv.org)

Memorizing weak examples can elicit strong behavior out of password-locked models

Fabien Roger and ryan_greenblatt

Jun 6, 2024, 11:54 PM

58 points

5 comments7 min readLW link

A small update to the Sparse Coding interim research report

Lee Sharkey, Dan Braun and beren

Apr 30, 2023, 7:54 PM

61 points

5 comments1 min readLW link

[Paper] Stress-testing capability elicitation with password-locked models

Fabien Roger and ryan_greenblatt

Jun 4, 2024, 2:52 PM

85 points

10 comments12 min readLW link

(arxiv.org)

EfficientZero: How It Works

1a3ornNov 26, 2021, 3:17 PM

299 points

50 comments29 min readLW link 1 review

Getting 50% (SoTA) on ARC-AGI with GPT-4o

ryan_greenblattJun 17, 2024, 6:44 PM

263 points

50 comments13 min readLW link

Competitive programming with AlphaCode

AlgonFeb 2, 2022, 4:49 PM

58 points

36 comments15 min readLW link

(deepmind.com)

Meta AI announces Cicero: Human-Level Diplomacy play (with dialogue)

Jacy Reese AnthisNov 22, 2022, 4:50 PM

93 points

64 comments1 min readLW link

(www.science.org)

[Question] The thing I don’t understand about AGI

Jeremy KalfusJun 18, 2024, 4:25 AM

7 points

12 comments1 min readLW link

What will the scaled up GATO look like? (Updated with questions)

Amal Oct 25, 2022, 12:44 PM

34 points

22 comments1 min readLW link

DeepMind on Stratego, an imperfect information game

sanxiynOct 24, 2022, 5:57 AM

15 points

9 comments1 min readLW link

(arxiv.org)

The case for a negative alignment tax

Cameron Berg, Judd Rosenblatt, Diogo de Lucena and AE Studio

Sep 18, 2024, 6:33 PM

77 points

20 comments7 min readLW link

[Crosspost] AlphaTensor, Taste, and the Scalability of AI

jamierumbelowOct 9, 2022, 7:42 PM

16 points

4 comments1 min readLW link

(jamieonsoftware.com)

What DALL-E 2 can and cannot do

Swimmer963 (Miranda Dixon-Luinenburg) May 1, 2022, 11:51 PM

353 points

303 comments9 min readLW link

Devil’s Advocate: Adverse Selection Against Conscientiousness

lionhearted (Sebastian Marshall)May 28, 2023, 5:53 PM

10 points

2 comments1 min readLW link

Is AI Progress Impossible To Predict?

alyssavanceMay 15, 2022, 6:30 PM

278 points

39 comments2 min readLW link

Benchmarking LLM Agents on Kaggle Competitions

aogMar 22, 2024, 1:09 PM

15 points

4 comments5 min readLW link

Personal imitation software

FlaglandbaseMar 7, 2022, 7:55 AM

6 points

6 comments1 min readLW link

Can GPT-3 Write Contra Dances?

jefftkDec 4, 2022, 3:00 AM

6 points

4 comments10 min readLW link

(www.jefftk.com)

AlexaTM − 20 Billion Parameter Model With Impressive Performance

MrThinkSep 9, 2022, 9:46 PM

5 points

0 comments1 min readLW link

Request: stop advancing AI capabilities

So8resMay 26, 2023, 5:42 PM

154 points

24 comments1 min readLW link

Ok, AI Can Write Pretty Good Fiction Now

JustisMillsJun 16, 2025, 9:13 PM

57 points

34 comments6 min readLW link

(justismills.substack.com)

[linkpost] The final AI benchmark: BIG-bench

RomanSJun 10, 2022, 8:53 AM

25 points

21 comments1 min readLW link

Dual-Useness is a Ratio

jimrandomhApr 6, 2023, 5:46 AM

35 points

2 comments1 min readLW link

Energy-Based Transformers are Scalable Learners and Thinkers

Matrice JacobineJul 8, 2025, 1:44 PM

7 points

5 comments1 min readLW link

(energy-based-transformers.github.io)

AI doing philosophy = AI generating hands?

Wei DaiJan 15, 2024, 9:04 AM

46 points

23 comments3 min readLW link

The Curious Case of the bos_token

larry-dialJun 17, 2025, 7:00 PM

11 points

2 comments10 min readLW link

Capabilities and alignment of LLM cognitive architectures

Seth HerdApr 18, 2023, 4:29 PM

88 points

18 comments20 min readLW link

Google announces ‘Bard’ powered by LaMDA

M. Y. ZuoFeb 6, 2023, 7:40 PM

31 points

3 comments2 min readLW link

Google’s PaLM-E: An Embodied Multimodal Language Model

SandXboxMar 7, 2023, 4:11 AM

87 points

7 comments1 min readLW link

(palm-e.github.io)

What’s the Most Impressive Thing That GPT-4 Could Plausibly Do?

bayesedAug 26, 2022, 3:34 PM

24 points

22 comments1 min readLW link

ChatGPT and Bing Chat can’t play Botticelli

Asha SaavossMar 29, 2023, 5:39 PM

11 points

0 comments6 min readLW link

Molecular dynamics data will be essential for the next generation of ML protein models

Abhishaike MahajanAug 26, 2024, 2:50 PM

9 points

0 comments11 min readLW link

(www.owlposting.com)

The Stochastic Parrot Hypothesis is debatable for the last generation of LLMs

Quentin FEUILLADE--MONTIXI and Pierre Peigné

Nov 7, 2023, 4:12 PM

52 points

21 comments6 min readLW link

A Year of AI Increasing AI Progress

TW123Dec 30, 2022, 2:09 AM

148 points

3 comments2 min readLW link

ACT-1: Transformer for Actions

Daniel KokotajloSep 14, 2022, 7:09 PM

52 points

4 comments1 min readLW link

(www.adept.ai)

Mastering Stratego (Deepmind)

svemirskiDec 2, 2022, 2:21 AM

6 points

0 comments1 min readLW link

(www.deepmind.com)

The alignment problem in different capability regimes

BuckSep 9, 2021, 7:46 PM

88 points

12 comments5 min readLW link

The longest training run

Jsevillamol, Tamay, Owen D and anson.ho

Aug 17, 2022, 5:18 PM

71 points

12 comments9 min readLW link

(epochai.org)

[Question] Could transformer network models learn motor planning like they can learn language and image generation?

mu_(negative)Apr 23, 2023, 5:24 PM

2 points

4 comments1 min readLW link

Squeezing foundations research assistance out of formal logic narrow AI.

Donald HobsonMar 8, 2023, 9:38 AM

16 points

1 comment2 min readLW link

Readability is mostly a waste of characters

vlad.proexApr 21, 2023, 10:05 PM

21 points

7 comments3 min readLW link

Will we run out of ML data? Evidence from projecting dataset size trends

Pablo VillalobosNov 14, 2022, 4:42 PM

75 points

12 comments2 min readLW link

(epochai.org)

Google announces Pathways: new generation multitask AI Architecture

OzyrusOct 29, 2021, 11:55 AM

6 points

1 comment1 min readLW link

(blog.google)

[Question] What would you expect a massive multimodal online federated learner to be capable of?

Aryeh EnglanderAug 27, 2022, 5:31 PM

13 points

4 comments1 min readLW link

[Question] Is “Recursive Self-Improvement” Relevant in the Deep Learning Paradigm?

DragonGodApr 6, 2023, 7:13 AM

32 points

36 comments7 min readLW link

No, human brains are not (much) more efficient than computers

Jesse HooglandSep 6, 2022, 1:53 PM

22 points

21 comments3 min readLW link

(www.jessehoogland.com)

Evaluations project @ ARC is hiring a researcher and a webdev/engineer

Beth BarnesSep 9, 2022, 10:46 PM

99 points

7 comments10 min readLW link

[Question] Are language models close to the superhuman level in philosophy?

Roman LeventovAug 19, 2022, 4:43 AM

6 points

2 comments2 min readLW link

Language models can generate superior text compared to their input

ChristianKlJan 17, 2023, 10:57 AM

48 points

28 comments1 min readLW link

Why the technological singularity by AGI may never happen

hippkeSep 3, 2021, 2:19 PM

5 points

14 comments1 min readLW link

Epistemic Strategies of Safety-Capabilities Tradeoffs

adamShimiOct 22, 2021, 8:22 AM

5 points

0 comments6 min readLW link

Diffusion Guided NLP: better steering, mostly a good thing

Nathan Helm-BurgerAug 10, 2024, 7:49 PM

13 points

0 comments1 min readLW link

(arxiv.org)

Sydney can play chess and kind of keep track of the board state

Erik JennerMar 3, 2023, 9:39 AM

64 points

19 comments6 min readLW link

o3, Oh My

ZviDec 30, 2024, 2:10 PM

63 points

17 comments36 min readLW link

(thezvi.wordpress.com)

AlphaGeometry: An Olympiad-level AI system for geometry

alyssavanceJan 17, 2024, 5:17 PM

45 points

9 comments1 min readLW link

(deepmind.google)

[Question] Are Speed Superintelligences Feasible for Modern ML Techniques?

DragonGodSep 14, 2022, 12:59 PM

9 points

7 comments1 min readLW link

PaLM in “Extrapolating GPT-N performance”

Lukas FinnvedenApr 6, 2022, 1:05 PM

85 points

19 comments2 min readLW link

Timelines to Transformative AI: an investigation

Zershaaneh QureshiMar 26, 2024, 6:28 PM

20 points

2 comments50 min readLW link

“AI achieves silver-medal standard solving International Mathematical Olympiad problems”

gjmJul 25, 2024, 3:58 PM

133 points

38 comments2 min readLW link

(deepmind.google)

OpenAI Solves (Some) Formal Math Olympiad Problems

Michaël TrazziFeb 2, 2022, 9:49 PM

78 points

27 comments2 min readLW link

A chess game against GPT-4

Rafael HarthMar 16, 2023, 2:05 PM

24 points

23 comments1 min readLW link

Steering subsystems: capabilities, agency, and alignment

Seth HerdSep 29, 2023, 1:45 PM

31 points

0 comments8 min readLW link

[Question] Killing Recurrent Memory Over Self Attention?

Del NoboloJun 6, 2023, 11:02 PM

3 points

0 comments1 min readLW link

Elon Musk announces xAI

Jan_KulveitJul 13, 2023, 9:01 AM

75 points

35 comments1 min readLW link

(www.ft.com)

We have achieved Noob Gains in AI

phdeadMay 18, 2022, 8:56 PM

118 points

21 comments7 min readLW link

Principles of Privacy for Alignment Research

johnswentworthJul 27, 2022, 7:53 PM

73 points

31 comments7 min readLW link

Interpreting Yudkowsky on Deep vs Shallow Knowledge

adamShimiDec 5, 2021, 5:32 PM

100 points

32 comments24 min readLW link

Playing Dixit with AI: Can AI Systems Identify Misalignments in My Personalized Statements?

Mariia KoroliukJan 17, 2025, 6:52 PM

1 point

0 comments2 min readLW link

A short project on Mamba: grokking & interpretability

Alejandro TlaieOct 18, 2024, 4:59 PM

21 points

0 comments6 min readLW link

Inflection.ai is a major AGI lab

Nikola JurkovicAug 9, 2023, 1:05 AM

137 points

13 comments2 min readLW link

Deliberative Credit Assignment: Making Faithful Reasoning Profitable

Florian_DietzJul 14, 2025, 9:26 AM

9 points

0 comments17 min readLW link

[Question] Is there a publicly available list of examples of frontier model capabilities?

Max KearneySep 19, 2023, 5:45 PM

1 point

0 comments1 min readLW link

Paper: Discovering novel algorithms with AlphaTensor [Deepmind]

LawrenceCOct 5, 2022, 4:20 PM

82 points

18 comments1 min readLW link

(www.deepmind.com)

Article Review: Google’s AlphaTensor

Robert_AIZIOct 12, 2022, 6:04 PM

8 points

4 comments10 min readLW link

When AI solves a game, focus on the game’s mechanics, not its theme.

Cleo NardoNov 23, 2022, 7:16 PM

89 points

7 comments2 min readLW link

OpenAI Codex: First Impressions

specbugAug 13, 2021, 4:52 PM

49 points

8 comments4 min readLW link

(sixeleven.in)

To contribute to AI safety, consider doing AI research

VikaJan 16, 2016, 8:42 PM

39 points

39 comments2 min readLW link

Absolute Zero: Reinforced Self-play Reasoning with Zero Data

Matrice JacobineMay 12, 2025, 3:20 PM

6 points

4 comments1 min readLW link

(www.arxiv.org)

[Question] What’s the difference between newer Atari-playing AI and the older Deepmind one (from 2014)?

RaemonNov 2, 2021, 11:36 PM

27 points

8 comments1 min readLW link

[Question] What are the relative speeds of AI capabilities and AI safety?

NunoSempereApr 24, 2020, 6:21 PM

8 points

2 comments1 min readLW link

Forecasting AI Forecasting

Alvin ÅnestrandJun 23, 2025, 1:39 PM

8 points

4 comments6 min readLW link

[Question] Is the speed of training large models going to increase significantly in the near future due to Cerebras Andromeda?

Amal Nov 15, 2022, 10:50 PM

13 points

11 comments1 min readLW link

HIRING: Inform and shape a new project on AI safety at Partnership on AI

Madhulika SrikumarNov 24, 2021, 8:27 AM

6 points

0 comments1 min readLW link

On agentic generalist models: we’re essentially using existing technology the weakest and worst way you can use it

Yuli_BanAug 28, 2024, 1:57 AM

10 points

2 comments9 min readLW link

GPT4 is capable of writing decent long-form science fiction (with the right prompts)

RomanSMay 23, 2023, 1:41 PM

22 points

28 comments65 min readLW link

Is GPT-N bounded by human capabilities? No.

Cleo NardoOct 17, 2022, 11:26 PM

49 points

8 comments2 min readLW link

Notes on Meta’s Diplomacy-Playing AI

Erich_GrunewaldDec 22, 2022, 11:34 AM

15 points

2 comments14 min readLW link

(www.erichgrunewald.com)

Interpretability Externalities Case Study—Hungry Hungry Hippos

Magdalena WacheSep 20, 2023, 2:42 PM

64 points

22 comments2 min readLW link

AI Forecasting: One Year In

jsteinhardtJul 4, 2022, 5:10 AM

132 points

12 comments6 min readLW link

(bounded-regret.ghost.io)

Alignment being impossible might be better than it being really difficult

Martín SotoJul 25, 2022, 11:57 PM

13 points

2 comments2 min readLW link

INTELLECT-1 Release: The First Globally Trained 10B Parameter Model

Matrice JacobineNov 29, 2024, 11:05 PM

16 points

1 comment1 min readLW link

(www.primeintellect.ai)

[Question] What is the most probable AI?

Zeruel017Jun 20, 2022, 11:26 PM

−2 points

0 comments3 min readLW link

What’s the backward-forward FLOP ratio for Neural Networks?

Marius Hobbhahn and Jsevillamol

Dec 13, 2021, 8:54 AM

20 points

12 comments10 min readLW link

[Question] Hypothetical: what would you do?

JNSAug 3, 2023, 10:39 PM

4 points

2 comments1 min readLW link

A call for a quantitative report card for AI bioterrorism threat models

JunoDec 4, 2023, 6:35 AM

12 points

0 comments10 min readLW link

Open Source Search (Summary)

samuelshadrachJun 18, 2025, 7:35 AM

21 points

1 comment6 min readLW link

(samuelshadrach.com)

Stability AI releases StableLM, an open-source ChatGPT counterpart

OzyrusApr 20, 2023, 6:04 AM

11 points

3 comments1 min readLW link

(github.com)

LLMs are (mostly) not helped by filler tokens

Kshitij SachanAug 10, 2023, 12:48 AM

66 points

36 comments6 min readLW link

Testing PaLM prompts on GPT3

YitzApr 6, 2022, 5:21 AM

103 points

14 comments8 min readLW link

AI as Super-Demagogue

RationalDinoNov 5, 2023, 9:21 PM

11 points

12 comments9 min readLW link

They gave LLMs access to physics simulators

ryan_bOct 17, 2022, 9:21 PM

50 points

18 comments1 min readLW link

(arxiv.org)

AGI-Automated Interpretability is Suicide

__RicG__May 10, 2023, 2:20 PM

25 points

33 comments7 min readLW link

Agentized LLMs will change the alignment landscape

Seth HerdApr 9, 2023, 2:29 AM

160 points

102 comments3 min readLW link 1 review

Estimating training compute of Deep Learning models

lennart, Jsevillamol, Marius Hobbhahn, Tamay Besiroglu and anson.ho

Jan 20, 2022, 4:12 PM

37 points

4 comments1 min readLW link

Google DeepMind’s RT-2

SandXboxAug 11, 2023, 11:26 AM

9 points

1 comment1 min readLW link

(robotics-transformer2.github.io)

Large Language Models Pass the Turing Test

Matrice JacobineApr 2, 2025, 5:41 AM

6 points

0 comments1 min readLW link

(arxiv.org)

What’s up with AI’s vision

Joachim BartosikMay 3, 2025, 1:23 PM

12 points

19 comments1 min readLW link

An Introduction to AI Sandbagging

Teun van der Weij, Felix Hofstätter and Francis Rhys Ward

Apr 26, 2024, 1:40 PM

48 points

13 comments8 min readLW link

A Critique of AI Alignment Pessimism

ExCephJul 19, 2022, 2:28 AM

9 points

1 comment9 min readLW link

$300 for the best sci-fi prompt: the results

RomanSJan 3, 2024, 7:10 PM

16 points

19 comments7 min readLW link

How I’m thinking about GPT-N

delton137Jan 17, 2022, 5:11 PM

54 points

21 comments18 min readLW link

Seeing how well an agentic AI coding tool can do compared to me using an actual real-world example

MassimogJun 1, 2025, 7:24 PM

32 points

2 comments1 min readLW link

(blog.massimogauthier.com)

A case for capabilities work on AI as net positive

Noosphere89Feb 27, 2023, 9:12 PM

10 points

37 comments1 min readLW link

How to measure FLOP/s for Neural Networks empirically?

Marius HobbhahnNov 29, 2021, 3:18 PM

16 points

5 comments7 min readLW link

AI Can’t Write Good Fiction

JustisMillsMar 12, 2025, 6:11 AM

38 points

24 comments7 min readLW link

(justismills.substack.com)

Basic Mathematics of Predictive Coding

Adam ShaiSep 29, 2023, 2:38 PM

49 points

6 comments9 min readLW link

I Would Have Solved Alignment, But I Was Worried That Would Advance Timelines

307thOct 20, 2023, 4:37 PM

124 points

33 comments9 min readLW link

Uncompetitive programming with GPT-3

BezziFeb 6, 2022, 10:19 AM

7 points

8 comments3 min readLW link

OpenAI’s NSFW policy: user safety, harm reduction, and AI consent

8e9Feb 13, 2025, 1:59 PM

4 points

3 comments2 min readLW link

[Question] How might we make better use of AI capabilities research for alignment purposes?

Jemal YoungAug 31, 2022, 4:19 AM

11 points

4 comments1 min readLW link

Gato’s Generalisation: Predictions and Experiments I’d Like to See

Oliver SourbutMay 18, 2022, 7:15 AM

43 points

3 comments10 min readLW link

Distillation of Meta’s Large Concept Models Paper

NickyPMar 4, 2025, 5:33 PM

19 points

3 comments4 min readLW link

Lifelogging for Alignment & Immortality

Dev.ErrataAug 17, 2024, 11:42 PM

13 points

3 comments7 min readLW link

TinyStories: Small Language Models That Still Speak Coherent English

Ulisse MiniMay 28, 2023, 10:23 PM

67 points

8 comments2 min readLW link

(arxiv.org)

[Question] Have we seen any “ReLU instead of sigmoid-type improvements” recently

KvmanThinkingNov 23, 2024, 3:51 AM

2 points

4 comments1 min readLW link

This anime storyboard doesn’t exist: a graphic novel written and illustrated by GPT4

RomanSOct 5, 2023, 2:01 PM

12 points

7 comments55 min readLW link

Towards Better Milestones for Monitoring AI Capabilities

snewmanSep 27, 2023, 9:18 PM

11 points

0 comments14 min readLW link

GPT-4 implicitly values identity preservation: a study of LMCA identity management

OzyrusMay 17, 2023, 2:13 PM

21 points

4 comments13 min readLW link

[Thought Experiment] Tomorrow’s Echo—The future of synthetic companionship.

Vimal NaranOct 26, 2023, 5:54 PM

−7 points

2 comments2 min readLW link

AI Tracker: monitoring current and near-future risks from superscale models

Edouard Harris and Jeremie Harris

Nov 23, 2021, 7:16 PM

67 points

13 comments3 min readLW link

(aitracker.org)

Questions I’d Want to Ask an AGI+ to Test Its Understanding of Ethics

sweenesmJan 26, 2024, 11:40 PM

14 points

6 comments4 min readLW link

DeepMind: Generally capable agents emerge from open-ended play

Daniel KokotajloJul 27, 2021, 2:19 PM

247 points

53 comments2 min readLW link

(deepmind.com)

Eleuther releases Llemma: An Open Language Model For Mathematics

mako yassOct 17, 2023, 8:03 PM

22 points

0 comments1 min readLW link

(blog.eleuther.ai)

It matters when the first sharp left turn happens

Adam JermynSep 29, 2022, 8:12 PM

45 points

9 comments4 min readLW link

[Paper] AI Sandbagging: Language Models can Strategically Underperform on Evaluations

Teun van der Weij, Felix Hofstätter, Ollie J, Sam F. Brown and Francis Rhys Ward

Jun 13, 2024, 10:04 AM

84 points

10 comments2 min readLW link

(arxiv.org)

How 2025 AI Forecasts Fared So Far

Adam B, romeo and elifland

May 22, 2025, 9:42 AM

11 points

2 comments8 min readLW link

(theaidigest.org)

Among Us: A Sandbox for Agentic Deception

7vik and Adrià Garriga-alonso

Apr 5, 2025, 6:24 AM

110 points

7 comments7 min readLW link

What’s the future of AI hardware?

Itay DreyfusJun 17, 2024, 1:05 PM

2 points

0 comments8 min readLW link

(productidentity.co)

Predict 2025 AI capabilities (by Sunday)

Jonas V, elifland and Sage Future

Jan 15, 2025, 12:16 AM

55 points

3 comments1 min readLW link

Stupidity is also hard

walkthroughwallsSep 12, 2023, 2:45 AM

−8 points

4 comments2 min readLW link

How should DeepMind’s Chinchilla revise our AI forecasts?

Cleo NardoSep 15, 2022, 5:54 PM

35 points

12 comments13 min readLW link

plex Aug 29, 2021, 3:56 PM
1 point
0
I think this should be in the AI category, likely under Engineering.