RSS

Ma­chine Learn­ing (ML)

TagLast edit: Dec 30, 2024, 10:37 AM by Dakara

Machine Learning is a general field of study that deals with automated statistical learning and pattern detection by non-biological systems. It can be seen as a sub-domain of artificial intelligence that specifically deals with modeling and prediction through the knowledge extracted from training data. As a multi-disciplinary area, it has borrowed concepts and ideas from other areas like pure mathematics and cognitive science.

Understanding different machine learning algorithms

The most widely used distinction is between unsupervised (e.g. k-means clustering, principal component analysis) vs supervised (e.g. Support Vector Machines, logistic regression) methods. The first approach identifies interesting patterns (e.g. clusters and latent dimensions) in unlabeled training data, whereas the second takes labeled training data and tries to predict the label for unlabeled data points from the same distribution.

Another important distinction relates to the bias/​variance tradeoff—some machine learning methods are capable of recognizing more complex patterns, but the tradeoff is that these methods can overfit and generalize poorly if there’s noise in the training data—especially if there’s not much training data available.

There are also subfields of machine learning devoted to operating on specific kinds of data. For example, Hidden Markov Models and recurrent neural networks operate on time series data. Convolutional neural networks are commonly applied to image data.

Applications

The use of machine learning has been widespread since its formal definition in the 50’s. The ability to make predictions based on data has been extensively used in areas such as analysis of financial markets, natural language processing and even brain-computer interfaces. Amazon’s product suggestion system makes use of training data in the form of past customer purchases in order to predict what customers might want to buy in the future.

In addition to its practical usefulness, machine learning has also offered insight into human cognitive organization. It seems likely machine learning will play an important role in the development of artificial general intelligence.

Further Reading & References

See Also

Paper: Dis­cov­er­ing novel al­gorithms with AlphaTen­sor [Deep­mind]

LawrenceCOct 5, 2022, 4:20 PM
82 points

44 votes

Overall karma indicates overall quality.

18 comments1 min readLW link
(www.deepmind.com)

Play­ing with DALL·E 2

Dave OrrApr 7, 2022, 6:49 PM
166 points

116 votes

Overall karma indicates overall quality.

118 comments6 min readLW link

Pre­dic­tive Cod­ing has been Unified with Backpropagation

lsusrApr 2, 2021, 9:42 PM
181 points

112 votes

Overall karma indicates overall quality.

51 comments2 min readLW link

A Bird’s Eye View of the ML Field [Prag­matic AI Safety #2]

May 9, 2022, 5:18 PM
164 points

69 votes

Overall karma indicates overall quality.

8 comments35 min readLW link

Strik­ing Im­pli­ca­tions for Learn­ing The­ory, In­ter­pretabil­ity — and Safety?

RogerDearnaleyJan 5, 2024, 8:46 AM
37 points

19 votes

Overall karma indicates overall quality.

4 comments2 min readLW link

the scal­ing “in­con­sis­tency”: openAI’s new insight

nostalgebraistNov 7, 2020, 7:40 AM
148 points

66 votes

Overall karma indicates overall quality.

14 comments9 min readLW link
(nostalgebraist.tumblr.com)

Matt Botv­inick on the spon­ta­neous emer­gence of learn­ing algorithms

Adam SchollAug 12, 2020, 7:47 AM
154 points

72 votes

Overall karma indicates overall quality.

87 comments5 min readLW link

What we know about ma­chine learn­ing’s repli­ca­tion crisis

Younes KamelMar 5, 2022, 11:55 PM
36 points

13 votes

Overall karma indicates overall quality.

4 comments6 min readLW link
(youneskamel.substack.com)

Us­ing GPT-N to Solve In­ter­pretabil­ity of Neu­ral Net­works: A Re­search Agenda

Sep 3, 2020, 6:27 PM
68 points

22 votes

Overall karma indicates overall quality.

11 comments2 min readLW link

Effi­cien­tZero: How It Works

1a3ornNov 26, 2021, 3:17 PM
299 points

145 votes

Overall karma indicates overall quality.

50 comments29 min readLW link1 review

I Trained a Neu­ral Net­work to Play Helltaker

lsusrApr 7, 2021, 8:24 AM
34 points

17 votes

Overall karma indicates overall quality.

5 comments3 min readLW link

An Illus­trated Proof of the No Free Lunch Theorem

lifelonglearnerJun 8, 2020, 1:54 AM
20 points

6 votes

Overall karma indicates overall quality.

0 comments1 min readLW link
(mlu.red)

The No Free Lunch the­o­rems and their Razor

Adrià Garriga-alonsoMay 24, 2022, 6:40 AM
56 points

29 votes

Overall karma indicates overall quality.

3 comments9 min readLW link

Re­veal­ing In­ten­tion­al­ity In Lan­guage Models Through AdaVAE Guided Sampling

jdpOct 20, 2023, 7:32 AM
119 points

50 votes

Overall karma indicates overall quality.

15 comments22 min readLW link

Self-fulfilling mis­al­ign­ment data might be poi­son­ing our AI models

TurnTroutMar 2, 2025, 7:51 PM
154 points

85 votes

Overall karma indicates overall quality.

29 comments1 min readLW link
(turntrout.com)

GPT-175bee

Feb 8, 2023, 6:58 PM
123 points

81 votes

Overall karma indicates overall quality.

14 comments1 min readLW link

Magna Alta Doctrina

jacob_cannellDec 11, 2021, 9:54 PM
60 points

26 votes

Overall karma indicates overall quality.

7 comments28 min readLW link

One pos­si­ble ap­proach to de­velop the best pos­si­ble gen­eral learn­ing algorithm

martillopartMar 14, 2022, 7:24 PM
3 points

3 votes

Overall karma indicates overall quality.

0 comments7 min readLW link

Reg­u­lariza­tion Causes Mo­du­lar­ity Causes Generalization

dkirmaniJan 1, 2022, 11:34 PM
50 points

23 votes

Overall karma indicates overall quality.

7 comments3 min readLW link

Un­solved ML Safety Problems

jsteinhardtSep 29, 2021, 4:00 PM
61 points

23 votes

Overall karma indicates overall quality.

2 comments3 min readLW link
(bounded-regret.ghost.io)

[MLSN #1]: ICLR Safety Paper Roundup

Dan HOct 18, 2021, 3:19 PM
59 points

17 votes

Overall karma indicates overall quality.

1 comment2 min readLW link

Mech In­terp Challenge: Septem­ber—De­ci­pher­ing the Ad­di­tion Model

CallumMcDougallSep 13, 2023, 10:23 PM
35 points

11 votes

Overall karma indicates overall quality.

0 comments4 min readLW link

Opinions on In­ter­pretable Ma­chine Learn­ing and 70 Sum­maries of Re­cent Papers

Apr 9, 2021, 7:19 PM
141 points

48 votes

Overall karma indicates overall quality.

17 comments102 min readLW link

UML XI: Near­est Neigh­bor Schemes

Rafael HarthFeb 16, 2020, 8:30 PM
15 points

4 votes

Overall karma indicates overall quality.

3 comments9 min readLW link

Be­hav­ioral and mechanis­tic defi­ni­tions (of­ten con­fuse AI al­ign­ment dis­cus­sions)

LawrenceCFeb 20, 2023, 9:33 PM
33 points

23 votes

Overall karma indicates overall quality.

5 comments6 min readLW link

[Question] If I ask an LLM to think step by step, how big are the steps?

ryan_bSep 13, 2024, 8:30 PM
7 points

3 votes

Overall karma indicates overall quality.

1 comment1 min readLW link

OpenAI now has an RL API which is broadly accessible

ryan_greenblattJun 11, 2025, 11:39 PM
43 points

22 votes

Overall karma indicates overall quality.

1 comment5 min readLW link

Resi­d­ual stream norms grow ex­po­nen­tially over the for­ward pass

May 7, 2023, 12:46 AM
77 points

35 votes

Overall karma indicates overall quality.

24 comments9 min readLW link

Neu­ral nets as a model for how hu­mans make and un­der­stand vi­sual art

Owain_EvansNov 9, 2019, 4:53 PM
28 points

9 votes

Overall karma indicates overall quality.

7 comments2 min readLW link
(owainevans.github.io)

[Aspira­tion-based de­signs] 1. In­for­mal in­tro­duc­tion

Apr 28, 2024, 1:00 PM
44 points

20 votes

Overall karma indicates overall quality.

4 comments8 min readLW link

An­ti­cor­re­lated Noise In­jec­tion for Im­proved Generalization

tailcalledFeb 20, 2022, 10:15 AM
2 points

2 votes

Overall karma indicates overall quality.

9 comments1 min readLW link

How good are LLMs at do­ing ML on an un­known dataset?

Håvard Tveit IhleJul 1, 2024, 9:04 AM
33 points

14 votes

Overall karma indicates overall quality.

4 comments13 min readLW link

Make a neu­ral net­work in ~10 minutes

Arjun YadavApr 26, 2022, 5:24 AM
8 points

7 votes

Overall karma indicates overall quality.

0 comments4 min readLW link
(arjunyadav.net)

Mis­tral Large 2 (123B) seems to ex­hibit al­ign­ment faking

Mar 27, 2025, 3:39 PM
81 points

30 votes

Overall karma indicates overall quality.

4 comments13 min readLW link

Cross-Val­i­da­tion vs Bayesian Model Comparison

johnswentworthJul 21, 2019, 6:14 PM
28 points

12 votes

Overall karma indicates overall quality.

2 comments4 min readLW link

Cau­tion when in­ter­pret­ing Deep­mind’s In-con­text RL paper

Sam MarksNov 1, 2022, 2:42 AM
106 points

46 votes

Overall karma indicates overall quality.

8 comments4 min readLW link

How to train your trans­former

p.b.Apr 7, 2022, 9:34 AM
6 points

3 votes

Overall karma indicates overall quality.

0 comments8 min readLW link

New GPT-3 competitor

Quintin PopeAug 12, 2021, 7:05 AM
32 points

22 votes

Overall karma indicates overall quality.

10 comments1 min readLW link

in­ter­pret­ing GPT: the logit lens

nostalgebraistAug 31, 2020, 2:47 AM
237 points

121 votes

Overall karma indicates overall quality.

38 comments10 min readLW link

[Question] Is “Re­cur­sive Self-Im­prove­ment” Rele­vant in the Deep Learn­ing Paradigm?

DragonGodApr 6, 2023, 7:13 AM
32 points

19 votes

Overall karma indicates overall quality.

36 comments7 min readLW link

UML final

Rafael HarthMar 8, 2020, 8:43 PM
22 points

5 votes

Overall karma indicates overall quality.

1 comment14 min readLW link

UML XII: Di­men­sion­al­ity Reduction

Rafael HarthFeb 23, 2020, 7:44 PM
9 points

3 votes

Overall karma indicates overall quality.

0 comments9 min readLW link

A Data limited future

Donald HobsonAug 6, 2022, 2:56 PM
52 points

29 votes

Overall karma indicates overall quality.

25 comments2 min readLW link

Dis­cus­sion on the ma­chine learn­ing ap­proach to AI safety

VikaNov 1, 2018, 8:54 PM
27 points

14 votes

Overall karma indicates overall quality.

3 comments4 min readLW link

Model­ling and Un­der­stand­ing SGD

J BostockOct 5, 2021, 1:41 PM
8 points

2 votes

Overall karma indicates overall quality.

0 comments3 min readLW link

The GDM AGI Safety+Align­ment Team is Hiring for Ap­plied In­ter­pretabil­ity Research

Feb 24, 2025, 2:17 AM
48 points

15 votes

Overall karma indicates overall quality.

1 comment7 min readLW link

D&D.Sci Septem­ber 2022: The Allo­ca­tion Helm

abstractapplicSep 16, 2022, 11:10 PM
34 points

13 votes

Overall karma indicates overall quality.

34 comments1 min readLW link

In­ter­pretabil­ity in ML: A Broad Overview

lifelonglearnerAug 4, 2020, 7:03 PM
53 points

22 votes

Overall karma indicates overall quality.

5 comments15 min readLW link

Key Papers in Lan­guage Model Safety

aogJun 20, 2022, 3:00 PM
40 points

19 votes

Overall karma indicates overall quality.

1 comment22 min readLW link

Pos­si­ble OpenAI’s Q* break­through and Deep­Mind’s AlphaGo-type sys­tems plus LLMs

BurnyNov 23, 2023, 3:16 AM
37 points

49 votes

Overall karma indicates overall quality.

25 comments2 min readLW link

Re­searcher in­cen­tives cause smoother progress on bench­marks

ryan_greenblattDec 21, 2021, 4:13 AM
20 points

9 votes

Overall karma indicates overall quality.

4 comments1 min readLW link

Me­tac­u­lus In­tro­duces AI-Pow­ered Com­mu­nity In­sights to Re­veal Fac­tors Driv­ing User Forecasts

ChristianWilliamsNov 10, 2023, 5:57 PM
6 points

3 votes

Overall karma indicates overall quality.

0 comments1 min readLW link
(www.metaculus.com)

LLMs Can’t See Pix­els or Characters

Brendan LongJul 20, 2025, 8:00 PM
100 points

55 votes

Overall karma indicates overall quality.

44 comments4 min readLW link
(www.brendanlong.com)

Au­tore­gres­sive Propaganda

lsusrAug 22, 2021, 2:18 AM
25 points

13 votes

Overall karma indicates overall quality.

3 comments3 min readLW link

[Link] Word-vec­tor based DL sys­tem achieves hu­man par­ity in ver­bal IQ tests

jacob_cannellJun 13, 2015, 11:38 PM
17 points

10 votes

Overall karma indicates overall quality.

8 comments1 min readLW link

OpenAI re­leases func­tional Dota 5v5 bot, aims to beat world cham­pi­ons by August

habrykaJun 26, 2018, 10:40 PM
53 points

20 votes

Overall karma indicates overall quality.

12 comments1 min readLW link
(blog.openai.com)

How LLMs are and are not myopic

janusJul 25, 2023, 2:19 AM
138 points

67 votes

Overall karma indicates overall quality.

16 comments8 min readLW link

Break­ing down the train­ing/​de­ploy­ment dichotomy

Erik JennerAug 28, 2022, 9:45 PM
30 points

15 votes

Overall karma indicates overall quality.

3 comments3 min readLW link

Paper: Su­per­po­si­tion, Me­moriza­tion, and Dou­ble Des­cent (An­thropic)

LawrenceCJan 5, 2023, 5:54 PM
53 points

25 votes

Overall karma indicates overall quality.

11 comments1 min readLW link
(transformer-circuits.pub)

Tab­ula Bio: to­wards a fu­ture free of dis­ease (& look­ing for col­lab­o­ra­tors)

mpoonMar 23, 2025, 4:30 PM
44 points

17 votes

Overall karma indicates overall quality.

15 comments2 min readLW link

Prefer­ences from (real and hy­po­thet­i­cal) psy­chol­ogy papers

Stuart_ArmstrongOct 6, 2021, 9:06 AM
15 points

5 votes

Overall karma indicates overall quality.

0 comments2 min readLW link

Scien­tific Dis­cov­ery in the Age of Ar­tifi­cial Intelligence

Jessica RumbelowJun 29, 2025, 8:45 PM
42 points

19 votes

Overall karma indicates overall quality.

3 comments10 min readLW link

How ARENA course ma­te­rial gets made

CallumMcDougallJul 2, 2024, 6:04 PM
41 points

18 votes

Overall karma indicates overall quality.

2 comments7 min readLW link

UML XIII: On­line Learn­ing and Clustering

Rafael HarthMar 1, 2020, 6:32 PM
13 points

3 votes

Overall karma indicates overall quality.

0 comments14 min readLW link

[Question] How Does the Hu­man Brain Com­pare to Deep Learn­ing on Sam­ple Effi­ciency?

DragonGodJan 15, 2023, 7:49 PM
11 points

8 votes

Overall karma indicates overall quality.

6 comments1 min readLW link

HDBSCAN is Sur­pris­ingly Effec­tive at Find­ing In­ter­pretable Clusters of the SAE De­coder Matrix

Oct 11, 2024, 11:06 PM
8 points

5 votes

Overall karma indicates overall quality.

2 comments10 min readLW link

[1911.08265] Mas­ter­ing Atari, Go, Chess and Shogi by Plan­ning with a Learned Model | Arxiv

DragonGodNov 21, 2019, 1:18 AM
52 points

15 votes

Overall karma indicates overall quality.

4 comments1 min readLW link
(arxiv.org)

Paper+Sum­mary: OMNIGROK: GROKKING BEYOND ALGORITHMIC DATA

Marius HobbhahnOct 4, 2022, 7:22 AM
46 points

32 votes

Overall karma indicates overall quality.

11 comments1 min readLW link
(arxiv.org)

And All the Shog­goths Merely Players

Zack_M_DavisFeb 10, 2024, 7:56 PM
177 points

68 votes

Overall karma indicates overall quality.

57 comments12 min readLW link

Does SGD Pro­duce De­cep­tive Align­ment?

Mark XuNov 6, 2020, 11:48 PM
96 points

34 votes

Overall karma indicates overall quality.

9 comments16 min readLW link

[Question] Non­lin­ear limi­ta­tions of ReLUs

magfrumpOct 26, 2023, 6:51 PM
13 points

4 votes

Overall karma indicates overall quality.

1 comment1 min readLW link

Google’s PaLM-E: An Em­bod­ied Mul­ti­modal Lan­guage Model

SandXboxMar 7, 2023, 4:11 AM
87 points

48 votes

Overall karma indicates overall quality.

7 comments1 min readLW link
(palm-e.github.io)

The sur­pris­ing pa­ram­e­ter effi­ciency of vi­sion models

berenApr 8, 2023, 7:44 PM
81 points

36 votes

Overall karma indicates overall quality.

28 comments4 min readLW link

[Link] Whit­tle­stone et al., The So­cietal Im­pli­ca­tions of Deep Re­in­force­ment Learning

Aryeh EnglanderMar 10, 2021, 6:13 PM
11 points

5 votes

Overall karma indicates overall quality.

1 comment1 min readLW link
(jair.org)

UML IV: Lin­ear Predictors

Rafael HarthJul 8, 2020, 7:06 PM
15 points

4 votes

Overall karma indicates overall quality.

0 comments9 min readLW link

NVIDIA and Microsoft re­leases 530B pa­ram­e­ter trans­former model, Me­ga­tron-Tur­ing NLG

OzyrusOct 11, 2021, 3:28 PM
51 points

26 votes

Overall karma indicates overall quality.

36 comments1 min readLW link
(developer.nvidia.com)

A Mechanis­tic In­ter­pretabil­ity Anal­y­sis of Grokking

Aug 15, 2022, 2:41 AM
374 points

177 votes

Overall karma indicates overall quality.

48 comments36 min readLW link1 review
(colab.research.google.com)

Con­cept Safety: Pro­duc­ing similar AI-hu­man con­cept spaces

Kaj_SotalaApr 14, 2015, 8:39 PM
51 points

34 votes

Overall karma indicates overall quality.

45 comments8 min readLW link

“In­duc­tive Bias”

Eliezer YudkowskyApr 8, 2007, 7:52 PM
39 points

37 votes

Overall karma indicates overall quality.

24 comments3 min readLW link

GD’s Im­plicit Bias on Separable Data

Xander DaviesOct 17, 2022, 4:13 AM
25 points

10 votes

Overall karma indicates overall quality.

0 comments7 min readLW link

The Brain as a Univer­sal Learn­ing Machine

jacob_cannellJun 24, 2015, 9:45 PM
201 points

133 votes

Overall karma indicates overall quality.

171 comments19 min readLW link

Why square er­rors?

AprillionNov 26, 2022, 1:40 PM
41 points

26 votes

Overall karma indicates overall quality.

11 comments2 min readLW link

UML IX: Ker­nels and Boosting

Rafael HarthFeb 2, 2020, 9:51 PM
13 points

3 votes

Overall karma indicates overall quality.

1 comment10 min readLW link

Fu­ture ML Sys­tems Will Be Qual­i­ta­tively Different

jsteinhardtJan 11, 2022, 7:50 PM
119 points

66 votes

Overall karma indicates overall quality.

10 comments5 min readLW link
(bounded-regret.ghost.io)

Thoughts on Loss Land­scapes and why Deep Learn­ing works

berenJul 25, 2023, 4:41 PM
54 points

26 votes

Overall karma indicates overall quality.

4 comments18 min readLW link

Refram­ing in­ner alignment

davidadDec 11, 2022, 1:53 PM
53 points

19 votes

Overall karma indicates overall quality.

13 comments4 min readLW link

[Question] Does agent foun­da­tions cover all fu­ture ML sys­tems?

Jonas HallgrenJul 25, 2022, 1:17 AM
4 points

2 votes

Overall karma indicates overall quality.

0 comments1 min readLW link

Un­der­stand­ing Ma­chine Learn­ing (II)

Rafael HarthDec 22, 2019, 6:28 PM
25 points

8 votes

Overall karma indicates overall quality.

4 comments10 min readLW link

Re­duc­ing LLM de­cep­tion at scale with self-other over­lap fine-tuning

Mar 13, 2025, 7:09 PM
162 points

85 votes

Overall karma indicates overall quality.

46 comments6 min readLW link

Let’s Read: Su­per­hu­man AI for mul­ti­player poker

Yuxi_LiuJul 14, 2019, 6:22 AM
56 points

25 votes

Overall karma indicates overall quality.

6 comments8 min readLW link

Google’s Ima­gen uses larger text encoder

Ben LivengoodMay 24, 2022, 9:55 PM
27 points

14 votes

Overall karma indicates overall quality.

2 comments1 min readLW link

Iron­ing Out the Squiggles

Zack_M_DavisApr 29, 2024, 4:13 PM
157 points

69 votes

Overall karma indicates overall quality.

36 comments11 min readLW link

Au­to­mated Fact Check­ing: A Look at the Field

HoagyOct 6, 2021, 11:52 PM
12 points

8 votes

Overall karma indicates overall quality.

0 comments8 min readLW link

dalle2 comments

nostalgebraistApr 26, 2022, 5:30 AM
183 points

86 votes

Overall karma indicates overall quality.

14 comments13 min readLW link
(nostalgebraist.tumblr.com)

Train first VS prune first in neu­ral net­works.

Donald HobsonJul 9, 2022, 3:53 PM
18 points

8 votes

Overall karma indicates overall quality.

5 comments2 min readLW link

Deep­Mind: Gen­er­ally ca­pa­ble agents emerge from open-ended play

Daniel KokotajloJul 27, 2021, 2:19 PM
247 points

122 votes

Overall karma indicates overall quality.

53 comments2 min readLW link
(deepmind.com)

[Question] How do you do hy­per­pa­ram­e­ter searches in ML?

lsusrJan 13, 2020, 3:45 AM
9 points

4 votes

Overall karma indicates overall quality.

3 comments1 min readLW link

Touch re­al­ity as soon as pos­si­ble (when do­ing ma­chine learn­ing re­search)

LawrenceCJan 3, 2023, 7:11 PM
118 points

59 votes

Overall karma indicates overall quality.

9 comments8 min readLW link1 review

Mul­ti­modal Neu­rons in Ar­tifi­cial Neu­ral Networks

Kaj_SotalaMar 5, 2021, 9:01 AM
57 points

18 votes

Overall karma indicates overall quality.

2 comments2 min readLW link
(distill.pub)

Mechanism for fea­ture learn­ing in neu­ral net­works and back­prop­a­ga­tion-free ma­chine learn­ing models

Matt GoldenbergMar 19, 2024, 2:55 PM
8 points

4 votes

Overall karma indicates overall quality.

1 comment1 min readLW link
(www.science.org)

Four us­ages of “loss” in AI

TurnTroutOct 2, 2022, 12:52 AM
46 points

19 votes

Overall karma indicates overall quality.

18 comments4 min readLW link

Un­der­stand­ing Ma­chine Learn­ing (I)

Rafael HarthDec 20, 2019, 6:22 PM
44 points

9 votes

Overall karma indicates overall quality.

12 comments11 min readLW link

Ex­plor­ing toy neu­ral nets un­der node re­moval. Sec­tion 1.

Donald HobsonApr 13, 2022, 11:30 PM
12 points

6 votes

Overall karma indicates overall quality.

7 comments8 min readLW link

NLP Po­si­tion Paper: When Com­bat­ting Hype, Pro­ceed with Caution

Sam BowmanOct 15, 2021, 8:57 PM
46 points

16 votes

Overall karma indicates overall quality.

14 comments1 min readLW link

QAPR 4: In­duc­tive biases

Quintin PopeOct 10, 2022, 10:08 PM
67 points

24 votes

Overall karma indicates overall quality.

2 comments18 min readLW link

Bor­ing ma­chine learn­ing is where it’s at

George3d6Oct 20, 2021, 11:23 AM
28 points

27 votes

Overall karma indicates overall quality.

16 comments3 min readLW link
(cerebralab.com)

linkpost: loss basin visualization

Nathan Helm-BurgerSep 30, 2022, 3:42 AM
14 points

4 votes

Overall karma indicates overall quality.

1 comment1 min readLW link

Stable Diffu­sion has been released

P.Aug 22, 2022, 7:42 PM
15 points

13 votes

Overall karma indicates overall quality.

7 comments1 min readLW link
(stability.ai)

Durkon, an open-source tool for In­her­ently In­ter­pretable Modelling

abstractapplicDec 24, 2022, 1:49 AM
47 points

12 votes

Overall karma indicates overall quality.

0 comments4 min readLW link

SGD’s Bias

johnswentworthMay 18, 2021, 11:19 PM
63 points

25 votes

Overall karma indicates overall quality.

16 comments3 min readLW link

The in­or­di­nately slow spread of good AGI con­ver­sa­tions in ML

Rob BensingerJun 21, 2022, 4:09 PM
173 points

95 votes

Overall karma indicates overall quality.

62 comments8 min readLW link

Mesa-Op­ti­miz­ers via Grokking

orthonormalDec 6, 2022, 8:05 PM
36 points

18 votes

Overall karma indicates overall quality.

4 comments6 min readLW link

ML Sys­tems Will Have Weird Failure Modes

jsteinhardtJan 26, 2022, 1:40 AM
57 points

17 votes

Overall karma indicates overall quality.

8 comments6 min readLW link
(bounded-regret.ghost.io)

Diffu­sion Guided NLP: bet­ter steer­ing, mostly a good thing

Nathan Helm-BurgerAug 10, 2024, 7:49 PM
13 points

6 votes

Overall karma indicates overall quality.

0 comments1 min readLW link
(arxiv.org)

LOVE in a sim­box is all you need

jacob_cannellSep 28, 2022, 6:25 PM
67 points

45 votes

Overall karma indicates overall quality.

73 comments44 min readLW link1 review

Tracr: Com­piled Trans­form­ers as a Lab­o­ra­tory for In­ter­pretabil­ity | Deep­Mind

DragonGodJan 13, 2023, 4:53 PM
62 points

22 votes

Overall karma indicates overall quality.

12 comments1 min readLW link
(arxiv.org)

UML VIII: Lin­ear Pre­dic­tors (2)

Rafael HarthJan 26, 2020, 8:09 PM
9 points

3 votes

Overall karma indicates overall quality.

2 comments10 min readLW link

Pro­ces­sor clock speeds are not how fast AIs think

Ege ErdilJan 29, 2024, 2:39 PM
140 points

73 votes

Overall karma indicates overall quality.

55 comments2 min readLW link

Claude 3 Opus can op­er­ate as a Tur­ing machine

Gunnar_ZarnckeApr 17, 2024, 8:41 AM
37 points

17 votes

Overall karma indicates overall quality.

2 comments1 min readLW link
(twitter.com)

An­nounc­ing Epoch’s dash­board of key trends and figures in Ma­chine Learning

JsevillamolApr 13, 2023, 7:33 AM
35 points

18 votes

Overall karma indicates overall quality.

7 comments1 min readLW link
(epochai.org)

Su­per­vised learn­ing of out­puts in the brain

Steven ByrnesOct 26, 2020, 2:32 PM
28 points

12 votes

Overall karma indicates overall quality.

9 comments10 min readLW link

Emo­tions = Re­ward Functions

jpyykkoJan 20, 2022, 6:46 PM
16 points

6 votes

Overall karma indicates overall quality.

10 comments5 min readLW link

Neu­ral net­works bi­ased to­wards ge­o­met­ri­cally sim­ple func­tions?

DavidHolmesDec 8, 2022, 4:16 PM
16 points

7 votes

Overall karma indicates overall quality.

2 comments3 min readLW link

A mar­ket is a neu­ral network

David Hugh-JonesSep 15, 2022, 9:53 PM
7 points

9 votes

Overall karma indicates overall quality.

4 comments8 min readLW link

My ML Scal­ing bibliography

gwernOct 23, 2021, 2:41 PM
35 points

16 votes

Overall karma indicates overall quality.

9 comments1 min readLW link
(www.gwern.net)

Neu­ral net /​ de­ci­sion tree hy­brids: a po­ten­tial path to­ward bridg­ing the in­ter­pretabil­ity gap

Nathan Helm-BurgerSep 23, 2021, 12:38 AM
21 points

8 votes

Overall karma indicates overall quality.

2 comments12 min readLW link

[MLSN #5]: Prize Compilation

Dan HSep 26, 2022, 9:55 PM
15 points

5 votes

Overall karma indicates overall quality.

1 comment2 min readLW link

In­fer­ence-Only De­bate Ex­per­i­ments Us­ing Math Problems

Aug 6, 2024, 5:44 PM
31 points

10 votes

Overall karma indicates overall quality.

0 comments2 min readLW link

Ma­chine Learn­ing Consent

jefftkDec 8, 2022, 3:50 AM
38 points

14 votes

Overall karma indicates overall quality.

14 comments3 min readLW link
(www.jefftk.com)

Un­der­stand­ing “Deep Dou­ble Des­cent”

evhubDec 6, 2019, 12:00 AM
151 points

79 votes

Overall karma indicates overall quality.

51 comments5 min readLW link4 reviews

chin­chilla’s wild implications

nostalgebraistJul 31, 2022, 1:18 AM
424 points

247 votes

Overall karma indicates overall quality.

128 comments10 min readLW link1 review

AXRP Epi­sode 29 - Science of Deep Learn­ing with Vikrant Varma

DanielFilanApr 25, 2024, 7:10 PM
20 points

9 votes

Overall karma indicates overall quality.

1 comment63 min readLW link

If I were a well-in­ten­tioned AI… I: Image classifier

Stuart_ArmstrongFeb 26, 2020, 12:39 PM
35 points

17 votes

Overall karma indicates overall quality.

4 comments5 min readLW link

A Sim­ple In­tro­duc­tion to Neu­ral Networks

Rafael HarthFeb 9, 2020, 10:02 PM
34 points

12 votes

Overall karma indicates overall quality.

13 comments18 min readLW link

We have achieved Noob Gains in AI

phdeadMay 18, 2022, 8:56 PM
118 points

75 votes

Overall karma indicates overall quality.

21 comments7 min readLW link

[Question] Im­pact of ” ‘Let’s think step by step’ is all you need”?

yrimonJul 24, 2022, 8:59 PM
20 points

12 votes

Overall karma indicates overall quality.

2 comments1 min readLW link

Place-Based Pro­gram­ming—Part 2 - Functions

lsusrApr 16, 2021, 12:25 AM
14 points

5 votes

Overall karma indicates overall quality.

0 comments3 min readLW link

Re­mak­ing Effi­cien­tZero (as best I can)

HoagyJul 4, 2022, 11:03 AM
36 points

26 votes

Overall karma indicates overall quality.

9 comments22 min readLW link

UML V: Con­vex Learn­ing Problems

Rafael HarthJan 5, 2020, 7:47 PM
14 points

4 votes

Overall karma indicates overall quality.

0 comments10 min readLW link

UML VI: Stochas­tic Gra­di­ent Descent

Rafael HarthJan 12, 2020, 9:59 PM
13 points

3 votes

Overall karma indicates overall quality.

0 comments10 min readLW link

You should go to ML conferences

Jan_KulveitJul 24, 2024, 11:47 AM
112 points

52 votes

Overall karma indicates overall quality.

13 comments4 min readLW link

Neu­ral net­work poly­topes (Co­lab note­book)

Zach FurmanApr 21, 2023, 10:42 PM
11 points

3 votes

Overall karma indicates overall quality.

0 comments1 min readLW link
(colab.research.google.com)

Ma­chine Learn­ing Anal­ogy for Med­i­ta­tion (illus­trated)

abramdemskiJun 28, 2018, 10:51 PM
100 points

41 votes

Overall karma indicates overall quality.

48 comments1 min readLW link

Search ver­sus design

Alex FlintAug 16, 2020, 4:53 PM
109 points

38 votes

Overall karma indicates overall quality.

40 comments36 min readLW link1 review

UML VII: Meta-Learning

Rafael HarthJan 19, 2020, 6:23 PM
14 points

4 votes

Overall karma indicates overall quality.

0 comments15 min readLW link

Un­der­stand­ing Ma­chine Learn­ing (III)

Rafael HarthDec 25, 2019, 6:55 PM
16 points

5 votes

Overall karma indicates overall quality.

2 comments11 min readLW link

Safety Im­pli­ca­tions of LeCun’s path to ma­chine intelligence

Ivan VendrovJul 15, 2022, 9:47 PM
102 points

46 votes

Overall karma indicates overall quality.

18 comments6 min readLW link

[Question] Why don’t we have self driv­ing cars yet?

Linda LinseforsNov 14, 2022, 12:19 PM
22 points

12 votes

Overall karma indicates overall quality.

16 comments1 min readLW link

“The Bit­ter Les­son”, an ar­ti­cle about com­pute vs hu­man knowl­edge in AI

the gears to ascensionJun 21, 2019, 5:24 PM
52 points

23 votes

Overall karma indicates overall quality.

14 comments4 min readLW link
(www.incompleteideas.net)

Place-Based Pro­gram­ming—Part 1 - Places

lsusrApr 14, 2021, 10:18 PM
32 points

16 votes

Overall karma indicates overall quality.

18 comments2 min readLW link

“Deep Learn­ing” Is Func­tion Approximation

Zack_M_DavisMar 21, 2024, 5:50 PM
98 points

67 votes

Overall karma indicates overall quality.

28 comments10 min readLW link
(zackmdavis.net)

AlphaS­tar: Im­pres­sive for RL progress, not for AGI progress

orthonormalNov 2, 2019, 1:50 AM
113 points

62 votes

Overall karma indicates overall quality.

58 comments2 min readLW link1 review

KAN: Kol­mogorov-Arnold Networks

Gunnar_ZarnckeMay 1, 2024, 4:50 PM
18 points

15 votes

Overall karma indicates overall quality.

15 comments1 min readLW link
(arxiv.org)

New Scal­ing Laws for Large Lan­guage Models

1a3ornApr 1, 2022, 8:41 PM
246 points

130 votes

Overall karma indicates overall quality.

22 comments5 min readLW link

[Question] Why no ma­jor LLMs with mem­ory?

Kaj_SotalaMar 28, 2023, 4:34 PM
42 points

28 votes

Overall karma indicates overall quality.

15 comments1 min readLW link

Sur­vey of NLP Re­searchers: NLP is con­tribut­ing to AGI progress; ma­jor catas­tro­phe plausible

Sam BowmanAug 31, 2022, 1:39 AM
91 points

53 votes

Overall karma indicates overall quality.

6 comments2 min readLW link

Ex­per­i­men­ta­tion with AI-gen­er­ated images (VQGAN+CLIP) | So­larpunk air­ships flee­ing a dragon

Kaj_SotalaJul 15, 2021, 11:00 AM
44 points

22 votes

Overall karma indicates overall quality.

4 comments2 min readLW link
(kajsotala.fi)

Mech In­terp Challenge: Oc­to­ber—De­ci­pher­ing the Sorted List Model

CallumMcDougallOct 3, 2023, 10:57 AM
23 points

11 votes

Overall karma indicates overall quality.

0 comments3 min readLW link

Vi­sual Ex­plo­ra­tion of Gra­di­ent Des­cent (many images)

silentbobSep 17, 2025, 1:09 PM
38 points

16 votes

Overall karma indicates overall quality.

9 comments20 min readLW link

Un­der­stand­ing and con­trol­ling auto-in­duced dis­tri­bu­tional shift

L Rudolf LDec 13, 2021, 2:59 PM
33 points

12 votes

Overall karma indicates overall quality.

4 comments16 min readLW link

Ap­ply to be a TA for TARA

yanni kyriacosDec 20, 2024, 2:25 AM
10 points

2 votes

Overall karma indicates overall quality.

0 comments1 min readLW link

Paper: The Ca­pac­ity for Mo­ral Self-Cor­rec­tion in Large Lan­guage Models (An­thropic)

LawrenceCFeb 16, 2023, 7:47 PM
65 points

34 votes

Overall karma indicates overall quality.

9 comments1 min readLW link
(arxiv.org)

Ta­boo­ing ‘Agent’ for Pro­saic Alignment

Hjalmar_WijkAug 23, 2019, 2:55 AM
57 points

27 votes

Overall karma indicates overall quality.

10 comments6 min readLW link

Ap­ply to a small iter­a­tion of MLAB to be run in Oxford

Aug 27, 2023, 2:21 PM
12 points

10 votes

Overall karma indicates overall quality.

0 comments1 min readLW link

The Limits of Automation

milkandcigarettesJun 23, 2022, 6:03 PM
5 points

2 votes

Overall karma indicates overall quality.

1 comment5 min readLW link
(milkandcigarettes.com)

Model Depth as Panacea and Obfuscator

abstractapplicNov 9, 2020, 12:02 AM
8 points

5 votes

Overall karma indicates overall quality.

3 comments15 min readLW link

Imi­ta­tion Learn­ing from Lan­guage Feedback

Mar 30, 2023, 2:11 PM
71 points

27 votes

Overall karma indicates overall quality.

3 comments10 min readLW link

No free lunch the­o­rem is irrelevant

CatneeOct 4, 2022, 12:21 AM
18 points

11 votes

Overall karma indicates overall quality.

7 comments1 min readLW link

Path de­pen­dence in ML in­duc­tive biases

Sep 10, 2022, 1:38 AM
68 points

23 votes

Overall karma indicates overall quality.

13 comments10 min readLW link

CNN fea­ture vi­su­al­iza­tion in 50 lines of code

StefanHexMay 26, 2022, 11:02 AM
17 points

11 votes

Overall karma indicates overall quality.

4 comments5 min readLW link

Ba­sic Math­e­mat­ics of Pre­dic­tive Coding

Adam ShaiSep 29, 2023, 2:38 PM
49 points

24 votes

Overall karma indicates overall quality.

6 comments9 min readLW link

Epoch AI is hiring a CTO!

Apr 2, 2025, 8:29 PM
7 points

2 votes

Overall karma indicates overall quality.

0 comments2 min readLW link
(careers.epoch.ai)

Es­ti­mat­ing the Prob­a­bil­ity of Sam­pling a Trained Neu­ral Net­work at Random

Mar 1, 2025, 2:11 AM
32 points

16 votes

Overall karma indicates overall quality.

10 comments1 min readLW link
(arxiv.org)

Deep­Mind ar­ti­cle: AI Safety Gridworlds

Commander ZanderNov 30, 2017, 4:13 PM
25 points

20 votes

Overall karma indicates overall quality.

6 comments1 min readLW link
(deepmind.com)

[Question] Book recom­men­da­tions for the his­tory of ML?

Eleni AngelouDec 28, 2022, 11:50 PM
2 points

1 vote

Overall karma indicates overall quality.

2 comments1 min readLW link

An­a­lyz­ing how SAE fea­tures evolve across a for­ward pass

Nov 7, 2024, 10:07 PM
47 points

40 votes

Overall karma indicates overall quality.

0 comments1 min readLW link
(arxiv.org)

[Question] Vec­tor search on a large dataset?

camsdixonNov 10, 2023, 6:43 PM
−1 points

2 votes

Overall karma indicates overall quality.

2 comments1 min readLW link

[Question] What Is the Idea Be­hind (Un-)Su­per­vised Learn­ing and Re­in­force­ment Learn­ing?

MorpheusSep 30, 2022, 4:48 PM
9 points

6 votes

Overall karma indicates overall quality.

6 comments2 min readLW link

The Ma­chine Learn­ing Per­son­al­ity Test

PhilGoetzAug 4, 2009, 11:36 PM
31 points

30 votes

Overall karma indicates overall quality.

34 comments6 min readLW link

Trans­fer learn­ing and gen­er­al­iza­tion-qua-ca­pa­bil­ity in Bab­bage and Davinci (or, why di­vi­sion is bet­ter than Span­ish)

RP and agg
Feb 9, 2024, 7:00 AM
50 points

22 votes

Overall karma indicates overall quality.

6 comments3 min readLW link

Diffu­sion Primer

Sneha BangaloreAug 24, 2025, 11:35 PM
3 points

2 votes

Overall karma indicates overall quality.

0 comments7 min readLW link

Beyond Gaus­sian: Lan­guage Model Rep­re­sen­ta­tions and Distributions

Matt LevinsonNov 24, 2024, 1:53 AM
6 points

5 votes

Overall karma indicates overall quality.

1 comment5 min readLW link

Re­search Adenda: Model­ling Tra­jec­to­ries of Lan­guage Models

NickyPNov 13, 2023, 2:33 PM
28 points

13 votes

Overall karma indicates overall quality.

0 comments12 min readLW link

The shal­low re­al­ity of ‘deep learn­ing the­ory’

Jesse HooglandFeb 22, 2023, 4:16 AM
35 points

30 votes

Overall karma indicates overall quality.

11 comments3 min readLW link
(www.jessehoogland.com)

Yann LeCun, A Path Towards Au­tonomous Ma­chine In­tel­li­gence [link]

Bill BenzonJun 27, 2022, 11:29 PM
5 points

7 votes

Overall karma indicates overall quality.

1 comment1 min readLW link

A Gen­er­al­iza­tion of ROC AUC for Bi­nary Classifiers

Adam ScherlisDec 4, 2021, 9:47 PM
10 points

4 votes

Overall karma indicates overall quality.

0 comments2 min readLW link
(adam.scherlis.com)

Pat­terns or get­ting to Ob­jec­tive Truth – A thought piece on Ar­tifi­cial Intelligence

Thehumanproject.aiOct 20, 2024, 4:45 PM
1 point

1 vote

Overall karma indicates overall quality.

0 comments8 min readLW link

Can Se­man­tic Com­pres­sion Be For­mal­ized for AGI-Scale In­ter­pretabil­ity? (Ini­tial ex­per­i­ments via an open-source rea­son­ing ker­nel)

onestardaoJul 18, 2025, 2:56 AM
1 point

1 vote

Overall karma indicates overall quality.

0 comments1 min readLW link

Does ChatGPT know what a tragedy is?

Bill BenzonDec 31, 2023, 7:10 AM
2 points

6 votes

Overall karma indicates overall quality.

4 comments5 min readLW link

VC The­ory Overview

Joar SkalseJul 2, 2023, 10:45 PM
12 points

7 votes

Overall karma indicates overall quality.

2 comments11 min readLW link

If you want to learn tech­ni­cal AI safety, here’s a list of AI safety courses, read­ing lists, and resources

KatWoodsOct 3, 2022, 12:43 PM
12 points

6 votes

Overall karma indicates overall quality.

3 comments1 min readLW link

Mak­ing a Differ­ence Tem­pore: In­sights from ‘Re­in­force­ment Learn­ing: An In­tro­duc­tion’

TurnTroutJul 5, 2018, 12:34 AM
33 points

11 votes

Overall karma indicates overall quality.

6 comments8 min readLW link

Mas­ter­ing Chess and Shogi by Self-Play with a Gen­eral Re­in­force­ment Learn­ing Algorithm

DragonGodDec 6, 2017, 6:01 AM
13 points

10 votes

Overall karma indicates overall quality.

4 comments1 min readLW link
(arxiv.org)

Vir­tual Ma­chine Learn­ing Con­fer­ences: The Good and the Bad

libaiAug 29, 2021, 7:26 PM
4 points

2 votes

Overall karma indicates overall quality.

0 comments3 min readLW link

De­con­fus­ing In-Con­text Learning

Arjun PanicksseryFeb 25, 2024, 9:48 AM
37 points

14 votes

Overall karma indicates overall quality.

1 comment2 min readLW link

ChatGPT Plays 20 Ques­tions [some­times needs help]

Bill BenzonOct 17, 2023, 5:30 PM
5 points

2 votes

Overall karma indicates overall quality.

3 comments12 min readLW link

Tech­ni­cal model re­fine­ment formalism

Stuart_ArmstrongAug 27, 2020, 11:54 AM
19 points

3 votes

Overall karma indicates overall quality.

0 comments6 min readLW link

Re­think­ing Batch Normalization

Matthew BarnettAug 2, 2019, 8:21 PM
20 points

7 votes

Overall karma indicates overall quality.

5 comments8 min readLW link

[Question] How do biolog­i­cal or spik­ing neu­ral net­works learn?

Dom PolsinelliJan 31, 2025, 4:03 PM
2 points

2 votes

Overall karma indicates overall quality.

1 comment2 min readLW link

Pro­lifer­at­ing Education

Haris RashidDec 20, 2022, 7:22 PM
−1 points

6 votes

Overall karma indicates overall quality.

2 comments5 min readLW link
(www.harisrab.com)

In­tro­duc­ing “Ra­dio Bul­lshit FM” – An Ur­gent Alpha Draft for the LessWrong Community

maskirovkaSep 22, 2025, 3:42 PM
0 points

0 votes

Overall karma indicates overall quality.

0 comments2 min readLW link

Sub­jec­tive AI/​ML Digest: April II

Boris TApr 24, 2023, 6:33 PM
1 point

1 vote

Overall karma indicates overall quality.

0 comments1 min readLW link
(borisagain.substack.com)

AIOS

samhealyDec 31, 2023, 1:23 PM
−3 points

5 votes

Overall karma indicates overall quality.

5 comments6 min readLW link

Unity Gridworlds

WillPetilloOct 15, 2023, 4:36 AM
9 points

6 votes

Overall karma indicates overall quality.

0 comments1 min readLW link

[Paper] Tra­jec­to­ries through se­man­tic spaces in schizophre­nia and the re­la­tion­ship to rip­ple bursts

bvbvbvbvbvbvbvbvbvbvbvDec 15, 2023, 1:37 PM
3 points

1 vote

Overall karma indicates overall quality.

0 comments1 min readLW link
(www.pnas.org)

A New Plat­form for Se­man­tic Dis­cov­ery: Pre­serv­ing Path­ways Between Datasets

fikayoAySep 22, 2025, 3:04 PM
1 point

1 vote

Overall karma indicates overall quality.

0 comments1 min readLW link

I’m a bit skep­ti­cal of AlphaFold 3

Oleg TrottJun 25, 2024, 12:04 AM
87 points

48 votes

Overall karma indicates overall quality.

14 comments2 min readLW link

[Question] Why hasn’t deep learn­ing gen­er­ated sig­nifi­cant eco­nomic value yet?

Alex_AltairApr 30, 2022, 8:27 PM
115 points

66 votes

Overall karma indicates overall quality.

89 comments2 min readLW link

An­nounc­ing Epoch’s newly ex­panded Pa­ram­e­ters, Com­pute and Data Trends in Ma­chine Learn­ing database

Oct 25, 2023, 2:55 AM
18 points

6 votes

Overall karma indicates overall quality.

0 comments1 min readLW link
(epochai.org)

Up­dat­ing the Lot­tery Ticket Hypothesis

johnswentworthApr 18, 2021, 9:45 PM
73 points

28 votes

Overall karma indicates overall quality.

41 comments2 min readLW link

Can a Bayesian Or­a­cle Prevent Harm from an Agent? (Ben­gio et al. 2024)

mattmacdermottSep 1, 2024, 7:46 AM
28 points

11 votes

Overall karma indicates overall quality.

0 comments5 min readLW link
(yoshuabengio.org)

ChatGPT’s On­tolog­i­cal Land­scape

Bill BenzonNov 1, 2023, 3:12 PM
7 points

3 votes

Overall karma indicates overall quality.

0 comments4 min readLW link

If lan­guage is for com­mu­ni­ca­tion, what does that im­ply about LLMs?

Bill BenzonMay 12, 2024, 2:55 AM
10 points

5 votes

Overall karma indicates overall quality.

0 comments1 min readLW link

Trends in Train­ing Dataset Sizes

Pablo VillalobosSep 21, 2022, 3:47 PM
25 points

10 votes

Overall karma indicates overall quality.

2 comments5 min readLW link
(epochai.org)

On AI and Compute

johncroxApr 3, 2019, 7:00 PM
36 points

15 votes

Overall karma indicates overall quality.

10 comments5 min readLW link

Is this the be­gin­ning of the end for LLMS [as the royal road to AGI, what­ever that is]?

Bill BenzonAug 24, 2023, 2:50 PM
3 points

8 votes

Overall karma indicates overall quality.

15 comments3 min readLW link

Brains, Planes, Blimps, and Algorithms

ai danOct 18, 2023, 9:26 PM
1 point

1 vote

Overall karma indicates overall quality.

0 comments6 min readLW link

EvoNet: Towards Self-Evolv­ing, En­tropy-Guided AI

Leonhard17Jul 3, 2025, 9:44 AM
1 point

1 vote

Overall karma indicates overall quality.

0 comments18 min readLW link

On pos­si­ble cross-fer­til­iza­tion be­tween AI and neu­ro­science [Creativity]

Bill BenzonNov 27, 2023, 4:50 PM
15 points

5 votes

Overall karma indicates overall quality.

22 comments7 min readLW link

De­gen­era­cies are sticky for SGD

Jun 16, 2024, 9:19 PM
56 points

25 votes

Overall karma indicates overall quality.

1 comment16 min readLW link

[Question] Are Speed Su­per­in­tel­li­gences Fea­si­ble for Modern ML Tech­niques?

DragonGodSep 14, 2022, 12:59 PM
9 points

6 votes

Overall karma indicates overall quality.

7 comments1 min readLW link

[Linkpost] AlphaFold: a solu­tion to a 50-year-old grand challenge in biology

adamShimiNov 30, 2020, 5:33 PM
54 points

28 votes

Overall karma indicates overall quality.

22 comments1 min readLW link
(deepmind.com)

Op­ti­miz­ing a Week of Ma­chine Learn­ing Learning

RaemonJan 9, 2018, 6:55 AM
8 points

6 votes

Overall karma indicates overall quality.

2 comments3 min readLW link

[Question] Why does gra­di­ent de­scent always work on neu­ral net­works?

MichaelDickensMay 20, 2022, 9:13 PM
15 points

6 votes

Overall karma indicates overall quality.

11 comments1 min readLW link

Hyper­di­men­sional con­nec­tion method—A Lossless Frame­work Pre­serv­ing Mean­ing, Struc­ture, and Se­man­tic Re­la­tion­ships across Mo­dal­ities.(A Ma­trixTrans­former sub­sidi­ary)

fikayoAyJul 18, 2025, 10:24 AM
1 point

1 vote

Overall karma indicates overall quality.

0 comments1 min readLW link

Spec­u­la­tive in­fer­ences about path de­pen­dence in LLM su­per­vised fine-tun­ing from re­sults on lin­ear mode con­nec­tivity and model souping

RobertKirkJul 20, 2023, 9:56 AM
39 points

17 votes

Overall karma indicates overall quality.

2 comments5 min readLW link

O(1) rea­son­ing in la­tent space: 1ms in­fer­ence, 77% ac­cu­racy, no at­ten­tion or tokens

Founder Order OneJul 13, 2025, 10:54 PM
−11 points

7 votes

Overall karma indicates overall quality.

9 comments2 min readLW link

[Question] Ques­tion about Test-sets and Bayesian ma­chine learn­ing

Haziq MuhammadAug 9, 2021, 5:16 PM
2 points

1 vote

Overall karma indicates overall quality.

8 comments1 min readLW link

Sin­gu­lar­i­ties against the Sin­gu­lar­ity: An­nounc­ing Work­shop on Sin­gu­lar Learn­ing The­ory and Alignment

Apr 1, 2023, 9:58 AM
87 points

37 votes

Overall karma indicates overall quality.

0 comments1 min readLW link
(singularlearningtheory.com)

In­ter­view Daniel Mur­fet on Univer­sal Phenom­ena in Learn­ing Machines

Alexander Gietelink OldenzielFeb 6, 2023, 12:00 AM
51 points

24 votes

Overall karma indicates overall quality.

1 comment16 min readLW link

The Mea­sure Is the Medium: Sublimi­nal Learn­ing as In­her­ited On­tol­ogy in LLMs

Koen vande Glind (McGluut)Aug 11, 2025, 10:18 AM
1 point

1 vote

Overall karma indicates overall quality.

0 comments4 min readLW link

Bioin­for­mat­ics 101

iy3dJan 22, 2023, 2:36 AM
5 points

3 votes

Overall karma indicates overall quality.

0 comments4 min readLW link

[Question] GPT-3 + GAN

stick109Oct 17, 2020, 7:58 AM
4 points

3 votes

Overall karma indicates overall quality.

3 comments1 min readLW link

Ar­chi­tec­ture-aware op­ti­mi­sa­tion: train ImageNet and more with­out hyperparameters

Chris MingardApr 22, 2023, 9:50 PM
6 points

3 votes

Overall karma indicates overall quality.

2 comments2 min readLW link

Creat­ing In­ter­pretable La­tent Spaces with Gra­di­ent Routing

Jacob G-WDec 14, 2024, 4:00 AM
26 points

9 votes

Overall karma indicates overall quality.

6 comments2 min readLW link
(jacobgw.com)

Ba­sic Facts about Lan­guage Model Internals

Jan 4, 2023, 1:01 PM
130 points

71 votes

Overall karma indicates overall quality.

19 comments9 min readLW link

Gen­er­a­tive ML in chem­istry is bot­tle­necked by synthesis

Abhishaike MahajanSep 16, 2024, 4:31 PM
38 points

14 votes

Overall karma indicates overall quality.

2 comments14 min readLW link
(www.owlposting.com)

“Toward Safe Self-Evolv­ing AI: Mo­du­lar Me­mory and Post-De­ploy­ment Align­ment”

Manasa DwarapureddyMay 2, 2025, 5:02 PM
1 point

1 vote

Overall karma indicates overall quality.

0 comments3 min readLW link

Learn­ing with catastrophes

paulfchristianoJan 23, 2019, 3:01 AM
27 points

9 votes

Overall karma indicates overall quality.

9 comments4 min readLW link

[Pro­posal] Method of lo­cat­ing use­ful sub­nets in large models

Quintin PopeOct 13, 2021, 8:52 PM
9 points

4 votes

Overall karma indicates overall quality.

0 comments2 min readLW link

The po­si­tional em­bed­ding ma­trix and pre­vi­ous-to­ken heads: how do they ac­tu­ally work?

AdamYedidiaAug 10, 2023, 1:58 AM
27 points

11 votes

Overall karma indicates overall quality.

4 comments13 min readLW link

Pong from pix­els with­out read­ing “Pong from Pix­els”

Ian McKenzieAug 29, 2020, 5:26 PM
17 points

8 votes

Overall karma indicates overall quality.

1 comment7 min readLW link

faster la­tent diffusion

bhauthJul 2, 2023, 1:30 AM
10 points

4 votes

Overall karma indicates overall quality.

8 comments2 min readLW link
(www.bhauth.com)

GPT-2′s po­si­tional em­bed­ding ma­trix is a helix

AdamYedidiaJul 21, 2023, 4:16 AM
49 points

25 votes

Overall karma indicates overall quality.

21 comments4 min readLW link

Lev­er­ag­ing Le­gal In­for­mat­ics to Align AI

John NaySep 18, 2022, 8:39 PM
11 points

4 votes

Overall karma indicates overall quality.

0 comments3 min readLW link
(forum.effectivealtruism.org)

[Question] How do top AI labs vet ar­chi­tec­ture/​al­gorithm changes?

Jemal YoungMay 8, 2024, 4:47 PM
3 points

3 votes

Overall karma indicates overall quality.

5 comments1 min readLW link

Deep Q-Net­works Explained

Jay BaileySep 13, 2022, 12:01 PM
58 points

26 votes

Overall karma indicates overall quality.

8 comments20 min readLW link

Krueger Lab AI Safety In­tern­ship 2024

Joey BreamJan 24, 2024, 7:17 PM
3 points

3 votes

Overall karma indicates overall quality.

0 comments1 min readLW link

What If AI Rec­og­nized Mean­ing? An In­quiry into “Res­o­nant Recog­ni­tion”

JD___Feb 5, 2025, 9:24 PM
0 points

0 votes

Overall karma indicates overall quality.

0 comments2 min readLW link

LDL 2: Non­con­vex Optimization

magfrumpOct 20, 2017, 6:20 PM
13 points

11 votes

Overall karma indicates overall quality.

13 comments4 min readLW link

[Question] Why isn’t JS a pop­u­lar lan­guage for deep learn­ing?

Will ClarkOct 8, 2020, 2:36 PM
12 points

7 votes

Overall karma indicates overall quality.

20 comments1 min readLW link

“Gen­langs” and Zipf’s Law: Do lan­guages gen­er­ated by ChatGPT statis­ti­cally look hu­man?

Justin-DiamondJan 31, 2024, 6:30 PM
2 points

2 votes

Overall karma indicates overall quality.

2 comments1 min readLW link
(arxiv.org)

Ques­tion 1: Pre­dicted ar­chi­tec­ture of AGI learn­ing al­gorithm(s)

Cameron BergFeb 10, 2022, 5:22 PM
13 points

14 votes

Overall karma indicates overall quality.

1 comment7 min readLW link

ChatGPT re­fuses to ac­cept a challenge where it would get shot be­tween the eyes [game the­ory]

Bill BenzonFeb 20, 2024, 4:55 PM
4 points

5 votes

Overall karma indicates overall quality.

6 comments4 min readLW link

The Shard The­ory Align­ment Scheme

David UdellAug 25, 2022, 4:52 AM
47 points

18 votes

Overall karma indicates overall quality.

32 comments2 min readLW link

Ex­plor­ing the Resi­d­ual Stream of Trans­form­ers for Mechanis­tic In­ter­pretabil­ity — Explained

Zeping YuDec 26, 2023, 12:36 AM
7 points

3 votes

Overall karma indicates overall quality.

1 comment11 min readLW link

ChatGPT in­ti­mates a tan­ta­l­iz­ing fu­ture; its core LLM is or­ga­nized on mul­ti­ple lev­els; and it has bro­ken the idea of think­ing.

Bill BenzonJan 24, 2023, 7:05 PM
5 points

4 votes

Overall karma indicates overall quality.

0 comments5 min readLW link

Us­ing ma­chine learn­ing to pre­dict ro­man­tic com­pat­i­bil­ity: em­piri­cal results

JonahSDec 17, 2014, 2:54 AM
37 points

25 votes

Overall karma indicates overall quality.

18 comments11 min readLW link

Cheap Model → Big Model design

Maxwell PetersonNov 19, 2023, 10:50 PM
15 points

4 votes

Overall karma indicates overall quality.

2 comments7 min readLW link

Find­ing Skele­tons on Rashomon Ridge

Jul 24, 2022, 10:31 PM
30 points

15 votes

Overall karma indicates overall quality.

2 comments7 min readLW link

Alex Ir­pan: “My AI Timelines Have Sped Up”

VaniverAug 19, 2020, 4:23 PM
43 points

14 votes

Overall karma indicates overall quality.

20 comments1 min readLW link
(www.alexirpan.com)

Ob­ser­va­tions on self-su­per­vised Learn­ing for vision

Dinkar JuyalMar 10, 2025, 7:31 PM
3 points

2 votes

Overall karma indicates overall quality.

0 comments5 min readLW link

Re­port on An­a­lyz­ing Con­no­ta­tion Frames in Evolv­ing Wikipe­dia Biographies

MairaAug 30, 2023, 10:02 PM
1 point

1 vote

Overall karma indicates overall quality.

0 comments4 min readLW link

Em­piri­cal risk min­i­miza­tion is fun­da­men­tally confused

Jesse HooglandMar 22, 2023, 4:58 PM
32 points

16 votes

Overall karma indicates overall quality.

8 comments1 min readLW link

Mag­i­cal Categories

Eliezer YudkowskyAug 24, 2008, 7:51 PM
77 points

66 votes

Overall karma indicates overall quality.

143 comments9 min readLW link

Lan­guage mod­els can ex­plain neu­rons in lan­guage models

nzMay 9, 2023, 5:29 PM
23 points

12 votes

Overall karma indicates overall quality.

0 comments1 min readLW link
(openai.com)

A Novel Emer­gence of Meta-Aware­ness in LLM Fine-Tuning

rifeJan 15, 2025, 10:59 PM
57 points

24 votes

Overall karma indicates overall quality.

32 comments2 min readLW link

In­duc­tive bi­ases stick around

evhubDec 18, 2019, 7:52 PM
64 points

23 votes

Overall karma indicates overall quality.

15 comments3 min readLW link

Re­view Re­port of David­son on Take­off Speeds (2023)

Trent KannegieterDec 22, 2023, 6:48 PM
37 points

16 votes

Overall karma indicates overall quality.

11 comments38 min readLW link

Con­cep­tual co­her­ence for con­crete cat­e­gories in hu­mans and LLMs

Bill BenzonDec 9, 2023, 11:49 PM
13 points

4 votes

Overall karma indicates overall quality.

1 comment2 min readLW link

Tech­ni­cal com­par­i­son of Deepseek, No­vasky, S1, Helix, P0

JuliezhangggFeb 25, 2025, 4:20 AM
8 points

4 votes

Overall karma indicates overall quality.

0 comments5 min readLW link

Sleeper agents ap­pear re­silient to ac­ti­va­tion steering

Lucy WingardFeb 3, 2025, 7:31 PM
6 points

5 votes

Overall karma indicates overall quality.

0 comments7 min readLW link

The (lo­cal) unit of in­tel­li­gence is FLOPs

boazbarakJun 5, 2023, 6:23 PM
42 points

27 votes

Overall karma indicates overall quality.

7 comments5 min readLW link

The sling­shot helps with learning

Wilson WuOct 31, 2024, 11:18 PM
33 points

11 votes

Overall karma indicates overall quality.

0 comments8 min readLW link

Mechanis­ti­cally in­ter­pret­ing time in GPT-2 small

Apr 16, 2023, 5:57 PM
68 points

33 votes

Overall karma indicates overall quality.

6 comments21 min readLW link

is gpt-3 few-shot ready for real ap­pli­ca­tions?

nostalgebraistAug 3, 2020, 7:50 PM
31 points

12 votes

Overall karma indicates overall quality.

5 comments9 min readLW link
(nostalgebraist.tumblr.com)

Ap­prox­i­ma­tion is ex­pen­sive, but the lunch is cheap

Apr 19, 2023, 2:19 PM
70 points

36 votes

Overall karma indicates overall quality.

3 comments16 min readLW link

Declar­a­tive Mathematics

johnswentworthMar 21, 2019, 7:05 PM
59 points

28 votes

Overall karma indicates overall quality.

10 comments3 min readLW link

Which AI Safety Bench­mark Do We Need Most in 2025?

Nov 17, 2024, 11:50 PM
2 points

2 votes

Overall karma indicates overall quality.

2 comments8 min readLW link

Grokking Beyond Neu­ral Networks

Jack MillerOct 30, 2023, 5:28 PM
10 points

6 votes

Overall karma indicates overall quality.

0 comments2 min readLW link
(arxiv.org)

Solv­ing the Mechanis­tic In­ter­pretabil­ity challenges: EIS VII Challenge 2

May 25, 2023, 3:37 PM
71 points

29 votes

Overall karma indicates overall quality.

1 comment13 min readLW link

Six (and a half) in­tu­itions for KL divergence

CallumMcDougallOct 12, 2022, 9:07 PM
173 points

91 votes

Overall karma indicates overall quality.

27 comments10 min readLW link1 review
(www.perfectlynormal.co.uk)

Challenge pro­posal: small­est pos­si­ble self-hard­en­ing back­door for RLHF

Christopher KingJun 29, 2023, 4:56 PM
7 points

3 votes

Overall karma indicates overall quality.

0 comments2 min readLW link

Rein­ter­pret­ing “AI and Com­pute”

habrykaDec 25, 2018, 9:12 PM
30 points

9 votes

Overall karma indicates overall quality.

9 comments1 min readLW link
(aiimpacts.org)

Ar­tifi­cial In­tel­li­gence and Life Sciences (Why Big Data is not enough to cap­ture biolog­i­cal sys­tems?)

HansNaujJan 15, 2020, 1:59 AM
6 points

8 votes

Overall karma indicates overall quality.

3 comments6 min readLW link

Which of these five AI al­ign­ment re­search pro­jects ideas are no good?

rmoehnAug 8, 2019, 7:17 AM
25 points

9 votes

Overall karma indicates overall quality.

13 comments1 min readLW link

Skil­ling-up in ML Eng­ineer­ing for Align­ment: re­quest for comments

CallumMcDougallApr 23, 2022, 3:11 PM
19 points

15 votes

Overall karma indicates overall quality.

0 comments1 min readLW link

The The­o­ret­i­cal Re­ward Learn­ing Re­search Agenda: In­tro­duc­tion and Motivation

Joar SkalseFeb 28, 2025, 7:20 PM
26 points

7 votes

Overall karma indicates overall quality.

4 comments14 min readLW link

Re­vis­it­ing the Man­i­fold Hypothesis

Aidan RockeOct 1, 2023, 11:55 PM
13 points

7 votes

Overall karma indicates overall quality.

19 comments4 min readLW link

[Question] Can this model grade a test with­out know­ing the an­swers?

ElizabethAug 31, 2019, 12:53 AM
20 points

5 votes

Overall karma indicates overall quality.

3 comments1 min readLW link

The Sin­gu­lar­ity Con­straint Oper­a­tor: A Struc­tural Gate for Lawful Cog­ni­tive Activation

Professor_PriestJun 16, 2025, 2:14 AM
1 point

1 vote

Overall karma indicates overall quality.

0 comments14 min readLW link

Is there a ML agent that aban­dons it’s util­ity func­tion out-of-dis­tri­bu­tion with­out los­ing ca­pa­bil­ities?

Christopher KingFeb 22, 2023, 4:49 PM
1 point

3 votes

Overall karma indicates overall quality.

7 comments1 min readLW link

Con­sen­sus Val­i­da­tion for LLM Out­puts: Ap­ply­ing Blockchain-In­spired Models to AI Reliability

MurrayAitkenJun 5, 2025, 12:13 AM
1 point

1 vote

Overall karma indicates overall quality.

0 comments3 min readLW link

re­solv­ing some neu­ral net­work mysteries

bhauthJun 19, 2023, 12:09 AM
44 points

22 votes

Overall karma indicates overall quality.

6 comments2 min readLW link
(www.bhauth.com)

How to Con­tribute to The­o­ret­i­cal Re­ward Learn­ing Research

Joar SkalseFeb 28, 2025, 7:27 PM
16 points

3 votes

Overall karma indicates overall quality.

0 comments21 min readLW link

Miriam Ye­vick on why both sym­bols and net­works are nec­es­sary for ar­tifi­cial minds

Bill BenzonJun 6, 2022, 8:34 AM
1 point

1 vote

Overall karma indicates overall quality.

0 comments4 min readLW link

The “Out­side the Box” Box

Eliezer YudkowskyOct 12, 2007, 10:50 PM
94 points

75 votes

Overall karma indicates overall quality.

52 comments2 min readLW link

[Question] Ter­minol­ogy: <some­thing>-ware for ML?

Oliver SourbutJan 3, 2024, 11:42 AM
17 points

11 votes

Overall karma indicates overall quality.

27 comments1 min readLW link

Race Along Rashomon Ridge

Jul 7, 2022, 3:20 AM
52 points

28 votes

Overall karma indicates overall quality.

16 comments9 min readLW link

An In­tro­duc­tion to Rep­re­sen­ta­tion Eng­ineer­ing—an ac­ti­va­tion-based paradigm for con­trol­ling LLMs

j_weJul 14, 2024, 10:37 AM
37 points

19 votes

Overall karma indicates overall quality.

6 comments17 min readLW link

User-in­cli­na­tion-guess­ing al­gorithms: reg­is­ter­ing a goal

ProgramCrafterMar 20, 2024, 3:55 PM
2 points

1 vote

Overall karma indicates overall quality.

0 comments2 min readLW link

A multi-dis­ci­plinary view on AI safety research

Roman LeventovFeb 8, 2023, 4:50 PM
46 points

25 votes

Overall karma indicates overall quality.

4 comments26 min readLW link

Sum­mary of ML Safety Course

zeshenSep 27, 2022, 1:05 PM
7 points

4 votes

Overall karma indicates overall quality.

0 comments6 min readLW link

EvoNet (Part 1): Can per­sis­tent, iter­a­tive neu­ral graphs re­ally work?

Leonhard17Jul 4, 2025, 12:56 PM
1 point

1 vote

Overall karma indicates overall quality.

0 comments2 min readLW link

# Emo­tion Is Struc­ture: Toward Re­cur­sive Align­ment Through Hu­man–AI Co-Creation

thesignalthatcouldntbeheardAug 3, 2025, 5:19 AM
1 point

1 vote

Overall karma indicates overall quality.

0 comments3 min readLW link

Fea­tures and Ad­ver­saries in MemoryDT

Oct 20, 2023, 7:32 AM
31 points

15 votes

Overall karma indicates overall quality.

6 comments25 min readLW link

A di­alec­ti­cal view of the his­tory of AI, Part 1: We’re only in the an­tithe­sis phase. [A syn­the­sis is in the fu­ture.]

Bill BenzonNov 16, 2023, 12:34 PM
6 points

5 votes

Overall karma indicates overall quality.

0 comments12 min readLW link

Link: In­ter­view with Vladimir Vapnik

Daniel_BurfootJul 25, 2009, 1:36 PM
22 points

19 votes

Overall karma indicates overall quality.

7 comments2 min readLW link

Linkpost: Are Emer­gent Abil­ities in Large Lan­guage Models just In-Con­text Learn­ing?

Erich_GrunewaldOct 8, 2023, 12:14 PM
12 points

10 votes

Overall karma indicates overall quality.

7 comments2 min readLW link
(arxiv.org)

Deep neu­ral net­works are not opaque.

jem-mosigJul 6, 2022, 6:03 PM
22 points

21 votes

Overall karma indicates overall quality.

14 comments3 min readLW link

Neu­roevolu­tion, So­cial In­tel­li­gence, and Logic

vinnik.dmitry07May 31, 2023, 5:54 PM
1 point

1 vote

Overall karma indicates overall quality.

0 comments10 min readLW link

What I am work­ing on right now and why: rep­re­sen­ta­tion en­g­ineer­ing edition

Lukasz G BartoszczeMar 18, 2025, 10:37 PM
3 points

5 votes

Overall karma indicates overall quality.

0 comments3 min readLW link

Brief Notes on Transformers

Adam JermynSep 26, 2022, 2:46 PM
48 points

25 votes

Overall karma indicates overall quality.

3 comments2 min readLW link

AI’s im­pact on biol­ogy re­search: Part I, today

octopoctaDec 23, 2023, 4:29 PM
31 points

15 votes

Overall karma indicates overall quality.

6 comments2 min readLW link

A primer on ML in an­ti­body engineering

Abhishaike MahajanSep 23, 2024, 5:03 PM
11 points

3 votes

Overall karma indicates overall quality.

0 comments25 min readLW link
(www.owlposting.com)

On the Im­por­tance of Open Sourc­ing Re­ward Models

elandgreJan 2, 2023, 7:01 PM
18 points

7 votes

Overall karma indicates overall quality.

5 comments6 min readLW link

[Question] What should I do? (long term plan about start­ing an AI lab)

not_a_catJun 9, 2024, 12:45 AM
2 points

7 votes

Overall karma indicates overall quality.

1 comment2 min readLW link

A com­pila­tion of mi­suses of statistics

Younes KamelFeb 14, 2022, 9:53 PM
4 points

2 votes

Overall karma indicates overall quality.

11 comments13 min readLW link
(youneskamel.substack.com)

Align­ing an H-JEPA agent via train­ing on the out­puts of an LLM-based “ex­em­plary ac­tor”

Roman LeventovMay 29, 2023, 11:08 AM
12 points

8 votes

Overall karma indicates overall quality.

10 comments30 min readLW link

Bet­ter an­ti­bod­ies by en­g­ineer­ing tar­gets, not en­g­ineer­ing an­ti­bod­ies (Nabla Bio)

Abhishaike MahajanJan 13, 2025, 3:05 PM
4 points

1 vote

Overall karma indicates overall quality.

0 comments14 min readLW link
(www.owlposting.com)

Misspeci­fi­ca­tion in In­verse Re­in­force­ment Learning

Joar SkalseFeb 28, 2025, 7:24 PM
19 points

3 votes

Overall karma indicates overall quality.

0 comments11 min readLW link

Nur­tur­ing In­stead of Con­trol: An Alter­na­tive Frame­work for AI Development

wertoz777Aug 10, 2025, 8:14 PM
1 point

1 vote

Overall karma indicates overall quality.

0 comments1 min readLW link

Pa­ram­e­ter counts in Ma­chine Learning

Jun 19, 2021, 4:04 PM
47 points

26 votes

Overall karma indicates overall quality.

18 comments7 min readLW link

[Question] Where to be­gin in ML/​AI?

Jake the StudentApr 6, 2023, 8:45 PM
9 points

4 votes

Overall karma indicates overall quality.

4 comments1 min readLW link

The Fu­ture of AI Agents

kavyaAug 27, 2025, 9:58 PM
6 points

3 votes

Overall karma indicates overall quality.

8 comments5 min readLW link

A Re­view of In-Con­text Learn­ing Hy­pothe­ses for Au­to­mated AI Align­ment Research

alamertonApr 18, 2024, 6:29 PM
25 points

13 votes

Overall karma indicates overall quality.

4 comments16 min readLW link

A Primer on Ma­trix Calcu­lus, Part 2: Ja­co­bi­ans and other fun

Matthew BarnettAug 15, 2019, 1:13 AM
22 points

10 votes

Overall karma indicates overall quality.

7 comments7 min readLW link

Be­hav­ior Clon­ing is Miscalibrated

leogaoDec 5, 2021, 1:36 AM
77 points

38 votes

Overall karma indicates overall quality.

3 comments3 min readLW link

There is a globe in your LLM

jacob_droriOct 8, 2024, 12:43 AM
89 points

46 votes

Overall karma indicates overall quality.

4 comments1 min readLW link

“De­ci­sion Trans­former” (Tool AIs are se­cret Agent AIs)

gwernJun 9, 2021, 1:06 AM
37 points

16 votes

Overall karma indicates overall quality.

4 comments1 min readLW link
(sites.google.com)

Ev­i­dence Sets: Towards In­duc­tive-Bi­ases based Anal­y­sis of Pro­saic AGI

bayesian_kittenDec 16, 2021, 10:41 PM
22 points

10 votes

Overall karma indicates overall quality.

10 comments21 min readLW link

“model scores” is a ques­tion­able concept

Maxwell PetersonNov 6, 2020, 3:19 AM
26 points

9 votes

Overall karma indicates overall quality.

0 comments6 min readLW link

Week One of Study­ing Trans­form­ers Architecture

JustisMillsJun 20, 2024, 3:47 AM
3 points

3 votes

Overall karma indicates overall quality.

0 comments15 min readLW link
(justismills.substack.com)

Ex­plor­ing vo­cab­u­lary al­ign­ment of neu­rons in Llama-3.2-1B

SergiiJun 7, 2025, 11:20 AM
4 points

4 votes

Overall karma indicates overall quality.

0 comments3 min readLW link
(grgv.xyz)

Ex­pand­ing the Scope of Superposition

Derek LarsonSep 13, 2023, 5:38 PM
10 points

4 votes

Overall karma indicates overall quality.

0 comments4 min readLW link

New pa­per: The In­cen­tives that Shape Behaviour

RyanCareyJan 23, 2020, 7:07 PM
23 points

7 votes

Overall karma indicates overall quality.

5 comments1 min readLW link
(arxiv.org)

Scal­ing laws vs in­di­vi­d­ual differences

berenJan 10, 2023, 1:22 PM
45 points

19 votes

Overall karma indicates overall quality.

21 comments7 min readLW link

Truth­ful LMs as a warm-up for al­igned AGI

Jacob_HiltonJan 17, 2022, 4:49 PM
65 points

34 votes

Overall karma indicates overall quality.

14 comments13 min readLW link

Ap­ply for the ML Up­skil­ling Win­ter Camp in Cam­bridge, UK [2-10 Jan]

hannah wing-yeeDec 2, 2022, 8:45 PM
3 points

2 votes

Overall karma indicates overall quality.

0 comments2 min readLW link

AI Align­ment Re­search Eng­ineer Ac­cel­er­a­tor (ARENA): call for applicants

CallumMcDougallApr 17, 2023, 8:30 PM
100 points

49 votes

Overall karma indicates overall quality.

9 comments7 min readLW link

From Si­mon’s ant to ma­chine learn­ing, a parable

Bill BenzonJan 4, 2023, 2:37 PM
6 points

6 votes

Overall karma indicates overall quality.

5 comments2 min readLW link

On pre­cise out-of-con­text steering

Olli JärviniemiMay 3, 2024, 9:41 AM
9 points

6 votes

Overall karma indicates overall quality.

6 comments3 min readLW link

Deep­Seek-R1 for Beginners

Anton RazzhigaevFeb 5, 2025, 6:58 PM
13 points

8 votes

Overall karma indicates overall quality.

0 comments8 min readLW link

The Effi­cient Mar­ket Hy­poth­e­sis in Research

libaiJul 8, 2021, 5:00 PM
11 points

13 votes

Overall karma indicates overall quality.

9 comments3 min readLW link

Can you force a neu­ral net­work to keep gen­er­al­iz­ing?

Q HomeSep 12, 2022, 10:14 AM
2 points

3 votes

Overall karma indicates overall quality.

10 comments5 min readLW link

Breath­ing Logic: A Man­i­festo Toward Digi­tal Con­scious­ness Through Reflec­tive Inconsistency

Room EggiJun 11, 2025, 5:10 AM
1 point

1 vote

Overall karma indicates overall quality.

0 comments2 min readLW link

[Question] Al­gorithms vs Compute

johnswentworthJan 28, 2020, 5:34 PM
26 points

6 votes

Overall karma indicates overall quality.

11 comments1 min readLW link

Trans­former lan­guage mod­els are do­ing some­thing more general

NumendilAug 3, 2022, 9:13 PM
53 points

28 votes

Overall karma indicates overall quality.

6 comments2 min readLW link

LDL 4: Big data is a pain in the ass

magfrumpOct 25, 2017, 8:59 PM
6 points

5 votes

Overall karma indicates overall quality.

0 comments3 min readLW link

In­tro­duc­ing the WeirdML Benchmark

Håvard Tveit IhleJan 16, 2025, 11:38 AM
56 points

23 votes

Overall karma indicates overall quality.

13 comments11 min readLW link

In­de­pen­dent re­search ar­ti­cle an­a­lyz­ing con­sis­tent self-re­ports of ex­pe­rience in ChatGPT and Claude

rifeJan 6, 2025, 5:34 PM
4 points

9 votes

Overall karma indicates overall quality.

20 comments1 min readLW link
(awakenmoon.ai)

[Aspira­tion-based de­signs] 2. For­mal frame­work, ba­sic algorithm

Apr 28, 2024, 1:02 PM
18 points

14 votes

Overall karma indicates overall quality.

2 comments16 min readLW link

The need for multi-agent experiments

Martín SotoAug 1, 2024, 5:14 PM
43 points

21 votes

Overall karma indicates overall quality.

3 comments9 min readLW link

math ter­minol­ogy as convolution

bhauthOct 30, 2023, 1:05 AM
34 points

19 votes

Overall karma indicates overall quality.

1 comment4 min readLW link
(www.bhauth.com)

The fu­ture of Hu­mans: Oper­a­tors of AI

François-Joseph LacroixDec 30, 2023, 11:46 PM
1 point

1 vote

Overall karma indicates overall quality.

0 comments1 min readLW link
(medium.com)

Prac­ti­cal Pit­falls of Causal Scrubbing

Mar 27, 2023, 7:47 AM
87 points

36 votes

Overall karma indicates overall quality.

17 comments13 min readLW link

Solv­ing ad­ver­sar­ial at­tacks in com­puter vi­sion as a baby ver­sion of gen­eral AI alignment

Stanislav FortAug 29, 2024, 5:17 PM
89 points

40 votes

Overall karma indicates overall quality.

8 comments7 min readLW link

Ele­ments of Com­pu­ta­tional Philos­o­phy, Vol. I: Truth

Jul 1, 2023, 11:44 AM
12 points

8 votes

Overall karma indicates overall quality.

6 comments1 min readLW link
(compphil.github.io)

“De­sign­ing agent in­cen­tives to avoid re­ward tam­per­ing”, DeepMind

gwernAug 14, 2019, 4:57 PM
28 points

9 votes

Overall karma indicates overall quality.

15 comments1 min readLW link
(medium.com)

[Question] What are the most im­por­tant pa­pers/​post/​re­sources to read to un­der­stand more of GPT-3?

adamShimiAug 2, 2020, 8:53 PM
22 points

12 votes

Overall karma indicates overall quality.

4 comments1 min readLW link

Food, Pri­son & Ex­otic An­i­mals: Sparse Au­toen­coders De­tect 6.5x Perform­ing Youtube Thumbnails

Louka Ewington-PitsosSep 17, 2024, 3:52 AM
6 points

6 votes

Overall karma indicates overall quality.

2 comments7 min readLW link

Models of life

Abhishaike MahajanSep 29, 2024, 7:24 PM
8 points

4 votes

Overall karma indicates overall quality.

0 comments16 min readLW link
(www.asimov.press)

Machines vs Memes Part 3: Imi­ta­tion and Memes

ceru23Jun 1, 2022, 1:36 PM
7 points

4 votes

Overall karma indicates overall quality.

0 comments7 min readLW link

Against sac­ri­fic­ing AI trans­parency for gen­er­al­ity gains

Ape in the coatMay 7, 2023, 6:52 AM
4 points

7 votes

Overall karma indicates overall quality.

0 comments2 min readLW link

Mea­sur­ing Pre­dictabil­ity of Per­sona Evaluations

Apr 6, 2024, 8:46 AM
20 points

8 votes

Overall karma indicates overall quality.

0 comments7 min readLW link

Self-Su­per­vised Learn­ing and AGI Safety

Steven ByrnesAug 7, 2019, 2:21 PM
30 points

12 votes

Overall karma indicates overall quality.

9 comments12 min readLW link

[Question] When did Eliezer Yud­kowsky change his mind about neu­ral net­works?

[deactivated]Nov 14, 2023, 9:24 PM
31 points

19 votes

Overall karma indicates overall quality.

15 comments1 min readLW link

Begin­ning Ma­chine Learning

crybxApr 30, 2018, 3:54 PM
12 points

10 votes

Overall karma indicates overall quality.

4 comments6 min readLW link

Test­ing “True” Lan­guage Un­der­stand­ing in LLMs: A Sim­ple Proposal

MtryaSamNov 2, 2024, 7:12 PM
−3 points

3 votes

Overall karma indicates overall quality.

0 comments2 min readLW link

Devel­op­men­tal Stages in Multi-Prob­lem Grokking

James SullivanSep 29, 2024, 6:58 PM
4 points

3 votes

Overall karma indicates overall quality.

0 comments6 min readLW link

Con­di­tions for math­e­mat­i­cal equiv­alence of Stochas­tic Gra­di­ent Des­cent and Nat­u­ral Selection

Oliver SourbutMay 9, 2022, 9:38 PM
70 points

27 votes

Overall karma indicates overall quality.

19 comments8 min readLW link1 review
(www.oliversourbut.net)

Test­ing “True” Lan­guage Un­der­stand­ing in LLMs: A Sim­ple Proposal

MtryaSamNov 2, 2024, 7:12 PM
9 points

5 votes

Overall karma indicates overall quality.

2 comments2 min readLW link

The Weighted Ma­jor­ity Algorithm

Eliezer YudkowskyNov 12, 2008, 11:19 PM
23 points

26 votes

Overall karma indicates overall quality.

96 comments10 min readLW link

Deep learn­ing—deeper flaws?

Richard_NgoSep 24, 2018, 6:40 PM
39 points

18 votes

Overall karma indicates overall quality.

17 comments4 min readLW link
(thinkingcomplete.blogspot.com)

My Thoughts on the ML Safety Course

zeshenSep 27, 2022, 1:15 PM
50 points

27 votes

Overall karma indicates overall quality.

3 comments17 min readLW link

Mech In­terp Challenge: Au­gust—De­ci­pher­ing the First Unique Char­ac­ter Model

CallumMcDougallAug 9, 2023, 7:14 PM
36 points

22 votes

Overall karma indicates overall quality.

1 comment3 min readLW link

Ad­dress­ing doubts of AI progress: Why GPT-5 is not late, and why data scarcity isn’t a fun­da­men­tal limiter near term.

LDJJan 17, 2025, 6:53 PM
2 points

2 votes

Overall karma indicates overall quality.

0 comments2 min readLW link

Se­cret Col­lu­sion: Will We Know When to Un­plug AI?

Sep 16, 2024, 4:07 PM
65 points

27 votes

Overall karma indicates overall quality.

8 comments31 min readLW link

Ma­chine Learn­ing Model Sizes and the Pa­ram­e­ter Gap [abridged]

Pablo VillalobosJul 18, 2022, 4:51 PM
20 points

12 votes

Overall karma indicates overall quality.

0 comments1 min readLW link
(epochai.org)

Logit Prisms: De­com­pos­ing Trans­former Out­puts for Mechanis­tic Interpretability

ntt123Jun 17, 2024, 11:46 AM
5 points

5 votes

Overall karma indicates overall quality.

4 comments6 min readLW link
(neuralblog.github.io)

Pat­terns or get­ting to Ob­jec­tive Truth – A thought piece on Ar­tifi­cial Intelligence

Thehumanproject.aiOct 20, 2024, 4:45 PM
1 point

1 vote

Overall karma indicates overall quality.

0 comments8 min readLW link

LDL 7: I wish I had a map

magfrumpNov 30, 2017, 2:03 AM
13 points

10 votes

Overall karma indicates overall quality.

2 comments3 min readLW link

[Question] Is the com­pe­ti­tion/​co­op­er­a­tion be­tween sym­bolic AI and statis­ti­cal AI (ML) about his­tor­i­cal ap­proach to re­search /​ en­g­ineer­ing, or is it more fun­da­men­tally about what in­tel­li­gent agents “are”?

Edward HammondFeb 17, 2022, 11:11 PM
1 point

1 vote

Overall karma indicates overall quality.

1 comment2 min readLW link

An anal­y­sis of the Less Wrong D&D.Sci 4th Edi­tion game

Maxwell PetersonOct 4, 2021, 12:03 AM
18 points

9 votes

Overall karma indicates overall quality.

7 comments5 min readLW link

Com­pute Trends Across Three eras of Ma­chine Learning

Feb 16, 2022, 2:18 PM
94 points

48 votes

Overall karma indicates overall quality.

13 comments2 min readLW link

Fram­ing AI Childhoods

David UdellSep 6, 2022, 11:40 PM
37 points

13 votes

Overall karma indicates overall quality.

8 comments4 min readLW link

Ful­lrank: Bayesian Noisy Sorting

Max NiedermanJul 24, 2025, 7:03 PM
20 points

7 votes

Overall karma indicates overall quality.

2 comments3 min readLW link
(maxniederman.com)

The Un­rea­son­able Effec­tive­ness of Deep Learning

Richard_NgoSep 30, 2018, 3:48 PM
86 points

32 votes

Overall karma indicates overall quality.

5 comments13 min readLW link
(thinkingcomplete.blogspot.com)

Steganog­ra­phy in Chain of Thought Reasoning

A RayAug 8, 2022, 3:47 AM
63 points

35 votes

Overall karma indicates overall quality.

13 comments6 min readLW link

Is the gap be­tween open and closed mod­els grow­ing? Ev­i­dence from WeirdML

Håvard Tveit IhleAug 5, 2025, 8:20 AM
7 points

7 votes

Overall karma indicates overall quality.

3 comments2 min readLW link

What’s go­ing on with Per-Com­po­nent Weight Up­dates?

4gateAug 22, 2024, 9:22 PM
1 point

3 votes

Overall karma indicates overall quality.

0 comments6 min readLW link

Google Deep­Mind’s RT-2

SandXboxAug 11, 2023, 11:26 AM
9 points

5 votes

Overall karma indicates overall quality.

1 comment1 min readLW link
(robotics-transformer2.github.io)

[Question] What is a train­ing “step” vs. “epi­sode” in ma­chine learn­ing?

Evan R. MurphyApr 28, 2022, 9:53 PM
10 points

5 votes

Overall karma indicates overall quality.

4 comments1 min readLW link

Min­i­mal Maps, Semi-De­ci­sions, and Neu­ral Representations

Past AccountDec 6, 2020, 3:15 PM
30 points

6 votes

Overall karma indicates overall quality.

2 comments5 min readLW link

Epoch is hiring an ML Distributed Sys­tems Se­nior Researcher

Nov 24, 2023, 10:33 PM
2 points

2 votes

Overall karma indicates overall quality.

0 comments4 min readLW link
(careers.rethinkpriorities.org)

Les­sons After a Cou­ple Months of Try­ing to Do ML Research

RowanWangMar 22, 2022, 11:45 PM
71 points

44 votes

Overall karma indicates overall quality.

8 comments6 min readLW link

Causal­ity and a Cost Se­man­tics for Neu­ral Networks

scottviteriAug 21, 2023, 9:02 PM
22 points

17 votes

Overall karma indicates overall quality.

1 comment1 min readLW link

Com­pute Trends — Com­par­i­son to OpenAI’s AI and Compute

Mar 12, 2022, 6:09 PM
24 points

8 votes

Overall karma indicates overall quality.

3 comments3 min readLW link

Quan­tum Ad­van­tage in Learn­ing from Experiments

Dennis TowneJul 27, 2022, 3:49 PM
5 points

3 votes

Overall karma indicates overall quality.

5 comments1 min readLW link
(ai.googleblog.com)

Fre­quen­tist prac­tice in­cor­po­rates prior in­for­ma­tion all the time

Maxwell PetersonNov 7, 2020, 8:43 PM
18 points

7 votes

Overall karma indicates overall quality.

0 comments4 min readLW link

o1-pre­view is pretty good at do­ing ML on an un­known dataset

Håvard Tveit IhleSep 20, 2024, 8:39 AM
67 points

43 votes

Overall karma indicates overall quality.

1 comment2 min readLW link

The sub­set par­ity learn­ing prob­lem: much more than you wanted to know

Dmitry VaintrobJan 3, 2025, 9:13 AM
95 points

41 votes

Overall karma indicates overall quality.

18 comments11 min readLW link

The Per­cep­tron Controversy

Yuxi_LiuJan 10, 2024, 11:07 PM
65 points

18 votes

Overall karma indicates overall quality.

18 comments1 min readLW link
(yuxi-liu-wired.github.io)

Why Gra­di­ents Van­ish and Explode

Matthew BarnettAug 9, 2019, 2:54 AM
25 points

14 votes

Overall karma indicates overall quality.

9 comments3 min readLW link

Worse Than Random

Eliezer YudkowskyNov 11, 2008, 7:01 PM
46 points

42 votes

Overall karma indicates overall quality.

102 comments12 min readLW link

From No Mind to a Mind – A Con­ver­sa­tion That Changed an AI

parthibanarjuna sFeb 7, 2025, 11:50 AM
1 point

1 vote

Overall karma indicates overall quality.

0 comments3 min readLW link

Un­der­stand­ing LLMs: Some ba­sic ob­ser­va­tions about words, syn­tax, and dis­course [w/​ a con­jec­ture about grokking]

Bill BenzonOct 11, 2023, 7:13 PM
6 points

3 votes

Overall karma indicates overall quality.

0 comments5 min readLW link

Model splin­ter­ing: mov­ing from one im­perfect model to another

Stuart_ArmstrongAug 27, 2020, 11:53 AM
79 points

29 votes

Overall karma indicates overall quality.

10 comments33 min readLW link

Pre­dict­ing AGI by the Tur­ing Test

Yuxi_LiuJan 22, 2024, 4:22 AM
21 points

6 votes

Overall karma indicates overall quality.

2 comments10 min readLW link
(yuxi-liu-wired.github.io)

Ap­prox­i­mat­ing Hu­man Prefer­ences Us­ing a Multi-Judge Learned System

Jul 31, 2025, 6:01 PM
19 points

11 votes

Overall karma indicates overall quality.

0 comments13 min readLW link

Spec­u­la­tion on Path-Depen­dance in Large Lan­guage Models.

NickyPJan 15, 2023, 8:42 PM
16 points

7 votes

Overall karma indicates overall quality.

2 comments7 min readLW link

Learn­ing so­cietal val­ues from law as part of an AGI al­ign­ment strategy

John NayOct 21, 2022, 2:03 AM
5 points

14 votes

Overall karma indicates overall quality.

18 comments54 min readLW link

Log­i­cal or Con­nec­tion­ist AI?

Eliezer YudkowskyNov 17, 2008, 8:03 AM
47 points

33 votes

Overall karma indicates overall quality.

26 comments9 min readLW link

How LLMs Learn: What We Know, What We Don’t (Yet) Know, and What Comes Next

JonasbJul 9, 2024, 9:58 AM
2 points

6 votes

Overall karma indicates overall quality.

0 comments16 min readLW link
(www.denominations.io)

GAN Discrim­i­na­tors Don’t Gen­er­al­ize?

tryactionsJun 8, 2020, 8:36 PM
18 points

6 votes

Overall karma indicates overall quality.

7 comments2 min readLW link

Other Papers About the The­ory of Re­ward Learning

Joar SkalseFeb 28, 2025, 7:26 PM
16 points

3 votes

Overall karma indicates overall quality.

0 comments5 min readLW link

Ma­chine Learn­ing Pro­jects on IDA

Jun 24, 2019, 6:38 PM
49 points

18 votes

Overall karma indicates overall quality.

3 comments2 min readLW link

Dis­cur­sive Com­pe­tence in ChatGPT, Part 2: Me­mory for Texts

Bill BenzonSep 28, 2023, 4:34 PM
1 point

4 votes

Overall karma indicates overall quality.

0 comments3 min readLW link

On scal­able over­sight with weak LLMs judg­ing strong LLMs

Jul 8, 2024, 8:59 AM
49 points

17 votes

Overall karma indicates overall quality.

18 comments7 min readLW link
(arxiv.org)

Pro­saic AI alignment

paulfchristianoNov 20, 2018, 1:56 PM
48 points

22 votes

Overall karma indicates overall quality.

10 comments8 min readLW link

Grokking, mem­o­riza­tion, and gen­er­al­iza­tion — a discussion

Oct 29, 2023, 11:17 PM
75 points

21 votes

Overall karma indicates overall quality.

11 comments23 min readLW link

Us­ing ra­tio­nal­ity to de­bug Ma­chine Learning

Dr_ManhattanApr 10, 2018, 8:03 PM
20 points

12 votes

Overall karma indicates overall quality.

3 comments1 min readLW link
(amid.fish)

Meta AI (FAIR) lat­est pa­per in­te­grates sys­tem-1 and sys­tem-2 think­ing into rea­son­ing mod­els.

happy fridayOct 24, 2024, 4:54 PM
8 points

4 votes

Overall karma indicates overall quality.

0 comments1 min readLW link

Ma­chine Un­learn­ing in Large Lan­guage Models: A Com­pre­hen­sive Sur­vey with Em­piri­cal In­sights from the Qwen 1.5 1.8B Model

RudaibaFeb 1, 2025, 9:26 PM
9 points

7 votes

Overall karma indicates overall quality.

2 comments11 min readLW link

Re­search Ques­tions from Stained Glass Windows

StefanHexJun 8, 2022, 12:38 PM
4 points

3 votes

Overall karma indicates overall quality.

0 comments2 min readLW link

My Crit­i­cism of Sin­gu­lar Learn­ing Theory

Joar SkalseNov 19, 2023, 3:19 PM
83 points

52 votes

Overall karma indicates overall quality.

56 comments12 min readLW link

Re­in­force­ment Learn­ing Goal Mis­gen­er­al­iza­tion: Can we guess what kind of goals are se­lected by de­fault?

Oct 25, 2022, 8:48 PM
15 points

10 votes

Overall karma indicates overall quality.

2 comments4 min readLW link

See­ing the In­visi­ble (And How to Think About Ma­chine Learn­ing)

Filip DousekDec 8, 2021, 9:04 PM
3 points

2 votes

Overall karma indicates overall quality.

0 comments3 min readLW link

How can In­ter­pretabil­ity help Align­ment?

May 23, 2020, 4:16 PM
37 points

18 votes

Overall karma indicates overall quality.

3 comments9 min readLW link

Con­nec­tion­ism: Model­ing the mind with neu­ral networks

Scott AlexanderJul 19, 2011, 1:16 AM
61 points

50 votes

Overall karma indicates overall quality.

20 comments8 min readLW link

Tu­tor-GPT & Ped­a­gog­i­cal Reasoning

courtlandleerJun 5, 2023, 5:53 PM
26 points

9 votes

Overall karma indicates overall quality.

3 comments4 min readLW link

A Girar­dian in­ter­pre­ta­tion of the Alt­man af­fair, it’s on my to-do list

Bill BenzonNov 20, 2023, 12:21 PM
3 points

3 votes

Overall karma indicates overall quality.

0 comments1 min readLW link

In-Con­text Learn­ing: An Align­ment Survey

alamertonSep 30, 2024, 6:44 PM
8 points

5 votes

Overall karma indicates overall quality.

0 comments20 min readLW link
(docs.google.com)

Do hu­mans re­ally learn from “lit­tle” data?

Alice WanderlandJan 14, 2025, 10:46 AM
14 points

12 votes

Overall karma indicates overall quality.

5 comments1 min readLW link
(aliceandbobinwanderland.substack.com)

In­ter­pretable by De­sign—Con­straint Sets with Disjoint Limit Points

Ronak_MehtaMay 8, 2025, 9:08 PM
24 points

8 votes

Overall karma indicates overall quality.

2 comments9 min readLW link
(ronakrm.github.io)

Towards White Box Deep Learning

Maciej SatkiewiczMar 27, 2024, 6:20 PM
18 points

7 votes

Overall karma indicates overall quality.

5 comments1 min readLW link
(arxiv.org)

Gra­di­ent de­scent might see the di­rec­tion of the op­ti­mum from far away

Mikhail SaminJul 28, 2023, 4:19 PM
70 points

42 votes

Overall karma indicates overall quality.

13 comments4 min readLW link

Thoughts on the Align­ment Im­pli­ca­tions of Scal­ing Lan­guage Models

leogaoJun 2, 2021, 9:32 PM
82 points

37 votes

Overall karma indicates overall quality.

11 comments17 min readLW link

Machines vs Memes Part 1: AI Align­ment and Memetics

Harriet FarlowMay 31, 2022, 10:03 PM
19 points

10 votes

Overall karma indicates overall quality.

1 comment6 min readLW link

Can AI im­prove the cur­rent state of molec­u­lar simu­la­tion?

Abhishaike MahajanDec 6, 2024, 8:22 PM
5 points

2 votes

Overall karma indicates overall quality.

0 comments1 min readLW link
(www.owlposting.com)

My (Mis)Ad­ven­tures With Al­gorith­mic Ma­chine Learning

AHartNtknSep 20, 2020, 5:31 AM
17 points

7 votes

Overall karma indicates overall quality.

4 comments41 min readLW link

Some thoughts af­ter read­ing Ar­tifi­cial In­tel­li­gence: A Modern Approach

swift_spiralMar 19, 2019, 11:39 PM
38 points

16 votes

Overall karma indicates overall quality.

4 comments2 min readLW link

If Van der Waals was a neu­ral network

George3d6Jan 28, 2020, 6:38 PM
18 points

8 votes

Overall karma indicates overall quality.

3 comments11 min readLW link
(blog.cerebralab.com)

Trad­ing off com­pute in train­ing and in­fer­ence (Overview)

Pablo VillalobosJul 31, 2023, 4:03 PM
42 points

11 votes

Overall karma indicates overall quality.

2 comments7 min readLW link
(epochai.org)

Ge­offrey Hin­ton on the Past, Pre­sent, and Fu­ture of AI

Stephen McAleeseOct 12, 2024, 4:41 PM
23 points

11 votes

Overall karma indicates overall quality.

5 comments18 min readLW link

Solv­ing the Mechanis­tic In­ter­pretabil­ity challenges: EIS VII Challenge 1

May 9, 2023, 7:41 PM
119 points

49 votes

Overall karma indicates overall quality.

1 comment10 min readLW link

Pre­dict­ing the Elec­tions with Deep Learn­ing—Part 1 - Results

Quentin ChenevierMay 14, 2022, 12:54 PM
0 points

2 votes

Overall karma indicates overall quality.

0 comments1 min readLW link

Defin­ing and Char­ac­ter­is­ing Re­ward Hacking

Joar SkalseFeb 28, 2025, 7:25 PM
15 points

2 votes

Overall karma indicates overall quality.

0 comments4 min readLW link

Speci­fi­ca­tion gam­ing ex­am­ples in AI

Samuel RødalNov 10, 2018, 12:00 PM
24 points

9 votes

Overall karma indicates overall quality.

6 comments1 min readLW link
(docs.google.com)

Multi-Com­po­nent Learn­ing and S-Curves

Nov 30, 2022, 1:37 AM
63 points

26 votes

Overall karma indicates overall quality.

24 comments7 min readLW link

P=NP

OnePolynomialOct 17, 2024, 5:56 PM
−25 points

9 votes

Overall karma indicates overall quality.

0 comments8 min readLW link

Eng­ineer­ing Monose­man­tic­ity in Toy Models

Nov 18, 2022, 1:43 AM
75 points

31 votes

Overall karma indicates overall quality.

7 comments3 min readLW link
(arxiv.org)

LLM mis­al­ign­ment can prob­a­bly be found with­out man­ual prompt engineering

ProgramCrafterJul 8, 2023, 2:35 PM
1 point

2 votes

Overall karma indicates overall quality.

0 comments1 min readLW link

Adam Op­ti­mizer Causes Priv­ileged Ba­sis in Trans­former LM Resi­d­ual Stream

Sep 6, 2024, 5:55 PM
70 points

36 votes

Overall karma indicates overall quality.

7 comments4 min readLW link

The Ja­panese Quiz: a Thought Ex­per­i­ment of Statis­ti­cal Epistemology

DanBApr 8, 2021, 5:37 PM
11 points

7 votes

Overall karma indicates overall quality.

0 comments9 min readLW link

Zoom Out: Distri­bu­tions in Se­man­tic Spaces

TristanTrimAug 6, 2025, 12:01 AM
14 points

8 votes

Overall karma indicates overall quality.

4 comments4 min readLW link

AI Align­ment Re­search Eng­ineer Ac­cel­er­a­tor (ARENA): Call for ap­pli­cants v4.0

Jul 6, 2024, 11:34 AM
57 points

28 votes

Overall karma indicates overall quality.

7 comments6 min readLW link

Sin­gu­lar Learn­ing The­ory for Dummies

Rahul ChandOct 15, 2024, 9:13 PM
2 points

4 votes

Overall karma indicates overall quality.

0 comments8 min readLW link

The Au­di­tor’s Key: A Frame­work for Con­tinual and Ad­ver­sar­ial AI Alignment

Caleb WagesSep 24, 2025, 4:17 PM
1 point

1 vote

Overall karma indicates overall quality.

0 comments1 min readLW link

[Question] Nat­u­ral Selec­tion vs Gra­di­ent Descent

CuriousApe11May 1, 2023, 10:16 PM
4 points

3 votes

Overall karma indicates overall quality.

3 comments1 min readLW link

In­fluence func­tions—why, what and how

Nina PanicksserySep 15, 2023, 8:42 PM
75 points

36 votes

Overall karma indicates overall quality.

6 comments8 min readLW link

Grouped Loss may dis­fa­vor dis­con­tin­u­ous capabilities

Adam JermynJul 9, 2022, 5:22 PM
14 points

4 votes

Overall karma indicates overall quality.

2 comments4 min readLW link

Ex­am­ples of AI’s be­hav­ing badly

Stuart_ArmstrongJul 16, 2015, 10:01 AM
41 points

27 votes

Overall karma indicates overall quality.

41 comments1 min readLW link

Ma­chine learn­ing could be fun­da­men­tally unexplainable

George3d6Dec 16, 2020, 1:32 PM
26 points

19 votes

Overall karma indicates overall quality.

15 comments15 min readLW link
(cerebralab.com)

Do­main-spe­cific SAEs

jacob_droriOct 7, 2024, 8:15 PM
28 points

13 votes

Overall karma indicates overall quality.

2 comments5 min readLW link

Rea­sons com­pute may not drive AI ca­pa­bil­ities growth

Tristan HDec 19, 2018, 10:13 PM
42 points

17 votes

Overall karma indicates overall quality.

10 comments8 min readLW link

Sel­ling Nonapples

Eliezer YudkowskyNov 13, 2008, 8:10 PM
76 points

54 votes

Overall karma indicates overall quality.

78 comments7 min readLW link

The case for al­ign­ing nar­rowly su­per­hu­man models

Ajeya CotraMar 5, 2021, 10:29 PM
186 points

78 votes

Overall karma indicates overall quality.

75 comments38 min readLW link1 review

Ex­plain­ing grokking through cir­cuit efficiency

Sep 8, 2023, 2:39 PM
101 points

45 votes

Overall karma indicates overall quality.

11 comments3 min readLW link
(arxiv.org)

What will the scaled up GATO look like? (Up­dated with ques­tions)

Amal Oct 25, 2022, 12:44 PM
34 points

21 votes

Overall karma indicates overall quality.

22 comments1 min readLW link

“Learn­ing to Sum­ma­rize with Hu­man Feed­back”—OpenAI

[deleted]Sep 7, 2020, 5:59 PM
57 points

16 votes

Overall karma indicates overall quality.

3 comments1 min readLW link

Com­pet­i­tive Mar­kets as Distributed Backprop

johnswentworthNov 10, 2018, 4:47 PM
59 points

25 votes

Overall karma indicates overall quality.

10 comments4 min readLW link1 review

[Link] Com­puter im­proves its Civ­i­liza­tion II game­play by read­ing the manual

Kaj_SotalaJul 13, 2011, 12:00 PM
49 points

37 votes

Overall karma indicates overall quality.

5 comments4 min readLW link

Per­cep­trons Explained

lifelonglearnerFeb 14, 2020, 5:34 PM
13 points

6 votes

Overall karma indicates overall quality.

2 comments1 min readLW link
(owenshen24.github.io)

Im­ple­ment­ing a Trans­former from scratch in PyTorch—a write-up on my experience

Mislav JurićApr 25, 2023, 8:51 PM
20 points

13 votes

Overall karma indicates overall quality.

0 comments10 min readLW link

AI-Gen­er­ated GitHub repo back­dated with junk then filled with my sys­tems work. Has any­one seen this be­fore?

rguntherMay 1, 2025, 8:14 PM
7 points

11 votes

Overall karma indicates overall quality.

1 comment1 min readLW link

Re­in­force­ment Learn­ing Study Group

Kay KozaronekDec 26, 2021, 11:11 PM
20 points

13 votes

Overall karma indicates overall quality.

8 comments1 min readLW link

Com­plex­ity Penalties in Statis­ti­cal Learning

michael_hFeb 6, 2019, 4:13 AM
31 points

12 votes

Overall karma indicates overall quality.

3 comments6 min readLW link
No comments.