RSS

Grokking (ML)

TagLast edit: 29 Feb 2024 5:58 UTC by Morpheus

A Phenomenon in machine learning where a machine learning model generalizes to a test set only long after it achieved perfect loss on the training set.

Ex­plain­ing grokking through cir­cuit efficiency

8 Sep 2023 14:39 UTC
98 points
8 comments3 min readLW link
(arxiv.org)

Grokking, mem­o­riza­tion, and gen­er­al­iza­tion — a discussion

29 Oct 2023 23:17 UTC
63 points
10 comments23 min readLW link

Mesa-Op­ti­miz­ers via Grokking

orthonormal6 Dec 2022 20:05 UTC
36 points
4 comments6 min readLW link

QAPR 5: grokking is maybe not *that* big a deal?

Quintin Pope23 Jul 2023 20:14 UTC
114 points
15 comments9 min readLW link

Paper+Sum­mary: OMNIGROK: GROKKING BEYOND ALGORITHMIC DATA

Marius Hobbhahn4 Oct 2022 7:22 UTC
46 points
11 comments1 min readLW link
(arxiv.org)

An in­ter­ac­tive in­tro­duc­tion to grokking and mechanis­tic interpretability

7 Aug 2023 19:09 UTC
23 points
3 comments1 min readLW link
(pair.withgoogle.com)

A Mechanis­tic In­ter­pretabil­ity Anal­y­sis of Grokking

15 Aug 2022 2:41 UTC
368 points
47 comments36 min readLW link1 review
(colab.research.google.com)

Grokking Beyond Neu­ral Networks

Jack Miller30 Oct 2023 17:28 UTC
9 points
0 comments2 min readLW link
(arxiv.org)
No comments.