RSS

Scal­ing Laws

TagLast edit: 4 Oct 2021 19:20 UTC by plex

Scaling Laws refer to the observed trend of some machine learning architectures (notably transformers) to scale their performance on predictable power law when given more compute, data, or parameters (model size), assuming they are not bottlenecked on one of the other resources. This has been observed as highly consistent over more than six orders of magnitude.

Scaling laws graph from Scaling Laws for Neural Language Models

Thoughts on the Align­ment Im­pli­ca­tions of Scal­ing Lan­guage Models

leogao2 Jun 2021 21:32 UTC
77 points
11 comments17 min readLW link

A closer look at chess scal­ings (into the past)

hippke15 Jul 2021 8:13 UTC
47 points
14 comments4 min readLW link

NVIDIA and Microsoft re­leases 530B pa­ram­e­ter trans­former model, Me­ga­tron-Tur­ing NLG

Ozyrus11 Oct 2021 15:28 UTC
51 points
34 comments1 min readLW link
(developer.nvidia.com)

/​r/​MLS­cal­ing: new sub­red­dit for NN scal­ing re­search/​discussion

gwern30 Oct 2020 20:50 UTC
19 points
0 comments1 min readLW link
(www.reddit.com)

My ML Scal­ing bibliography

gwern23 Oct 2021 14:41 UTC
34 points
9 comments1 min readLW link
(www.gwern.net)

Pa­ram­e­ter counts in Ma­chine Learning

Jsevillamol19 Jun 2021 16:04 UTC
42 points
16 comments7 min readLW link

How much chess en­g­ine progress is about adapt­ing to big­ger com­put­ers?

paulfchristiano7 Jul 2021 22:35 UTC
111 points
23 comments6 min readLW link

Trans­for­ma­tive AI and Com­pute [Sum­mary]

lennart26 Sep 2021 11:41 UTC
5 points
0 comments9 min readLW link

What is Com­pute? - Trans­for­ma­tive AI and Com­pute [1/​4]

lennart23 Sep 2021 16:25 UTC
24 points
8 comments19 min readLW link

Pro­posal: Scal­ing laws for RL generalization

flodorner1 Oct 2021 21:32 UTC
17 points
8 comments11 min readLW link

Fore­cast­ing Com­pute—Trans­for­ma­tive AI and Com­pute [2/​4]

lennart2 Oct 2021 15:54 UTC
10 points
0 comments19 min readLW link

Com­pute Gover­nance and Con­clu­sions—Trans­for­ma­tive AI and Com­pute [3/​4]

lennart14 Oct 2021 8:23 UTC
7 points
0 comments5 min readLW link