Lottery Ticket Hypothesis

TagLast edit: 26 Nov 2021 14:08 UTC by Multicore

The Lottery Ticket Hypothesis claims that neural networks used in machine learning get most of their performance from sub-networks that are already present at initialization that approximate the final policy (“winning tickets”). The training process would, under this model, work by increasing weight on the lottery ticket sub-network and reducing weight on the rest of the network.

The hypothesis was proposed in a paper by Jonathan Frankle and Micheal Carbin of MIT CSAIL.

A Mechanistic Interpretability Analysis of Grokking

Neel Nanda and Tom Lieberum

15 Aug 2022 2:41 UTC

372 points

47 comments36 min readLW link 1 review

(colab.research.google.com)

Gradations of Inner Alignment Obstacles

abramdemski20 Apr 2021 22:18 UTC

80 points

22 comments9 min readLW link

Updating the Lottery Ticket Hypothesis

johnswentworth18 Apr 2021 21:45 UTC

73 points

41 comments2 min readLW link

[Question] Does the lottery ticket hypothesis suggest the scaling hypothesis?

Daniel Kokotajlo28 Jul 2020 19:52 UTC

14 points

17 comments1 min readLW link

Understanding “Deep Double Descent”

evhub6 Dec 2019 0:00 UTC

149 points

51 comments5 min readLW link 4 reviews

[Question] What happens to variance as neural network training is scaled? What does it imply about “lottery tickets”?

abramdemski28 Jul 2020 20:22 UTC

25 points

4 comments1 min readLW link

Understanding the Lottery Ticket Hypothesis

Alex Flint14 May 2021 0:25 UTC

50 points

9 comments8 min readLW link

Exploring the Lottery Ticket Hypothesis

Rauno Arike25 Apr 2023 20:06 UTC

53 points

3 comments11 min readLW link

Why Neural Networks Generalise, and Why They Are (Kind of) Bayesian

Joar Skalse29 Dec 2020 13:33 UTC

74 points

58 comments1 min readLW link 1 review

No comments.