Learning strategies and the Pokemon league parable

I have re­cently noted a shift in my learn­ing strat­egy, which I re­flec­tively ap­prove of. On hind­sight, it feels ob­vi­ous.

How­ever, I can vividly re­call many peo­ple I re­spect and ad­mire recom­mend­ing me to try a very similar thing in the past, and my­self scoffing at them and trust­ing my gut over their ad­vice.

Take this as a word of cau­tion if you feel like the ad­vice I am giv­ing is ob­vi­ously wrong: I have fallen in this pit, and it took me a while to climb out.

I claim there are two broad learn­ing strate­gies one can fol­low.

The de­fault strat­egy is what in the soft­ware en­g­ineer­ing lingo is called a wa­ter­fall strat­egy. Peo­ple at­tend col­lege and differ­ent courses and read books, and gain knowl­edge on a broad col­lec­tion of sub­jects. After­wards, they move to a sec­ond phase where they try to ap­ply what they have learned. If they can­not reach their goals with their cur­rent strat­egy, they back down to the first phase and start again.

I claim that this strat­egy has some glar­ing flaws, which I plan to ex­pose via my ex­pe­rience in an area of great im­por­tance: Poke­mon videogames.

When I got my first video­con­sole, the first videogame I ever owned was Poke­mon Gold. I ab­solutely loved that game, and I spent many hours ab­sorbed try­ing to com­plete it.

For the most part, the level of challenge was ad­e­quate for an 8 year old, but there comes a point where the game sud­denly spikes up in difficulty: the Poke­mon league. In the league, you have to defeat five train­ers with full teams of high level Poke­mon in a row.

When I first con­fronted the Poke­mon league, I was quite un­der lev­eled and I was ut­terly crushed.

My re­sponse to the prob­lem was to back down and go to eas­ier ar­eas, where I could train with eas­ier challenges.

After around 10h of train­ing, I came back to the League and defeated the five train­ers with rel­a­tively ease, and I won the ti­tle of Poke­mon mas­ter, offi­cially achiev­ing my most am­bi­tious 8 year old goal.

“Well Jaime”, you may say, “That does not look like your learn­ing strat­egy had a prob­lem. You suc­cess­fully com­pleted the challenge!”

And yes, I did. But it was awfully in­effi­cient! I lev­eled at a very slow rate by fight­ing eas­ier fights, I ended up train­ing for too long and I picked up tac­tics to fight other train­ers in­stead of learn­ing the tac­tics that would have been op­ti­mal to fight through the league.

If in­stead I had kept fight­ing the train­ers in the league, I would have lev­eled up much faster and I would have learned what tac­tics were good against each of the elite train­ers I had to fight.

I claim that similar things hap­pen to the peo­ple like my­self two years ago who ap­ply the wa­ter­fall strat­egy:

  • You end up wast­ing time learn­ing stuff you do not need. Even worse, be­cause of fade-out effects you will for­get most of what you learned. Think about how many use­ful things did you learn in school that you still re­mem­ber to­day.

  • You do not learn at an op­ti­mal rate, be­cause it’s hard to cal­ibrate for prob­lems that are just above your cur­rent level.

  • There is no clear stop point of tran­si­tion­ing be­tween learn­ing mode and solv­ing mode, which fur­ther re­sults in over­train­ing.

Is there an al­ter­na­tive? Yes there is! I in­tro­duce to you pro­ject-based learn­ing, aka the ag­ile strat­egy. In­stead of gen­er­ally learn­ing and then solv­ing, run head first into the prob­lem, and learn to fail effec­tively. When the brick wall in front of you re­fuses to move, think about what you need to learn to over­com­ing, and while you learn the tech­niques make an effort to ap­ply them to the task of wall mov­ing.

You will learn more about the prob­lem and whether the tech­niques you are learn­ing are ac­tu­ally use­ful this way.

When I in­tro­spect on why I thought wa­ter­fall strat­egy was a bet­ter fit for me than pro­ject based learn­ing, I find the fol­low­ing rea­sons:

  • Water­fall is a good strat­egy when it is hard to de­ter­mine what one needs to learn to solve a prob­lem. It is easy to say “you never know if it is go­ing to be use­ful” and just keep learn­ing use­less stuff. But there is virtue in pre­ci­sion, and I have learned to be will­ing to make more con­crete pre­dic­tions (“there is a low chance than learn­ing French will pay off in the long run given my cur­rent tra­jec­tory”).

  • Hu­mans like find­ing con­nec­tions be­tween dis­parate ar­eas, and it feels re­ally pleas­ant to con­nect two differ­ent things to­gether (like for ex­am­ple play­ing Poke­mon videogames and ex­plain­ing the differ­ences be­tween strate­gies for learn­ing). I am ac­tu­ally un­sure of how much of this is true in­sight and how much is just con­fir­ma­tion bias (if I had not played Poke­mon, maybe an­other ex­am­ple would have oc­curred to me).

  • Bang­ing your head against a wall feels re­ally bad, and just re­peat­edly try­ing to solve a prob­lem through sheer willpower is not a strat­egy I would recom­mend. But that is not a very good ex­cuse to skip the first try: some­times the wall is made of pa­per­board, and you want to take ad­van­tage of that.

I am run­ning out of time to write more, and I feel I have not given enough ex­am­ples.

The re­flec­tion that ini­ti­ated this post was me two years ago read­ing a new math text­book ev­ery month vs me now tak­ing weeks off at a time try­ing to solve open re­search prob­lems. I feel like the sec­ond strat­egy is prov­ing much more use­ful to give me a feel of whether this is a good way of reach­ing my goals, and giv­ing me a bet­ter map of what I need to learn to solve the re­search prob­lems (turns out that it’s not so much about learn­ing tech­ni­cal top­ics but rather about learn­ing how to write bet­ter and be or­ga­nized while pur­su­ing re­search di­rec­tions).

I keep see­ing peo­ple want­ing to con­tribute to AI al­ign­ment re­search or other im­por­tant re­search ar­eas and re­sort­ing to wa­ter­fall strate­gies in­stead of pro­ject based learn­ing. This post is for them.

If you have an ex­am­ple you want to share or any thoughts, please con­tribute to the con­ver­sa­tion in the com­ments sec­tion!

Dar­mani in the com­ments be­low points out that I am con­fus­ing what is nor­mally called pro­ject-based learn­ing (do­ing pro­jects to learn gen­er­ally use­ful skills) with what is nor­mally called pull-based learn­ing (learn­ing skills to do a con­crete pro­ject). Ooops!

This post meant to recom­mend the sec­ond one, and the ag­ile vs wa­ter­fall anal­ogy was meant to point out that if you have a con­crete goal you are work­ing to­wards, you can use it to con­stantly check whether you are already in a good po­si­tion to make progress on the prob­lem you re­ally care about.

Fol­low­ing his recom­men­da­tion, I also want to share a note on how I gen­er­ated this post:

The gen­er­a­tor of this post is a com­bi­na­tion of the fol­low­ing ob­ser­va­tions:

1) I see a lot of peo­ple who keep wait­ing for a call to adventure

2) Most knowl­edge I have ac­quired through life has turned out to be use­less, non trans­fer­able and/​or fades out very quickly

3) It makes sense to think that peo­ple get a bet­ter grasp of what skills they need to solve a prob­lem (such as pro­duc­ing high qual­ity AI Align­ment re­search) af­ter they have grap­pled with the prob­lem. This feels spe­cially true when you are in the edge of a new field, be­cause there is no one else you can turn to who would be able to com­press their ex­pe­rience in a di­gestible for­mat.

4) Peo­ple (spe­cially in math­e­mat­ics) have a ten­dency to wan­der around aim­lessly pick­ing up top­ics, and then use very few of what they learn. Here I am stand­ing on not very solid ground, be­cause con­ven­tional wis­dom is that you need to wan­der around to “see the con­nec­tions”, but I feel like that might be just con­fir­ma­tion bias creep­ing in.