Replicating the replication crisis with GPT-3?

I am get­ting wor­ried that peo­ple are hav­ing so much fun do­ing in­ter­est­ing stuff with GPT-3 and AI Dun­geon that they’re for­get­ting how easy it is to fool your­self. Maybe we should think about how many differ­ent cog­ni­tive bi­ases are in play here? Here are some fea­tures that make it par­tic­u­larly easy dur­ing ca­sual ex­plo­ra­tion.

First, it works much like au­to­com­plete, which makes it the most nat­u­ral thing in the world to “cor­rect” the tran­script to be more in­ter­est­ing. You can undo and retry, or trim off ex­tra text if it gen­er­ates more than you want.

Ran­dom­ness is turned on by de­fault, so if you try mul­ti­ple times then you will get mul­ti­ple replies and keep go­ing un­til you get a good one. It would be bet­ter sci­ence but less fun to keep the en­tire dis­tri­bu­tion rather than stop­ping at a good one. Ran­dom­ness also makes a lot of gam­blers’ fal­la­cies more likely.

Sup­pose you don’t do that. Then you have to de­cide whether to share the tran­script. You will prob­a­bly share the in­ter­est­ing tran­scripts and not the bor­ing failures, re­sult­ing in a “file drawer” bias.

And even if you don’t do that, “in­ter­est­ing” tran­scripts will be linked to and up­voted and re­shared, for an­other kind of sur­vivor bias.

What other bi­ases do you think will be a prob­lem?