Ok, I misread one of gwern’s replies. My original intent was to extract money from the fact that gwern gave (from my vantage point) too high a probability of this being a scam.

Under my original version of the terms, if his P(scam) was .1:

he would expect to get $1000 .1 of the time

he would expect to lose $100 .9 of the time

yielding an expected value of $10

Under my original version of the terms, if his P(scam) was .05:

he would expect to get $1000 .05 of the time

he would expect to lose $100 .95 of the time

yielding an expected value of -$45

In the second case, he would of course not want to take that bet. I’d thus like to amend my suggested conditions to have gwern only put $52 at stake against my $1000. For any P(scam) > .05 this is a positive expected value, so I would expect it to have been satisfactory to gwern[19 August 2012 01:53:58AM].

The problem definition talks about clusters in the space of books, but to me it’s cleaner to look at regions of token-space, and token-sequences as trajectories through that space.

GPT is a generative model, so it can provide a probability distribution over the next token given some previous tokens. I assume that the basic model of a cluster can also provide a probability distribution over the next token.

With these two distribution generators in hand, you could generate books by multiplying the two distributions when generating each new token. This will bias the story towards the desired cluster, while still letting GPT guide the overall dynamic. Some hyperparameter tuning for weighting these two contributions will be necessary.

You could then fine-tune GPT using the generated books to break the dependency on the original model.

Seems like a fun project to try, with GPT-3, though probably even GPT-2 would give some interesting results.