I think some of the “A”’s are rated a bit too highly on somewhat spurious evidence. Still, an interesting read.
AI can now write a book with a mostly consistent plot, given roughly a page of prompting or less.
Score: A+. I actually thought that I’d failed this one, but I looked it up, and surprisingly (to me), it seems AI was in fact capable of this by 2023! See, for instance, Death of an Author, a novella supposedly written 95%+ by ChatGPT, and described by New Scientist as “not awful.” High praise indeed...
I think this one is rated a bit too highly. There’s a significant difference between “automatically generate a novel given a page of prompting about what its plot should look like” and “95% of a book’s raw text was technically output by an LLM at some point, mediated through extensive feedback and guidance from a human author, over the course of several months”.
Come to think of it, this would make for an interesting contest. Build whatever feedback loop you like, train whatever model you like, then arrive at the contest building. The judges will give you a plot outline, and you’ve got an hour to produce a one page prompt for your system, twelve point, times new roman. The system then has however many hours to write a hundred page novel. Best output wins the competition[1].
Twitter is still functional, and most users haven’t left the site. The workplace environment is kind of miserable though, and content moderation is still severely lacking (according to both sides of the culture war). Elon Musk is largely washed-up, and won’t be doing anything too groundbreaking with the remainder of his life (outside of politics perhaps, which I won’t rule out).
Score: A? I don’t think I did too badly on this one. Twitter (now “X”) is still fully functional, and it still has a large userbase. There have been multiple waves of layoffs and plenty of reported internal drama there, which sounds pretty miserable to me. Musk’s main focus were his DOGE efforts, so he did go into politics, but outside of that, most people seem to consider him well-past his intellectual prime. Obviously this sort of thing is largely subjective, but I think most people would agree my prediction(s) have held up.
This one is a bit political, and I think it lends itself towards seeing what you want to see. To give it an objective rating, I think you need a counterfactual. What would a “not-washed-up” Elon Musk have done between 2023 and now? He seems to be doing the same kind of stuff as always, from my perspective—he’s got one of the half dozen or so frontier LLM companies on Earth, and I haven’t seen a major decline in innovation from any of his companies.
Come to think of it, this would make for an interesting contest.
I’ve suggested that if someone was serious about creative writing benchmarking of novels, they should have a contest where you must provide a fully automated solution the organizers run, which is mandated to use up at least $1000 of tokens, and then release into the real world for marketplace evaluation.
I think some of the “A”’s are rated a bit too highly on somewhat spurious evidence. Still, an interesting read.
I think this one is rated a bit too highly. There’s a significant difference between “automatically generate a novel given a page of prompting about what its plot should look like” and “95% of a book’s raw text was technically output by an LLM at some point, mediated through extensive feedback and guidance from a human author, over the course of several months”.
Come to think of it, this would make for an interesting contest. Build whatever feedback loop you like, train whatever model you like, then arrive at the contest building. The judges will give you a plot outline, and you’ve got an hour to produce a one page prompt for your system, twelve point, times new roman. The system then has however many hours to write a hundred page novel. Best output wins the competition[1].
This one is a bit political, and I think it lends itself towards seeing what you want to see. To give it an objective rating, I think you need a counterfactual. What would a “not-washed-up” Elon Musk have done between 2023 and now? He seems to be doing the same kind of stuff as always, from my perspective—he’s got one of the half dozen or so frontier LLM companies on Earth, and I haven’t seen a major decline in innovation from any of his companies.
The only issue is that you’d need judges willing to read some number of completely LLM-generated novels in order to rate them.
I’ve suggested that if someone was serious about creative writing benchmarking of novels, they should have a contest where you must provide a fully automated solution the organizers run, which is mandated to use up at least $1000 of tokens, and then release into the real world for marketplace evaluation.