I’ve done some benchmarking in 2018. I benchmarked an “AI software” we devised, by some benchmarks mostly I invented, too. Which doesn’t look very good, I know, but bear with me!

For one, I have given an unsolved Sudoku puzzle to this software with two working names, “Spector” and/or “Profounder”. It concluded, that for every X and every Y: X==Y implies that column(X) != column(Y) and row(X)!=row(Y). (Zero Sudoku topic knowledge by Spector is, of course, a necessary condition.)

With several unsolved Sudoku puzzles, Spector concluded that subsquare(X) != subsquare(Y). Just for one puzzle, the concept of “3 by 3 subsquare” isn’t economical. It’s economical for several of them, though.

The second benchmark I invented, was giving the string “ABCDEFGHIJKLMNOPQRSTUWXYZ” to Spector. The string generating algorithm would be simpler if the letter “V” wasn’t missing. This is the way Spector notices something might be wrong with the given string. (Zero alphabet topic knowledge by Spector is, of course, a necessary condition.)

Yet another benchmark was numbers from 3 to 122. Each labeled by 0 or 1, depends if it’s nonprime or prime. The simplest generating algorithm is a sort of Eratosthenes sieve. Not for numbers, but for their labels. Spector finds and generates it, with zero knowledge about primes.

Another benchmark was inspired by a mistake someone made. There is a nursing school here somewhere, which sends their students to practice in a nearby hospital for a day or two every week. Except for freshmen in the first year. They teach them everything else in this school, of course, including the gym (boys and girls separated there) and they feed them all once a day, too. It’s standard in this part of the world. But the school does not feed them when they are at the hospital.

So they forget to feed girls from 2B department on Thursdays when they are in school. They forget to include that into their schedule. Boys from 2B have eaten while girls were exercising, but poor girls were forgotten and nobody noticed.

I asked Spector, giving him the school schedule in CSV format if anything is wrong with it. Spector did conclude, that every student has a lunch break once a day when not practicing, except for those girls on Thursday. Which was (probability-wise) odd enough to be significant.

Spector/Profounder is all about one mayor and three to five lesser tricks. To find a generating algorithm for every part of any data it gets. This is the mayor. Then to see if some small data alteration would mean a significantly simpler generation. Then to evaluate the probabilities and needed complexities. And then Spector also asks itself, what data changes are possible but which conserve already observed rules. Which is particularly handy in the unsolved Sudoku case for example.

We will do some more benchmarking this year.

https://www.oecd.org/going-digital/ai/principles/

Either I have no clue, either …