{compressed, some deletions}
Suppose you have at least one “foundational principle” A = [...words..] → mapped to token vector say in binary = [ 0110110...] → sent to internal NN. Encoding and decoding processes non-transparent in terms of attempting to ‘train’ on the principle A. If the system’s internal weight matrices are already mostly constant, you can’t add internal principles (not clear you can even add them when initial random weights are being nonrandomized during training).
Thanks for the downvotes (to err is human, to error correct, divine). Good probe of system behavior for me.
How abt:
’Reason correctly (logic) from stated 1st principles, rely on standard physics and concrete mathematical reps + definitions and systems can reach valid conclusions (under the 1st principle assumptions, etc.) that are *irrefutable* (in their domain of validity)
{M words, N posts} have been written by humanity that have been in the set: {useful in some way but not the way intended, null, anti-useful}. There’s an opportunity cost to reading posts (finite life, attention,etc). The longer the post, the higher the probability of wasted opportunity for the reader (compression good).
Nice posts include Einstein’s 1905 papers or 1915 GR.′