mishka comments on Have you heard about MIT’s “liquid neural networks”? What do you think about them?

mishka 10 May 2023 12:12 UTC
3 points
0
Not quite. The actual output is the map from tokens to probabilities, and only then one samples a token from that distribution.

So, LLMs are more continuous in this sense than is apparent at first, but time is discrete in LLMs (a discrete step produces the next map from tokens to probabilities, and then samples from that).

Of course, when one thinks about spoken language, time is continuous for audio, so there is still some temptation to use continuous models in connection with language :-) who knows… :-)
- awg 10 May 2023 15:20 UTC
  1 point
  0
  Parent
  Ah aha! Thank you for that clarification!