dhar174

Karma: 2

dhar174 22 Feb 2023 2:06 UTC
3 points
4
on: The idea that ChatGPT is simply “predicting” the next word is, at best, misleading
To those that believe language models do not have internal representations of concepts:
I can help at least partially disprove the assumptions behind that.
There is convincing evidence otherwise, as demonstrated through an Othello in an actual experiment:
https://thegradient.pub/othello/ The researchers conclusion:
“Our experiment provides evidence supporting that these language models are developing world models and relying on the world model to generate sequences.” )

dhar174 8 Apr 2023 17:24 UTC
1 point
0
in reply to: GödelPilled’s comment on: GPT-4 Specs: 1 Trillion Parameters?
You’re missing the possibility that parameters during training were larger than models used for inference. It is common practice now to train large, then distill into a series of smaller models that can be used based on the task need.