ProgramCrafter comments on LLMs Can’t See Pixels or Characters

ProgramCrafter 22 Jul 2025 23:26 UTC
1 point
0
Generally it does not have to go through the hassle of invoking a tool in well-formed JSON, but rather an inference pipeline catching up (a special token plus command perhaps?) and then replicating the next word or so character by character, so that model’s output gets like (| for token boundaries):

> |Wait, |%INLINE_COMMAND_TOKEN%|zoom(|”s|trawberry|”)| is |”|s|t|r|a|w|b|e|r|r|y|”|, so|...