Jan Betley comments on Jan Betley’s Shortform

Jan Betley 13 May 2025 11:50 UTC
37 points
0
Logprobs returned by OpenAI API are rounded.

This shouldn’t matter for most use cases. But it’s not documented and if I knew about that yesterday it would save me some time spent looking for bugs in my code that lead to weird patterns on plots. Also I couldn’t find any mentions of that on the internet.

Note that o3 says this is probably because of quantization.

Specific example. Let’s say we have some prompt and the next token has the following probabilities:
```
{'Yes': 0.585125924124863, 'No': 0.4021507743936782, '453': 0.0010611222547735814, '208': 0.000729297949192385, 'Complex': 0.0005679778139233991, '137': 0.0005012386615241694, '823': 0.00028559723754702996, '488': 0.0002682937756551232, '682': 0.00022242335224465697, '447': 0.00020894740257339377, 'Sorry': 0.00017322348090150576, '393': 0.00016272840074489514, '117': 0.00016272840074489514, 'Please': 0.00015286918535050066, 'YES': 0.00012673300592808172, 'Unknown': 0.00012673300592808172, "It's": 0.00012673300592808172, 'In': 0.00011905464125845765, 'Un': 0.00011905464125845765, '-': 0.0001118414851867673}
```
These probabilities were calculated from the following logprobs, in the same order:
```
[-0.5359282, -0.9109282, -6.8484282, -7.2234282, -7.4734282, -7.5984282, -8.160928, -8.223428, -8.410928, -8.473428, -8.660928, -8.723428, -8.723428, -8.785928, -8.973428, -8.973428, -8.973428, -9.035928, -9.035928, -9.098428]
```
No clear pattern here, and they don’t look like rounded numbers. But if you subtract the highest logprob from all logprobs on the list you get:
```
[0.0, -0.375, -6.3125, -6.6875, -6.9375, -7.0625, -7.6249998, -7.6874998, -7.8749998, -7.9374998, -8.1249998, -8.1874998, -8.1874998, -8.2499998, -8.4374998, -8.4374998, -8.4374998, -8.4999998, -8.4999998, -8.5624998]
```
And after rounding that to 6 decimal places the pattern becomes clear:
```
[0.0, -0.375, -6.3125, -6.6875, -6.9375, -7.0625, -7.625, -7.6875, -7.875, -7.9375, -8.125, -8.1875, -8.1875, -8.25, -8.4375, -8.4375, -8.4375, -8.5, -8.5, -8.5625]
```
So the logprobs resolution is ¹⁄₁₆.

(I tried that with 4.1 and 4.1-mini via the chat completions API)
- J Bostock 13 May 2025 14:04 UTC
  10 points
  0
  Parent
  Might be to avoid people stealing the unembedding matrix weights.
- wassname 13 May 2025 22:40 UTC
  5 points
  0
  Parent
  I also haven’t seen this mentioned anywhere.
  
  I think most commercial frontier models that offer logprobs will take some precautions against distilling. Some logprobs seem to have a noise vector attached too (deepseek?), and some like grok will only offer the top 8, not the top 20. Others will not offer them at all.
  
  It’s a shame, as logprobs can be really information rich and token efficient ways to do evals, ranking, and judging.

Jan Betley comments on Jan Betley’s Shortform

Logprobs returned by OpenAI API are rounded.