If you’re using inspect evals, you might want to consider optimising your parallelism kwargs for throughput.
The default max_connections is pretty bad and leads to evals taking forever.
max_connections
After some quick testing I found that 1000 might be optimal for GPT-4o specifically (see plot below).
I expect this to depend on (i) model family / provider, (ii) whether it’s a default model or a custom finetune (for OAI specifically)
Codebase here if anyone wants to reproduce: https://github.com/dtch1997/inspect-evals-speedtest
If you’re using inspect evals, you might want to consider optimising your parallelism kwargs for throughput.
The default
max_connections
is pretty bad and leads to evals taking forever.After some quick testing I found that 1000 might be optimal for GPT-4o specifically (see plot below).
I expect this to depend on (i) model family / provider, (ii) whether it’s a default model or a custom finetune (for OAI specifically)
Codebase here if anyone wants to reproduce: https://github.com/dtch1997/inspect-evals-speedtest