RandyOrion comments on Introducing the WeirdML Benchmark

RandyOrion 17 Jan 2025 4:42 UTC
−3 points
0
Hi. A question here.

“The system executes code in a Docker container with strict resource limits (TITAN V GPU with 12GB memory, 600-second timeout). This ensures fair comparison between models and tests their ability to work within realistic constraints.”

How can you run llama-3.1/3.3-70b models with 12GB vram?
- Håvard Tveit Ihle 17 Jan 2025 4:51 UTC
  2 points
  0
  Parent
  The LLMs are presented with the ML task and they write python code to solve the ML task. This python code is what is run in the isolated docker with 12GB memory.
  
  So the LLMs themselves are not run on the TITAN V, they are mostly called through an API. Although I did in fact run a bunch of the LLMs locally through ollama, just not on the TITAN V server, but a larger one.
  - RandyOrion 17 Jan 2025 7:46 UTC
    1 point
    0
    Parent
    Thanks for the clarification.