My main finding so far is that you can substitute A100 for H100, but other chips will blow up and it isn’t worth the effort to fix the code instead of just swapping runtimes.
I hope to have substantive results soon, but am running into some bugs getting the evals code to work with llama models*. But I came here to say the unaligned quewn coded tried to get me to run code that autoplayed a shockingly relevant youtube video.
*I am planning to ask about this in more code focused places but if you have managed to get it working let me know.
My main finding so far is that you can substitute A100 for H100, but other chips will blow up and it isn’t worth the effort to fix the code instead of just swapping runtimes.
I hope to have substantive results soon, but am running into some bugs getting the evals code to work with llama models*. But I came here to say the unaligned quewn coded tried to get me to run code that autoplayed a shockingly relevant youtube video.
*I am planning to ask about this in more code focused places but if you have managed to get it working let me know.