drstats/export/posts-trekhleb.txt

trekhleb said @retoor I've tried to train ~80M GPT parameters on a single GPU in the browser so far. Pretty heavy. It is interesting to see how 1.5B parameter will behave...,@retoor I'm not sure, it probably depends on the model configuration/implementation and equipment. But in the browser, for that "homemade GPT", I see that training on WebGPU is around x100 - x1000 times faster than CPU```