Discussion about this post

User's avatar
Trevor McCourt's avatar

"Instead, their GPU estimates measure the power consumption of very small models on large, powerful hardware; this likely results in extremely poor utilization of the GPU and a lot of wasted energy"

This is false. The estimates for the GPUs were based on a theoretical model that underestimates true energy consumption (basically assumes ideal utilization). This is made extremely clear in appendix E of our paper.

https://arxiv.org/pdf/2510.23972

Expand full comment
Trevor McCourt's avatar

"These architectures use digital pseudorandom number generators (PRNGs), which offer an efficient way to generate numbers with sufficient randomness"

by what definition of extremely efficient?

https://youtu.be/dRuhl6MLC78?si=-m_wWvY95RWjVLub&t=2359

Actually, I make most of your points in this talk. Might be worth a watch for anyone that found this post interesting.

Expand full comment

No posts