Groq hardware has the fastest ResNet-50 performance of any commercially available hardware; so you can perform over 400,000 multiplications before one byte is retrieved from memory on a GPU.

Only Groq architecture provides information about power and performance at compile time. What does that mean? Groq makes fast work out of all kinds of work. No need to waste time profiling your code on hardware. You can optimize using the compiler, know the power consumption and completion time at compile time, limit power to below a certain level, and bound the time taken to execute a model.