Sure, ANN computations are mostly multiplication (or multiply and add) - multipl...

Sure, ANN computations are mostly multiplication (or multiply and add) - multiply an ANN input by a weight (parameter) and accumulate, parallelized into matrix multiplication which is the basic operation supported by accelerators like GPUs and TPUs.

Still, even with modern accelerators it's lot of computation, and is what drives the price per token of larger models vs smaller ones.