Sure, ANN computations are mostly multiplication (or multiply and add) - multiply an ANN input by a weight (parameter) and accumulate, parallelized into matrix multiplication which is the basic operation supported by accelerators like GPUs and TPUs.
Still, even with modern accelerators it's lot of computation, and is what drives the price per token of larger models vs smaller ones.
Still, even with modern accelerators it's lot of computation, and is what drives the price per token of larger models vs smaller ones.