Given the rate and the magnitude of algorithm evolution, many of those alternative AI chip designs may become obsolete even before their commercial releases. AI algorithms of tomorrow might demand different compute architectures, memory resources, data-transfer capabilities, etc.
There are various examples where algorithm improvements provided more performance gain than hardware improvement.
* A linear programming problem that would take 82 years to solve in 1988 could be solved in one minute in 2003. Hardware accounted for 1,000 times speedup, while algorithmic advance accounted for 43,000 times.
* Algorithm speedup between 1991 and 2013 for mixed integer solvers was 580,000 times, while the hardware speedup of peak supercomputers increased only 320,000 times.
* Similar results are rumored to take place in other classes of constrained optimization problems and prime number factorization.
DeepScale, spun-out of UC Berkeley research, squeezes AI for advanced driver assistance systems (ADAS) and autonomous driving onto automotive grade-chips (as opposed to GPUs). Their neural network models have demonstrated 30 times speedup compared to leading object detection models using algorithms alone while reducing energy and memory footprint enough to run on existing hardware in just a couple of years.
Another example of such algorithmic leapfrogging came from researchers from the Allen Institute of Artificial Intelligence. Using a novel mathematical approach employing binarization of neural networks, they showed that they can drastically increase speed as well as reduce power and memory requirements.
This enables even the most advanced deep-learning model to be deployed on a chip as small as a $5 Raspberry Pi. The researchers recently spun out the algorithms and processing tools as XNOR.ai* to deploy AI on edge devices and drive further algorithmic advances for AI.