March 14, 2019 at 9:51 am #29446
- Topic 973
- Replies 0
- posts 973
#News(IoTStack) [ via IoTForIndiaGroup ]
Companies battle it out to get artificial intelligence to the edge using various chip architectures as their weapons of choice.
It is possible to run inference models on microcontrollers and relatively low-end chips, but most machine-learning functions need a boost from one of what has become a long list of optional CPU add-ins based on FPGAs, ASICs and other SoC configurations, as well as combinations of GPUs, CPUs and occasionally by a special-purpose ASICS like Google’s Tensor Processing Unit, said NXP’s Levy.
Most of that help comes in the form of accelerators. These FPGAs, SoCs, ASIC and other special-purpose chips are designed to help resource-constrained, x86-based devices process large volumes of image or audio data through one layer after another of analytic criteria so the app can correctly calculate and weight the value of each.
Intel and Nvidia have made sallies toward the edge AI market. Efforts such as Nvidia’s Jetson—a GPU module platform with a 7.5W power budget that is a fraction of Nvidia’s more typical 70W but way too high for edge applications that tend not to rise above 5W—have not been convincing, Kaul said.
Two new efforts at inference hardware
Xilinx has tried to capitalize on its experience in FPGAs and in systems-level design with a new product line and roadmap designed to address as many parts of the edge/device market as possible.
describing an Adaptive Compute Acceleration Platform that “draws on the strength of CPUs, GPUs and FPGAs to accelerate any application.”
Xilinx presentations describe a broad product line, list of use cases and details about its AI engine core, the goal for which is to deliver three to eight times the performance per silicon area than traditional approaches and provide high-performance DSP capabilities.
Flex Logix, meanwhile, has created a reconfigurable neural accelerator that uses low DRAM bandwidth. The target spec for silicon area and power is due during the first half of next year, with tape-out in the second half of the year. The inferencing engine will act as a CPU, not simply a larger, fancier accelerator. It offers a modular, scalable architecture intended to reduce the cost in time and energy of moving data by reducing the need to move it and by improving the way data and matrix calculations load to reduce bottlenecks.
The chip dedicates DRAM as if it were dedicated to a single processor block rather than managing it as one big pool of memory. The DRAM does not feed data to several parts of the chip simultaneously. “Treating DRAM, which is really expensive, as one big pool of memory flowing into one processor block is typical of Van Neumann architecture, but isn’t going to be the winning architecture for neural nets,” Tate said.
You must be logged in to reply to this topic.