The past decade has seen tremendous progress in the application of neural networks to computer vision. These networks are trained with millions of manually annotated images. When presented with a new image, the trained network reliably locates and identifies objects. Behind the scenes billions of calculations are being made on each video frame: a perfect fit for the huge compute resources of cloud farms.
Unfortunately, network bandwidth, latency and privacy concerns make cloud deployments a non-starter for most applications. Running standard deep learning frameworks at the edge requires expensive specialized hardware. We have developed a number of technical innovations that collectively make edge AI viable for mainstream applications.
We are experts at simplifying complex neural networks without sacrificing accuracy. We refer to this as Software Acceleration and it is comprised of both algorithmic innovations and their efficient implementation. Our neural networks often consist of fewer layers with sparse connections. We leverage cascades and the statistics of the underlying signals to make informed guesses and further reduce computation. Connection weights are quantized, sometimes even binarized, and all of our implementations run fixed-point arithmetic. Invision Software Acceleration enables the cost effective deployment of AI on low-powered IoT devices (single core ARM) as well as high throughout (8k@120fps) applications on more advanced hardware.
We are pragmatic rather than dogmatic. We leverage a variety of frameworks such as TensorFlow and Torch for training our networks, but have developed our own embedded software stack in C++. This was the only way for us to achieve the execution speed and memory utilization needed for mainstream deployment. Existing frameworks such as Caffe2 and TensorFlow suffer a 500% to 700% overhead to hold their models in memory. While newer iterations such as TensorFlow Lite have reduced this overhead, our implementations typically achieve less than 10% overhead and are compatible with all mainstream embedded operating systems.
Our accelerated software runs almost anywhere from the cheapest single cores ARM processors, to accelerated hardware such as DSP, VPU, GPU and FPGA. Our software has been proven on Nvidia GPUs, Intel CPUs, Ambarella S2L & S3L, NXP i.MX6 & Layerscape, TI TDA2 as well as a variety of Axis Communications Cameras, and we continue to add platforms with new customers. Taking a pure software approach to AI provides tremendous flexibility, allowing our customers to source from multiple vendors and benefit from improvements in accuracy and speed through continual secured updates.
Some more advanced applications, such as Advanced Driver Assistance Systems, require the fusion of multiple sensors. We have developed a cutting-edge approach to processing multiple data sources acquired asynchronously at heterogeneous frame rates. Existing methods utilize heuristics or neural networks to fuse metadata. Our approach delivers higher accuracy at a lower computational footprint relative to existing methods, delivering both cost and packaging advantages.
Cloud and Fog
Advanced driver assistance systems need to function locally without the latency of a cloud network connection. Similarly, factory safety or building security systems need to function in the event of a network outage. To support these applications, and ease migration from legacy systems, our modular AI functions can be run at the edge, in the cloud, or on a local fog server.