3-D Sensing Using Cameras

Our proprietary and patented AI platform addresses today’s critical traffic problems. Over 16 patents protect our core re-usable core technologies:

  • 3-D Engine – Real-time geo-spatial 3-D insights are extracted from a single camera. This includes position, size, speed and headings of objects. This data is critical for mobility applications.
  • Mesh Sensing – Multiple cameras collaborate to cover large areas, present consolidated insights to enhance operational performance.
  • Distributed Edge processing preserves privacy and reduces bandwidth costs.
  • Generative AI Learning – Our platform allows each sensor to learn to detect anomalies by leveraging Generative AI and share that improvement with other sensors.

Cameras are incredibly powerful and information-rich sensors. Compared to even the most advanced LiDAR, for example, they offer much higher spatial resolutions, higher frame rates and the ability to perceive texture making them the ideal sensor for detection, classification and tracking. But cameras have the drawback that they cannot perceive the 3-D world. That’s because the image sensor in the camera transforms the 3-D world into 2-D pixels.

Invision has developed the technology needed to give any camera the ability to perceive the 3-D world by teaching the camera about its surroundings. In so doing, we empower regular cameras with LiDAR-like capabilities. Here’s how it’s done:

A camera-based geo-referenced 3-D map is created from a video walk-through or drive-through of the site.

Using this map, any camera pointed at the site can be automatically calibrated. This allows the camera to sense in 3-D: Geo-positions, speeds, sizes, and accelerations are extracted. This, in turn, makes it possible to apply spatial reasoning to detected objects thus increasing detection accuracy: is the detected vehicle moving on the road surface? Is its orientation and acceleration physically plausible? Is its size consistent with its geo-location?

With each camera extracting geo-referenced metadata, collaborative multi-camera detection and tracking becomes possible. Objects are tracked from entrance to exit across multiple cameras with high integrity.

Invision’s technology is highly generic. It will work with any off-the-shelf camera and any site and achieves a localization accuracy of 10cm. It does not rely on a flat ground-plane assumption. In practice, this assumption is almost never met and can lead to poor performance with localization errors of as much as two meters (learn more here).

Karim Ali explains Invision AI technology