TABLE OF CONTENTS

Engineering the future: How we built the world's first video AI Agents

Dunchadhn Lyons

By

Dunchadhn Lyons

|

5 minute read

At Spot AI, we’re excited to launch AI Agents—an innovative solution for those looking to leverage their video data to generate actionable insights in safety, operational efficiency, and security. AI Agents analyze all of a customer’s video in real-time, delivering automated real-world outcomes through advanced AI and machine learning capabilities.

Development and design of AI Agents

AI Agents emerged from iterative design discussions among Spot’s AI engineering team, driven by the need to address complex video intelligence use cases. During rigorous research, customer feedback, and collaborative brainstorming, we observed a common thread: many of the most impactful use cases could not be fully addressed by any one AI system. This realization led us to architect a framework that integrates all of Spot’s AI capabilities—object detection, tracking, facial and license plate recognition, attribute recognition, and text-based video search—into a composable AI architecture.

This modular approach empowers users to configure AI Agents that detect and respond to specific events, such as Forklift Near-Misses, Possible Falls, Missing Personal Protective Equipment, and Unattended Workstations. Additionally, the platform is extensible, offering customers the flexibility to construct custom AI Agents that address unique, long-tail scenarios within their operations.

The evolution of AI Agents into the first video RAG system

Central to AI Agents is our Proposer-Verifier framework, which introduced a critical validation layer that enhances the accuracy and reliability of AI Agents, especially in safety-critical situations. For instance, in detecting forklift near-miss incidents, where precision is paramount, the verifier functions as a safeguard by cross-checking detected events against stringent accuracy thresholds. This ensures that only verified, high-confidence events trigger alerts or automated actions, providing a reliable, intelligent response system in high-stakes environments.

AI Agents represent a significant advancement in the field of video AI, becoming the first Video Retrieval-Augmented Generation (RAG) System designed specifically for real-time video intelligence. This evolution was driven by our response to increasingly complex challenges, leading us to a framework centered on text, reasoning, and semantic memory in real-time. The foundation of AI Agents includes rule-based retrieval from composable AI, text-based video search, and proposer-verifier reasoning, combining to create a Video RAG architecture similar to RAG models in text-based AI. Positioned at the forefront of video intelligence, Spot’s AI Agents Video RAG system leads the way for reasoning-driven solutions, engineered to meet the needs of an evolving market.

Technical functionality of AI Agents

AI Agents utilize Spot AI’s proprietary algorithms, integrating multi-faceted object and attribute recognition systems alongside text-based search and proposer-verifier reasoning to analyze video data. These agents detect anomalies, patterns, and environmental factors relevant to safety, operations and security. Through the combined functionality of our object and attribute detection, environmental context (e.g., location and time), and behavioral analysis (e.g., motion and duration), AI Agents execute high-fidelity monitoring and event detection.

Each AI Agent employs conditional logic to execute pre-defined responses based on detected events, functioning autonomously through “if-then” logic. This allows the system to take programmed actions, including alerting staff, generating reports, or, soon, dynamically adjusting on-site equipment to meet situational needs. The framework’s flexibility supports extensive customization to address specific operational objectives, improving safety compliance and reducing downtime.

Platform customization and control

AI Agents’ composable design lets users define custom configurations for tailored use cases. For example, a safety-focused user may build an agent dedicated to detecting missing PPE, while an operational team may deploy an agent focused on workflow inefficiencies. The platform’s building block structure offers a user-driven solution, empowering users to deploy AI Agents that best support their unique objectives.

Data scalability and utilization

Given the sheer volume of video data generated by modern video systems, AI Agents provide a scalable method to extract value from this data. For instance, a facility with 64 cameras generates over 30 terabytes of video data per month. AI Agents streamline data analysis by focusing on critical events, allowing teams to prioritize safety and security without the need for continuous manual review.

AI Agents analyze video data in real-time, flagging and categorizing events of interest. Each Agent is designed to respond promptly to incidents, enabling swift action through automated notifications and responses that reduce the time and resources typically required for manual monitoring and response.

Key advantages of AI Agents

  1. Proactive event detection: AI Agents deliver instant alerts via email, text, or push notifications, allowing immediate response to critical events.
  2. Customizable solutions: With customizable logic and modular AI capabilities, users can design AI Agents specific to their operational needs.
  3. Historical data analysis: New AI Agents can retroactively analyze archived footage, providing insights into past events that align with newly configured parameters.

Future developments

The current capabilities of AI Agents are only the beginning. We’re developing enhancements that will enable AI Agents to physically interact with the environment by triggering alarms, stopping equipment, or activating lights and sounds in real time to mitigate risks as they arise.

Furthermore, upcoming updates will allow AI Agents to synthesize trends from historical data, providing deeper insights into operational patterns. This capability will empower users to optimize processes, increase safety measures, and enhance efficiency through data-driven insights.

For a demonstration or further details on building your first Video AI Agents, contact us at success@spotai.co. Discover how AI Agents can transform your video data into actionable intelligence for safety, security, and operational excellence.

Tour the dashboard now

Get Started