3 minute presentation. Unmute for audio explanation.

Key Contributions

HARMONIC Architecture

A cognitive-robotic architecture, which enables a flexible integration of OntoAgent, a cognitive architecture with low-level robot planning and control using Behavior Trees.

Multi-Robot Planning

A cognitive strategy for multi-robot planning and execution with metacognition, communication, and explainability. This includes the robots' natural language understanding and generation, their reasoning about plans, goals, and attitudes, and their ability to explain the reasons for their own and others' actions.

Simulation Evaluation

Simulation experiments on a search task carried out by a robotic team, which showcases the human-robot team's interactions and ability to handle complex, real-world scenarios.

Overview

We introduce HARMONIC (Human-AI Robotic Team Member Operating with Natural Intelligence and Communication), a cognitive-robotic architecture that integrates the OntoAgent cognitive framework with general-purpose robot control systems applied to human-robot teaming (HRT). We also present a cognitive strategy for robots that incorporates metacognition, natural language communication, and explainability capabilities required for collaborative partnerships in HRT. Through simulation experiments involving a joint search task performed by a heterogeneous team of a UGV, a drone, and a human operator, we demonstrate the system's ability to coordinate actions between robots with heterogeneous capabilities, adapt to complex scenarios, and facilitate natural human-robot communication. Evaluation results show that robots using the OntoAgent architecture within the HARMONIC framework can reason about plans, goals, and team member attitudes while providing clear explanations for their decisions, which are essential prerequisites for realistic human-robot teaming.

HARMONIC System Architecture

An overview of the HARMONIC framework, showing the Strategic and Tactical components representing the high-level planning (System 2) and low-level execution (System 1), respectively.

OntoAgent in Strategic Layer (System - 2)

OntoAgent is a content-centric cognitive architecture designed for social intelligent agents through computational cognitive modeling. It dynamically acquires, maintains, and expands large-scale knowledge bases essential for perception, reasoning, and action.

Its memory structure has three components: a Situation Model for active concept instances, Long-Term Semantic Memory, and Episodic Memory. Goal-directed behavior is managed via a goal agenda and prioritizer that selects stored plans or initiates first-principles reasoning when needed.

Built on a service-based infrastructure, it integrates native and external capabilities such as robotic vision and text-to-speech. The pivotal element is OntoGraph, a knowledge base API that unifies knowledge representation through inheritance, flexible organization into "spaces," and efficient querying via a graph database view. This underpins robust natural language understanding and meaning-based text generation using a semantic lexicon to resolve complex linguistic phenomena, as well as ontological interpretation of visual inputs.

Individual Capabilities

  • Attention management
  • Perception interpretation
  • Utility-based and analogical decision-making, enhanced by metacognitive abilities
  • Supported by a cognitive architecture, OntoAgent
  • Prioritizing goals and plans
  • Executing and monitoring plans

Team-Oriented Capabilities

  • Natural language communication
  • Explaining decisions
  • Assessing decision confidence
  • Evaluating the trustworthiness of teammates
  • Placeholder
  • Placeholder

Behavior Trees in Tactical Layer (System - 1)

The tactical layer implements a blackboard through the State Manager which maintains condition and state variables. Sensory inputs are continuously written into this State Manager, where the attention module filters and packages perception data for the strategic layer. While the strategic layer runs asynchronously, real-time control is achieved through LiveBT accessing data from the State Manager in parallel, choosing appropriate policies and actions stored in the Action Manager. This structure enables robots to maintain responsive behavior while implementing higher-level commands.

Visual demonstration of the tactical component, showing low-level planning and execution in the HARMONIC framework.

Capabilities

  • Low-level planners and control algorithms for decision making.
  • Manages skills and policies.
  • Processes sensor inputs.
  • Planning motor actions to execute high-level commands received from the strategic layer (e.g., "pickup that screwdriver").
  • Handles reactive responses, such as collision avoidance.
  • Takes care of the needs of the robot.

Evaluation Results

View in full screen mode for clarity.


For further details on OntoAgent, please refer to the following books:

Linguistics for the Age of AI

MIT Press, 2021

Agents in the Long Game of AI

MIT Press, 2024


Acknowledgments

This research was supported in part by grant #N00014-23-1-2060 from the U.S. Office of Naval Research. Any opinions or findings expressed in this material are those of the authors and do not necessarily reflect the views of the Office of Naval Research.

References