A Survey of Semantic Reasoning Frameworks for Robotic Systems
Weiyu Liu,
Angel Daruna,
Kartik Ramachandruni**,
Maithili Patel**,
Sonia Chernova
Robotics and Autonomous Systems (RAS) 2023
Paper | Abstract
Robots are increasingly transitioning from specialized, single-task machines to general-purpose systems that operate in diverse and dynamic environments. To address the challenges associated with operation in real-world domains, robots must effectively generalize knowledge, learn, and be transparent in their decision making. This survey examines Semantic Reasoning techniques for robotic systems, which enable robots to encode and use semantic knowledge, including concepts, facts, ideas, and beliefs about the world. Continually perceiving, understanding, and generalizing semantic knowledge allows a robot to identify the meaningful patterns shared between problems and environments, and therefore more effectively perform a wide range of real-world tasks. We identify the three common components that make up a computational Semantic Reasoning Framework: knowledge sources, computational frameworks, and world representations. We analyze the existing implementations and the key characteristics of these components, highlight the many interactions that occur between them, and examine their integration for solving robotic tasks related to five aspects of the world, including objects, spaces, agents, tasks, and actions. By analyzing the computational formulation and underlying mechanisms of existing methods, we provide a unified view of the wide range of semantic reasoning techniques and identify open areas for future research.
|
Attentive Task-Net: Self Supervised Task-Attention Network for Imitation Learning using Video Demonstration
Kartik Ramachandruni,
Madhu Vankadari,
Anima Majumder,
Samrat Dutta,
Swagat Kumar
International Conference on Robotics and Automation (ICRA) 2020
Paper | Abstract
This paper proposes an end-to-end self-supervised feature representation network named Attentive Task-Net or AT-Net for video-based task imitation. The proposed AT-Net incorporates a novel multi-level spatial attention module to identify the intended task demonstrated by the expert. The neural connections in AT-Net ensure the relevant information in the demonstration is amplified and the irrelevant information is suppressed while learning task-specific feature embeddings. This is achieved by a weighted combination of multiple inter mediate feature maps of the input image at different stages of the CNN pipeline. The weights of the combination are given by the compatibility scores, predicted by the attention module for respective feature maps. The AT-Net is trained using a metric learning loss which aims to decrease the distance between the feature representations of concurrent frames from multiple view points and increase the distance between temporally consecutive frames. The AT-Net features are then used to formulate a reinforcement learning problem for task imitation. Through experiments on the publicly available Multi-view pouring dataset, it is demonstrated that the output of the attention module highlights the task-specific objects while suppressing the rest of the background. The efficacy of the proposed method is further validated by qualitative and quantitative comparison with a state-of-the-art technique along with intensive ablation studies. The proposed method is implemented to imitate a pouring task where an RL agent is learned with the AT-Net in Gazebo simulator. Our findings show that the AT-Net achieves 6.5% decrease in alignment error along with a reduction in the number of training iterations by almost 155k over the state-of-the-art while satisfactorily imitating the intended task.
|
Vision-based control of UR5 robot to track a moving object under occlusion using Adaptive Kalman Filter
Kartik Ramachandruni,
Shivam Jaiswal,
Suril V. Shah
Advances In Robotics (AIR) 2019
Paper | Abstract
This paper presents a robust method to track a moving object under occlusion using an off-the-shelf monocular camera and a 6 Degree of Freedom (DOF) articulated arm. The visual servoing problem of tracking a known object using data from a monocular camera can be solved with a simple closed loop controller. However, this system frequently fails in situations where the object cannot be detected and to overcome this problem an estimation based tracking system is required. This work employs an Adaptive Kalman Filter (AKF) to improve the visual feedback of the camera. The role of the AKF is to estimate the position of the object when it is occluded/out of view and remove the noise and uncertainties associated with visual data. Two estimation models for the AKF are selected for comparison and among them, the Mean-Adaptive acceleration model is implemented on a 6-DOF UR5 articulated arm with a monocular camera mounted in eye-in-hand configuration to follow the known object in 2D cartesian space (without using depth information).
|