Research
I am currently working on user-assistive robots that learn user organizational preferences, such as object placement
and arrangement preferences, from passive observation rather than explicit task instructions or demonstrations.
My research aims to develop semantic reasoning techniques that can learn novel rearrangement preferences
by integrating contextual cues from passive observations of partially arranged environments
(e.g. a half-empty fridge or cabinet), with the aim of generalizing to previously unseen objects and households.
My broader research interests include employing user interaction to resolve
goal uncertainty in assistive tasks, user-adaptive task planning for human-robot collaboration, and robot imitation
learning from video demonstrations.
|
ConSOR: A Context-Aware Semantic Object Rearrangement Framework
for Partially Arranged Scenes
Kartik Ramachandruni,
Max Zuo,
Sonia Chernova
International Conference on Intelligent Robots and Systems (IROS) 2023, IROS ARC 2023
workshop
Paper |
Code | Abstract | Poster | Workshop
Object rearrangement is the problem of enabling a robot to identify the correct object placement
in a complex environment. Prior work on object rearrangement has explored a diverse set of
techniques
for following user instructions to achieve some desired goal state. Logical predicates, images of
the
goal scene, and natural language descriptions have all been used to instruct a robot in how to
arrange
objects. In this work, we argue that burdening the user with specifying goal scenes is not
necessary in partially-arranged environments, such as common household settings. Instead, we show
that
contextual cues from partially arranged scenes (i.e., the placement of some number of pre-arranged
objects in
the environment) provide sufficient context to enable robots to perform object rearrangement
\textit{without any explicit user goal specification}. We introduce ConSOR, a Context-aware
Semantic Object
Rearrangement framework that utilizes contextual cues from a partially arranged initial state of
the environment to complete the arrangement of new objects, without explicit goal specification
from
the user. We demonstrate that ConSOR strongly outperforms two baselines in generalizing to novel
object arrangements and unseen object categories. The code and data are available at https://github.com/kartikvrama/consor
.
|
UHTP: A User-Aware Hierarchical Task Planning Framework for
Communication-Free, Mutually-Adaptive Human-Robot Collaboration
Kartik Ramachandruni*,
Cassandra Kent*,
Sonia Chernova
Transactions on Human-Robot Interaction (THRI) 2023, ACC HAI 2022 workshop
Paper | Abstract | Workshop
Collaborative human-robot task execution approaches require
mutual adaptation, allowing both the human and robot partners to
take active roles in action selection and role assignment to
achieve a single shared goal. Prior works have utilized a
leader-follower paradigm in which either agent must follow the
actions specified by the other agent. We introduce the User-aware
Hierarchical Task Planning (UHTP) framework, a communication-free
human-robot collaborative approach for adaptive execution of
multi-step tasks that moves beyond the leader-follower paradigm.
Specifically, our approach enables the robot to observe the human,
perform actions that support the human's decisions, and actively
select actions that maximize the expected efficiency of the
collaborative task. In turn, the human chooses actions based on
their observation of the task and the robot, without being
dictated by a scheduler or the robot. We evaluate UHTP both in
simulation and in a human subjects experiment of a collaborative
drill assembly task. Our results show that UHTP achieves more
efficient task plans and shorter task completion times than
non-adaptive baselines across a wide range of human behaviors,
that interacting with a UHTP-controlled robot reduces the human's
cognitive workload, and that humans prefer to work with our
adaptive robot over a fixed-policy alternative.
|
A Survey of Semantic Reasoning Frameworks for Robotic Systems
Weiyu Liu*,
Angel Daruna*,
Kartik Ramachandruni**,
Maithili Patel**,
Sonia Chernova
Robotics and Autonomous Systems (RAS) 2023
Paper | Abstract
Robots are increasingly transitioning from specialized, single-task machines to general-purpose
systems that operate in diverse and dynamic environments. To address the challenges associated
with operation in real-world domains, robots must effectively generalize knowledge, learn, and be
transparent in their decision making. This survey examines Semantic Reasoning techniques for
robotic systems, which enable robots to encode and use semantic knowledge, including concepts,
facts, ideas, and beliefs about the world. Continually perceiving, understanding, and generalizing
semantic knowledge allows a robot to identify the meaningful patterns shared between problems and
environments, and therefore more effectively perform a wide range of real-world tasks. We identify
the three common components that make up a computational Semantic Reasoning Framework: knowledge
sources, computational frameworks, and world representations. We analyze the existing
implementations and the key characteristics of these components, highlight the many interactions
that occur between them, and examine their integration for solving robotic tasks related to five
aspects of the world, including objects, spaces, agents, tasks, and actions. By analyzing the
computational formulation and underlying mechanisms of existing methods, we provide a unified view
of the wide range of semantic reasoning techniques and identify open areas for future
research.
|
Attentive Task-Net: Self Supervised Task-Attention Network for Imitation Learning using
Video Demonstration
Kartik Ramachandruni,
Madhu Vankadari,
Anima Majumder,
Samrat Dutta,
Swagat Kumar
International Conference on Robotics and Automation (ICRA) 2020
Paper | Abstract
This paper proposes an end-to-end self-supervised feature representation network named Attentive
Task-Net or AT-Net for video-based task imitation. The proposed AT-Net incorporates a novel
multi-level spatial attention module to identify the intended task demonstrated by the expert. The
neural connections in AT-Net ensure the relevant information in the demonstration is amplified and
the irrelevant information is suppressed while learning task-specific feature embeddings. This is
achieved by a weighted combination of multiple inter mediate feature maps of the input image at
different stages of the CNN pipeline. The weights of the combination are given by the
compatibility scores, predicted by the attention module for respective feature maps. The AT-Net is
trained using a metric learning loss which aims to decrease the distance between the feature
representations of concurrent frames from multiple view points and increase the distance between
temporally consecutive frames. The AT-Net features are then used to formulate a reinforcement
learning problem for task imitation. Through experiments on the publicly available Multi-view
pouring dataset, it is demonstrated that the output of the attention module highlights the
task-specific objects while suppressing the rest of the background. The efficacy of the proposed
method is further validated by qualitative and quantitative comparison with a state-of-the-art
technique along with intensive ablation studies. The proposed method is implemented to imitate a
pouring task where an RL agent is learned with the AT-Net in Gazebo simulator. Our findings show
that the AT-Net achieves 6.5% decrease in alignment error along with a reduction in the number of
training iterations by almost 155k over the state-of-the-art while satisfactorily imitating the
intended task.
|
Design and source code from Jon Barron's website
|
|