On Visual Feature Representations for Transition State Learning in Robotic Task Demonstrations

TitleOn Visual Feature Representations for Transition State Learning in Robotic Task Demonstrations
Publication TypeJournal Article
Year of Publication2015
AuthorsGarg, A., Krishnan S., Murali A., Pokorny F. T., Abbeel P., Darrell T., & Goldberg K.
Volume44
Date Published2015
Abstract

Robot learning from raw trajectory data is challenging due to temporal and spatial inconsistencies. A key problem is extracting conceptual task structure from repeated human demonstrations. In prior work, we proposed a Switched Linear Dynamical System (SLDS) characterization of the demonstrations; the key insight being that switching events induce a density over the state space. A mixture model characterization of this density, called Transition State Clustering, extracts the latent task structure. However, robotics is increasingly moving towards state spaces derived from vision, e.g., from Convolutional Neural Networks (CNNs). This workshop paper describes an extension called Transition State Clustering with Deep Learning (TSC-DL), where we explore augmenting kinematic and dynamic states with features from pre-trained Deep CNNs. We report results on two datasets comparing architectures (AlexNet and VGG), choices of convolutional layer for featurization, dimensionality reduction techniques, visual feature encoding. We find that TSC-DL matches manual annotations with up to 0.806 Normalized Mutual Information (NMI). We also found that use of both kinematics and visual data results in increases of up-to 0.215 NMI compared to using kinematics alone. Video results at: http://berkeleyautomation.github.io/tsc-dl/

URLhttp://www.icsi.berkeley.edu/pubs/vision/onvisualfeaturereps15.pdf
ICSI Research Group

Vision