For high degree-of-freedom manipulators such as PR2 and Baxter, a key problem is to find an appropriate trajectory that is not only valid from a geometric point-of-view (i.e., feasible and obstacle-free, the criterion that most existing planners focus on), but should also satisfy the user's preferences. Such user's preferences depend on the surrounding context and the task being performed.

In this work, we propose an algorithm to learn such preferences via eliciting online feedback from the user, which does not need to be an optimal demonstration. We demonstrate that the robot can generalize its learning and produce preferred trajectories for new environments and situations, such as household chores and grocery checkout tasks.

Multiple trajectories for moving an egg container. Robot plans a bad trajectory (waypoints 1-2-4) with knife close to flower. As feedback user corrects waypoint 2 and moves it to waypoint 3. User providing zero-G feedback on Baxter.

Popular Press: Discovery Channel Daily Planet (at 6:00 minutes), IEEE Spectrum, Daily Mail (UK), Techcrunch, FOX News, Kurzweil AI, CBS News, CNET, NBC News, Huffington Post (UK), Gizmodo, PopSci, Slashdot Front page, ACM Technews, French Tribune

Publications

Learning Trajectory Preferences for Manipulators via Iterative Improvement. In NIPS , 2013.

Ashesh Jain, Brian Wojcik, Thorsten Joachims and Ashutosh Saxena. [PDF]

Beyond geometric path planning: Learning context-driven trajectory preferences via sub-optimal feedback. In ISRR, 2013.

Ashesh Jain, Shikhar Sharma, and Ashutosh Saxena. [ In, 2013.Ashesh Jain, Shikhar Sharma, and Ashutosh Saxena. [ PDF

An earlier version of this work was presented at the ICML workshop on Robot Learning, June 2013. [PDF]

Power point slides [PPT]

Video

People