Understanding human behaviors has received considerable attention in neuroscience literature. Existing research addressed the main question: given measurements of a certain human movement, what is the underlying optimality criteria that human has optimized to fulfill this movement? Inverse Optimal Control (IOC) is a well-established approach to understand the biological movements in terms of the optimal control theory. IOC learns the criterion that best describes the demonstrated human behavior. Thus far, gradient-based techniques have been used to obtain the unknown behavior cost. However, these techniques are limited by locating only local optimum parameters. In this paper, behavior learning is modeled as an Inverse Linear Quadratic Regulator (ILQR) problem, where linear behavior dynamics and a quadratic cost are assumed. An efficient meta-heuristic technique, Particle Swarm Optimization (PSO), is used to retrieve the unknown cost in the proposed ILQR problem. Moreover, an evolving-ILQR algorithm is proposed to refine the learned cost once new unseen demonstrations exist to overcome the over-fitting problem. The reach-to-grasp behavior is studied to quantify the proposed approaches. Results are encouraging and show consistency with that in neuroscience literature. Meanwhile, the evolving-ILQR algorithm has quantified in successive scenarios, where the so far retrieved behavior cost has incrementally refined once new unseen demonstrations are available. |