Wang, Peng Edward

Explainable AI for Human Action Prediction in Human-Robot Collaboration towards Smart Manufacturing

INTRODUCTION:

While some manufacturing operations can be relatively easily automated, many involve sophistic skills and adaptations and are not easily automatable. Integrating humans with machines/robots can provide an effective solution for high-efficient automation of complex manufacturing processes. The integration transforms human skills/knowledge to programmable robot motions and relax the human involvement, and hence make the automation easier to realize as well as more reliable, robust, and adaptive. This is the human-robot collaboration (HRC) or co-robot, that enables robots to collaborate and coordinate effectively and safely with multiple other agents, either people or robots. Effective solutions for HRC requires deep analysis of human actions that are dominated by human skills/knowledge, and then adapt and embed the knowledge into the planning and control of robot motion. The realization of HRC depends on multiple factors, one of which is human action recognition and prediction from videos and other sensor data captured in the factory floor. This work will study on how deep learning can be leveraged to accurately recognize and predict human actions.

METHODS:

This research investigates deep convolutional neural network (DCNN) to recognize the human action and identify the context of the action for accurate and reliable inference of human workers’ intention in performing manufacturing tasks. Specifically, well-established DCNN structures will be utilized after modification through transfer learning-enabled fine tuning, for recognition of human worker actions. In the transferred DCNN, the convolutional and pooling layers for feature extraction are fixed, while only the newly added fully connected layers need to be tuned from scratch with random initialization. The transfer learning is based on an assumption that feature learning (especially the lower-level feature learning) in DCNN is generalizable regardless of the application domains.

To address the challenge, black-box modeling nature, of the deep learning networks, We will investiage a post-hoc technique, layer-wise relevance propagation (LRP), for quantifying the contributions of individual values in inputs to the output (i.e. classification decision) of DCNNs, to determine what information in the images is used by the DCNN to distinguish between human actions. The quantified values, known as relevance scores, obtained from applying LRP to individual CNN decisions of several samples of each input type are analyzed for their consistency among samples belonging to the same action.

Personnel:

Edward Wang
Matthew Russel (Ph.D. student)
Mellissa Anders (master student)
Joe Kershaw (master student)

Computational Methods:

Deep learning (to be developed by the group)

Software:

Matlab (available at UK)
Python (free access online)

UK and non-UK collaborators:

None

Grants:

Publications: