Abstract

Human–robot collaboration (HRC) has become an integral element of many manufacturing and service industries. A fundamental requirement for safe HRC is understanding and predicting human trajectories and intentions, especially when humans and robots operate nearby. Although existing research emphasizes predicting human motions or intentions, a key challenge is predicting both human trajectories and intentions simultaneously. This paper addresses this gap by developing a multi-task learning framework consisting of a bi-long short-term memory-based encoder–decoder architecture that obtains the motion data from both human and robot trajectories as inputs and performs two main tasks simultaneously: human trajectory prediction and human intention prediction. The first task predicts human trajectories by reconstructing the motion sequences, while the second task tests two main approaches for intention prediction: supervised learning, specifically a support vector machine, to predict human intention based on the latent representation, and, an unsupervised learning method, the hidden Markov model, that decodes the latent features for human intention prediction. Four encoder designs are evaluated for feature extraction, including interaction-attention, interaction-pooling, interaction-seq2seq, and seq2seq. The framework is validated through a case study of a desktop disassembly task with robots operating at different speeds. The results include evaluating different encoder designs, analyzing the impact of incorporating robot motion into the encoder, and detailed visualizations. The findings show that the proposed framework can accurately predict human trajectories and intentions.

References

1.
Zhang
,
Y.
, and
Yang
,
Q.
,
2021
, “
A Survey on Multi-Task Learning
,”
IEEE Trans. Knowl. Data. Eng.
,
34
(
12
), pp.
5586
5609
.
2.
Fan
,
J.
,
Zheng
,
P.
, and
Lee
,
C. K.
,
2023
, “
A Vision-Based Human Digital Twin Modeling Approach for Adaptive Human-Robot Collaboration
,”
ASME J. Manuf. Sci. Eng.
,
145
(
12
), p.
121002
.
3.
Magrini
,
E.
,
Ferraguti
,
F.
,
Ronga
,
A. J.
,
Pini
,
F.
,
De Luca
,
A.
, and
Leali
,
F.
,
2020
, “
Human-Robot Coexistence and Interaction in Open Industrial Cells
,”
Rob. Comput.-Integr. Manuf.
,
61
, p.
101846
.
4.
Cacace
,
J.
,
Caccavale
,
R.
,
Finzi
,
A.
, and
Grieco
,
R.
,
2023
, “
Combining Human Guidance and Structured Task Execution During Physical Human-Robot Collaboration
,”
J. Intell. Manuf.
,
34
(
7
), pp.
3053
3067
.
5.
Yao
,
B.
,
Yang
,
B.
,
Xu
,
W.
,
Ji
,
Z.
,
Zhou
,
Z.
, and
Wang
,
L.
,
2024
, “
Virtual Data Generation for Human Intention Prediction Based on Digital Modeling of Human-Robot Collaboration
,”
Rob. Comput.-Integr. Manuf.
,
87
, p.
102714
.
6.
Zhang
,
X.
,
Tian
,
S.
,
Liang
,
X.
,
Zheng
,
M.
, and
Behdad
,
S.
,
2024
, “
Early Prediction of Human Intention for Human–Robot Collaboration Using Transformer Network
,”
ASME J. Comput. Inf. Sci. Eng.
,
24
(
5
), p.
051003
.
7.
Liu
,
W.
,
Liang
,
X.
, and
Zheng
,
M.
,
2023
, “
Task-Constrained Motion Planning Considering Uncertainty-Informed Human Motion Prediction for Human-Robot Collaborative Disassembly
,”
IEEE/ASME Trans. Mechatron.
,
28
(
4
), pp.
2056
2063
.
8.
Katsampiris-Salgado
,
K.
,
Dimitropoulos
,
N.
,
Gkrizis
,
C.
,
Michalos
,
G.
, and
Makris
,
S.
,
2024
, “
Advancing Human-Robot Collaboration: Predicting Operator Trajectories Through AI and Infrared Imaging
,”
J. Manuf. Syst.
,
74
, pp.
980
994
.
9.
Xiao
,
J.
,
Gao
,
J.
,
Anwer
,
N.
, and
Eynard
,
B.
,
2023
, “
Multi-Agent Reinforcement Learning Method for Disassembly Sequential Task Optimization Based on Human–Robot Collaborative Disassembly in Electric Vehicle Battery Recycling
,”
ASME J. Manuf. Sci. Eng.
,
145
(
12
), p.
121001
.
10.
Yang
,
J.
, and
Howard
,
B.
,
2020
, “
Prediction of Initial and Final Postures for Motion Planning in Human Manual Manipulation Tasks Based on Cognitive Decision Making
,”
ASME J. Comput. Inf. Sci. Eng.
,
20
(
1
), p.
011007
.
11.
Abuduweili
,
A.
,
Li
,
S.
, and
Liu
,
C.
,
2019
, “Adaptable Human Intention and Trajectory Prediction for Human-Robot Collaboration,” arXiv preprint arXiv:1909.05089.
12.
Tian
,
S.
,
Liang
,
X.
, and
Zheng
,
M.
,
2023
, “
An Optimization-Based Human Behavior Modeling and Prediction for Human-Robot Collaborative Disassembly
,”
2023 American Control Conference (ACC)
,
San Diego, CA
,
May 31–June 2
, pp.
3356
3361
.
13.
Zhou
,
H.
,
Yang
,
G.
,
Wang
,
B.
,
Li
,
X.
,
Wang
,
R.
,
Huang
,
X.
,
Wu
,
H.
, and
Wang
,
X. V.
,
2023
, “
An Attention-Based Deep Learning Approach for Inertial Motion Recognition and Estimation in Human-Robot Collaboration
,”
J. Manuf. Syst.
,
67
, pp.
97
110
.
14.
Caruana
,
R.
,
1997
, “
Multitask Learning
,”
Mach. Learn.
,
28
, pp.
41
75
.
15.
Cai
,
J.
,
Liang
,
X.
,
Wibranek
,
B.
, and
Guo
,
Y.
,
2023
, “
Multi-Task Deep Learning-Based Human Intention Prediction for Human-Robot Collaborative Assembly
,”
Computing in Civil Engineering 2023
,
Corvallis, OR
,
June 25–28
, pp.
579
587
.
16.
Liu
,
C.
,
Li
,
X.
,
Li
,
Q.
,
Xue
,
Y.
,
Liu
,
H.
, and
Gao
,
Y.
,
2021
, “
Robot Recognizing Humans Intention and Interacting With Humans Based on a Multi-Task Model Combining ST-GCN-LSTM Model and YOLO Model
,”
Neurocomputing
,
430
, pp.
174
184
.
17.
Crawshaw
,
M.
,
2020
, “Multi-task Learning With Deep Neural Networks: A Survey,” 2009.09796, https://arxiv.org/abs/2009.09796.
18.
Standley
,
T.
,
Zamir
,
A.
,
Chen
,
D.
,
Guibas
,
L.
,
Malik
,
J.
, and
Savarese
,
S.
,
2020
, “Which Tasks Should Be Learned Together in Multi-Task Learning?,”
Proceedings of the 37th International Conference on Machine Learning
, Vol.
119
,
H. D.
III
, and
A.
Singh
, eds., Proceedings of Machine Learning Research,
PMLR
, pp.
9120
9132
. https://proceedings.mlr.press/v119/standley20a.html
19.
Sutskever
,
I.
,
Vinyals
,
O.
, and
Le
,
Q. V.
,
2014
, “
Sequence to Sequence Learning With Neural Networks
,”
NIPS'14: Proceedings of the 27th International Conference on Neural Information Processing Systems, Vol. 27
,
Montreal, Quebec, Canada
,
Dec. 8–13
.
20.
Yousaf
,
K.
, and
Nawaz
,
T.
,
2022
, “
A Deep Learning-Based Approach for Inappropriate Content Detection and Classification of Youtube Videos
,”
IEEE Access
,
10
, pp.
16283
16298
.
21.
Vaswani
,
A.
,
Shazeer
,
N.
,
Parmar
,
N.
,
Uszkoreit
,
J.
,
Jones
,
L.
,
Gomez
,
A. N.
,
Kaiser
,
Ł.
, and
Polosukhin
,
I.
,
2017
, “
Attention is All You Need
,”
Advances in Neural Information Processing Systems 30 (NIPS 2017)
,
Long Beach, CA
,
Dec. 4–9
.
22.
Kao
,
C.-C.
,
Sun
,
M.
,
Wang
,
W.
, and
Wang
,
C.
,
2020
, “
A Comparison of Pooling Methods on LSTM Models for Rare Acoustic Event Classification
,”
ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
,
Barcelona, Spain
,
May 4–8
, IEEE, pp.
316
320
.
23.
Ma
,
H.
,
Zhang
,
Z.
,
Li
,
W.
, and
Lu
,
S.
,
2021
, “
Unsupervised Human Activity Representation Learning With Multi-Task Deep Clustering
,”
Association for Computing Machinery
,
5
(
1
), pp.
1
25
.
24.
Fu
,
Z.
,
Zhao
,
Y.
,
Chang
,
D.
,
Wang
,
Y.
, and
Wen
,
J.
,
2023
, “
Latent Low-Rank Representation With Weighted Distance Penalty for Clustering
,”
IEEE Trans. Cybern.
,
53
(
11
), pp.
6870
6882
.
25.
Zhang
,
X.
,
Yi
,
D.
,
Behdad
,
S.
, and
Saxena
,
S.
,
2023
, “
Unsupervised Human Activity Recognition Learning for Disassembly Tasks
,”
IEEE Trans. Ind. Inform.
,
20
(
1
), pp.
785
794
.
You do not currently have access to this content.