Publication detailsZhou, Lingli, Zhang, Haofeng, Long, Yang, Shao, Ling & Yang, Jingyu (2019). Depth Embedded Recurrent Predictive Parsing Network for Video Scenes. IEEE Transactions on Intelligent Transportation Systems 20(12): 4643-4654.
- Publication type: Journal Article
- ISSN/ISBN: 1524-9050 (print), 1558-0016 (electronic)
- DOI: 10.1109/TITS.2019.2909053
- Further publication details on publisher web site
- Durham Research Online (DRO) - may include full text
Author(s) from Durham
Semantic segmentation-based scene parsing plays
an important role in automatic driving and autonomous navigation.
However, most of the previous models only consider
static images, and fail to parse sequential images because they
do not take the spatial-temporal continuity between consecutive
frames in a video into account. In this paper, we propose a depth
embedded recurrent predictive parsing network (RPPNet), which
analyzes preceding consecutive stereo pairs for parsing result.
In this way, RPPNet effectively learns the dynamic information
from historical stereo pairs, so as to correctly predict the
representations of the next frame. The other contribution of
this paper is to systematically study the video scene parsing
(VSP) task, in which we use the RPPNet to facilitate conventional
image paring features by adding spatial-temporal information.
The experimental results show that our proposed method RPPNet
can achieve fine predictive parsing results on cityscapes and
the predictive features of RPPNet can significantly improve
conventional image parsing networks in VSP task.