基于深度学习的三维目标检测方法综述Review on the 3D-Object Detection Based on Deep Learning
彭育辉;郑玮鸿;张剑锋;
摘要(Abstract):
在介绍二维目标检测方法的基础上,对基于深度学习的三维目标检测方法中的深度神经网络展开探讨,包括间接处理、直接处理和融合处理3类基本方法,并着重分析和对比各深度神经网络在三维目标检测速度和精确度等方面的优缺点,为车载激光雷达目标检测方法的选择提供参考依据。
关键词(KeyWords): 无人驾驶;深度学习;激光雷达;目标检测;三维点云
基金项目(Foundation): 福建省科技厅产学合作重大项目(2017H6007)
作者(Authors): 彭育辉;郑玮鸿;张剑锋;
DOI: 10.19620/j.cnki.1000-3703.20190695
参考文献(References):
- [1] ZHOU Y, TUZEL O. VoxelNet:End-to-End Learning for Point Cloud Based 3D Object Detection[C]//IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR), 2018:4490-4499.
- [2] ENGELCKE M, RAO D, WANG D Z, et al. Vote3Deep:Fast Object Detection in 3D Point Clouds Using Efficient Convolutional Neural Networks[C]//IEEE International Conference on Robotics&Automation(ICRA), 2017:1355-1361.
- [3] QI C R, LIU W, WU C, et al. Frustum PointNets for 3D Object Detection from RGB-D Data[C]//31st Meeting of the IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR), 2018:918-927.
- [4] WANG D Z, POSNER I. Voting for Voting in Online Point Cloud Object Detection[C]//2015 Robotics:Science and Systems Conference, July 13-17, 2015, Rome, Italy, 2015.
- [5] LI B, ZHANG T, XIA T. Vehicle Detection from 3D Lidar Using Fully Convolutional Network[C]//2016 Robotics:Science and Systems Conference, Jun 20-22, 2016, Ann Arbor, Michigan, USA, 2015.
- [6] CHEN X, MA H, WAN J, et al. Multi-View 3D Object Detection Network for Autonomous Driving[C]//30th IEEE Conference on Computer Vision and Pattern Recognition(CVPR), 2017:6526-6534.
- [7] FELZENSZWALB P F, GIRSHICK R B, MCALLESTER D,et al. Object Detection with Discriminatively Trained PartBased Models[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2010, 32(9):1627-1645.
- [8] GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation[C]//27th IEEE Conference on Computer Vision and Pattern Recognition(CVPR), 2014:580-587.
- [9] GIRSHICK R. Fast R-CNN[C]//IEEE International Conference on Computer Vision(ICCV), 2015:1440-1448.
- [10] REN S, HE K, GIRSHICK R, et al. Faster R-CNN:Towards Real-Time Object Detection with Region Proposal Networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 39(6):1137-1149.
- [11] REDMON J, DIVVALA S, GIRSHICK R, et al. You Only Look Once:Unified, Real-Time Object Detection[C]//IEEE Conference on Computer Vision and Pattern Recognition(CVPR), 2015:779-788.
- [12] LIU W, ANGUELOV D, ERHAN D, et al. SSD:Single Shot MultiBox Detector[C]//14th European Conference on Computer Vision(ECCV), 2016:21-37.
- [13] REDMON J, FARHADI A. YOLO9000:Better, Faster,Stronger[C]//30th IEEE Conference on Computer Vision and Pattern Recognition(CVPR), 2017:6517-6525.
- [14]沈思,朱丹浩.基于深度学习的中文地名识别研究[J].北京理工大学学报, 2017, 37(11):1150-1155.
- [15] MA X, HOVY E. End-to-End Sequence Labeling via BiDirectional LSTM-CNNs-CRF[C]//54th Annual Meeting of the Association for Computational Linguistics(ACL),2016:1064-1074.
- [16] YOON S, KIM E. Temporal Classification Error Compensation of Convolutional Neural Network for Traffic Sign Recognition[C]//International Conference on Control Engineering and Artificial Intelligence(CCEAI), 2017.
- [17]李倩玉,蒋建国,齐美彬.基于改进深层网络的人脸识别算法[J].电子学报, 2017, 45(3):619-625.
- [18] QI C R, SU H, MO K, et al. PointNet:Deep Learning on Point Sets for 3D Classification and Segmentation[C]//30th IEEE Conference on Computer Vision and Pattern Recognition(CVPR), 2017:77-85.
- [19] HE K, GKIOXARI G, DOLLAR P, et al. Mask R-CNN[C]//IEEE International Conference on Computer Vision(ICCV), 2017:2980-2988.
- [20] ZHANG N, DONAHUE J, GIRSHICK R, et al. Part-based R-CNNs for Fine-grained Category Detection[C]//European Conference on Computer Vision(ECCV), 2014:834-849.
- [21] ALI W, ABDELKARIM S, ZAHRAN M, et al. YOLO3D:End-to-End Real-Time 3D Oriented Object Bounding Box Detection from LiDAR Point Cloud[C]//European Conference on Computer Vision(ECCV), 2018:716-728.
- [22] GEIGER A, LENZ P, URTASUN R. Are We Ready for Autonomous Driving? The KITTI Vision Benchmark Suite[C]//IEEE Conference on Computer Vision and Pattern Recognition(CVPR), 2012:3354-3361.
- [23] WU Z, SONG S, KHOSLA A, et al. 3D ShapeNets:A Deep Representation for Volumetric Shapes[C]//IEEE Conference on Computer Vision and Pattern Recognition(CVPR), 2014:1912-1920.
- [24] SONG S, XIAO J. Sliding Shapes for 3D Object Detection in Depth Images[C]//European Conference on Computer Vision(ECCV), 2014:634-651.
- [25] MATURANA D, SCHERER S. VoxNet:A 3D Convolutional Neural Network for Real-Time Object Recognition[C]//International Conference on Intelligent Robots and Systems(IROS), 2015:922-928.
- [26] SONG S, XIAO J. Deep Sliding Shapes for Amodal 3D Object Detection in RGB-D Images[C]//IEEE Conference on Computer Vision and Pattern Recognition(CVPR), 2016:808-816.
- [27] YAN Y, MAO Y, LI B. SECOND:Sparsely Embedded Convolutional Detection[J]. Sensors, 2018, 18(10).
- [28] DOLSON J, BAEK J, PLAGEMANN C, et al. Upsampling Range Data in Dynamic Environments[C]//IEEE Computer Society Conference on Computer Vision and Pattern Recognition(CVPR), 2010:1141-1148.
- [29] ANDREASSON H, TRIEBEL R, LILIENTHAL A. NonIterative Vision-Based Interpolation of 3D Laser Scans[J].Autonomous Robots and Agents, 2007, 76:83-90.
- [30] PREMEBIDA C, CARREIRA J, BATISTA J, et al.Pedestrian Detection Combining RGB and Dense LIDAR Data[C]//International Conference on Intelligent Robots and Systems(IROS), 2014:4112-4117.
- [31] GONZALEZ A, VILLALONGA G, XU J L, et al. Multiview Random Forest of Local Experts Combining RGB and LIDAR data for Pedestrian Detection[C]//IEEE Intelligent Vehicles Symposium(IV), 2015:356-361.
- [32] QI C R, SU H, NIESSNER M, et al. Volumetric and MultiView CNNs for Object Classification on 3D Data[C]//IEEE Conference on Computer Vision and Pattern Recognition(CVPR), 2016:5648-5656.
- [33] QI C R, YI L, SU H, et al. PointNet++:Deep Hierarchical Feature Learning on Point Sets in a Metric Space[C]//Annual Conference on Neural Information Processing Systems(NIPS), 2017:5100-5109.
- [34] SHI S, WANG X, LI H. PointRCNN:3D Object Proposal Generation and Detection from Point Cloud[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019:770-779.
- [35] ARANDJELOVIC R, GRONAT P, TORII A, et al.NetVLAD:CNN Architecture for Weakly Supervised Place Recognition[C]//IEEE Conference on Computer Vision and Pattern Recognition(CVPR), 2018:1437-1451.
- [36] UY M A, LEE G H. PointNetVLAD:Deep Point Cloud Based Retrieval for Large-Scale Place Recognition[C]//IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR), 2018:4470-4479.
- [37] CHOPRA S, HADSELL R, LECUN Y. Learning a Similarity Metric Discriminatively, with Application to Face Verification[C]//IEEE Computer Society Conference on Computer Vision and Pattern Recognition(CVPR),2005:539-546.
- [38] SHI W, CABALLERO J, HUSZáR F, et al. Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network[C]//IEEE Conference on Computer Vision and Pattern Recognition(CVPR), 2016:1874-1883.
- [39] DENG H, BIRDAL T, ILIC S. PPFNet:Global Context Aware Local Features for Robust 3D Point Matching[C]//IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR), 2018:195-205.
- [40] WANG W, YU R, HUANG Q, et al. SGPN:Similarity Group Proposal Network for 3D Point Cloud Instance Segmentation[C]//IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR), 2018:2569-2578.
- [41] XU D F, ANGUELOV D, JAIN A. PointFusion:Deep Sensor Fusion for 3D Bounding Box Estimation[C]//IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR), 2018:244-253.
- [42] YU L, LI X, FU C, et al. PU-Net:Point Cloud Upsampling Network[C]//IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR), 2018:2790-2799.
- [43] YANG B, LUO W, URTASUN R. PIXOR:Real-time 3D Object Detection from Point Clouds[C]//IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR), 2018:7652-7660.
- [44] YANG Y, FENG C, SHEN Y, et al. FoldingNet:Point Cloud Auto-encoder via Deep Grid Deformation[C]//IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR), 2018:206-215.
- [45] SHEN Y, FENG C, YANG Y, et al. Mining Point Cloud Local Structures by Kernel Correlation and Graph Pooling[C]//IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR), 2018:4548-4557.
- [46] ZENG W, GEVERS T. 3DContextNet:K-d Tree Guided Hierarchical Learning of Point Clouds Using Local and Global Contextual Cues[C]//European Conference on Computer Vision(ECCV), 2018:314-330.
- [47] YEW Z J, LEE G H. 3DFeat-Net:Weakly Supervised Local 3D Features for Point Cloud Registration[C]//European Conference on Computer Vision(ECCV), 2018:630-646.
- [48] GE L, ZHOU R, YUAN J. Point-to-Point Regression PointNet for 3D Hand Pose Estimation[C]//Proceedings of the European Conference on Computer Vision(ECCV),2018:475-491.
- [49] MARTIN S, STEFAN M, K A, et al. Complex-YOLO:An Euler-Region-Proposal for Real-Time 3D Object Detection on Point Clouds[C]//European Conference on Computer Vision(ECCV), 2018:197-209.
- [50] KU J, MOZIFIAN M, LEE J, et al. Joint 3D Proposal Generation and Object Detection from View Aggregation[C]//IEEE/RSJ International Conference on Intelligent Robots and Systems(IROS), 2017:5750-5757.
- [51] LIANG M, YANG B, WANG S, et al. Deep Continuous Fusion for Multi-Sensor 3D Object Detection[C]//European Conference on Computer Vision(ECCV), 2018:663-678.
- [52] DU X, ANGJR M H, KARAMAN S, et al. A General Pipeline for 3D Detection of Vehicles[C]//IEEE International Conference on Robotics and Automation(ICRA), 2018:3194-3200.