Monocular 3D Object Detection for Autonomous Vehicle

Wasif Maqsood

Monocular 3D Object Detection for Autonomous Vehicle

Wasif Maqsood

URI: http://repository.cuilahore.edu.pk/xmlui/handle/123456789/4153

Date: 2024-06-27

Abstract:

Currently, Environment perception, 3D objects detection and the distance of objects from the camera is one of the hot topics in computer vision and in robotics, which is widely explored by scientists to achieve maximum accuracy of detection for autonomous vehicles. For reliable and safe driving, it is necessary that self-driving cars can perceive the environmental surroundings accurately. 3D object detection and their distance estimation are a challenging task because of different angles of moving vehicles and computational resources required to process video data. Distance estimation from the camera is used in all autonomous vehicles and robots for safe driving. In this research, a two-stage deep learning architecture is proposed for 3D object detection, their pose estimation and then the distance of objects using monocular cameras installed in vehicles. In contrast to stoneworker methods which only regress 3D dimensions, we propose a method in which using deep neural network we regress 2D bounding boxes, geometric estimation and the distance from the camera and then use these estimations for regressing accurate 3D object properties and estimate pose to construct the stable 3D bounding box. Our models is tested on the KITTI Dataset, which consists of images of vehicles in different environments. The dataset contains separate repositories for training and testing purposes (7481 and 7518 images, respectively) with main target classes (cars, pedestrians).In this Thesis we discussed deep learning techniques for computer vision. More precisely, we are focusing on the 3D bounding boxes and distance estimation from the scene by using only single for autonomous vehicles and robots. In this chapter we present an introductory approach for the problem and also present our contributions and objectives of this thesis.

Show full item record