Qualification Type: | PhD |
---|---|
Location: | Leeds |
Funding for: | UK Students |
Funding amount: | £19,237 per year for 3.5 years |
Hours: | Full Time |
Placed On: | 28th November 2024 |
---|---|
Closes: | 28th February 2025 |
See Project Link via the 'Apply' link above.
Project Title: PhD Studentship: Scene Understanding in Challenging Environments with Multimodal Deep Learning
Number of Positions: 1
School/Faculty: Computer Science
Closing Date: 28 February 2025
Eligibility: UK Only
Funding: School of Computer Science Studentship consisting of the award of fees, together with a tax-free maintenance grant of £19,237 per year for 3.5 years.
Lead Supervisor’s full name & email address
Dr Qian Xie: q.xie2@leeds.ac.uk
Project summary
Accurate 3D scene understanding is critical for robotics, autonomous driving, and similar applications. While current models perform well in standard environments (e.g., good lighting, no occlusions), they struggle in challenging conditions like darkness, fog, rain, and occlusions. This project aims to advance scene understanding in such environments using multimodal deep learning approaches. Leveraging cutting-edge technologies such as large models, diffusion models, and vision transformers, the research will focus on integrating diverse data inputs, including 2D RGB images, thermal images, depth maps, and 3D point clouds.
Applications in robotics, autonomous driving, and similar fields rely heavily on accurate 3D scene understanding. While state-of-the-art models have achieved remarkable performance in standard environments (e.g., with good lighting, no occlusions, and clean image data), understanding 3D scenes in challenging environments (e.g., darkness, occlusions, smoke, fog, rain, or snow) remains a significant challenge. Conventional approaches rely heavily on single-modal data (e.g., 2D RGB images), which limits their effectiveness in non-standard scenarios. Moreover, current models trained on standard conditions often fail to generalize to these difficult environments, leaving room for substantial improvement.
This project aims to leverage advanced deep learning technologies—including large models, diffusion models, and vision transformers—to develop novel algorithms for multimodal data integration. The goal is to enhance machine perception and understanding of complex scenes in challenging conditions using diverse data inputs, such as: 2D RGB images, Thermal images, Depth images, 3D point clouds.
Please feel free to contact the main supervisor informally with your CV and questions for a discussion.
Please state your entry requirements plus any necessary or desired background
A first class or an upper second class British Bachelors Honours degree (or equivalent) in an appropriate discipline.
Subject Area
Computer Science & IT, Artificial Intelligence
Keywords
3D, artificial intelligence, computer vision
Type / Role:
Subject Area(s):
Location(s):