3D Visual Programming Language

Mono3DVLT: Monocular-Video-Based 3D Visual Language Tracking

Abstract: Visual-Language Tracking (VLT) is emerging as a promising paradigm to bridge the human-machine performance gap. For single objects, VLT broadens the problem scope to text-driven video ...

IEEE

Vision-Language Models for 3D Scene Understanding: Applications and Developments

Abstract: This article focuses on the applications and advances of Visual Language Modeling (VLM) in 3D scene understanding. The article details several mainstream visual language models and analyzes ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results

Mono3DVLT: Monocular-Video-Based 3D Visual Language Tracking

Vision-Language Models for 3D Scene Understanding: Applications and Developments

Trending now