Abstract: This work introduces Talk2BEV, a large vision-language model (LVLM) 1 interface for bird’s-eye view (BEV) maps commonly used in autonomous driving. While existing perception systems for ...
Abstract: Vision-and-Language Navigation in Continuous Environments (VLN-CE) requires agents to navigate 3D environments based on visual observations and natural language instructions. Existing ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results