Release Time:2024-12-22 Hits:
Indexed by: Journal Papers
Document Code: 471432
Date of Publication: 2025-05-26
Journal: IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY
Volume: 35
Issue: 5
Page Number: 4384-4396
ISSN: 1051-8215
Key Words: BLIP; image-to-text generation; multi-modal understanding; Vision-language tracking