Hits:
Date of Publication:2024-12-21
Journal:IEEE Transactions on Circuits and Systems for Video Technology
ISSN No.:1051-8215
Key Words:BLIP; image-to-text generation; multi-modal understanding; Vision-language tracking
Next One:Asymmetric Mask Scheme for Self-supervised Real Image Denoising