Publications

Preprints

  1. How (Mis)calibrated is Your Federated CLIP and What To Do About It?
    Mainak Singha, Masih Aminbeidokhti, Paolo Casari, Gianni Franchi, Elisa Ricci, Subhankar Roy
    arXiv Preprint, 2025.
    [ arXiv ]   [ Code ]   [ BibTeX ]

Conferences

  1. CLIPoint3D: Language-Grounded Few-Shot Unsupervised 3D Point Cloud Domain Adaptation
    Mainak Singha, Sarthak Mehrotra, Paolo Casari, Subhasis Chaudhuri, Elisa Ricci, Biplab Banerjee
    Computer Vision and Pattern Recognition (CVPR), 2026.
    [ arXiv ]   [ Code ]   [ BibTeX ]

  2. BioVLM: Routing Prompts, Not Parameters, for Cross-Modality Generalization in Biomedical VLMs
    Mainak Singha, Tanisha Gupta; Ankit Jha, Muhammad Haris Khan, Sayantani Ghosh, Biplab Banerjee
    Findings of the Association for Computational Linguistics (ACL Findings), 2026.
    [ arXiv ]   [ Code ]   [ BibTeX ]

  3. GeoMeld: Toward Semantically Grounded Foundation Models for Remote Sensing
    Maram Hassan, Aminur Hossain, Savitra Roy, Souparna Bhowmik, Ayush Patel, Mainak Singha, Subhasis Chaudhuri, Muhammad Haris Khan, Biplab Banerjee
    Computer Vision and Pattern Recognition (CVPR) Workshops, 2026.
    [ arXiv ]   [ Code ]   [ BibTeX ]

  4. Bi-Modal Textual Prompt Learning for Vision-Language Models in Remote Sensing
    Pankhi Kashyap, Mainak Singha, Biplab Banerjee
    IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2026.
    [ arXiv ]   [ Code ]   [ BibTeX ]

  5. FedMVP: Federated Multi-modal Visual Prompt Tuning for Vision-Language Models
    Mainak Singha, Subhankar Roy, Sarthak Mehrotra, Ankit Jha, Moloud Abdar, Biplab Banerjee, Elisa Ricci
    International Conference on Computer Vision (ICCV), 2025.
    [ PDF ]   [ arXiv ]   [ Code ]   [ HTML ]   [ BibTeX ]

  6. OSLoPrompt: Bridging Low-Supervision Challenges and Open-Set Domain Generalization in CLIP
    Mohamad Hassan N C, Divyam Gupta, Mainak Singha, Sai Bhargav Rongali, Ankit Jha, Muhammad Haris Khan, Biplab Banerjee
    Computer Vision and Pattern Recognition (CVPR), 2025.
    [ PDF ]   [ arXiv ]   [ Code ]   [ HTML ]   [ BibTeX ]

  7. SDHSI-Net: Learning Better Representations for Hyperspectral Images via Self-Distillation [Oral]
    Prachet Dev Singh, Shyamsundar Paramasivam, Sneha Barman, Mainak Singha, Ankit Jha, Girish Mishra, Biplab Banerjee
    IEEE India Geoscience and Remote Sensing Symposium (InGARSS), 2025.
    [ arXiv ]   [ Code ]   [ BibTeX ]

  8. Reconstruction Guided Few-shot Network For Remote Sensing Image Classification [Oral]
    Mohit Jaiswal, Naman Jain, Shivani Pathak, Mainak Singha, Nikunja Bihari Kar, Ankit Jha, Biplab Banerjee
    IEEE India Geoscience and Remote Sensing Symposium (InGARSS), 2025.
    [ arXiv ]   [ Code ]   [ BibTeX ]

  9. MMLGNet: Cross-Modal Alignment of Remote Sensing Data using CLIP [Oral]
    Aditya Chaudhary, Sneha Barman, Mainak Singha, Ankit Jha, Girish Mishra, Biplab Banerjee
    IEEE India Geoscience and Remote Sensing Symposium (InGARSS), 2025.
    [ arXiv ]   [ Code ]   [ BibTeX ]

  10. Elevating All Zero-Shot Sketch-Based Image Retrieval Through Multimodal Prompt Learning
    Mainak Singha, Ankit Jha, Divyam Gupta, Pranav Singla, Biplab Banerjee
    European Conference on Computer Vision (ECCV), 2024.
    [ PDF ]   [ arXiv ]   [ Code ]   [ Project ]   [ HTML ]   [ BibTeX ]

  11. COSMo: CLIP Talks on Open-Set Multi-Target Domain Adaptation
    Munish Monga, Sachin Kumar Giroh, Ankit Jha, Mainak Singha, Biplab Banerjee, Jocelyn Chanussot
    British Machine Vision Conference (BMVC), 2024.
    [ PDF ]   [ arXiv ]   [ Code ]   [ HTML ]   [ BibTeX ]

  12. Unknown Prompt, the only Lacuna: Unveiling CLIP’s Potential for Open Domain Generalization
    Mainak Singha, Ankit Jha, Shirsha Bose, Ashwin Nair, Moloud Abdar, Biplab Banerjee
    Computer Vision and Pattern Recognition (CVPR), 2024.
    [ PDF ]   [ arXiv ]   [ Code ]   [ HTML ]   [ BibTeX ]

  13. CDAD-Net: Bridging Domain Gaps in Generalized Category Discovery
    Sai Bhargav Rongali, Sarthak Mehrotra, Ankit Jha, Mohamad Hassan N C, Shirsha Bose, Tanisha Gupta, Mainak Singha, Biplab Banerjee
    Computer Vision and Pattern Recognition (CVPR) Workshops, 2024.
    [ PDF ]   [ arXiv ]   [ Code ]   [ HTML ]   [ BibTeX ]

  14. GraphVL: Graph-Enhanced Semantic Modeling via Vision-Language Models for Generalized Class Discovery
    Bhupendra Solanki, Ashwin R Nair, Mainak Singha, Souradeep Mukhopadhyay, Ankit Jha, Biplab Banerjee
    Indian Conference on Computer Vision Graphics and Image Processing (ICVGIP), 2024.
    [ PDF ]   [ arXiv ]   [ HTML ]   [ BibTeX ]

  15. StyLIP: Multi-Scale Style-Conditioned Prompt Learning for CLIP-based Domain Generalization
    Shirsha Bose, Ankit Jha, Enrico Fini, Mainak Singha, Biplab Banerjee, Elisa Ricci
    Winter Conference on Applications of Computer Vision (WACV), 2024.
    [ PDF ]   [ arXiv ]   [ HTML ]   [ BibTeX ]

  16. C-SAW: Self-Supervised Prompt Learning for Image Generalization in Remote Sensing [Best Paper Award]
    Avigyan Bhattacharya, Mainak Singha, Ankit Jha, Biplab Banerjee
    Indian Conference on Computer Vision Graphics and Image Processing (ICVGIP), 2023.
    [ PDF ]   [ arXiv ]   [ Code ]   [ HTML ]   [ BibTeX ]

  17. GOPro: Generate and Optimize Prompts in CLIP using Self-Supervised Learning
    Mainak Singha, Ankit Jha, Biplab Banerjee
    British Machine Vision Conference (BMVC), 2023.
    [ PDF ]   [ arXiv ]   [ Code ]   [ HTML ]   [ BibTeX ]

  18. AD-CLIP: Adapting Domains in Prompt Space Using CLIP
    Mainak Singha, Harsh Pal, Ankit Jha, Biplab Banerjee
    International Conference on Computer Vision (ICCV) Workshops, 2023.
    [ PDF ]   [ arXiv ]   [ Code ]   [ HTML ]   [ BibTeX ]

  19. HAVE-Net: Hallucinated Audio-Visual Embeddings for Few-Shot Classification with Unimodal Cues [Best Paper Award]
    Ankit Jha, Debabrata Pal, Mainak Singha, Naman Agarwal, Biplab Banerjee
    European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML-PKDD) Workshops, 2023.
    [ PDF ]   [ arXiv ]   [ HTML ]   [ BibTeX ]

  20. APPLeNet: Visual Attention Parameterized Prompt Learning for Few-Shot Remote Sensing Image Generalization Using CLIP
    Mainak Singha, Ankit Jha, Bhupendra Solanki, Shirsha Bose, Biplab Banerjee
    Computer Vision and Pattern Recognition (CVPR) Workshops, 2023.
    [ PDF ]   [ arXiv ]   [ Code ]   [ HTML ]   [ BibTeX ]

Journals

  1. Meta-Learning to Teach Semantic Prompts for Open Domain Generalization in Vision-Language Models
    Shirsha Bose, Mainak Singha, Ankit Jha, Souradeep Mukhopadhyay, Biplab Banerjee
    Transactions on Machine Learning Research (TMLR), 2025.
    [ PDF ]   [ HTML ]   [ BibTeX ]

  2. RS3Lip: Consistency for Remote Sensing Image Classification on Part Embeddings using Self-Supervised Learning and CLIP
    Ankit Jha, Mainak Singha, Avigyan Bhattacharya, Biplab Banerjee
    Computer Vision and Image Understanding (CVIU), 2025.
    [ PDF ]   [ HTML ]   [ BibTeX ]

  3. Towards Molecular Structure Discovery from Cryo-ET Density Volumes via Modelling Auxiliary Semantic Prototypes
    Ashwin Nair, Xingjian Li, Bhupendra Solanki, Souradeep Mukhopadhyay, Ankit Jha, Mostofa Rafid Uddin, Mainak Singha, Biplab Banerjee, Min Xu
    Briefings in Bioinformatics, 2024.
    [ PDF ]   [ HTML ]   [ BibTeX ]