Publications

Preprints

  1. How (Mis)calibrated is Your Federated CLIP and What To Do About It?
    Mainak Singha, Masih Aminbeidokhti, Paolo Casari, Elisa Ricci, Subhankar Roy
    arXiv Preprint, 2025.
    [ arXiv ]   [ Code ]   [ BibTeX ]

Conferences

  1. FedMVP: Federated Multi-modal Visual Prompt Tuning for Vision-Language Models
    Mainak Singha, Subhankar Roy, Sarthak Mehrotra, Ankit Jha, Moloud Abdar, Biplab Banerjee, Elisa Ricci
    International Conference on Computer Vision (ICCV), 2025.
    [ PDF ]   [ arXiv ]   [ Code ]   [ HTML ]   [ BibTeX ]

  2. OSLoPrompt: Bridging Low-Supervision Challenges and Open-Set Domain Generalization in CLIP
    Mohamad Hassan N C, Divyam Gupta, Mainak Singha, Sai Bhargav Rongali, Ankit Jha, Muhammad Haris Khan, Biplab Banerjee
    Computer Vision and Pattern Recognition Conference (CVPR), 2025.
    [ PDF ]   [ arXiv ]   [ Code ]   [ HTML ]   [ BibTeX ]

  3. Elevating All Zero-Shot Sketch-Based Image Retrieval Through Multimodal Prompt Learning
    Mainak Singha, Ankit Jha, Divyam Gupta, Pranav Singla, Biplab Banerjee
    European Conference on Computer Vision (ECCV), 2024.
    [ PDF ]   [ arXiv ]   [ Code ]   [ Project ]   [ HTML ]   [ BibTeX ]

  4. COSMo: CLIP Talks on Open-Set Multi-Target Domain Adaptation
    Munish Monga, Sachin Kumar Giroh, Ankit Jha, Mainak Singha, Biplab Banerjee, Jocelyn Chanussot
    British Machine Vision Conference (BMVC), 2024.
    [ PDF ]   [ arXiv ]   [ Code ]   [ HTML ]   [ BibTeX ]

  5. Unknown Prompt, the only Lacuna: Unveiling CLIP’s Potential for Open Domain Generalization
    Mainak Singha, Ankit Jha, Shirsha Bose, Ashwin Nair, Moloud Abdar, Biplab Banerjee
    Computer Vision and Pattern Recognition Conference (CVPR), 2024.
    [ PDF ]   [ arXiv ]   [ Code ]   [ HTML ]   [ BibTeX ]

  6. CDAD-Net: Bridging Domain Gaps in Generalized Category Discovery
    Sai Bhargav Rongali, Sarthak Mehrotra, Ankit Jha, Mohamad Hassan N C, Shirsha Bose, Tanisha Gupta, Mainak Singha, Biplab Banerjee
    Computer Vision and Pattern Recognition Conference (CVPR) Workshops, 2024.
    [ PDF ]   [ arXiv ]   [ Code ]   [ HTML ]   [ BibTeX ]

  7. GraphVL: Graph-Enhanced Semantic Modeling via Vision-Language Models for Generalized Class Discovery
    Bhupendra Solanki, Ashwin R Nair, Mainak Singha, Souradeep Mukhopadhyay, Ankit Jha, Biplab Banerjee
    Indian Conference on Computer Vision Graphics and Image Processing (ICVGIP), 2024.
    [ PDF ]   [ arXiv ]   [ HTML ]   [ BibTeX ]

  8. StyLIP: Multi-Scale Style-Conditioned Prompt Learning for CLIP-based Domain Generalization
    Shirsha Bose, Ankit Jha, Enrico Fini, Mainak Singha, Biplab Banerjee, Elisa Ricci
    Winter Conference on Applications of Computer Vision (WACV), 2024.
    [ PDF ]   [ arXiv ]   [ HTML ]   [ BibTeX ]

  9. C-SAW: Self-Supervised Prompt Learning for Image Generalization in Remote Sensing [Best Paper Award]
    Avigyan Bhattacharya, Mainak Singha, Ankit Jha, Biplab Banerjee
    Indian Conference on Computer Vision Graphics and Image Processing (ICVGIP), 2023.
    [ PDF ]   [ arXiv ]   [ Code ]   [ HTML ]   [ BibTeX ]

  10. GOPro: Generate and Optimize Prompts in CLIP using Self-Supervised Learning
    Mainak Singha, Ankit Jha, Biplab Banerjee
    British Machine Vision Conference (BMVC), 2023.
    [ PDF ]   [ arXiv ]   [ Code ]   [ HTML ]   [ BibTeX ]

  11. AD-CLIP: Adapting Domains in Prompt Space Using CLIP
    Mainak Singha, Harsh Pal, Ankit Jha, Biplab Banerjee
    International Conference on Computer Vision (ICCV) Workshops, 2023.
    [ PDF ]   [ arXiv ]   [ Code ]   [ HTML ]   [ BibTeX ]

  12. HAVE-Net: Hallucinated Audio-Visual Embeddings for Few-Shot Classification with Unimodal Cues [Best Paper Award]
    Ankit Jha, Debabrata Pal, Mainak Singha, Naman Agarwal, Biplab Banerjee
    European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML-PKDD) Workshops, 2023.
    [ PDF ]   [ arXiv ]   [ HTML ]   [ BibTeX ]

  13. APPLeNet: Visual Attention Parameterized Prompt Learning for Few-Shot Remote Sensing Image Generalization Using CLIP
    Mainak Singha, Ankit Jha, Bhupendra Solanki, Shirsha Bose, Biplab Banerjee
    Computer Vision and Pattern Recognition Conference (CVPR) Workshops, 2023.
    [ PDF ]   [ arXiv ]   [ Code ]   [ HTML ]   [ BibTeX ]

Journals

  1. Meta-Learning to Teach Semantic Prompts for Open Domain Generalization in Vision-Language Models
    Shirsha Bose, Mainak Singha, Ankit Jha, Souradeep Mukhopadhyay, Biplab Banerjee
    Transactions on Machine Learning Research (TMLR), 2025.
    [ PDF ]   [ HTML ]   [ BibTeX ]

  2. RS3Lip: Consistency for Remote Sensing Image Classification on Part Embeddings using Self-Supervised Learning and CLIP
    Ankit Jha, Mainak Singha, Avigyan Bhattacharya, Biplab Banerjee
    Computer Vision and Image Understanding (CVIU), 2025.
    [ PDF ]   [ HTML ]   [ BibTeX ]

  3. Towards Molecular Structure Discovery from Cryo-ET Density Volumes via Modelling Auxiliary Semantic Prototypes
    Ashwin Nair, Xingjian Li, Bhupendra Solanki, Souradeep Mukhopadhyay, Ankit Jha, Mostofa Rafid Uddin, Mainak Singha, Biplab Banerjee, Min Xu
    Briefings in Bioinformatics, 2024.
    [ PDF ]   [ HTML ]   [ BibTeX ]