I am a Ph.D. student in Computer Science at IIIT Hyderabad, India and a member of CVIT lab, jointly advised by C. V. Jawahar, IIIT Hyderabad, and Karteek Alahari at THOTH group, Inria. My research focuses on developing video understanding models with the ability to generalize effectively across previously unseen domains.

In 2019, I've spent three wonderful months as a visiting researcher at the THOTH group, Inria. Before joining Ph.D., I was a Junior Research Fellow (JRF) at IIT KGP. In the past, I have also been a Research Assistant at ISI, Kolkata.

My Ph.D. is funded by the Google India Ph.D. Fellowship, 2017 (1 out of 4 recipients). Thank you Google!

Recent News
  • [2022/12/10] Ego-Exo4D accepted at CVPR, 2024
  • [2023/11/30] Ego-Exo4D has been launched! [arxiv] [blog] [website]
  • [2022/12/10] Our paper CleanAdapt won the best paper runner-up award at ICVGIP, 2022.
  • [2022/10/23] One paper accepted at ICVGIP, 2022.
  • [2020/10/12] One paper accepted at ICPR, 2020.

Selected Publications

Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives   
Kristen Grauman, Andrew Westbury, Lorenzo Torresani, Kris Kitani, Jitendra Malik, ..., Avijit Dasgupta* et al.
CVPR 2024 (Oral)

TL;DR: Ego-Exo4D is a diverse, large-scale multimodal multiview video dataset capturing skilled human activities.
pdf / webpage / bibtex / blog

Overcoming Label Noise for Source-free Unsupervised Video Domain Adaptation   
Avijit Dasgupta, C. V. Jawahar, Karteek Alahari
ICVGIP 2022 (Best paper runner up)

TL;DR: CleanAdapt proposes a self-training source-free video domain adaptation method, generating pseudo-labels from a source pre-trained model to bridge domain gaps. Achieving a 7% gain over source-only models, CleanAdapt outperforms state-of-the-art approaches in video adaptation on various datasets.
pdf / extended version / webpage / bibtex

Context Aware Group Activity Recognition   
Avijit Dasgupta, C. V. Jawahar, Karteek Alahari
ICPR, 2020

We show the efficacy of using contextual information such as scene labels, human keypoints etc., for group activity recognition.

A Fully Convolutional Neural Network based Structured Prediction Approach Towards the Retinal Vessel Segmentation   
Avijit Dasgupta*, Sonam Singh* (*equal contribution)
ISBI, 2017


