Tanay Dixit

I’m a senior undergraduate in the Electrical Engineering department at IIT Madras. My research interests mainly lie in the applications of Deep Learning to various settings like Natural Language Processing. I am interested in studying the vulnerabilities of systems, quantifying and analyzing the robustness of models.
During my junior year, I interned remotely at the University of Washington’s H2lab, where I worked with Bhargavi Paranjape and Prof. Hannaneh Hajishirzi on topics related to counterfactual data augmentation in NLP. I spent my summer 2022 interning at USC Information Science Institute. I was part of the LUKA Lab, where I worked with Prof. Muhao Chen, Fei Wang, and Ehsan Qasemi on improving the factuality of abstractive summarization systems. Currently as part of my Btech thesis, I’m working with Prof. Mitesh Khapra on tackling problems related to developing and analyzing systems for low-resource Indian Languages

Aside from my research, I was the head of the Analytics Club at IIT Madras for ‘21-22; it’s one of the most active and sought-after groups in the University, working actively to affect real-world problems by utilizing Machine Learning. For further info visit our Github organization. I still continue to contribute by being a memeber of the Student Council.

Previously, I have got the opportunity to contribute immensely to some early staged startups namely BlueBarrel Solutions and NikkOtto. I also interned at Subex where I worked with Vision & Language models with aim of automating the invoice verification process.


  • CORE: A Retrieve-then-Edit Framework for Counterfactual Data Generation
    Tanay Dixit, Bhargavi Paranjape, Hannaneh Hajishirzi, Luke Zettlemoyer
    Proc. Empirical Methods in Natural Language Processing (Findings EMNLP), 2022

  • Benchmarking generalization via in-context instructions on 1,600+ language tasks
    Yizhong Wang, Swaroop Mishra, Pegah Alipoormolabashi, Yeganeh Kordi, Amirreza Mirzaei, Anjana Arunkumar, Arjun Ashok, Arut Selvan Dhanasekaran, Atharva Naik, David Stap, Eshaan Pathak, Giannis Karamanolakis, Haizhi Gary Lai, Ishan Purohit, Ishani Mondal, Jacob Anderson, Kirby Kuznia, Krima Doshi, Maitreya Patel, Kuntal Kumar Pal, Mehrad Moradshahi, Mihir Parmar, Mirali Purohit, Neeraj Varshney, Phani Rohitha Kaza, Pulkit Verma, Ravsehaj Singh Puri, Rushang Karia, Shailaja Keyur Sampat, Savan Doshi, Siddhartha Mishra, Sujan Reddy, Sumanta Patro, Tanay Dixit, Xudong Shen, Chitta Baral, Yejin Choi, Hannaneh Hajishirzi, Noah A Smith, Daniel Khashabi
    Proc. Empirical Methods in Natural Language Processing (EMNLP), 2022

  • Perturbation CheckLists for Evaluating NLG Evaluation Metrics
    Ananya B. Sai, Tanay Dixit, Dev Sheth, Sreyas Mohan, Mitesh Khapra
    Proc. Empirical Methods in Natural Language Processing (EMNLP), 2021

  • NL-Augmenter: A Framework for Task-Sensitive Natural Language Augmentation
    Kaustubh D. Dhole et al
    Association for Computational Linguistics GEM Workshop ACL 2021


  • IndicMT Eval: A Dataset to Meta-Evaluate Machine Translation metrics for Indian Languages
    Ananya B. Sai, Vignesh Nagarajan, Tanay Dixit, Raj Dabre, Anoop Kunchukuttan, Pratyush Kumar, Mitesh M. Khapra

Some of my favourite projects

Multilingual Text Classification
Training robust models on sparce mutlilingual data to improve out of distribution robustness

Style Transfer
Making using of some recent generative modelling techniques to understand how Snapchat filters work

Document Layout Understaning
Extending BERT and Transformers to images for self supervised learning, which truely shows the power of these modelling techniques.


  • One of my favorite quotes : I Wish There Was a Way to Know You’re in the Good Old Days Before You’ve Actually Left Them ~Andy Bernard (office!)
  • I like to play tennis in my free time, Nadal officially the 🐐 !
  • Love trying out different cuisines and dishes,on a quest to dine at all the best restaurants that exist, feel free to drop in some recommendations 😁