Tanay Dixit
I’m a senior undergraduate in the Electrical Engineering department at IIT Madras. My research interests mainly lie in the applications of Deep Learning to various settings like Natural Language Processing. I am interested in studying the vulnerabilities of systems, quantifying and analyzing the robustness of models.
During my junior year, I interned remotely at the University of Washington’s H2lab, where I worked with Bhargavi Paranjape and Prof. Hannaneh Hajishirzi on topics related to counterfactual data augmentation in NLP. I spent my summer 2022 interning at USC Information Science Institute. I was part of the LUKA Lab, where I worked with Prof. Muhao Chen, Fei Wang, and Ehsan Qasemi on improving the factuality of abstractive summarization systems. Currently as part of my Btech thesis, I’m working with Prof. Mitesh Khapra on tackling problems related to developing and analyzing systems for low-resource Indian Languages
Aside from my research, I was the head of the Analytics Club at IIT Madras for ‘21-22; it’s one of the most active and sought-after groups in the University, working actively to affect real-world problems by utilizing Machine Learning. For further info visit our Github organization. I still continue to contribute by being a memeber of the Student Council.
Previously, I have got the opportunity to contribute immensely to some early staged startups namely BlueBarrel Solutions and NikkOtto. I also interned at Subex where I worked with Vision & Language models with aim of automating the invoice verification process.
Publications
- CORE: A Retrieve-then-Edit Framework for Counterfactual Data Generation
Tanay Dixit, Bhargavi Paranjape, Hannaneh Hajishirzi, Luke Zettlemoyer
Proc. Empirical Methods in Natural Language Processing (Findings EMNLP), 2022
[Paper][code] - Benchmarking generalization via in-context instructions on 1,600+ language tasks
Yizhong Wang, Swaroop Mishra, Pegah Alipoormolabashi, Yeganeh Kordi, Amirreza Mirzaei, Anjana Arunkumar, Arjun Ashok, Arut Selvan Dhanasekaran, Atharva Naik, David Stap, Eshaan Pathak, Giannis Karamanolakis, Haizhi Gary Lai, Ishan Purohit, Ishani Mondal, Jacob Anderson, Kirby Kuznia, Krima Doshi, Maitreya Patel, Kuntal Kumar Pal, Mehrad Moradshahi, Mihir Parmar, Mirali Purohit, Neeraj Varshney, Phani Rohitha Kaza, Pulkit Verma, Ravsehaj Singh Puri, Rushang Karia, Shailaja Keyur Sampat, Savan Doshi, Siddhartha Mishra, Sujan Reddy, Sumanta Patro, Tanay Dixit, Xudong Shen, Chitta Baral, Yejin Choi, Hannaneh Hajishirzi, Noah A Smith, Daniel Khashabi
Proc. Empirical Methods in Natural Language Processing (EMNLP), 2022
[Paper][code] - Perturbation CheckLists for Evaluating NLG Evaluation Metrics
Ananya B. Sai, Tanay Dixit, Dev Sheth, Sreyas Mohan, Mitesh Khapra
Proc. Empirical Methods in Natural Language Processing (EMNLP), 2021
[Paper][code][website] - NL-Augmenter: A Framework for Task-Sensitive Natural Language Augmentation
Kaustubh D. Dhole et al
Association for Computational Linguistics GEM Workshop ACL 2021
[paper][code]
Preprints
- IndicMT Eval: A Dataset to Meta-Evaluate Machine Translation metrics for Indian Languages
Ananya B. Sai, Vignesh Nagarajan, Tanay Dixit, Raj Dabre, Anoop Kunchukuttan, Pratyush Kumar, Mitesh M. Khapra
[pre-print]
Some of my favourite projects
Multilingual Text Classification
Training robust models on sparce mutlilingual data to improve out of distribution robustness
Style Transfer
Making using of some recent generative modelling techniques to understand how Snapchat filters work
Document Layout Understaning
Extending BERT and Transformers to images for self supervised learning, which truely shows the power of these modelling techniques.
Miscellaneous
- One of my favorite quotes : I Wish There Was a Way to Know You’re in the Good Old Days Before You’ve Actually Left Them ~Andy Bernard (office!)
- I like to play tennis in my free time, Nadal officially the 🐐 !
- Love trying out different cuisines and dishes,on a quest to dine at all the best restaurants that exist, feel free to drop in some recommendations 😁