Manan Dey

I am currently working as a Software Engineer at SAP Labs, Bangalore. Previously, I had worked as a data-science intern at Impact Analytics, Bangalore and as a summer intern at the Indian Institute of Technology, Guwahati (IITG) under the guidance of Dr. Pradip K. Das.

My research interest lies in the intersection of Machine Learning, Deep Learning, and Natural Language Processing. A list of my publications can be found here.

I also love contributing to open-source projects and have been an active contributor to projects such as the BigScience workshop and the BigCode project. A list of my open-source contributions can be found here.

updates

May 4, 2023	Our paper “🎅SantaCoder: don’t reach for the stars!” won the Best Paper Award at the Deep Learning for Code workshop (DL4C), ICLR 2023
Nov 30, 2022	Our paper BigScience ROOTS Corpus: A 1.6TB Composite Multilingual Dataset, got accepted at the NeurIPS Datasets and Benchmarks Track, 2022 and the BLOOM: A 176B-Parameter Open-Access Multilingual Language Model pre-print was also released.
Oct 6, 2022	Our paper “How sensitive are translation systems to extra contexts? Mitigating gender bias in Neural Machine Translation models through relevant contexts” got accepted at the Findings of EMNLP, 2022
Aug 24, 2022	Our task “Indic Cause and Effect” - a task to measure a model’s ability to perform causal reasoning in 3 different Indic languages (Bengali, Hindi and Malayalam) got accepted at the Google BIG-bench 🪑
May 27, 2022	Our paper You reap what you sow: On the Challenges of Bias Evaluation Under Multilingual Settings, got accepted at the Proceedings of BigScience Episode #5 – Workshop on Challenges & Perspectives in Creating Large Language Models.
Jan 29, 2022	Our paper Multitask Prompted Training Enables Zero-Shot Task Generalization, got accepted at ICLR 2022 as a Spotlight.
Jan 24, 2022	Our paper PromptSource: An Integrated Development Environment and Repository for Natural Language Prompts, got accepted at the ACL, Demo Track, 2022
Dec 20, 2021	Our pre-print Between words and characters: A Brief History of Open-Vocabulary Modeling and Tokenization in NLP, was released as part of the BigScience workshop 🌸 Tokenization working group.

🕰️ view older....

selected publications

ICLR

Multitask Prompted Training Enables Zero-Shot Task Generalization

Victor Sanh, Albert Webson, Colin Raffel, Stephen Bach, Lintang Sutawika, Zaid Alyafeai, Antoine Chaffin, Arnaud Stiegler, Arun Raja, Manan Dey, M Saiful Bari, Canwen Xu, Urmish Thakker, Shanya Sharma, Eliza Szczechla, and 25 more authors

International Conference on Learning Representations (ICLR) (Spotlight), 2022

Abs HTML PDF

Large language models have recently been shown to attain reasonable zero-shot generalization on a diverse set of tasks (Brown et al., 2020). It has been hypothesized that this is a consequence of implicit multitask learning in language model training (Radford et al., 2019). Can zero-shot generalization instead be directly induced by explicit multitask learning? To test this question at scale, we develop a system for easily mapping general natural language tasks into a human-readable prompted form. We convert a large set of supervised datasets, each with multiple prompts using varying natural language. These prompted datasets allow for benchmarking the ability of a model to perform completely unseen tasks specified in natural language. We fine-tune a pretrained encoder-decoder model (Raffel et al., 2020; Lester et al., 2021) on this multitask mixture covering a wide variety of tasks. The model attains strong zero-shot performance onseveral datasets, often outperforming models 16× its size. Further, our model attains strong performance on a subset of tasks from the BIG-Bench benchmark, out-performing models 6× its size.