miscellaneous

a list of my miscellaneous activities.

updates

May 4, 2023 Our paper “🎅SantaCoder: don’t reach for the stars!” won the Best Paper Award at the Deep Learning for Code workshop (DL4C), ICLR 2023
Nov 30, 2022 Our paper BigScience ROOTS Corpus: A 1.6TB Composite Multilingual Dataset, got accepted at the NeurIPS Datasets and Benchmarks Track, 2022 and the BLOOM: A 176B-Parameter Open-Access Multilingual Language Model pre-print was also released.
Oct 6, 2022 Our paper “How sensitive are translation systems to extra contexts? Mitigating gender bias in Neural Machine Translation models through relevant contexts” got accepted at the Findings of EMNLP, 2022
Aug 24, 2022 Our task “Indic Cause and Effect” - a task to measure a model’s ability to perform causal reasoning in 3 different Indic languages (Bengali, Hindi and Malayalam) got accepted at the Google BIG-bench 🪑
May 27, 2022 Our paper You reap what you sow: On the Challenges of Bias Evaluation Under Multilingual Settings, got accepted at the Proceedings of BigScience Episode #5 – Workshop on Challenges & Perspectives in Creating Large Language Models.
Jan 29, 2022 Our paper Multitask Prompted Training Enables Zero-Shot Task Generalization, got accepted at ICLR 2022 as a Spotlight.
Jan 24, 2022 Our paper PromptSource: An Integrated Development Environment and Repository for Natural Language Prompts, got accepted at the ACL, Demo Track, 2022
Dec 20, 2021 Our pre-print Between words and characters: A Brief History of Open-Vocabulary Modeling and Tokenization in NLP, was released as part of the BigScience workshop 🌸 Tokenization working group.
Jul 30, 2021 Our team (Sentence-Transformers) was one of the special nominees at the Flax Community Event organized by Hugging Face 🤗 and Google.
Mar 22, 2021 Participated in the HuggingFace 🤗 XLSR Fine-Tuning Sprint and submitted models fine-tuned on Common Voice dataset achieving the best WER score on the test set of multiple languages (Tamil, Irish and Punjabi)
Dec 11, 2020 Our paper “Evaluating Gender Bias in Natural Language Inference” got accepted at the Workshop on Dataset Curation and Security - NeurIPS 2020
Nov 30, 2020 Participated in the HuggingFace 🤗 Dataset sprint and was one of the core-contributors. Added multiple datasets to the Datasets Library (XQuAD-R, msr_genomics_kbcomp, hippocorpus).
Dec 14, 2019 Received XPRIZE AI for Good Travel Grant as well as Travel Grant from NeurIPS for presenting our paper at the AI for Social Good workshop at the conference in 2019.
Dec 14, 2019 Our paper “Assessing Viewer’s Mental Health by Detecting Depression inYouTube Videos” got accepted at the AI for Social Good Workshop at NeurIPS 2019.
Oct 5, 2019 Our paper and abstract got selected at the 3rd International Workshop on Mining Actionable Insights from Social Networks, 2019 and the Montreal AI Symposium 2019 for poster presentation respectively.

activities