miscellaneous
a list of my miscellaneous activities.
updates
May 4, 2023 | Our paper “🎅SantaCoder: don’t reach for the stars!” won the Best Paper Award at the Deep Learning for Code workshop (DL4C), ICLR 2023 |
---|---|
Nov 30, 2022 | Our paper BigScience ROOTS Corpus: A 1.6TB Composite Multilingual Dataset, got accepted at the NeurIPS Datasets and Benchmarks Track, 2022 and the BLOOM: A 176B-Parameter Open-Access Multilingual Language Model pre-print was also released. |
Oct 6, 2022 | Our paper “How sensitive are translation systems to extra contexts? Mitigating gender bias in Neural Machine Translation models through relevant contexts” got accepted at the Findings of EMNLP, 2022 |
Aug 24, 2022 | Our task “Indic Cause and Effect” - a task to measure a model’s ability to perform causal reasoning in 3 different Indic languages (Bengali, Hindi and Malayalam) got accepted at the Google BIG-bench 🪑 |
May 27, 2022 | Our paper You reap what you sow: On the Challenges of Bias Evaluation Under Multilingual Settings, got accepted at the Proceedings of BigScience Episode #5 – Workshop on Challenges & Perspectives in Creating Large Language Models. |
Jan 29, 2022 | Our paper Multitask Prompted Training Enables Zero-Shot Task Generalization, got accepted at ICLR 2022 as a Spotlight. |
Jan 24, 2022 | Our paper PromptSource: An Integrated Development Environment and Repository for Natural Language Prompts, got accepted at the ACL, Demo Track, 2022 |
Dec 20, 2021 | Our pre-print Between words and characters: A Brief History of Open-Vocabulary Modeling and Tokenization in NLP, was released as part of the BigScience workshop 🌸 Tokenization working group. |
Jul 30, 2021 | Our team (Sentence-Transformers) was one of the special nominees at the Flax Community Event organized by Hugging Face 🤗 and Google. |
Mar 22, 2021 | Participated in the HuggingFace 🤗 XLSR Fine-Tuning Sprint and submitted models fine-tuned on Common Voice dataset achieving the best WER score on the test set of multiple languages (Tamil, Irish and Punjabi) |
Dec 11, 2020 | Our paper “Evaluating Gender Bias in Natural Language Inference” got accepted at the Workshop on Dataset Curation and Security - NeurIPS 2020 |
Nov 30, 2020 | Participated in the HuggingFace 🤗 Dataset sprint and was one of the core-contributors. Added multiple datasets to the Datasets Library (XQuAD-R, msr_genomics_kbcomp, hippocorpus). |
Dec 14, 2019 | Received XPRIZE AI for Good Travel Grant as well as Travel Grant from NeurIPS for presenting our paper at the AI for Social Good workshop at the conference in 2019. |
Dec 14, 2019 | Our paper “Assessing Viewer’s Mental Health by Detecting Depression inYouTube Videos” got accepted at the AI for Social Good Workshop at NeurIPS 2019. |
Oct 5, 2019 | Our paper and abstract got selected at the 3rd International Workshop on Mining Actionable Insights from Social Networks, 2019 and the Montreal AI Symposium 2019 for poster presentation respectively. |