Announcement_17
Our paper BigScience ROOTS Corpus: A 1.6TB Composite Multilingual Dataset, got accepted at the NeurIPS Datasets and Benchmarks Track, 2022 and the BLOOM: A 176B-Parameter Open-Access Multilingual Language Model pre-print was also released.