BigScience Large Open-science Open-access Multilingual Language Model (BLOOM) was the World’s Largest Open Multilingual Language Model, It was the main outcome of the BigScience collaborative initiative, led by HuggingFace and involved several hundreds of researchers and engineers from France and abroad representing both the academia and the private sector.. It is no longer currently the largest multilingual model, but it played a pivotal role in the open-science and multilingual LLM space.

BLOOM was trained on approximately 366 billion (1.6TB) tokens from March to July 2022. Specifically, it is a 176-billion-parameter Transformer-based autoregressive Large Language Model. The model, as well as the code base and the data used to train it, are distributed under free licenses.