sailor2 - Ollama 框架

Sailor2 是一项社区驱动的计划，旨在为东南亚 (SEA) 带来最先进的多语言语言模型。我们的研究强调了对用于生产的 8B 和 20B 参数范围内的模型以及用于推测性解码和研究目的等专门应用的 1B 模型的强烈需求。这些模型以 Apache 2.0 许可发布，提高了该地区对先进语言技术的访问性。

Sailor2 基于出色的多语言模型 Qwen 2.5 的基础构建，并在 500B tokens 上持续预训练，以通过统一模型更好地支持 15 种语言。这些语言包括英语、中文、缅甸语、宿务语、伊洛卡诺语、印度尼西亚语、爪哇语、高棉语、老挝语、马来语、巽他语、他加禄语、泰语、越南语和瓦赖语。通过满足对多样化、强大且可访问的语言模型日益增长的需求，Sailor2 旨在通过开放、包容和可访问的多语言 LLM 为 SEA 地区服务不足的人群提供服务。 Sailor2 模型有三种尺寸：1B、8B 和 20B，它们分别从 0.5B、7B 和 14B 的 Qwen2.5 基础模型扩展而来。

![logo](/assets/mchiang0610/sailor2/a76a9182-cc11-47e1-bb50-478ad4ccb157)

Sailor2 is a community-driven initiative that brings cutting-edge multilingual language models to South-East Asia (SEA). Our research highlights a strong demand for models in the **8B and 20B** parameter range for production use, alongside **1B models** for specialized applications, such as speculative decoding and research purposes. These models, released under the **Apache 2.0 license**, provide enhanced accessibility to advanced language technologies across the region.

Sailor2 builds upon the foundation of the awesome multilingual model Qwen 2.5 and is continuously pre-trained on 500B tokens to support 15 languages better with a unified model. These languages include English, Chinese, Burmese, Cebuano, Ilocano, Indonesian, Javanese, Khmer, Lao, Malay, Sundanese, Tagalog, Thai, Vietnamese, and Waray. By addressing the growing demand for diverse, robust, and accessible language models, Sailor2 seeks to serve the underserved in SEA areas with open, inclusive, and accessible multilingual LLMs. The Sailor2 model comes in three sizes, 1B, 8B, and 20B, which are expanded from the Qwen2.5 base models of 0.5B, 7B, and 14B, respectively.

粘贴、拖放或单击以上传图像（.png、.jpeg、.jpg、.svg、.gif）

Sailor2 是为东南亚设计的多语言语言模型。 提供 1B、8B 和 20B 参数大小。

自述文件

Sailor2 是为东南亚设计的多语言语言模型。提供 1B、8B 和 20B 参数大小。