gemma3:1b - Ollama 框架

Gemma 是 Google 基于 Gemini 技术构建的轻量级模型系列。 Gemma 3 模型是多模态的，可以处理文本和图像，并具有 128K 的上下文窗口，支持超过 140 种语言。提供 1B、4B、12B 和 27B 参数大小的版本，它们擅长问答、摘要和推理等任务，而其紧凑的设计允许部署在资源受限的设备上。

模型

文本

1B 参数模型（32k 上下文窗口）

ollama run gemma3:1b

多模态（视觉）

4B 参数模型（128k 上下文窗口）

ollama run gemma3:4b

12B 参数模型（128k 上下文窗口）

ollama run gemma3:12b

27B 参数模型（128k 上下文窗口）

ollama run gemma3:27b

评估

基准测试结果

这些模型针对大量不同的数据集和指标进行了评估，以涵盖文本生成的不同方面

推理、逻辑和代码能力

基准	指标	Gemma 3 PT 1B	Gemma 3 PT 4B	Gemma 3 PT 12B	Gemma 3 PT 27B
HellaSwag	10-shot	62.3	77.2	84.2	85.6
BoolQ	0-shot	63.2	72.3	78.8	82.4
PIQA	0-shot	73.8	79.6	81.8	83.3
SocialIQA	0-shot	48.9	51.9	53.4	54.9
TriviaQA	5-shot	39.8	65.8	78.2	85.5
Natural Questions	5-shot	9.48	20.0	31.4	36.1
ARC-c	25-shot	38.4	56.2	68.9	70.6
ARC-e	0-shot	73.0	82.4	88.3	89.0
WinoGrande	5-shot	58.2	64.7	74.3	78.8
BIG-Bench Hard		28.4	50.9	72.6	77.7
DROP	3-shot, F1	42.4	60.1	72.2	77.2
AGIEval	3-5-shot	22.2	42.1	57.4	66.2
MMLU	5-shot, top-1	26.5	59.6	74.5	78.6
MATH	4-shot	–	24.2	43.3	50.0
GSM8K	5-shot, maj@1	1.36	38.4	71.0	82.6
GPQA		9.38	15.0	25.4	24.3
MMLU (Pro)	5-shot	11.2	23.7	40.8	43.9
MBPP	3-shot	9.80	46.0	60.4	65.6
HumanEval	pass@1	6.10	36.0	45.7	48.8
MMLU (Pro COT)	5-shot	9.7	NaN	NaN	NaN

多语言能力

基准	Gemma 3 PT 1B	Gemma 3 PT 4B	Gemma 3 PT 12B	Gemma 3 PT 27B
MGSM	2.04	34.7	64.3	74.3
Global-MMLU-Lite	24.9	57.0	69.4	75.7
Belebele	26.6	59.4	78.0	–
WMT24++ (ChrF)	36.7	48.4	53.9	55.7
FloRes	29.5	39.2	46.0	48.8
XL-Sum	4.82	8.55	12.2	14.9
XQuAD (all)	43.9	68.0	74.5	76.8

多模态能力

基准	Gemma 3 PT 4B	Gemma 3 PT 12B	Gemma 3 PT 27B
COCOcap	102	111	116
DocVQA (val)	72.8	82.3	85.6
InfoVQA (val)	44.1	54.8	59.4
MMMU (pt)	39.2	50.3	56.1
TextVQA (val)	58.9	66.5	68.6
RealWorldQA	45.5	52.2	53.9
ReMI	27.3	38.5	44.8
AI2D	63.2	75.2	79.0
ChartQA	45.4	60.9	63.8
ChartQA (augmented)	81.8	88.5	88.7
VQAv2	–	–	–
BLINK	38.0	35.9	39.6
OKVQA	51.0	58.7	60.2
TallyQA	42.5	51.8	54.3
SpatialSense VQA	50.9	60.0	59.4
CountBenchQA	26.1	17.8	68.0

![Google Gemma 3 logo](/assets/library/gemma3/b54bf767-f9c5-4284-b551-a49aebe3a3c2)

> This model requires Ollama 0.6 or later. [Download Ollama](https://ollama.ac.cn/download)

Gemma is a lightweight, family of models from Google built on Gemini technology. The Gemma 3 models are multimodal—processing text and images—and feature a 128K context window with support for over 140 languages. Available in 1B, 4B, 12B, and 27B parameter sizes, they excel in tasks like question answering, summarization, and reasoning, while their compact design allows deployment on resource-limited devices.

## Models

### Text

**1B parameter model** (32k context window)

```
ollama run gemma3:1b 
```

### Multimodal (Vision)

**4B parameter model** (128k context window)

```
ollama run gemma3:4b
```

**12B parameter model** (128k context window)

```
ollama run gemma3:12b
```

**27B parameter model** (128k context window)

```
ollama run gemma3:27b
```

## Evaluation

![Chatbot Arena ELO Score](/assets/library/gemma3/89dc5a19-179e-4dd3-8e5d-12ad54973148)

### Benchmark Results

These models were evaluated against a large collection of different datasets and
metrics to cover different aspects of text generation:

#### Reasoning, logic and code capabilities

| Benchmark                      | Metric         | Gemma 3 PT 1B  | Gemma 3 PT 4B | Gemma 3 PT 12B | Gemma 3 PT 27B |
| ------------------------------ |----------------|:--------------:|:-------------:|:--------------:|:--------------:|
| [HellaSwag][hellaswag]         | 10-shot        |      62.3      |     77.2      |      84.2      |      85.6      |
| [BoolQ][boolq]                 | 0-shot         |      63.2      |     72.3      |      78.8      |      82.4      |
| [PIQA][piqa]                   | 0-shot         |      73.8      |     79.6      |      81.8      |      83.3      |
| [SocialIQA][socialiqa]         | 0-shot         |      48.9      |     51.9      |      53.4      |      54.9      |
| [TriviaQA][triviaqa]           | 5-shot         |      39.8      |     65.8      |      78.2      |      85.5      |
| [Natural Questions][naturalq]  | 5-shot         |      9.48      |     20.0      |      31.4      |      36.1      |
| [ARC-c][arc]                   | 25-shot        |      38.4      |     56.2      |      68.9      |      70.6      |
| [ARC-e][arc]                   | 0-shot         |      73.0      |     82.4      |      88.3      |      89.0      |
| [WinoGrande][winogrande]       | 5-shot         |      58.2      |     64.7      |      74.3      |      78.8      |
| [BIG-Bench Hard][bbh]          |                |      28.4      |     50.9      |      72.6      |      77.7      |
| [DROP][drop]                   | 3-shot, F1     |      42.4      |     60.1      |      72.2      |      77.2      |
| [AGIEval][agieval]             | 3-5-shot       |      22.2      |     42.1      |      57.4      |      66.2      |
| [MMLU][mmlu]                   | 5-shot, top-1  |      26.5      |     59.6      |      74.5      |      78.6      |
| [MATH][math]                   | 4-shot         |       --       |     24.2      |      43.3      |      50.0      |
| [GSM8K][gsm8k]                 | 5-shot, maj@1  |      1.36      |     38.4      |      71.0      |      82.6      |
| [GPQA][gpqa]                   |                |      9.38      |     15.0      |      25.4      |      24.3      |
| [MMLU][mmlu] (Pro)             | 5-shot         |      11.2      |     23.7      |      40.8      |      43.9      |
| [MBPP][mbpp]                   | 3-shot         |      9.80      |     46.0      |      60.4      |      65.6      |
| [HumanEval][humaneval]         | pass@1         |      6.10      |     36.0      |      45.7      |      48.8      |
| [MMLU][mmlu] (Pro COT)         | 5-shot         |      9.7       |     NaN       |      NaN       |      NaN       |

[hellaswag]: https://arxiv.org/abs/1905.07830
[boolq]: https://arxiv.org/abs/1905.10044
[piqa]: https://arxiv.org/abs/1911.11641
[socialiqa]: https://arxiv.org/abs/1904.09728
[triviaqa]: https://arxiv.org/abs/1705.03551
[naturalq]: https://github.com/google-research-datasets/natural-questions
[arc]: https://arxiv.org/abs/1911.01547
[winogrande]: https://arxiv.org/abs/1907.10641
[bbh]: https://paperswithcode.com/dataset/bbh
[drop]: https://arxiv.org/abs/1903.00161
[agieval]: https://arxiv.org/abs/2304.06364
[mmlu]: https://arxiv.org/abs/2009.03300
[math]: https://arxiv.org/abs/2103.03874
[gsm8k]: https://arxiv.org/abs/2110.14168
[gpqa]: https://arxiv.org/abs/2311.12022
[mbpp]: https://arxiv.org/abs/2108.07732
[humaneval]: https://arxiv.org/abs/2107.03374

#### Multilingual capabilities

| Benchmark                            | Gemma 3 PT 1B | Gemma 3 PT 4B | Gemma 3 PT 12B | Gemma 3 PT 27B |
| ------------------------------------ |:-------------:|:-------------:|:--------------:|:--------------:|
| [MGSM][mgsm]                         |      2.04     |      34.7     |      64.3      |      74.3      |
| [Global-MMLU-Lite][global-mmlu-lite] |      24.9     |      57.0     |      69.4      |      75.7      |
| [Belebele][belebele]                 |      26.6     |      59.4     |      78.0      |       --       |
| [WMT24++][wmt24pp] (ChrF)            |      36.7     |      48.4     |      53.9      |      55.7      |
| [FloRes][flores]                     |      29.5     |      39.2     |      46.0      |      48.8      |
| [XL-Sum][xlsum]                      |      4.82     |      8.55     |      12.2      |      14.9      |
| [XQuAD][xquad] (all)                 |      43.9     |      68.0     |      74.5      |      76.8      |

[mgsm]: https://arxiv.org/abs/2210.03057
[flores]: https://arxiv.org/abs/2106.03193
[belebele]: https://arxiv.org/abs/2308.16884
[xlsum]: https://arxiv.org/abs/2106.13822
[xquad]: https://arxiv.org/abs/1910.11856v3
[global-mmlu-lite]: https://hugging-face.cn/datasets/CohereForAI/Global-MMLU-Lite
[wmt24pp]: https://arxiv.org/abs/2502.12404v1

#### Multimodal capabilities

| Benchmark                      | Gemma 3 PT 4B | Gemma 3 PT 12B | Gemma 3 PT 27B |
| ------------------------------ |:-------------:|:--------------:|:--------------:|
| [COCOcap][coco-cap]            |      102      |      111      |      116      |
| [DocVQA][docvqa] (val)         |      72.8     |      82.3     |      85.6     |
| [InfoVQA][info-vqa] (val)      |      44.1     |      54.8     |      59.4     |
| [MMMU][mmmu] (pt)              |      39.2     |      50.3     |      56.1     |
| [TextVQA][textvqa] (val)       |      58.9     |      66.5     |      68.6     |
| [RealWorldQA][realworldqa]     |      45.5     |      52.2     |      53.9     |
| [ReMI][remi]                   |      27.3     |      38.5     |      44.8     |
| [AI2D][ai2d]                   |      63.2     |      75.2     |      79.0     |
| [ChartQA][chartqa]             |      45.4     |      60.9     |      63.8     |
| [ChartQA][chartqa] (augmented) |      81.8     |      88.5     |      88.7     |
| [VQAv2][vqav2]                 |       --      |       --      |       --      |
| [BLINK][blinkvqa]              |      38.0     |      35.9     |      39.6     |
| [OKVQA][okvqa]                 |      51.0     |      58.7     |      60.2     |
| [TallyQA][tallyqa]             |      42.5     |      51.8     |      54.3     |
| [SpatialSense VQA][ss-vqa]     |      50.9     |      60.0     |      59.4     |
| [CountBenchQA][countbenchqa]   |      26.1     |      17.8     |      68.0     |

[coco-cap]: https://cocodataset.org/#home
[docvqa]: https://www.docvqa.org/
[info-vqa]: https://arxiv.org/abs/2104.12756
[mmmu]: https://arxiv.org/abs/2311.16502
[textvqa]: https://textvqa.org/
[realworldqa]: https://paperswithcode.com/dataset/realworldqa
[remi]: https://arxiv.org/html/2406.09175v1
[ai2d]: https://allenai.org/data/diagrams
[chartqa]: https://arxiv.org/abs/2203.10244
[vqav2]: https://visualqa.org/index.html
[blinkvqa]: https://arxiv.org/abs/2404.12390
[okvqa]: https://okvqa.allenai.org/
[tallyqa]: https://arxiv.org/abs/1810.12440
[ss-vqa]: https://arxiv.org/abs/1908.02660
[countbenchqa]: https://github.com/google-research/big_vision/blob/main/big_vision/datasets/countbenchqa/

粘贴、拖放或单击以上传图片 (.png, .jpeg, .jpg, .svg, .gif)