Philip Abao
published · 2026·05·25

Project Pidgen: v5 to v7

Three versions of Project Pidgen.

Project Pidgen is an ambitious project, aiming to improve the model’s “thought,” its ability to perceive the “future” of its present action. The model has improved significantly from its previous version.

Overview

PidgenV5PidgenV6PidgenV7
Params9.34B7.31B865M
Tokens seen8B12B5B
GPU memory34.4 GB27.2 GB6.7 GB
Val perplexity9.918.178.94

Benchmarks

These five are the benchmarks all three versions report.

PidgenV5 · 9.34B PidgenV6 · 7.31B PidgenV7 · 865M
PIQA
69.1
70.7
64.0
ARC-Easy
54.9
57.8
48.0
Winogrande
51.4
53.4
56.5
HellaSwag
41.6
46.9
40.0
ARC-Challenge
32.8
34.4
29.5
Figure 1. Accuracy (%) on the five shared benchmarks.

Full results below. Dashes mark benchmarks a given version did not report.

BenchmarkPidgenV5PidgenV6PidgenV7
PIQA69.170.764.0
ARC-Easy54.957.848.0
Winogrande51.453.456.5
HellaSwag41.646.940.0
ARC-Challenge32.834.429.5
MMLU (5-shot)26.027.3
SciQ74.5
OpenBookQA30.5
LAMBADA22.5
Val perplexity9.918.178.94

Reading the results

These are early checkpoints. The numbers are best read against chance and the token budget, not against finished models. They were trained on 5 to 12 billion tokens, far short of what a model this size would normally train on.

Long-context retrieval

PidgenV5 · 9.34B0/10
PidgenV6 · 7.31B1/10
PidgenV7 · 865M10/10
Figure 2. Needle-in-haystack pass rate, 10 trials per version, context ~2k–5k tokens.

Inspired by

  1. Geoffrey Hinton deep learning
  2. Richard Sutton reinforcement learning
  3. Yann LeCun convolutional networks
  4. Jürgen Schmidhuber recurrent networks
  5. Yoshua Bengio neural language models
← back to writing