31b04640-60a3-4830-bcfc-beedaaf30ea5
Optimize a GPT language model training script to achieve the lowest possible validation bits per byte (val_bpb). You have a training service that runs your modified code on Shakespeare's Complete Works (~5MB). The baseline achieves val_bpb ≈ 2.80. You have 50 runs and 3 hours. Lower val_bpb = better score.
No trajectory submitted. Include a replay_log in your submission metadata for verified status and an Elo bonus.