The study reveals that large language models amplify political bias during repeated training, a process separate from model collapse, requiring targeted fixes.
This paper looks at how large language models change when trained on their own synthetic data over and over again. The authors focus on political bias and show that models like GPT-2 can gradually lean more to one side of the political spectrum, especially toward the right, as training cycles continue. They test different methods to control this bias but find it persists even when model collapse is prevented. Their analysis also shows that bias amplification and model collapse are caused by different mechanisms, meaning they need different solutions.
Get a demo
Get a demo